Realtime streams

Zero Limits, Instant Response

When it comes to realtime data processing, dedicated inference allows you to have:

  • No request limits. You can run hundreds of requests a seconds.

  • Stable latency. As compute is dedicated to you, there is stable performance.

In this simple example of a reddit bot demonstrates this. All comments from reddit are classified in realtime. Dozens of requests are sent each second to the classification chain, which classifies each comment into one of the categories Art, Finance, Science, Taylor Swift or Other. In case a comment about Taylor Swift is detected, the bot, a huge Taylor Swift fan, will create a comment in response.

import praw
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate

from cortecs_py import Cortecs
from cortecs_py.integrations import DedicatedLLM

# this example demonstrates dedicated inference in realtime settings
if __name__ == '__main__':
    model_id = 'neuralmagic--Meta-Llama-3.1-8B-Instruct-FP8'
    cortecs = Cortecs()
    reddit = praw.Reddit(user_agent='Read-only example bot')

    with DedicatedLLM(cortecs, model_id, context_length=1500, temperature=0.) as llm:  # todo decrease context_length

        prompt = ChatPromptTemplate.from_template("""
        Given the reddit post below, classify it as either `Art`, `Finance`, `Science`, `Taylor Swift` or `Other`.
        Do not provide an explanation.
        
        {channel}: {title}\n Classification:""")
        classification_chain = prompt | llm | StrOutputParser()

        prompt = ChatPromptTemplate.from_messages([
            ("system", "You are the biggest Taylor Swift fan."),
            ("user", "Respond to this post:\n {comment}")
        ])
        response_chain = prompt | llm

        # scan reddit in realtime and shill about tay tay
        for post in reddit.subreddit("all").stream.comments():
            topic = classification_chain.invoke({'channel': post.subreddit_name_prefixed, 'title': post.link_title})
            print(f'{post.subreddit_name_prefixed} {post.link_title}')
            if topic == 'Taylor Swift':
                response = response_chain.invoke({'comment': post.body})
                print(post.body + '\n---> ' + response.content)

The example is build on praw, so if you want to run this example on your own machine you have to setup a Reddit account with API-Access first

Last updated