From Zero to Hero: Want to Cheaply Build a Robust Multi-User Chat Assistant? 🤖

By Jerry D. Boonstra

June 24, 2024 - 5 minutes read - 912 words

The secret of getting ahead is getting started. — Mark Twain

Introduction

I’ve spent this year diving into the LLM application development scene, and there is a lot going on!

Using my newfound knowledge, I built a multi-user chat assistant at work that can help data scientists. Challenged to do it quickly and at a low cost, I will share some of what I learned in the process.

This is the first article in a series where we’ll delve into the intricacies of creating a multi-user chat assistant using AWS serverless and OpenAI. Our goal is to provide you with a clear roadmap for building and continuously improving your solution.

Ultimately, this series aims to empower you to build a full-fledged AI assistant using open-source models, your data, and your compute resources.

The series as planned

(this article) 👉 From Zero to Hero: Want to Cheaply Build a Robust Multi-User Chat Assistant?
From Zero to Hero: Building and deploying your first multi-user Chat Assistant using AWS Lambda, OpenAI Assistant API, and TypeScript
From Zero to Hero: Adding Logging Traces, Ratings and Unit Tests to your Chat Assistant
From Zero to Hero: Adding Evals to your Chat Assistant
From Zero to Hero: Fine-tuning your LLM Application to Balance Accuracy, Robustness, and Cost
…?

Some things you should know

Before we proceed, this series will make a few assumptions about your use case:

You are ok with sharing your input data with OpenAI within their Terms of Service
A multi-turn chatbot-style interface suits your business problem.
You think AWS serverless technologies are cool (enough :)), and you have admin access for your account
You are comfortable with Typescript

If these do not hold, this series might not be for you.

Beyond a Demo

Evaluating quality in a continuous improvement process is important if we want a robust application that works “beyond a demo”.

To paraphrase AI consultant Hamel Husain:

Many people focus exclusively on changing the behavior or the system… which prevents them from improving their LLM products beyond a demo.

Success with AI hinges on how fast you can iterate. In addition to a process for changing your system, you must have processes and tools for:

Evaluating quality (ex: tests)

Debugging issues (ex: logging & inspecting data)

Controlled Change

A continuous improvement process for LLM-based applications always includes a quality evaluation step and looks like this:

cycle

It’s a lot. Not all of it is necessary for every application. So how do you really start?

First Steps

Here is a suggested series of steps to build a robust application solution, starting from zero.

Step 0: Understand the Problem

Understand your problem well enough to describe it in natural language.

First Prompt

Write down your first prompt, prioritizing these aspects:

Clarity: Ensure the prompt is clear and unambiguous. Avoid vague language and be specific about what you’re asking.
Context: Provide sufficient context to frame the question or task. This might include background information, specific scenarios, or examples.
Conciseness: Keep the prompt as brief as possible while retaining necessary detail. This helps the model focus on the main task without unnecessary complexity.

Step 1. Choose Model and Tools

OpenAI Assistant supports multiple models and tools per model.

if you choose gpt-3.5-turbo or newer you can use any of these tools:

Code Interpreter: Allows Assistants to write and run Python code in a sandboxed execution environment
File Search: Augments the Assistant with knowledge from outside its model, such as proprietary product information or documents provided by your users
Function calling: Allows you to describe functions to the Assistants API and have it intelligently return the functions that need to be called along with their arguments.

Step 2: Do Prompt Engineering in the Playground

Prompt engineering is the discipline of developing and optimizing prompts to efficiently apply and build with LLMs for various applications and use cases.

To get started, its helpful to browse categorized OpenAI example prompts and PromptingGuide.ai example prompts.
To further improve your prompts, familiarize yourself with Promptingguide.ai’s detailed guide on this topic.
With your chosen model, do iterative prompt engineering until you get something that works most of the time. More on this below.

Use the OpenAI Assistant Playground, the quickest way to prove the concept.

When in the playground

Include as much context as your application needs. Providing context can take the form of:
- User-uploaded documents, which are inserted directly into the context.
- Administrator-added content with Retrival Augmented Generation, where standard NLP techniques are used for search to identify relevant documents to supply to the LLM.
Providing few-shot examples often helps for otherwise hard-to-address cases.
Prompt chaining, where a task is split into subtasks with the idea to create a chain of prompt operations, is useful to approach complex tasks.
For the most deterministic output use lowest temperature (t=0)

You will eventually hit a good enough threshold to stop adding more prompts and start using your prompt and model choice.

Step 3: Build and deploy a multiuser application

Once you have something that seems to be working, its time to get it in front of some users!

For this, you’ll need a multiuser application, which is the topic of our next article in the series.

Part 2: From Zero to Hero: Building and deploying your first multi-user Chat Assistant using AWS Lambda, OpenAI Assistant API, and TypeScript

Stay Tuned!

If you enjoyed this article, please consider sharing it with your friends or follow me @jerrydboonstra on Twitter/X to receive notifications about new articles.

From Zero to Hero: LLM