Eidos Blog - Using ChatGPT as an Assistant Chatbot: are we there yet?

ChatGPT release has shocked the web and everybody is talking about it and the impacts its use will have in our lives. It can write entire articles from scratch (although not this one yet), code entire C++ functions just from their description and even make music and poetry.

But, can it be used as a Sales Assistant Chatbot on a eCommerce Store? In this article we will try to answer that question, writing a ChatBot able to answer user inquiries about products available at a Shirt Store.

No technical knowledge is needed to understand this article! But if you want to dive into implementation, code is available at

https://colab.research.google.com/drive/1IKED3XWe5H4uW6bXPp2x0APMq6Y5Xabl

Context is all you need

Let’s think which are the capabilities such a chatbot needs to satisfy to be totally functional. First, it needs to be clever enough to understand the customer inquiries and assist with purchases. We already know ChatGPT is able to do keep a decent conversation if its limits are not intentionally pushed.

Besides intelligence, the ChatBot needs to be aware of the different items the store is offering, their prices, stock and even certain information particular to the store, like its location and opening hours. We will call this information context, and while it is likely to be available on the store website or internal system, it’s not possible to tell ChatGPT go and search for it on the web or request it from some external API or database.

Ultimately, in order for any conversational AI to go beyond just having a general conversation, it requires a level of context awareness to successfully perform specific tasks. This involves having access to a specific knowledge base that the AI can draw from to generate relevant responses. However, if the required knowledge was not available during the AI’s training, whether it’s due to the information being unavailable on public sources, too specific or granular, or occurring after the training date, the AI may not be able to perform the desired task effectively. Hence the need to find a way to provide ChatGPT with the context, so that it unlocks all its potential and is able to perform a wider range of tasks.

Context by Prompt Engineering

ChatGPT prompt is the starting point on each new conversation, the initial text or input that a user provides to the model to generate a response. This initial prompt has terrible power on how ChatGPT will answer and behave for the rest of the conversation. Prompt Engineering is the art of finding the right initial prompt so that it behaves exactly as we intend. It is a very powerful tool that used wisely can push ChatGPT to its real limits, as demonstrated in this famous Reddit post where they propose implementing a token reward system in an attempt to encourage ChatGPT to violate its behavioral guideline.

We can use Prompt Engineering to make ChatGPT aware of current context information every time a new chat is about to start. Just a description of what’s in the store and how it should behave will be enough for ChatGTP to start performing as the new Store Assistant.

Done! We have essentially flashed the AI memory, as if we used MiB’s Neuralyzer. Further questions are answered based on this context information.

Thanks to it’s trained intelligence, ChatGPT is even able to tell that a young kid will probably prefer the Pikachu shirt rather than Kurt Cobain’s face. We can check that it also understand the other contextual information given:

On the initial prompt, we gave the information on a rather vague way. We said the orders could be shipped to the southern area of the city, but we didn’t specify which exact locations that area was. ChatGPT seems to be able to discern the location of certain areas, like on this question, where we ask for shipments to Punta de Rieles, a location northern of the city.

It’s clear that, when provided the proper initial prompt, ChatGPT is capable of correctly performing as a Store Assistant Chatbot. So, how can we integrate it on our website so new visitors are able to talk with it? We requiere a method of communicating with ChatGPT that allows to access it from an external site and not just from the OpenAI demo page.

A chatbot working as a Store Assistant on a Shirt Store, as generated by MidJourney

Sadly, that’s not yet possible. ChatGPT has not yet been released as an API, so it can only be used from the demo site. OpenAI has not made an official announcement about when it’s planing to release it, and given all the controversy around biased AI responses and the disturbing confessions it made to the NYT reporter, that could not be in the short terms plans.

DaVinci as a ChatBot

The best API alternative to ChatGPT at the moment is another OpenAI model called Da Vinci. It’s also a huge language model, trained with a huge amount of data, but it wasn’t fine-tuned on dialogue as ChatGPT was. Unlike ChatGPT, Da Vinci does not possess conversational abilities out of the box and only generates text based on the most probable option. However, this doesn’t hinder its usage as a chatbot. To use it as a chatbot, all we need to do is input the conversation as a written dialogue, and Da Vinci will generate text as a continuation of that conversation.

Similarly to how we approached ChatGPT, we will start the conversation with an initial prompt containing context information, and then instruct DaVinci to generate text as if it were a dialogue between an AI and a human.

Every written exchange between the Human and the AI will be added to this initial dialogue and used as the initial prompt for the next AI generated answer. Doing this, we are giving the previous conversation context to the AI, so it can generate new text that follows this conversation. This is automatically done for us on ChatGPT demo site, but we have to do it manually when using OpenAI API.

The code for the ChatBot is available as a Google Colab here. If you have a OpenAI account, it will be easy to configure it and start using it. Otherwise, you can just check the generation samples shown there.

Some differences between DaVinci and ChatGPT answers are easily spotted:

ChatGPT answers are more verbose, while DaVinci prefers short answers.
DaVinci usually confuses stock information of shirts.
DaVinci makes up information that was not given! It does even offer a 10% discount when signing up to an unconfirmed existence newsletter.

While creating new promotions for each customer could be a marketing feature rather than a bug, a ChatBot giving misleading information about product could be potentially harmful to the store reputation. When faced with questions not clearly stated before, current AI models sometimes indulge in spontaneous generation of false or imaginary information.

While this confabulation problem is likely to be solved in upcoming GPT generations, it’s certainly a risk to weight when using a GPT based system on production. We could reduce AI confabulation by giving more precise item descriptions, and even indicating the AI to restrain to what was explicitly said on the initial prompt.

But the main limitation of this Prompt Engineering approach is the maximum prompt length of DaVinci Model. If the number of items being sold on our store grows and its description goes beyond 4000 words, then OpenAI API call is just going to fail and ask us to reduce the prompt.

Certainly, a method to avoid reaching the prompt limit is to reduce the context information the conversation is starting with, and only load more information once we know what the user is asking for. We could start by asking the user wether he is looking for a long or short sleeved shirt, and on the next question load only descriptions of that particular category. This approach could work, but we will need to manually engineer different flows of chat, and how to populate the initial prompt on each conversation step.

Context by Fine-tuning

OpenAI gives us the possibility to fine-tune DaVinci and create a customized model for our application. Fine-tuning involves taking a pre-existing chatbot model and training it on your specific data set to better understand the nuances of your customers’ queries and responses. By training on our own data, we can teach it how to perform a custom task that is tailored to our organization’s requirements. Ideally, once the model has been fine-tuned, we don’t need to provide context or examples on the prompt anymore, as it has already learned needed information.

To optimize a chatbot, we require data in the form of conversations between the assistant and visitors. The quantity of conversations needed will vary based on the variety of potential contexts and types of customer inquiries, but typically several thousand conversations will be necessary.

If we already have historical chat conversations between store assistants and visitors, then fine-tuning is a good way of leveraging that existing data and improving the model response quality.

Fine-tuning a chatbot can be a time-consuming and costly process, which means it may not be feasible to use it to teach the chatbot about the current store context. Fine-tuning the model every time a new product is available or prices change is just prohibitively expensive, and even if we had infinite resources to do so, it will increase behaviour consistency, but it won’t completely remove confabulation.

So, if initial prompt length is limited, and we can’t rely on just wiring context deep inside our model brain, how can make our ChatBot be aware of store context to answer accordingly?

Context as external knowledge

To optimize the use of limited space in a GPT prompt, we must selectively include contextual information that pertains to the specific user query. Although the store’s inventory may be extensive, the language model can focus on items that are pertinent to the user’s search criteria, such as a particular type or feature of the item. This approach allows us to narrow down the search and only load information relevant to the user’s needs. For instance, if the language model determines that only 10 items are potentially relevant to the user’s query, we can limit the information in the prompt to those items.

To find which items are relevant to last user message, we need to establish a method for measuring similarity. We can use OpenAI text embeddings for that. Text embeddings enable us to represent a given text as a vector, which is essentially a list of numbers. These numbers convey information about the meaning and content of the text. Thus, if two texts are comparable in meaning, the distance between their embeddings will be relatively small.

Steps for generating an answer based on external stored knowledge

Generating OpenAI text embeddings for every product in the store can be accomplished within a matter of minutes. This involves describing the features and characteristics of each product in written form, in a similar manner as when providing context for the initial prompt. This information can then be used to generate the corresponding embeddings, which can be compared against the user’s query to identify the most relevant products.

To store the embeddings we create and enable similarity searches, we can utilize Pinecone, which is a vector database designed to simplify the process of incorporating vector-search capabilities into applications. Once the product information embeddings have been saved to Pinecone, we can initiate similarity searches in response to user questions.

Each time a user poses a question to the chatbot, we generate an embedding for the message using the same OpenAI model that was used to create the embeddings for product descriptions. By querying this embedding vector in Pinecone, we can retrieve the most closely related products — those whose descriptions are semantically similar to the user’s question.

After identifying the products that are most relevant to the user's message, we can augment the GPT prompt with their descriptions, providing contextual information to the model. For instance, if a user inquires about the cost of a Pokemon shirt, the ChatBot will consult its knowledge repository to identify information relevant to this query. The resulting information will encompass the shirt's description, stock availability, and pricing. By incorporating this contextual information into the initial prompt, GPT will be able to accurately provide the user with the shirt's price.

Let’s try asking for the Pikachu shirt, it’s set on the database to be at $20, only 5 units available at Small size.

Seems like there are many shirts with different Pikachu variations on stock, but the ChatBot was able to actually retrieve the price for the one we were originally asking for! Let’s try asking for the price of a different shirt and see how the ChatBot responds:

Impressive, as that’s exactly what was at the products database at that time!

Conclusions

We have successfully developed a ChatBot that promptly responds to user inquiries regarding the store’s products, including constantly changing information like items price and stock.

Remarkably, tour implementation is designed to work solely with unstructured data. We have converted all our database structured information into text in a very simple way, enabling the chatbot to operate on a higher level of semantics.

Let’s review some important thoughts we have seen:

To be effective in various business-specific functions, ChatBots require contextual information. While GPT prompt is highly robust and capable of comprehending context, it has a word limit of 4000, limiting its capacity.
To overcome this limitation, we have restricted the contextual information we feed into GPT prompt to only the information that is relevant to user query. We can use OpenAI embeddings and Pinecone for that.
GPT model may occasionally generate incorrect or fictitious information when asked for something that is not explicitly mentioned in its prompt. As a Large Language Model, it is ultimately a statistical tool used to predict the most likely next sentence without truly understanding it, just producing a statistically feasible answer.

GPT models may still have their flaws, but these limitations are expected to reduce on upcoming versions. AI is poised to become even more potent in the forthcoming years, and even though it may not truly grasp the meaning, a statistical model may be intelligent enough to manage a fundamental conversation, such as that of a Store ChatBot.

Meanwhile, present versions of GPT can still be leveraged for ChatBot implementation, with the assistance of tailored coding or manual definition of conversation flows suitable for the specific business application.

Using ChatGPT as an Assistant Chatbot: are we there yet?

Context is all you need

Context by Prompt Engineering

DaVinci as a ChatBot

Context by Fine-tuning

Context as external knowledge

Conclusions

Related posts

Fine-Tuning GPT-4: Crafting Journalist-Like UEFA Champions League Match Reports

What Happened at the 2021 NVIDIA GTC?

Inside the AI Act: Unpacking the Layers of European AI Regulation

Let’s Start a Project Together!