Retrieval-Augmented-Generation (RAG) models should be in all organisations’ AI toolboxes.
With RAG, you make your organisation’s data accessible via a Large Language Model (LLM), like ChatGPT, enabling users to converse with your organisation’s data. The next step is to convert your RAG implementation into multiple user- or role-specific AI agents.
❗Not all users are equal and shouldn’t interact with the same RAG model. You wouldn’t want your customers to chat with the same AI as your marketing or legal department does. At first, this looks like a data access topic, but there is more to it. You want the AI to act and respond differently to different users. This is where the concept of AI Agents comes in.
👩🎓In my earlier blog, I described the building blocks of an RAG model: Query Translation, Routing, Query Construction, Embedding and Indexing, Retrieval, and Generation.
In every one of these building blocks, decisions need to be made on how to handle the user’s prompts. Effectively, if you’re building one RAG model for your organisation, you have created one AI Agent for all users.
AI Agents
❓Different questions, different ways of coming to an answer. Your marketing manager will use a very different methodology and a wide range of sources to create his marketing plan and content compared to the service desk agent, who mainly uses internal documentation. One generic RAG model is unlikely to provide both users with the best possible answers.
Instead, we can create an AI Marketing agent that aligns with how a Marketing Manager works, what data sources they use and what tone of voice the content should have.
We create an AI Agent for the service desk employees that is focused on giving quick, short, and friendly answers that they can use to answer the customer.
🤏One size LLM doesn’t fit all
The thing is that while building your RAG application, you need to make decisions that influence the model’s behaviour. Take, for example, Query Translation, the process of transforming or augmenting the prompt to retrieve the correct documents that enable the LLM to provide the best possible answer. How you transform or augment the prompt might work perfectly for the customer service agents but not for the marketing manager or sales manager. Another example is the use of the right LLM. For your customer-facing chatbot, you want to use a small LLM that can create quick answers for customers, while for internal use, you might go for the biggest model with a large context window.
🤔Choosing the right RAG model
RAG models come in different flavours: RAG, CRAG, FLAIRE or MoE, to name a few. New ones are popping up quickly, and one of them is Mixture-of-Depths (MoD), which promises enhanced focus and reduced costs.
You might choose a relatively simple CRAG Model with limited or perhaps one data source for the customer-facing chatbot. CRAG, as you want to incorporate contrastive learning techniques that enable the model to better distinguish between correct and incorrect responses, which is vital when in direct contact with a customer.
For your marketing and sales department, you might implement an MoE or MoD model connected with as many data sources as possible to provide an exhaustive answer.
🆘Generic LLMs need help
You might think of LLMs like ChatGPT as the oracle that knows it all, but that isn’t how they were built in the first place. First and foremost, LLMs aim to ‘understand’ our language by predicting the next word. Having accomplished that, we discovered they are pretty good at answering questions but need help.
For instance, we discovered that LLMs produce better answers when we tell them to show us the steps they take to come up with the answer. Also, when we add additional information to the user’s prompt, such as you are a customer service agent from company X, and reply in a friendly tone to this prompt, the LLM will produce answers more appropriately.
🧑🤝🧑Human LLMs
Personally, I find it funny to see that LLMs work very much like Humans. They produce better answers when they are:
· Experts focused on one topic,
· Work in a structured way and,
· Have access to the right data.
And finally, they even lie, or we call it hallucinate, when they are let free.
🤖Multi-Agent Models: The Future of AI Agents
The next step in the evolution of AI agents could be having multiple agents working together to solve complex problems. You might have seen the visuals where different AI agents represent every expertise in an organisation, and together, they ‘run’ the company.
Imagine answering a Request For Proposal (RFP) from a prospect. You import the RFP into the AI model and request it to make a proposal. You could have the following workflow:
· The Pre-sales engineer Agent selects the correct services,
· The Marketing Agent create content,
· The Sales Agent sets the pricing,
· And have the Finance and Legal Agent add the T&Cs.
Every agent is an expert in its field, so the theory is that this would provide the best results. This is promising but also very much in its infancy, with systems like this producing output of a highly inconsistent quality level.
📢Conclusion
In conclusion, Agent or Multi-Agent systems represent a significant step forward in the field of AI. By implementing AI agents, LLMs can provide more accurate and comprehensive responses to complex user queries.