A Guide to using RAG with AskYoda

A Guide to using RAG with AskYoda

Ever wondered how to enhance Large Language Models (LLMs) with your own data? Enter retrieval-augmented generation (RAG)! This article explores RAG's benefits, mechanics, use cases, and implementation in Eden AI (ASKYODA). Let's dive in!

What is RAG ?

Large Language Models (LLMs) have changed the way we interact with technology now with their amazing capabilities to generate high-quality human-like text, translate languages, create diverse forms of creative content and answer your questions in an informative way.

Despite this, LLMs face some limitations, particularly in their consistency of producing accurate and relevant responses as they lack a clear source for the generated data.

In response to this, researchers have introduced a new approach known as Retrieval-Augmented Generation (RAG), which combines the strengths of retrieval and generative models to improve the accuracy of LLMs.

How Retrieval Augmented Generation works

RAG is a method that enhances the quality and relevance of LLM-generated responses by allowing them to access additional data resources without needing to be retrained.

Retrieval models are good at extracting relevant information from vast datasets, while generative models are good at generating creative text. So, RAG uses retrieval models to find relevant documents or data points and adds them to the LLM's prompt to produce more accurate responses.

RAG works by collecting and adding relevant documents or data points to an LLM's prompt to generate a more accurate response. Here's a step-by-step breakdown of the process:

  1. The context window of a Large Language Model (LLM) represents its current scope of perception. Acting as a prompt card, RAG underscores crucial points for the LLM's consideration. By integrating pertinent details, RAG elevates the LLM's capacity to produce precise and informed responses.
  2. To find relevant information, retrieval-augmented generation uses semantic search. The user's query or prompt is converted into embeddings and sent to a vector database.
  3. The vector database then searches for the nearest neighbor to the user's intent and returns the relevant results. These results are passed to the LLM via its context window to generate a response.

The Many Benefits of Using RAG

Source Verification and Trust Building

One of the biggest advantages of using retrieval-augmented generation (RAG) is that it can help LLMs provide sources to users, allowing them to verify the answers just like one would check the sources in a research paper with footnotes, this will help build trust in the model's response.

Hallucination Resolution and Error Reduction

Using RAG can also help clear up ambiguity in a user's query and reduce the chances of the model making wrong guesses or hallucinating and by doing this, it reduces the chances of the model producing incorrect or fabricated information.

Enhanced Scalability

What's more, the implementation of RAG can also enhance the scalability of a system, making it more adept at handling large datasets and intricate inquiries.

Using RAG as a workflow

The RAG method can be summarized into a straightforward workflow comprising the following steps:

1. Select Text Embedding Provider:

Choose an appropriate text embedding provider, such as Google, Cohere or OpenAI Ada for converting text into vector representations.

2. Connect to Vectorial Database:

Establish a connection to a vectorial database, such as Elasticsearch, Faiss, Qdrant or supabase, where the embedded text data will be stored and searched.

3. Embed Existing Data:

Convert all existing data in your knowledge base, including PDFs, HTML documents, and audio files, into text representations using the chosen embedding provider. Apply preprocessing steps if necessary to handle different data formats and ensure consistency.

4. Apply Semantic Search and Retrieval on the user query:

  • Receive User Query: Receive the user's query in natural language.
  • Embed User Query: Transform the user query into a vector representation using the selected embedding provider.
  • Perform Semantic Search: Conduct a semantic search within the vectorial database, comparing the embedded user query to the stored text representations.
  • Identify K Nearest Neighbors: Identify the K nearest neighbors (chunks) of the embedded query within the vectorial database. These chunks represent the most relevant text segments in the knowledge base that semantically align with the user's query.

5. Answer Generation with LLM

  • Select Language Model: Choose an appropriate large language model (LLM), such as OpenAI GPT, google Palm2, Anthropic Claude or Cohere to generate comprehensive and informative responses.
  • Prepare Context: Create a context document that combines the user query and the K nearest neighbors identified in the semantic search. This context provides the LLM with the relevant information to formulate a comprehensive answer.
  • Feed Context to LLM: Pass the context document to the selected LLM, allowing it to access the embedded representations of the user's query and the relevant text chunks from the knowledge base.
  • Generate Answer: The LLM analyzes the context and generates a natural language response that addresses the user's query, incorporating information from the K nearest neighbors.
  • Refine and Return Response: The generated response may undergo refinement steps, such as text simplification, paraphrasing, and ensuring consistency with the knowledge base. Finally, the refined response is returned to the user.

This workflow effectively utilizes text embeddings, semantic search, and a powerful LLM to provide comprehensive and relevant answers to user queries, leveraging the knowledge base and the user's query in a semantically meaningful way.

Use RAG workflow in Eden AI (AskYoda)

This workflow can be easily implemented using Eden AI's AskYoda –a user-friendly platform that streamlines the entire process.

Eden AI's AskYoda RAG Worflow

1. Multiple data integrations:

AskYoda simplifies the first steps of the RAG workflow by offering an intuitive interface to upload and manage your data. Whether it's PDFs, HTML documents, or audio files, AskYoda handles the data preprocessing for you, ensuring a smooth transition from raw information to text representations.

2. Effortless Semantic Search:

Connecting to a vectorial database becomes a breeze with AskYoda. The platform seamlessly integrates with popular databases such as Qdrant and supabase, allowing you to establish a robust connection for storing and retrieving the embedded text data.

3. Intelligent Retrieval of Information:

With AskYoda's user-friendly interface, performing semantic search and retrieving information is just a few clicks away. The platform takes care of transforming user queries into vector representations and efficiently identifies the K nearest neighbors within the vectorial database, presenting you with the most relevant text segments aligned with the user's query.

4. Diverse LLM Selection:

AskYoda empowers users by providing a range of powerful LLMs to choose from, including OpenAI GPT, Google Palm2, Anthropic Claude, and Cohere. This flexibility ensures that you can tailor your responses based on the specific requirements of your application.

5. Streamlined Answer Generation:

Creating a context document and feeding it to the selected LLM is made simple with AskYoda's intuitive workflow. The platform enables you to effortlessly generate natural language responses by analyzing the context, incorporating information from the semantic search, and delivering comprehensive answers to user queries.

6. User-Centric Experience:

Eden AI's AskYoda prioritizes a user-centric experience, making the implementation of the RAG workflow accessible to both beginners and seasoned professionals. The platform's user-friendly design and powerful features combine to create a seamless and efficient process from data upload to response generation, it is also available as an API.

Here's a detailed guide on creating your personalized AI assistant using AskYoda. Alternatively, you can to watch the instructional video below:

Add your chatbot to your website

Your chatbot can be integrated into a website or application to allow users to ask questions and receive responses based on the data the chatbot has been trained on. The repository on GitHub contains the source code for using and displaying the Yoda Chatbot in a website, with branches for the unframed source code and the embed code.


RAG, an innovative technique for boosting LLM accuracy and consistency is becoming an indispensable tool in the field of natural language processing. Its integration in AskYoda Eden AI's RAG workflow simplifies the process, allowing users to tap into the power of text embeddings, semantic search, and LLMs without the intricacies of manual implementation.

Related Posts

Try Eden AI for free.

You can directly start building now. If you have any questions, feel free to schedule a call with us!

Get startedContact sales