RAG Architecture

Overview

Aurory AI leverages the power of sophisticated Large Language Models (LLMs) to create advanced question-answering (Q&A) chatbots. These chatbots are capable of answering questions about specific source information using a technique known as Retrieval Augmented Generation, or RAG.

What is RAG?

Retrieval Augmented Generation (RAG) is a technique used to enhance the knowledge of LLMs with additional data. While LLMs can reason about a wide range of topics, their knowledge is restricted to the public data available up to a certain point in time when they were trained. To build AI applications that can reason about private data or data introduced after a model's training cutoff date, it is essential to augment the model's knowledge with specific information. RAG involves retrieving the appropriate information and incorporating it into the model prompt to generate accurate and relevant responses.

Aurory AI utilizes LangChain components designed to facilitate the development of Q&A applications and RAG applications more broadly.

RAG Architecture

A typical RAG application within Aurory AI comprises two main components:

  • Indexing: This involves ingesting data from a source and indexing it, usually performed offline.

  • Retrieval and Generation: This is the RAG chain that takes the user query at runtime, retrieves the relevant data from the index, and passes it to the model to generate responses.

Full Sequence from Raw Data to Answer

Indexing

  • Load: The data is loaded from multiple type of documents.

  • Split: Large documents are broken into smaller chunks using text splitters. This step is crucial for both indexing data and passing it to a model, as large chunks are harder to search over and do not fit well within a model's finite context window.

  • Store: The split data is stored and indexed using a Embeddings & Store model, allowing efficient search and retrieval.

Retrieval and Generation

  • Retrieve: Upon receiving a user input, relevant data splits are retrieved from storage using a Retriever.

  • Generate: A ChatModel/LLM generates an answer by using a prompt that includes the user's question and the retrieved data.

Application to Aurory AI

Aurory AI utilizes RAG to provide accurate and contextually relevant responses in its Q&A chatbots. By employing this technique, Aurory AI ensures that its applications can reason about both historical public data and newly introduced or private data, thereby delivering precise and comprehensive answers.

Through a structured approach involving indexing and retrieval, Aurory AI's RAG-based chatbots are equipped to handle a wide array of queries, making them powerful tools for various applications within the Web3 ecosystem and beyond. Whether it's answering questions based on unstructured data, SQL databases, or code snippets, Aurory AI's implementation of RAG sets a new standard for intelligent, data-driven interactions.

Last updated