RAG knowledge retrieval system

RAG (Retrieval-Augmented Generation) is a generative AI architecture that combines retrieval and generation techniques

Use external databases or knowledge bases for retrieval, and combine the retrieval results with a large language model to generate responses that are more accurate and contextually relevant.

The core of RAG technology is to combine the language capabilities of generative AI with knowledge retrieval capabilities, so that when the model answers questions it not only relies on internal training data but can also dynamically obtain the latest and more specialized information from external databases and incorporate that information into generated responses.

RAG process

High-accuracy RAG system

Although RAG knowledge-base retrieval systems can be quickly implemented using vector search and deployed as a basic version, further improving their reply accuracy is challenging. Reply accuracy is crucial to the user experience because it directly affects users' trust in and satisfaction with the system's responses. If reply accuracy is insufficient, users may doubt the system's answers and thus be less willing to use it.

According to material from the 2023 OpenAI Developers Conference, if a RAG system only performs simple vector similarity search and selects the correct embedding model, it can reach 45%. With HyDE Retrieval, FT Embeddings, and Chunk/Embedding Experiments added, reply accuracy can reach 65%.

In addition to the RAG techniques mentioned at the OpenAI Developers Conference, MaiAgent RAG also integrates various classic NLP algorithms and proprietary retrieval technologies. Compared with replies from OpenAI RAG, using internal datasets both can achieve 95% reply accuracy.

MaiAgent RAG reply accuracy

The MaiAgent platform offers two RAGs: MaiAgent RAG and OpenAI RAG. The following is a comparison table across different aspects:

MaiAgent RAG
OpenAI RAG

Model support

Supports all models 👍

Only supports OpenAI models

Environment support

Supports cloud and on-premises 👍

Only supports cloud; data needs to be sent to OpenAI

Reply accuracy

Extremely high 👍

Extremely high 👍

Supported file formats

Supports all common formats 👍 doc, docx, xls, xlsx, csv, ppt, pptx, pdf, txt, json, jsonl, md

Does not support xlsx, csv Does not support jsonl Does not support legacy Office files (doc, xlsx, ppt)

Supports images in documents

Yes (currently experimental) 👍

No

Supports tables in documents

Yes (currently experimental) 👍

No

Supports attachment uploads in conversations

Supported 👍

Supported 👍

Data slice transparency

Visualized 👍

Black box

Debug difficulty

Normal 👍

Black box, cannot debug

Top K adjustment

Enterprise edition customization feature 👍

No

Switch embedding model

Enterprise edition customization feature 👍

No

Last updated

Was this helpful?