# Embedding Model

## What is Embedding?

In the RAG (Retrieval-Augmented Generation) process, the core task of **Embedding** is to convert large amounts of text data processed by the Parser, such as documents and knowledge base content, into numerical forms that computers can understand and compare - vectors.

This conversion process enables the RAG system to:

### 1. **Understand Semantics**

Embedding models capture the deep semantic meaning of text, not just surface vocabulary. This means that even if the wording in queries and documents differs, the system can identify their relevance as long as the semantics are similar.

### 2. **Efficient Retrieval**

After converting text to vectors, the system can use efficient vector similarity search algorithms to quickly find the most relevant text segments from massive databases.

### 3. **Improve Generation Quality**

Retrieved relevant text segments are provided to Large Language Models (LLMs) as contextual references, helping LLMs generate more accurate responses.

## Embedding Plays Two Key Roles in MaiAgent's RAG Technology

### 1. **Convert knowledge base content into vectors** and store them in vector databases

### 2. **Vectorize user questions** to compare similarity with vectors in the database to find the most relevant content

<figure><img src="https://3415477754-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FNBTi475lqozGpB7xObpE%2Fuploads%2Fgit-blob-0ce05108866ecded0daa5a9fb3cd8b9137ffef26%2F2%20(1).png?alt=media" alt=""><figcaption><p>RAG Process</p></figcaption></figure>

Simply put, Embedding is the cornerstone of RAG systems, transforming unstructured text data into computable and comparable vectors, which is a key prerequisite for achieving precise information retrieval and high-quality content generation.

## Impact of Embedding on RAG Systems

### 1. Retrieval Quality

* **Semantic Understanding Depth**: High-quality embedding models can more accurately capture the semantic content of text, improving retrieval relevance
* **Context Awareness**: Excellent embeddings can understand textual context relationships, ensuring retrieval result coherence
* **Multilingual Support**: Powerful multilingual embedding models can handle cross-language knowledge retrieval needs

### 2. System Performance

* **Retrieval Speed**: The vector dimensions and computational efficiency of embedding models directly affect retrieval response time
* **Resource Consumption**: Different embedding models have varying computational resource requirements, affecting system scalability
* **Parallel Processing**: Efficient embedding models can support large-scale parallel retrieval

## Embedding Models Supported by MaiAgent

| Model Name                               | Model Developer | Origin | Features                                                                                                       | Open Source | Deployment Method                     | MTEB Average Score                             |
| ---------------------------------------- | --------------- | ------ | -------------------------------------------------------------------------------------------------------------- | ----------- | ------------------------------------- | ---------------------------------------------- |
| Cohere Embed v4.0                        | Cohere          | Canada | Multilingual support, highest performance                                                                      | No          | Requires cloud API inference service  | <p>Not yet public<br>Reference v3.0: 64.47</p> |
| Cohere Embed Multilingual v3.0 (Bedrock) | Cohere          | Canada | Multilingual support, high performance                                                                         | No          | Requires cloud API inference service  | 64.47                                          |
| OpenAI text-embedding-3-Large            | OpenAI          | USA    | Multilingual (especially strong in English context), medium performance                                        | No          | Requires cloud API inference service  | 64.68                                          |
| EmbeddingGemma                           | Google          | USA    | <mark style="color:red;">Open source</mark>, multilingual support, lightweight                                 | Yes         | Can be deployed on cloud or local GPU | 61.15                                          |
| Mxbai-embed-large                        | Mixedbread AI   | USA    | <mark style="color:red;">Open source</mark>, good balance of performance and resources; strong in long context | Yes         | Can be deployed on cloud or local GPU | 64.68                                          |
| BGE-Large                                | BAAI            | China  | <mark style="color:red;">Open source</mark>, multilingual support, lightweight                                 | Yes         | Can be deployed on cloud or local GPU | 64.23                                          |
| Nomic-embed-text                         | Nomic AI        | USA    | <mark style="color:red;">Open source</mark>, lightweight                                                       | Yes         | Can be deployed on cloud or local GPU | 62.39                                          |
| Qwen3-Embedding 0.6B                     | Alibaba         | China  | <mark style="color:red;">Open source</mark>, lightweight                                                       | Yes         | Can be deployed on cloud or local GPU | 61.82                                          |
| Granite-embedding-278m-multilingual      | IBM             | USA    | <mark style="color:red;">Open source</mark>, multilingual, lightweight                                         | Yes         | Can be deployed on cloud or local GPU | 56.1                                           |

{% hint style="info" %}
To ensure semantic representation accuracy and retrieval precision, MaiAgent's embedding models are selected and evaluated based on MTEB (Massive Text Embedding Benchmark) standards. MTEB is currently the mainstream benchmark for semantic vectorization models, covering various task types including:

* Retrieval
* Classification
* Clustering
* Reranking
* STS (Semantic Textual Similarity)
* Summarization / QA / Pair Classification, etc.
  {% endhint %}

## MaiAgent's Embedding Technology Advantages

### 1. Flexible Model Selection:

Offers multiple embedding model choices to meet different needs

### 2. Diverse Deployment Options:

Supports both cloud and local deployment to ensure data security

### 3. Performance Optimization:

Specially optimized for RAG scenarios to provide optimal retrieval results

### 4. Cost Effectiveness:

Choose appropriate models based on actual needs, balancing performance and cost
