Deployment Architecture

MaiAgent is a scalable generative AI platform that supports diverse application scenarios. To accommodate different business requirements and resource allocation methods, the platform offers two main deployment modes: Monolithic Deployment and Distributed Deployment.

This section will explain the differences between these two architectures, their applicable scenarios, their respective advantages and disadvantages, and provide practical deployment references.

Monolithic Deployment

Architecture Overview

In the monolithic deployment mode, all core components of MaiAgent (such as main service, task scheduling service, data storage, database, frontend service, etc.) are installed and run on the same server. It features centralized management, simple deployment, and is suitable for rapid launch and testing environments.

The MaiAgent platform can operate without GPU, allowing smooth deployment and execution in standard CPU environments. However, when deployed on machines with GPU resources, the platform can also be deployed alongside models in the same environment to fully utilize hardware acceleration capabilities. Below are two common architecture diagrams for reference.

Architecture Diagrams

  1. Deployment on GPU-enabled server:

MaiAgent platform and model services are installed on the same machine, with the platform handling request coordination and traffic control through internal APIs, while the model utilizes GPU for efficient inference capabilities. When the platform and model services are placed together, there's no need to purchase additional servers solely for running the platform, reducing overall hardware costs.

  1. Deployment on non-GPU server:

When MaiAgent is deployed on a server without GPU, since model services are still required, it needs to integrate with GPU servers or cloud API inference services through API connections. When the platform and model services are separated, they can be scaled independently, allowing flexible increase or decrease of computing power based on needs, making the architecture more flexible and maintainable.

Distributed Deployment

Architecture Overview

In the distributed deployment mode, MaiAgent's core modules are split into independent services and distributed across multiple servers. Different modules can be horizontally scaled according to needs, achieving high availability and large-scale processing capabilities.

  • Cloud Platform (Cloud PaaS) Environment In public or private cloud environments, you can directly utilize Platform as a Service (PaaS) capabilities, such as Kubernetes, AWS ECS/EKS, GCP Cloud Run, Azure App Service, etc. These services provide container orchestration, load balancing, auto-scaling, and monitoring mechanisms, enabling quick deployment and dynamic resource adjustment of distributed modules while reducing infrastructure maintenance burden.

  • On-Premise VM Environment Even in on-premise VM scenarios, you can set up container platforms or application service frameworks through virtual machines or bare metal servers to achieve distributed management and scalability similar to cloud environments. Although cluster resources, monitoring, and redundancy mechanisms need to be planned independently, high availability and elastic scaling can still be achieved.

Architecture Diagram

Deployment Mode Comparison Table

Characteristics
Monolithic Deployment
Distributed Deployment

Architecture Design

All components centralized on a single server/container

Components split into independent services, distributed across multiple nodes

Infrastructure Cost

Low, single server sufficient

High, requires multiple servers or cloud resources

Deployment Cost

Low

High, complex deployment, requires DevOps team

Maintenance Cost

Low, centralized management

High, requires cross-server and cross-service maintenance and monitoring

Scalability

None, limited by single machine resources

Yes, can independently scale bottleneck modules

High Availability

None, single point of failure leads to system-wide outage

Yes, single service failure doesn't affect overall system

Suitable Scenarios

PoC, development testing, small-scale applications

Production deployment, large-scale, multi-department

Last updated

Was this helpful?