Gaia-X Guide
Welcome to the Gaia-X Guide! This guide will help you understand various aspects of the Gaia-X project, including installation, configuration, and usage.
Resource Links
Introduction
Welcome to Gaia-X! This guide will help you understand the basic concepts and usage of Gaia-X.
What is Gaia-X?
Gaia-X is an open-source enterprise-grade AI application platform that includes a cross-platform client and enterprise management center. It provides a comprehensive one-stop solution for enterprise AI applications, helping users build AI applications more efficiently. Key features include:
- Enterprise Management: Provides an enterprise management center supporting multi-user, multi-permission management, centralized model management, and billing
- Highly Extensible: Plugin system based on MCP protocol supports integration with any MCP Server
- Multi-Agent Intelligent Collaboration: Supports multiple agents with intelligent agent selection for optimal task execution
Why Choose Gaia-X?
Among current open-source chatbot products, none are truly enterprise-grade, with most targeting individual users. Gaia-X is specifically designed for enterprise users with the following distinctive features:
1. Enterprise-Grade Features
The enterprise management center, developed in Golang, provides core enterprise management capabilities:
- Enterprise user management, including support for OAuth2.0, LDAP, DingTalk, and other user authentication and information management
- Large language model API authorization management, eliminating the need to share API keys with users
- Billing and quota management: Supports monthly user quotas and dual daily/monthly quota control for Agent APIs
- Various enterprise-grade reports
- Enterprise plugin marketplace and agent marketplace
- Integration with other Agent platforms, including Dify and Coze
2. MCP Support
The first enterprise-grade AI product in China supporting the MCP model context protocol, including:
- Support for various standard MCP Servers from the MCP community
- Internal system APIs can be configured through the management center to provide services as standard MCP Servers with controlled permissions
3. Multi-Agent Intelligent Collaboration
Built-in multi-agent management with the following features:
- Automatic vector storage for all agents
- Automatic question rewriting by large language models and RAG-based retrieval of most similar agents when users submit questions
- Automatic classifier selection of optimal single or multiple agents to process and respond to user queries
4. Natural Language-Driven RPA
Unlike traditional RPA that requires coding or script recording, we leverage large language models' planning and computer operation capabilities to directly convert user requirements into multiple computer operation steps, similar to human computer interaction, without requiring understanding of technical details like webpage source code.
5. Human-Confirmable ReAct Operations
To address risks in enterprise scenarios when large language models call tools, especially when directly submitting data to enterprise system APIs, we've introduced human-confirmable tool invocation:
- Support for configuring tools in MCP Servers to require human confirmation
- Tools requiring human confirmation automatically render appropriate display modes based on parameter types, such as forms, and support human modification and confirmation before executing subsequent processes, similar to Cursor's code writing confirmation process
6. Canvas Support
Besides supporting dynamic form rendering (for various tool parameters), it also supports automatic adaptation for other artifact scenarios:
- Code highlighting and execution (requires corresponding language MCP Server support)
- SVG and HTML display
- Common chart rendering, such as Echarts and Mermaid
7. Text Selection and Floating Ball
- Text selection triggers a dedicated toolbar that can send selected content to corresponding agents for generation
- Text selection toolbar supports custom configuration or selection from existing agent lists
- Global floating ball enables Gaia-X invocation in any context
- Both text selection and floating ball can be easily disabled to avoid interfering with user operations
Technical Architecture
The client is implemented using Electron + Ant Design X, supporting both MacOS and Windows platforms. The text selection component is developed in C++ for Windows and Objective-C for MacOS. MCP supports both TypeScript and Python, with each MCP Server running in an independent and isolated space for security.
Computer operations support three models: Claude 3.7 Sonnet, Zhipu CogAgent, and Byte UI-TARS. Users can configure their preferred model as needed. Note that Zhipu CogAgent and Byte UI-TARS are open-source models that require self-deployment. For deployment details, please refer to documentation xxx.
Core Concepts
Before starting with Gaia-X, understanding these core concepts will help you better comprehend and use it:
- Model Context Protocol (MCP): An open-source protocol by Anthropic for model context transmission, with an active community and numerous MCP Server options
- Agent: Gaia-X agents that can be intelligently coordinated in conversations and utilize tools from multiple MCP Servers
- Plugin: Components that extend Gaia-X functionality
Next Steps
- Getting Started - Learn how to install and use Gaia-X
- Advanced Features - Explore advanced features and usage of Gaia-X