Gaia-X Guide

Welcome to the Gaia-X Guide! This guide will help you understand various aspects of the Gaia-X project, including installation, configuration, and usage.

Introduction

Welcome to Gaia-X! This guide will help you understand the basic concepts and usage of Gaia-X.

Gaia-X is an open-source enterprise-grade AI application platform that includes a cross-platform client and enterprise management center. It provides a comprehensive one-stop solution for enterprise AI applications, helping users build AI applications more efficiently. Key features include:

Enterprise Management: Provides an enterprise management center supporting multi-user, multi-permission management, centralized model management, and billing
Highly Extensible: Plugin system based on MCP protocol supports integration with any MCP Server
Multi-Agent Intelligent Collaboration: Supports multiple agents with intelligent agent selection for optimal task execution

Why Choose Gaia-X?

Among current open-source chatbot products, none are truly enterprise-grade, with most targeting individual users. Gaia-X is specifically designed for enterprise users with the following distinctive features:

1. Enterprise-Grade Features

The enterprise management center, developed in Golang, provides core enterprise management capabilities:

Enterprise user management, including support for OAuth2.0, LDAP, DingTalk, and other user authentication and information management
Large language model API authorization management, eliminating the need to share API keys with users
Billing and quota management: Supports monthly user quotas and dual daily/monthly quota control for Agent APIs
Various enterprise-grade reports
Enterprise plugin marketplace and agent marketplace
Integration with other Agent platforms, including Dify and Coze

2. MCP Support

The first enterprise-grade AI product in China supporting the MCP model context protocol, including:

Support for various standard MCP Servers from the MCP community
Internal system APIs can be configured through the management center to provide services as standard MCP Servers with controlled permissions

3. Multi-Agent Intelligent Collaboration

Built-in multi-agent management with the following features:

Automatic vector storage for all agents
Automatic question rewriting by large language models and RAG-based retrieval of most similar agents when users submit questions
Automatic classifier selection of optimal single or multiple agents to process and respond to user queries

4. Natural Language-Driven RPA

Unlike traditional RPA that requires coding or script recording, we leverage large language models' planning and computer operation capabilities to directly convert user requirements into multiple computer operation steps, similar to human computer interaction, without requiring understanding of technical details like webpage source code.

5. Human-Confirmable ReAct Operations

To address risks in enterprise scenarios when large language models call tools, especially when directly submitting data to enterprise system APIs, we've introduced human-confirmable tool invocation:

Support for configuring tools in MCP Servers to require human confirmation
Tools requiring human confirmation automatically render appropriate display modes based on parameter types, such as forms, and support human modification and confirmation before executing subsequent processes, similar to Cursor's code writing confirmation process

6. Canvas Support

Besides supporting dynamic form rendering (for various tool parameters), it also supports automatic adaptation for other artifact scenarios:

Code highlighting and execution (requires corresponding language MCP Server support)
SVG and HTML display
Common chart rendering, such as Echarts and Mermaid

7. Text Selection and Floating Ball

Text selection triggers a dedicated toolbar that can send selected content to corresponding agents for generation
Text selection toolbar supports custom configuration or selection from existing agent lists
Global floating ball enables Gaia-X invocation in any context
Both text selection and floating ball can be easily disabled to avoid interfering with user operations

Technical Architecture

The client is implemented using Electron + Ant Design X, supporting both MacOS and Windows platforms. The text selection component is developed in C++ for Windows and Objective-C for MacOS. MCP supports both TypeScript and Python, with each MCP Server running in an independent and isolated space for security.

Computer operations support three models: Claude 3.7 Sonnet, Zhipu CogAgent, and Byte UI-TARS. Users can configure their preferred model as needed. Note that Zhipu CogAgent and Byte UI-TARS are open-source models that require self-deployment. For deployment details, please refer to documentation xxx.

Core Concepts

Before starting with Gaia-X, understanding these core concepts will help you better comprehend and use it:

Model Context Protocol (MCP): An open-source protocol by Anthropic for model context transmission, with an active community and numerous MCP Server options
Agent: Gaia-X agents that can be intelligently coordinated in conversations and utilize tools from multiple MCP Servers
Plugin: Components that extend Gaia-X functionality

Next Steps

Getting Started - Learn how to install and use Gaia-X
Advanced Features - Explore advanced features and usage of Gaia-X