Architecture Overview
Our system consists of three specialized units that work together to provide a complete AI infrastructure solution:Inference Unit
IP Address: 10.10.99.98
Port Range: 58000-58999
Function: Runs inference frameworks and AI models, provides computing services for machine learning workloads.
Port Range: 58000-58999
Function: Runs inference frameworks and AI models, provides computing services for machine learning workloads.
Application Unit
IP Address: 10.10.99.99
Port Range: 59000-59299
Function: Hosts user applications and system platforms, manages application lifecycle and deployment.
Port Range: 59000-59299
Function: Hosts user applications and system platforms, manages application lifecycle and deployment.
Hardware Control Unit
IP Address: 10.10.99.97
Port: 80 (monitoring service)
Function: Provides hardware monitoring, configuration, and matrix display management.
Port: 80 (monitoring service)
Function: Provides hardware monitoring, configuration, and matrix display management.
Computing Unit Configuration
The Computing Unit serves as the core processing engine, handling all AI inference tasks and model operations. Proper configuration ensures optimal performance for your machine learning workloads.
Inference Framework Setup
The Computing Unit runs multiple inference frameworks simultaneously:-
vLLM Inference Framework
- Service: Large Language Model (LLM) processing
- Port: 58000
- Purpose: High-performance LLM inference
-
TEI Embedding Model
- Port: 58080
- Purpose: Text embedding generation
-
TEI Reranker Model
- Port: 58081
- Purpose: Search result reranking
Model Storage Configuration
Store models that you manually configure and deploy:
Use this directory for models that require custom configuration or manual deployment processes.
Auto-launch Configuration
Configure automatic model startup behavior: Configuration File:~/cfe/autorun.sh
This script is user-editable and unencrypted. You can modify it to specify which models should automatically start when the system boots.
~/cfe/autorun.sh
Application Unit Configuration
The Application Unit provides a flexible deployment environment for user applications and system platforms. It includes both web serving capabilities and specialized AI platforms.
Core Services
Nginx Web Server- Port: 80
- Purpose: Serves user applications and handles HTTP requests
- Configuration: Standard nginx setup with reverse proxy capabilities
-
Dify Platform
- Port: 59080
- Purpose: AI application development platform
- Features: Workflow builder, model management, API endpoints
-
Altai (Local Deployment)
- Port: 59299
- Purpose: Local AI deployment and management
- Features: Self-hosted AI model serving
Application Storage Structure
Location for imported applications before deployment:
Applications in this directory are unencrypted. Ensure proper security measures are in place before importing sensitive applications.
Hardware Control Unit Configuration
The Hardware Control Unit provides comprehensive monitoring and configuration capabilities. It ensures system stability and allows customization of hardware displays.
Monitoring Service
Hardware Monitoring Frontend- Path:
/sdcard/web
- Port: 80
- Purpose: Web-based hardware monitoring interface
The monitoring frontend is user-modifiable. You can customize the interface by editing files in the
/sdcard/web
directory.Display Configuration
Matrix Display Settings- Configuration File:
/sdcard/matrix.json
- Purpose: Configure matrix display logos and patterns
/sdcard/matrix.json
Setup and Configuration Steps
1
Verify Network Configuration
Ensure all three units can communicate with each other:
All units should respond to ping requests within the local network.
2
Configure Port Access
Verify that the required ports are available and not blocked by firewalls:
- Computing Unit: Ports 58000-58999
- Application Unit: Ports 59000-59299
- Hardware Control Unit: Port 80
Ensure no port conflicts exist with other services running on your network.
3
Set Up Model Storage
Create the required directory structure on the Computing Unit:
Verify directories are created with proper permissions for model storage.
4
Configure Auto-launch Script
Edit the auto-launch configuration:
Test your auto-launch script manually before relying on automatic startup.
5
Deploy Applications
Set up application storage on the Application Unit:
Verify applications can be deployed and accessed through the configured ports.
Security Considerations
Important Security Notes:
- All imported applications and scripts are stored in non-encrypted format
- The auto-launch script is user-editable and unencrypted
- Ensure proper access controls are in place for sensitive operations
- Regularly update and monitor all system components
Troubleshooting
Port Connection Issues
Port Connection Issues
Common Problems:
- Firewall blocking configured ports
- Network connectivity issues between units
- Port conflicts with existing services
- Check firewall rules for the required port ranges
- Verify network connectivity using ping tests
- Use
netstat
to identify port conflicts
Model Loading Failures
Model Loading Failures
Common Problems:
- Incorrect model directory paths
- Insufficient permissions for model files
- Auto-launch script configuration errors
- Verify model files are in the correct directories
- Check file permissions with
ls -la
- Test auto-launch script manually
Application Deployment Issues
Application Deployment Issues
Common Problems:
- Application storage path configuration
- NVMe SSD access issues
- Port assignment conflicts
- Verify storage paths exist and are accessible
- Check NVMe SSD mount status
- Review port assignments for conflicts
Next Steps
After completing the system setup:- Deploy Your First Model: Upload a model to the Computing Unit and test inference
- Configure Applications: Set up your first application on the Application Unit
- Monitor System Health: Use the Hardware Control Unit to monitor system performance
- Customize Display: Configure matrix display settings for your environment
For additional technical support or advanced configuration guidance, contact your system administrator or refer to the detailed API documentation.
© 2025 Panidea (Chengdu) Artificial Intelligence Technology Co., Ltd. All rights reserved.