Infrastructure
This document describes the infrastructure components and architecture of the TradeX platform.
Database Architecture
Section titled “Database Architecture”MongoDB
Section titled “MongoDB”Purpose: Primary database for document storage
Used By:
- Backend Service (user data, preferences)
- User Service (profiles, preferences)
- Matching Engine (trade history, order history)
Characteristics:
- Document-based storage
- Flexible schema
- Horizontal scaling
- Replication support
PostgreSQL
Section titled “PostgreSQL”Purpose: Relational data storage
Used By:
- Auth Service (user authentication, sessions)
- User Service (KYC data, account settings)
- Wallet Service (balances, transactions)
- Metadata Service (instruments, configuration)
- Market Data Service (trades, candles, order books)
Characteristics:
- ACID compliance
- Strong consistency
- Complex queries
- Transactions
TimescaleDB
Section titled “TimescaleDB”Purpose: Time-series data storage
Used By:
- Market Data Service (OHLCV candles, trades)
Characteristics:
- Optimized for time-series data
- Automatic data retention policies
- Compression
- Continuous aggregates
ClickHouse
Section titled “ClickHouse”Purpose: Analytics and historical data
Used By:
- Market Data Service (historical trades, analytics)
Characteristics:
- Columnar storage
- High compression
- Fast analytical queries
- Horizontal scaling
Purpose: Caching and session management
Used By:
- All services (caching)
- Market Data Service (order book caching)
- Metadata Service (configuration caching)
- Auth Service (session storage)
Characteristics:
- In-memory storage
- Sub-millisecond latency
- Pub/sub support
- Persistence options
Message Queue Architecture
Section titled “Message Queue Architecture”Purpose: Event streaming and messaging
Configuration:
- Brokers: 3-node cluster (kafka-1, kafka-2, kafka-3)
- Replication: 3x replication factor
- Partitions: 3 partitions per topic (default)
- Schema Registry: Confluent Schema Registry
Topics:
engine.event.v1- Matching engine eventsengine.snapshot.v1- Order book snapshotsmd.instrument.*- Metadata eventswallet.*- Wallet eventsauth.*- Authentication eventsuser.*- User events
Characteristics:
- High throughput
- Event ordering per partition
- At-least-once delivery
- Schema evolution support
Service Architecture
Section titled “Service Architecture”Microservices
Section titled “Microservices”Each service is:
- Independent: Can be deployed independently
- Scalable: Horizontal scaling support
- Resilient: Fault-tolerant design
- Observable: Metrics, logging, tracing
Service Communication
Section titled “Service Communication”- REST: External APIs
- gRPC: Internal APIs
- Kafka: Event-driven communication
- WebSocket: Real-time client updates
Monitoring and Observability
Section titled “Monitoring and Observability”Prometheus
Section titled “Prometheus”Purpose: Metrics collection and storage
Metrics:
- Service metrics (latency, throughput, errors)
- Infrastructure metrics (CPU, memory, disk)
- Business metrics (orders, trades, users)
Grafana
Section titled “Grafana”Purpose: Metrics visualization and dashboards
Dashboards:
- Service health dashboards
- Infrastructure dashboards
- Business metrics dashboards
OpenTelemetry
Section titled “OpenTelemetry”Purpose: Distributed tracing
Features:
- Request tracing across services
- Span correlation
- Performance analysis
Logging
Section titled “Logging”Purpose: Structured logging
Features:
- Centralized logging
- Log aggregation
- Search and analysis
Deployment Architecture
Section titled “Deployment Architecture”Containerization
Section titled “Containerization”- Docker: Container runtime
- Docker Compose: Local development
- Kubernetes: Production deployment (optional)
Service Discovery
Section titled “Service Discovery”- DNS-based: Service discovery via DNS
- Environment variables: Service URLs
- Service mesh: Optional (Istio, Linkerd)
Load Balancing
Section titled “Load Balancing”- Gateway: API gateway for external traffic
- Service mesh: Internal load balancing
- Kafka: Partition-based load balancing
Security Architecture
Section titled “Security Architecture”Authentication
Section titled “Authentication”- JWT Tokens: Stateless authentication
- API Keys: Service-to-service authentication
- mTLS: Mutual TLS for gRPC (optional)
Authorization
Section titled “Authorization”- RBAC: Role-based access control
- Service-level permissions: Per-service permissions
- Resource-level permissions: Per-resource permissions
Network Security
Section titled “Network Security”- TLS: Encryption in transit
- Network policies: Service-to-service restrictions
- Firewall rules: External access restrictions
Scalability
Section titled “Scalability”Horizontal Scaling
Section titled “Horizontal Scaling”- Stateless services: Easy horizontal scaling
- Database sharding: For high-volume data
- Kafka partitioning: Parallel processing
Vertical Scaling
Section titled “Vertical Scaling”- Resource limits: Per-service resource limits
- Auto-scaling: Based on metrics
- Resource optimization: Efficient resource usage
High Availability
Section titled “High Availability”Replication
Section titled “Replication”- Database replication: Master-replica setup
- Kafka replication: 3x replication factor
- Service replication: Multiple service instances
Failover
Section titled “Failover”- Automatic failover: Database failover
- Service restart: Automatic service restart
- Circuit breakers: Fault tolerance
Backup and Recovery
Section titled “Backup and Recovery”Database Backups
Section titled “Database Backups”- Regular backups: Automated backups
- Point-in-time recovery: Time-based recovery
- Backup storage: Off-site backup storage
Disaster Recovery
Section titled “Disaster Recovery”- Recovery procedures: Documented procedures
- Recovery testing: Regular testing
- RTO/RPO: Recovery time and point objectives