Cubik AI

A comprehensive platform for building, deploying, and scaling AI applications with privacy in mind. Visit the live application.

Platform Overview

Cubik AI (powered by Unicron) is a comprehensive end-to-end platform for building, deploying, and scaling AI applications. It provides everything from infrastructure management to model deployment, focusing on simplifying the process of bringing AI models to production.

The platform manages the infrastructure to deploy Large Language Models at scale, providing everything from LLM deployments to custom workflows, while simplifying the MLOps (Machine Learning Operations) process.

Tech Stack

  • Go (Golang)
  • Python
  • PostgreSQL
  • Redis
  • Next.js
  • TypeScript
  • NextAuth
  • Docker
  • Kubernetes
  • AWS (EC2, EKS, ECR)
  • Terraform
  • gRPC
  • JWT
  • SQLC
  • Gin (HTTP web framework)
  • Prometheus & Grafana (monitoring)
  • Zerolog (logging)

Core Features

Model Deployment

  • Zero-Code Deployment: Deploy AI models without writing infrastructure code
  • Auto-Build System: Automatic handling of dependencies and setup
  • Performance Optimization: Automatic scaling and resource allocation
  • Real-time Monitoring: Track metrics and logs for all deployments
  • Production-Ready Endpoints: Instantly available API endpoints
  • Support for various ML model providers with provider-specific configurations
  • API key management for external providers
  • Access to numerous open source models

Application Types

  • Chat Applications: Customizable chat interfaces, message history, context management, multi-model support, conversation storage
  • AI Agents: Task planning and execution, multi-step reasoning, tool/API integration, memory and context awareness
  • Code Assistants: Code generation/completion, code review/analysis, documentation generation, bug detection/fixing
  • Workflow Automation: Visual workflow builder, conditional branching, external service integration, automated task scheduling

Infrastructure Management

  • Cluster Provisioning: Kubernetes cluster creation, node group management, auto-scaling configuration
  • Instance Type Selection: Support for various compute types (CPU, GPU), region-specific options, spot instance support
  • Server Infrastructure: Auto-scaling based on request volume, high traffic management, zero infrastructure code requirements

User and Workspace Management

  • User registration, login, and authentication
  • Email verification
  • OAuth integration (Google, GitHub)
  • Workspace creation and management
  • Team collaboration with role-based access control

Application and Service Management

  • App creation and deployment
  • Service configuration and deployment
  • API endpoints for deployed models
  • API key management for secure access

RLHF (Reinforcement Learning from Human Feedback)

  • Preference Learning: Rank model outputs, pair-wise comparison, batch preference collection
  • Direct Feedback: Numerical ratings, qualitative feedback, targeted improvements
  • Reward Modeling: Custom reward functions, multi-objective optimization, automated evaluation
  • Constitutional AI: Rule-based constraints, ethical guidelines, safety protocols

Model Training & Improvement

  • Use Now: Fast and scalable interface for using and fine-tuning models
  • Security: Protection against model misuse and attacks
  • Organic Growth: Fine-tune models with your data at any time
  • Scale as You Go: Auto-scaling clusters to keep up with demand without overpaying

System Architecture

The platform consists of several core components:

  • Main Application: Go-based backend service exposing REST APIs for managing deployments, users, and clusters
  • Zeus: gRPC service that handles interactions with cloud resources, containerization, and model deployment
  • Database Layer: PostgreSQL for storing user data, deployment configurations, and other persistent state
  • Task Processing: Asynchronous processing using Redis and the Asynq library for background jobs
  • Cluster Management: Management of Kubernetes clusters for model deployment and scaling
  • WebSocket Server: Real-time updates for long-running operations such as cluster creation
  • Authentication & Authorization: JWT-based with Role-Based Access Control (RBAC)
  • API Gateway: Routing and load balancing for deployed model endpoints

Deployment Options

  • Docker Compose: For simple deployments (Main application, Zeus gRPC, PostgreSQL, Redis, Nginx)
  • AWS: Primary cloud provider with EC2, EKS, ECR, and other services, using Terraform for infrastructure provisioning
  • Kubernetes: For more advanced deployments with EKS, custom node groups, and auto-scaling
  • CI/CD Pipeline: GitHub Actions for automated testing, Docker image building/pushing, and deployment

Security Features

  • Authentication: JWT-based authentication, secure password hashing, OAuth integrations, email verification
  • Authorization: Role-based access control, fine-grained permissions, API key management
  • Network Security: HTTPS encryption, IP whitelisting for load balancers, VPC isolation
  • Data Security: Database encryption, credential management, secret key rotation

User Management

  • User Types: Regular users, system administrators, workspace administrators
  • Roles and Permissions: Predefined roles, custom roles with specific permissions, workspace-specific role assignments
  • Team Collaboration: Workspace sharing, invitation system, user activity tracking

Monitoring and Logging

  • Metrics Collection: Infrastructure metrics, application metrics, deployment performance
  • Logging: Structured logging using Zerolog, build logs, deployment logs, API access logs
  • Alerting: Health status monitoring, error rate alerting, resource utilization alerts
  • Troubleshooting: Log analysis tools, deployment status tracking, cluster health monitoring

Plans & Features by Tier

  • Basic Plan: Common API server, 3 members per workspace, 10 GB storage, 3 deployed jobs, unlimited instant deployments, community support, restricted instances
  • Team Plan: API server deployment capability, unlimited workspace members, 20 GB storage + pay for excess, unlimited batch/task jobs, unlimited instant deployments, community support, RBAC, SSO, access to all instances
  • Enterprise Plan: Team features + private API server, unlimited workspace members, unlimited storage, custom domain support, private cluster / self-hosting option, unlimited jobs, private support + custom integration, RBAC, SSO, all instances + custom instances

Additional Benefits

  • No Data Privacy Issues: Fine-tune models on your data without privacy concerns
  • Continuous Learning: Improve models over time with feedback
  • Human-in-the-Loop: Keep humans involved in model training and improvement
  • Rule-Based Governance: Apply explicit rules and principles to model behavior
Cubik AI platform interface