Architecture Overview

StarNion is a fully self-hostable AI personal assistant. All data is stored on the userโ€™s own server, and the system is composed of five core services.


Overall System Structure

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                      User Access                         โ”‚
โ”‚                                                         โ”‚
โ”‚   Web Browser          Telegram App                      โ”‚
โ”‚       โ”‚                    โ”‚                            โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
        โ”‚                    โ”‚
        โ–ผ                    โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  UI (Next.js) โ”‚   โ”‚         Gateway (Go)               โ”‚
โ”‚   :3893       โ”‚โ”€โ”€โ–ถโ”‚              :8080                 โ”‚
โ”‚               โ”‚   โ”‚                                    โ”‚
โ”‚  - Chat UI    โ”‚   โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
โ”‚  - Dashboard  โ”‚   โ”‚  โ”‚ REST API โ”‚  โ”‚ Telegram Bot   โ”‚  โ”‚
โ”‚  - 24+ Pages  โ”‚   โ”‚  โ”‚ /api/v1/ โ”‚  โ”‚   Manager      โ”‚  โ”‚
โ”‚  - Settings   โ”‚   โ”‚  โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚       โ”‚                โ”‚            โ”‚
                    โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜            โ”‚
                    โ”‚  โ”‚  WebSocket Hub (/ws/chat)        โ”‚
                    โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                    โ”‚                   โ”‚ gRPC
                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                        โ”‚
                                        โ–ผ
                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚          Agent (TypeScript)            โ”‚
                    โ”‚             :50051                     โ”‚
                    โ”‚                                       โ”‚
                    โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
                    โ”‚  โ”‚     AI SDK v5 ยท Multi-LLM       โ”‚  โ”‚
                    โ”‚  โ”‚                                 โ”‚  โ”‚
                    โ”‚  โ”‚  24+ Skills: finance, diary,    โ”‚  โ”‚
                    โ”‚  โ”‚  goals, search, wellness, ...   โ”‚  โ”‚
                    โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
                    โ”‚                 โ”‚                      โ”‚
                    โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
                    โ”‚  โ”‚      SSE Streaming               โ”‚  โ”‚
                    โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                      โ”‚
                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚                                         โ”‚
                    โ–ผ                         โ–ผ               โ”‚
         โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”‚
         โ”‚  PostgreSQL      โ”‚    โ”‚       MinIO           โ”‚    โ”‚
         โ”‚  (pgvector)      โ”‚    โ”‚  (Object Storage)     โ”‚    โ”‚
         โ”‚                  โ”‚    โ”‚                       โ”‚    โ”‚
         โ”‚  - Conversations โ”‚    โ”‚  - Images             โ”‚    โ”‚
         โ”‚  - Finances      โ”‚    โ”‚  - Audio              โ”‚    โ”‚
         โ”‚  - Diary/Memos   โ”‚    โ”‚  - Document files     โ”‚    โ”‚
         โ”‚  - Embeddings    โ”‚    โ”‚  - Generated files    โ”‚    โ”‚
         โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ”‚
                                                              โ”‚
         โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
         โ”‚              LLM Providers                     โ”‚โ”€โ”€โ”˜
         โ”‚  Gemini / OpenAI / Claude / GLM / Ollama       โ”‚
         โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Five Core Services

1. UI (Next.js) โ€” Port 3893

The web front end. This is the interface users interact with directly in the browser.

  • Chat interface: Real-time streaming responses, file attachments, conversation history
  • Dashboard: Finance summary, goal status, D-Day, diary, memos, documents, images
  • 24+ Feature pages: Finance, budget, analytics, diary, wellness, garden, goals, D-Day, memos, memory, reports, statistics, search, skills, personas, models, channels, logs, usage, files, and more
  • Settings: Provider & model management, Telegram channel config, notification center (cron)
  • i18n: 4-language support (Korean, English, Japanese, Chinese) via next-intl

Next.js API Routes act as a proxy, forwarding requests to the Gatewayโ€™s REST API.

2. Gateway (Go) โ€” Port 8080

The hub for all traffic. It acts as an intermediary between the UI and the AI agent.

  • REST API (/api/v1/): Chat, file upload, settings, skill management, channel configuration, and more
  • WebSocket (/ws/chat): Real-time streaming chat connections
  • Telegram BotManager: Dynamically starts and stops Telegram bot instances per user
  • gRPC client: Communicates with the TypeScript Agent
  • Cron Scheduler: Per-user toggleable notification jobs (weekly report, budget warning, daily summary, etc.)
  • MinIO integration: Stores uploaded files in object storage

Goโ€™s high concurrency capability ensures stable operation even when many users are connected simultaneously.

3. Agent (TypeScript/Node.js) โ€” Port 50051 (gRPC)

The AI brain. A Vercel AI SDK v5-based agent analyzes messages and executes skills across multiple LLM providers.

  • AI SDK v5 Agent: Message processing, skill selection, response generation
  • Multi-LLM: Anthropic Claude, Google Gemini, OpenAI, GLM (Z.AI), Ollama
  • Skill system: 24+ built-in skills โ€” finance, diary, goals, wellness, search, memo, documents, image, audio, and more
  • SSE Streaming: Real-time responses via AI SDK standard streaming format
  • Embedding service: Converts text to vectors and stores them in PostgreSQL (pgvector)
  • RAG Memory: 4-layer semantic memory across all user data

4. PostgreSQL (pgvector)

The primary data store. The pgvector extension also stores vector embeddings alongside regular data.

Stored data: conversation history, expense records, diary entries, memos, goals, D-Days, document indices, embedding vectors, channel settings, skill settings, personas, cron schedules, usage logs

5. MinIO (Object Storage)

The file store. It provides an S3-compatible API, so it can be replaced with AWS S3.

Stored files: uploaded images, audio, documents; AI-generated files (QR codes, generated images, etc.)


Data Flow: Message Processing

Here is the processing flow when a user types โ€œlunch 12,000 won today.โ€

1. User โ†’ UI (Next.js)
   "lunch 12,000 won today"

2. UI โ†’ Gateway (HTTP POST /api/v1/chat or WebSocket)
   { message: "lunch 12,000 won today", user_id: "...", thread_id: "..." }

3. Gateway โ†’ Agent (gRPC Chat RPC)
   Called with server streaming

4. Agent: AI SDK v5 processing
   4-1. Message analysis: "recognized as food expense 12,000 won"
   4-2. Skill selection: finance skill
   4-3. DB query: check current month's food total
   4-4. Record expense: INSERT INTO finance_entries
   4-5. Generate response: "Recorded lunch 12,000 won. This month's food total: 87,500 won"

5. Agent โ†’ Gateway (gRPC streaming)
   Stream response token by token

6. Gateway โ†’ UI (WebSocket or SSE)
   Deliver real-time streaming

7. UI โ†’ User
   Display response on screen

gRPC Communication

Gateway (Go) and Agent (TypeScript) communicate via gRPC.

// proto/starnion/v1/agent.proto (summary)
service AgentService {
  // Regular chat (server streaming)
  rpc Chat(ChatRequest) returns (stream ChatResponse);

  // Health check
  rpc HealthCheck(HealthCheckRequest) returns (HealthCheckResponse);
}

Reasons for choosing gRPC:

  • Server streaming: Delivers LLM responses in real time, token by token
  • Type safety: Interface guaranteed via Protobuf schema
  • Efficiency: HTTP/2-based with low latency

WebSocket: Real-time Chat

Web UI chat is implemented with WebSocket. The Gatewayโ€™s WebSocket Hub manages connections.

Browser โ”€โ”€WebSocketโ”€โ”€ Gateway Hub โ”€โ”€gRPC Streamโ”€โ”€ Agent
  โ”‚              /ws/chat           server streaming  โ”‚
  โ”‚โ—€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”‚
        real-time token-by-token streaming

Connection flow:

  1. Browser establishes WebSocket connection to /ws/chat?user_id=...
  2. User types a message โ†’ JSON is sent
  3. Gateway sends a gRPC streaming request to Agent
  4. Agent response tokens are immediately relayed over WebSocket
  5. Characters appear on the browser screen in real time

Multi-Channel: Single Agent

The Web UI and Telegram connect to the same TypeScript Agent.

Telegram user โ”€โ”€โ–ถ Telegram Bot โ”€โ”€โ–ถ Gateway โ”€โ”€โ–ถ Agent โ”€โ”€โ–ถ same DB
Web user      โ”€โ”€โ–ถ WebSocket    โ”€โ”€โ–ถ Gateway โ”€โ”€โ–ถ Agent โ”€โ”€โ–ถ same DB

Since anything recorded in either channel is stored in the same PostgreSQL database, a memo written on the web can be retrieved on Telegram and vice versa.

Each channel message is identified by the platform field: web, telegram.


4-Layer RAG Memory System

The four-layer memory structure the Agent uses when referencing past records.

Query: "What did I eat last week?"

Layer 1: Daily Logs (daily log vectors)
  โ”œโ”€ Vector search over conversations from the past 7 days
  โ””โ”€ Extract food-expense-related entries

Layer 2: Knowledge Base (knowledge base vectors)
  โ”œโ”€ Spending pattern analysis results
  โ””โ”€ Frequently visited restaurant patterns

Layer 3: Document Sections (document section vectors)
  โ””โ”€ Indexed content from uploaded receipts and documents

Layer 4: Recent Finance (recent expense records)
  โ””โ”€ Direct DB query of recent expense entries

By fetching relevant context from each layer and passing it to the LLM together, natural memory references like โ€œI had samgyeopsal last week, right?โ€ are possible.


Multi-Provider LLM

The Agent supports multiple LLM providers. Users can switch providers and models in Settings > Models on the Web UI.

Provider Example Models Notes
Anthropic Claude claude-sonnet-4-5, claude-haiku Long context handling
Google Gemini gemini-2.0-flash, gemini-2.5-pro Fast response, multimodal
OpenAI gpt-4o, gpt-4o-mini High-quality responses
GLM (Z.AI) glm-4-flash, glm-4-plus Chinese language strength
Ollama llama3, mistral, qwen Fully local (no internet required)

Models and providers can be managed per-user via the Web UI or CLI: starnion config models.


Security Considerations

Self-hosted Design

StarNion is designed from the ground up for self-hosting.

  • All personal data (conversations, expenses, diary entries) is stored only on the userโ€™s server
  • Message content is sent to external servers only during LLM API calls, and only to the selected LLM provider
  • Using Ollama enables fully offline operation

JWT Authentication

Web UI login is JWT (JSON Web Token) based via NextAuth v5.

  • The server issues a JWT upon login
  • All subsequent API requests include the token
  • Re-login is required when the token expires
  • Gateway token validation ensures API-level security

PostgreSQL Advisory Locks

PostgreSQL session-level advisory locks are used to prevent duplicate Telegram bot execution. This prevents the same bot token from being polled simultaneously by two Gateway instances.

Data Isolation

Each userโ€™s data is completely isolated by the user_id foreign key. One user cannot access another userโ€™s data.


Copyright © 2025 StarNion. All rights reserved.  |  v0.1.0