The roadmap.
Our vision for a high-throughput future.
Focusing on optimization, resilience, and scale.
Phase 1: Storage Optimization & Idempotency
Byte-based Batching
Solving the "Small Files Problem"
The Problem
Currently, logs are batched by count. At scale, this generates millions of tiny files in S3, causing high API costs and slow operations.
The Solution
Shifting to size-based batching (5MB - 10MB). A single GZIP file can hold tens of thousands of logs, reducing S3 costs by 99%.
Idempotent Syncing
Solving "Double Processing"
The Problem
If the Syncer crashes before deleting a processed file from S3, the next cycle will duplicate those logs.
The Solution
Updating to ClickHouse ReplacingMergeTree. By generating unique log fingerprints, ClickHouse will automatically deduplicate data.
Phase 2: Resilience & Latency Reduction
Parallel Syncer
Solving "Explosive Backlogs"
The Problem
During database downtime, massive S3 backlogs accumulate. A single-threaded syncer might take days to catch up.
The Solution
Implement a multi-worker pool in Go with backpressure control to parallelize downloads and inserts, clearing backlogs in minutes.
Hybrid Path & Live Tail
Solving "Near Real-Time Latency"
The Problem
The S3-staging introduces a 30-60s delay, which is too slow for live incident response or debugging.
The Solution
Dual-path routing: ERROR logs bypass S3 and stream directly into ClickHouse or UI for instant visibility.
Phase 3: The SaaS Ecosystem
AstraLog Cloud
BYOC & Managed
The Problem
Managing ClickHouse and S3 permissions requires dedicated DevOps time.
The Solution
Launching a Management API and Dashboard to deploy Bring Your Own Cloud setups in clicks with a SQL-powered UI.
Dynamic Edge SDKs
Remote Configuration
The Problem
SDKs currently rely on static variables and cannot be throttled during massive traffic spikes.
The Solution
Remote configuration allowing Edge functions to fetch batching limits dynamically, enabling server-side throttling.