spiceai: an accelerated SQL and LLM-inference engine for data-grounded AI agents with a cluster-sidecar architecture
spiceai: an accelerated SQL and LLM-inference engine for data-grounded AI agents with a cluster-sidecar architecture
What it solves
Spice is designed to eliminate the complex data pipelines and "glue code" typically required to build data-grounded AI applications and agents. It provides a unified engine for SQL queries, search, and LLM inference, allowing developers to access federated data sources with millisecond latency on localhost, whether running as a standalone binary, a Kubernetes sidecar, or a distributed cluster.
How it works
Spice operates using a "cluster-sidecar" architecture. A lightweight sidecar runs alongside the application on localhost, serving data from a scoped working set. For larger queries, it transparently delegates to a central Spice cluster powered by Apache Ballista for distributed execution and the Spice Cayenne accelerator for high-performance columnar data access. It integrates with over 30 data connectors (e.g., Postgres, Snowflake, S3) and supports native CDC for real-time updates. AI capabilities are integrated directly into the SQL engine, enabling vector search, reranking, and text-to-SQL (NSQL) within a single query plan.
Who it’s for
It is built for developers creating AI agents and data-intensive applications that need high-performance access to diverse, federated data sources without the operational overhead of managing complex ETL pipelines.
Highlights
- Cluster-Sidecar Architecture: Combines local results caching, local working sets, and distributed cluster delegation for tiered latency.
- AI-Native Runtime: Integrates LLM inference, vector search, and text-to-SQL directly into the SQL engine via OpenAI-compatible APIs and MCP support.
- High-Performance Acceleration: Uses the Spice Cayenne accelerator and Vortex columnar format to outperform DuckDB and Parquet in specific workloads.
- Federated Querying: Connects to 30+ data sources and supports writing to Apache Iceberg tables using standard SQL.
- Real-time CDC: Native support for PostgreSQL WAL and DynamoDB Streams for low-latency data synchronization.
- Enterprise Ready: Includes mTLS, HashiCorp Vault/Azure Key Vault integration, and OpenTelemetry observability.
Sources
- undefinedspiceai/spiceai