smile: a high-performance JVM machine learning framework with integrated LLM inference and an agentic data science IDE
smile: a high-performance JVM machine learning framework with integrated LLM inference and an agentic data science IDE
What it solves
SMILE provides a high-performance, comprehensive machine learning framework for the JVM, enabling developers to implement a wide range of statistical and AI algorithms without leaving the Java, Scala, or Kotlin ecosystems. It bridges the gap between high-level data science needs and the performance requirements of JVM-based production environments.
How it works
SMILE is organized into several specialized modules:
- Core ML: Implements standard algorithms for classification, regression, clustering, manifold learning, and anomaly detection.
- Deep Learning & LLMs: Uses a LibTorch backend for GPU/CPU tensor operations and provides a full LLaMA-3 inference stack, including BPE tokenizers and an OpenAI-compatible REST server.
- NLP: Offers tools for text normalization, POS tagging, stemming, and relevance ranking.
- Base: Provides the foundational math, linear algebra, and data structures (like DataFrames) required for ML.
- Visualization: Includes Swing-based interactive plots and declarative Vega-Lite charts.
- SMILE Studio: An agentic IDE that allows users to interact with data using natural language via Python, Java, or Scala.
Who it’s for
Data scientists and software engineers working within the JVM ecosystem (Java, Scala, Kotlin) who need a robust, production-ready machine learning library with integrated deep learning and LLM capabilities.
Highlights
- Comprehensive Algorithm Suite: Supports everything from Random Forests and SVMs to t-SNE and UMAP.
- LLM Integration: Native LLaMA-3 inference and an OpenAI-compatible server for chat streaming.
- JVM Native: High-performance implementation with idiomatic APIs for Java, Scala, and Kotlin.
- Agentic IDE: Includes SMILE Studio for natural language data interaction.
- Enterprise Ready: Supports model serialization and integration with Apache Spark ML pipelines.
Sources
- undefinedhaifengl/smile