On-Device ML Infrastructure

Everyone has a device. Not everyone has a cloud.

Three billion phones already have the compute to run ML models, serve inference, and host autonomous agents. What they've never had is the infrastructure. Dust is that infrastructure — the serving layer underneath that everything else stands on.

Think CUDA, but for the three billion devices that already exist.

v0.1.0 15 modules 3 platform tiers Apache 2.0

Device Unified Serving Toolkit

15 Open-source modules
3 Platform tiers
0 Cloud dependencies

Demo

Qwen running on-device via Dust — no server, no cloud.

Run it yourself in 3 commands

  1. git clone https://github.com/rogelioRuiz/dust-llm-capacitor.git
  2. cd dust-llm-capacitor
  3. npm install && npm run test:ios

Demo

YOLO object detection running on-device via Dust — no server, no cloud.

Demo: YOLO object detection running on-device via Dust on Android

Run it yourself in 3 commands

  1. git clone https://github.com/rogelioRuiz/dust-onnx-capacitor.git
  2. cd dust-onnx-capacitor
  3. npm install && npm run test:android

Features

Built for shipping on-device inference, not stitching together runtime fragments.

Dust keeps bridge code thin, runtime code native, and contracts portable so the same serving model spans app shells and platforms.

On-Device Inference

Run locally with native session managers, memory-aware lifecycle controls, and offline-friendly model access.

Multi-Platform

Ship one architecture across Capacitor apps, Android/JVM libraries, and iOS/macOS packages.

Modular Architecture

Adopt the exact layer you need: bridge, runtime, or shared contract package, without dragging the full stack in.

GGUF & ONNX Support

Cover llama.cpp GGUF workflows and ONNX Runtime pipelines from the same project family.

Embeddings & Vector Search

Tokenizers, embedding services, and vector store contracts are first-class pieces of the platform model.

Model Lifecycle Management

Registries, downloads, verification, status transitions, and ref-counted sessions stay inside the framework.

Architecture

Thin bridges on top, platform-native runtimes underneath, shared contracts at the core.

The bridge layer exposes app-facing APIs while Kotlin and Swift libraries own actual inference, downloads, and model serving behavior.

Bridge Layer
capacitor-core capacitor-embeddings capacitor-llm capacitor-onnx capacitor-serve
Native Runtimes

Kotlin / Android

dust-embeddings-kotlin dust-llm-kotlin dust-onnx-kotlin dust-serve-kotlin

Swift / Apple

dust-embeddings-swift dust-llm-swift dust-onnx-swift dust-serve-swift
Core Contracts
dust-core-kotlin dust-core-swift ModelServer ModelSession VectorStore EmbeddingService

Ecosystem

Fifteen packages. One serving model.

The ecosystem spans bridge packages, platform-native runtimes, and portable core contracts designed to compose cleanly.

Capacitor

capacitor-core

Shared bridge-facing ML contracts, interfaces, and value types for Dust plugins.

View on GitHub →
Capacitor

capacitor-embeddings

On-device embedding model tokenization and inference for Capacitor apps.

View on GitHub →
Capacitor

capacitor-llm

Current JavaScript bridge for on-device GGUF and llama.cpp inference.

View on GitHub →
Capacitor

capacitor-onnx

Current JavaScript bridge for ONNX Runtime loading, preprocessing, and inference.

View on GitHub →
Capacitor

capacitor-serve

Downloads, registry, sessions, and model lifecycle management for Capacitor hosts.

View on GitHub →
Kotlin

dust-core-kotlin

Core contracts and service abstractions for Android and JVM consumers.

View on GitHub →
Kotlin

dust-embeddings-kotlin

Embedding and tokenizer runtime primitives for Android.

View on GitHub →
Kotlin

dust-llm-kotlin

Kotlin-native LLM runtime and GGUF session integrations.

View on GitHub →
Kotlin

dust-onnx-kotlin

ONNX Runtime session management and tensor preprocessing for Android.

View on GitHub →
Kotlin

dust-serve-kotlin

Model registry, downloads, and session lifecycle services for Android.

View on GitHub →
Swift

dust-core-swift

Core protocol and contract types for Swift consumers.

View on GitHub →
Swift

dust-embeddings-swift

Tokenizers and embedding runtime primitives for iOS and macOS.

View on GitHub →
Swift

dust-llm-swift

GGUF, llama.cpp inference, and chat runtime for iOS and macOS.

View on GitHub →
Swift

dust-onnx-swift

ONNX Runtime session management and preprocessing for Apple platforms.

View on GitHub →
Swift

dust-serve-swift

Model registry, downloads, and serving lifecycle services for Swift apps.

View on GitHub →

Get Started

Drop into the layer you need, from a bridge package to a native core library.

Pick the layer you need and install from the published registry.

# Capacitor bridge packages
npm install dust-core-capacitor dust-llm-capacitor dust-serve-capacitor
npx cap sync

If you know the cloud stack

Dust is the on-device equivalent of the tools you already use.

Every piece of the cloud ML stack has a device-native counterpart in Dust — same concepts, no server required.

Cloud Dust equivalent
CUDA dust-core
vLLM / Triton dust-serve
TensorRT dust-onnx
vLLM engine dust-llm
TEI dust-embeddings