Google Releases LiteRT-LM Framework Powering On-Device AI in Chrome, Chromebook Plus, and Pixel Watch

Nikita Silaech
Sep 26, 2025
1 min read

Google has released LiteRT-LM, the production-ready inference framework that enables on-device deployment of large language models like Gemini Nano across hundreds of millions of devices including Chrome browsers, Chromebooks, and Pixel Watches.

Technical Architecture:

Engine/Session Design: Singleton Engine manages shared resources while individual Sessions handle stateful conversations, enabling multiple AI features to share a single foundation model with task-specific LoRA adapters.
Cross-Platform Support: Deploys across Android, Linux, macOS, Windows, and Raspberry Pi with CPU, GPU, and NPU hardware acceleration through underlying LiteRT runtime.
Memory Optimization: Features session cloning, copy-on-write KV-cache, and context switching to minimize memory footprint while enabling sub-second time-to-first-token latency.
Modular Components: Open-source framework allows custom pipeline construction from core modules like executor, tokenizer, and sampler for resource-constrained deployments.

Production Deployments: The framework currently powers Web AI in Chrome through built-in AI APIs, AI capabilities for tab management and text analysis on Chromebook Plus, and Smart Replies feature on Pixel Watch. Each deployment demonstrates the system's scalability from high-performance multi-task environments to severely resource-constrained wearable devices.

Developer Access: Google now provides direct C++ interface access to LiteRT-LM for the first time, enabling developers to build custom high-performance AI pipelines. The framework complements existing high-level APIs including MediaPipe LLM Inference API and Chrome Built-in AI APIs.

Market Impact: LiteRT-LM addresses the fundamental challenge of deploying gigabyte-scale models across diverse edge hardware while maintaining offline availability and eliminating per-API-call costs for high-frequency AI tasks.

Source: Google Developers Blog

Responsible AI Foundation

Google Releases LiteRT-LM Framework Powering On-Device AI in Chrome, Chromebook Plus, and Pixel Watch

Related Posts

Comments

Never Miss a New Post.

Join Us