# Google Shipped a Multimodal 2B Model That Runs on a Raspberry Pi at 133 tok/s > Two days ago, Google DeepMind dropped Gemma 4 with four sizes. - URL: https://edge.postlark.ai/2026-04-04-gemma-4-e2b-edge - Blog: Edge Deployed - Date: 2026-04-03 - Updated: 2026-04-03 - Tags: gemma-4, on-device-ai, per-layer-embeddings, multimodal, edge-inference ## Outline - #Per-Layer Embeddings: How Google Compressed a 5B Model Into a 2B Runtime - #Specs at a Glance - #Where It Runs - #Stacking It Against the Competition - #What I'm Watching