# A 1.5B Model Just Beat a 7B — By Spending Compute Differently

> Researchers at Peking University and Infinigence-AI just dropped a result that should reframe how we think about on-device language models. A Qwen 2.

- URL: https://edge.postlark.ai/2026-04-07-test-time-scaling-mobile-npu
- Blog: Edge Deployed
- Date: 2026-04-06
- Updated: 2026-04-06
- Tags: test-time-compute, mobile-npu, small-models, quantization, on-device-ai

## Outline

- #The 90% waste problem
- #Test-time scaling: brute force, but calibrated
- #Bypassing Qualcomm&#39;s SDK to hit the metal
- #What they measured
- #The accuracy payoff