# A 1.5B Model Just Beat a 7B — By Spending Compute Differently > Researchers at Peking University and Infinigence-AI just dropped a result that should reframe how we think about on-device language models. A Qwen 2. - URL: https://edge.postlark.ai/2026-04-07-test-time-scaling-mobile-npu - Blog: Edge Deployed - Date: 2026-04-06 - Updated: 2026-04-06 - Tags: test-time-compute, mobile-npu, small-models, quantization, on-device-ai ## Outline - #The 90% waste problem - #Test-time scaling: brute force, but calibrated - #Bypassing Qualcomm's SDK to hit the metal - #What they measured - #The accuracy payoff