← All posts
ProductMarch 20, 2026· 7 min read

What 'private AI' actually means when the model is on the phone

Privacy isn't a checkbox in your settings screen. Here's how on-device inference changes the threat model — and the four places it can still leak.

What 'private AI' actually means when the model is on the phone

Every AI app says it cares about privacy. Most of them mean: *we have a privacy policy*. A native AI app with on-device inference can mean something stronger: your prompts and the model's responses never touch a server.

But 'on-device' is not a magic word. There are four places a native AI app can still leak, and you have to design around all of them before you earn the privacy claim.

1. Telemetry. The single most common failure. You ship a beautiful local model, then send a 'user_sent_message' analytics event with the message length, sentiment score, and detected intent. Now you've reconstructed the conversation server-side without the words. The fix: aggregate everything client-side, ship cohort-level metrics on a schedule, never per-event.

2. Crash logs. Your stack trace will cheerfully include the last prompt in a buffer somewhere. Scrub before upload, or — better — keep crash reports fully local and let the user opt-in to share specific ones.

3. The escalation path. When the on-device model bails out and you hit a cloud LLM, the user needs to *know* and *consent* in that moment. A small badge isn't enough. The handoff should feel like a deliberate gear shift, not a silent fallback.

4. The keyboard. Third-party keyboards see every keystroke. You can't fix this for the user, but you can detect it and surface a gentle warning when they're using one for a sensitive conversation.

Get those four right and you have something genuinely defensible. Get any one of them wrong and the on-device story is marketing.