Receipt Scanning Almost Killed My MVP

Two weeks nearly lost to one “cool” feature I parked in P0.

Stoka — my PWA for tracking what’s in your fridge plus AI recipe suggestions — had a hard deadline: ~2 weeks to a B2B partnership pitch. Ship or miss. No room for scope bloat, but my v1 PRD was already bloated before a single line of code was written.

PRD v1: OCR in P0

The feature I put at the top: receipt scanning. I pictured the flow: user comes home from groceries, snaps the receipt, AI extracts item names + quantities, auto-populates fridge inventory. Zero typing. 20 items added in 5 seconds.

The rationale I wrote in the PRD: “maximum convenience, differentiates from basic reminder apps.”

It sounded convincing. So I passed it to Shaka.

What you'll learn

01

Why the feature that sounds most "impressive" in a demo is often the wrong P0 pick.

02

A single PRD question that forces AHA-moment clarity — and how it re-ordered a deadline-critical MVP.

03

How to tell when your builder instinct is out-shouting your PM instinct — and what to do about it.

Shaka satellite, Guided mode

Shaka is my PRD-writing satellite — one of the specialized agent skills in Stella Protocol. For tight-deadline projects I usually use Shaka Express — 10 minutes, done. This time I picked Guided mode because something in the PRD felt off, and I couldn’t articulate what.

Guided mode runs through Observation Haki — 7 lenses that each force a question you want to skip. (Deep-dive per lens in the seven lenses post.)

Lenses 1–3 passed without drama. Pain was clear, Victory was clear, Boundary I had written. Then Lens #4 asked:

Shape — Which single feature is non-negotiable for the first user to hit the AHA moment?

I paused.

Instinct answer: OCR. So the scan is fast. So the user doesn’t have to type.

But I forced myself to answer again from the user’s angle, not the feature list’s. A first-time user opens Stoka not for tracking — tracking is work, not a reward. A user opens Stoka because it’s 6pm, they’re in the kitchen, and they don’t know “what should I cook tonight with what’s in the fridge right now.”

The AHA sits in AI meal suggestions. OCR just lowers friction for tracking — which may not even happen in the first session. A user can hit AHA by typing 5 items manually and seeing a recipe immediately.

Reshuffle

The call became:

OCR: P0 → P1. Ships in v0.2 if the demand is proven.
AI meal suggestions: stays P0. This is the core loop.
Onboarding wizard: P1 → P0. If OCR is skipped, the user has to be able to track manually as fast as possible. A 3-screen wizard with 5 pre-filled sample items so AI recipe suggestions fire on the first session.

Activation metric didn’t change: user tracks ≥5 items AND views ≥1 AI recipe within 7 days. OCR doesn’t affect this metric — the user can still hit 5 items manually. Just slower. And “slower” is a P1 problem, not a P0 one.

Impact math

The reason this reshuffle wasn’t just philosophical: OCR’s concrete costs.

Option A: Google Vision API — $1.50 per 1K requests. The MVP is on free tier, so this lands in pre-revenue unit economics. A concern, not a killer, but it needs an infra call (rate limiting, caching already-scanned receipts, etc.).
Option B: TensorFlow.js client-side — bundle size +18MB for the OCR model. This is a PWA. Battery drain. Install friction on thin mobile data, especially for the UAE/GCC target users who are often on the go.

Both need decisions, and decisions need time. Cutting OCR from P0 = ~2 weeks saved on build + infra research. Which is exactly the time I needed to finish AI recipes and onboarding properly.

Without Lens 4, I would have spent 2 weeks on a feature that did not gate activation. MVP ships late. Pitch missed.

Quick reference, not a deep-dive:

Pain — What friction does the user feel today without your product?
Victory — What does “winning” look like? An observable state, not a vague goal.
Boundary — What are you explicitly not building?
Shape — The single non-negotiable feature for AHA?
Skeleton — The minimum architecture to support Shape + Victory?
Surface — What does the user see first?
Fog — What you don’t yet know, and the cheapest way to de-risk it?

Sequential order matters: Pain → Victory establish why. Boundary + Shape establish what. Skeleton establishes how. Surface + Fog handle unknowns.

If you want why it’s 7 and not 5, plus concrete examples per lens, head to the seven lenses post.

What I’m taking away

A “cool” feature isn’t always an AHA feature. OCR is cool. OCR is impressive in a demo. OCR does not bring users back on day 2. What brings users back is recipe suggestions that match what’s actually in their fridge tonight.

My builder instinct says “automate first so it’s smooth.” My PM instinct should say “AHA first, automation later.” These two instincts clash often, and in a PRD without structure, the builder instinct wins because it’s louder.

Observation Haki isn’t magic. It’s 7 questions forced to be typed out. But one of those 7 questions — Lens 4 — caught a priority that I would have skipped if it hadn’t been asked out loud.

Key Takeaways

Demo-impressive is not AHA-impressive. OCR wowed in concept; AI meal suggestions actually made the user come back tomorrow. Write the PRD in terms of what drags users into session 2, not what gets a round of applause in session 1.
Builder instinct out-shouts PM instinct unless you force structure. Without a framework that insists “name the single non-negotiable feature for AHA,” the louder voice wins. The question isn’t optional — it’s the compass.
Every P0 feature has a concrete cost attached. OCR’s was ~2 weeks + infra decisions. When you write “must-have,” write the hours it costs next to it. A lot of P0 shrinks the moment the number is visible.