Status
Spec only. Canthus v1 ships a deterministic fuzzy-only suggestion pipeline. This page captures the contract a future embeddings reranker must satisfy. Treat the acceptance criteria as binding for any PR that turns the reranker on.Why embeddings might be added later
Fuzzy matching is strong on titles that share tokens. It is weak on paraphrases (“walking the pup” vs “walk dog”) and on cross-lingual or compound activities. An on-device sentence encoder can rerank top-K fuzzy candidates to recover those cases. The reranker is opt-in: the suggestion pipeline calls aReranker interface that defaults to identity. Turning embeddings on is a one-line wiring change once the assets and storage are in place.
Constraints
- Fully offline. The model and tokenizer ship in the app binary.
- Deterministic. Same inputs produce the same vector.
- Bounded latency. P95 reranking of top-25 candidates under 80 ms on mid-range Android.
- No regression for the fuzzy-only path when the reranker is off.
Storage
Model identity
A model is identified by a single string:"all-minilm-l6-v2-r4-q8".
The active modelId is stored in shared_preferences under the key embeddingModelId. Missing or unrecognised values force a re-encode at next launch.
Cached embeddings
Add a Drift table:model_id matches the active embeddingModelId. Vectors from older models are never mixed.
Bundled assets
Models live underapp/assets/models/:
pubspec.yaml lists the directory under flutter.assets.
Upgrade path
On app launch:- Read bundled
modelIdfromconfig.jsonof the active asset directory. - Read stored
modelIdfromshared_preferences. - If equal: do nothing.
- If different or missing: schedule a background re-encode job.
- Encode every
template_tasksrow. - Insert new rows under the new
modelIdwithencoded_at = now. - Once complete, atomically swap
embeddingModelIdto the new id. - Mark the old model’s rows as stale (a background sweep deletes them after one launch cycle).
- Encode every
- Until the swap completes, the reranker continues to read the old
modelId. The fuzzy-only path always works as a fallback.
Rollback
- Old model rows are retained for at least one full launch cycle after a swap.
- A failed swap (encode error, schema mismatch) reverts
embeddingModelIdto the previous value. - If a release ships with a regressed model, replacing the bundled asset directory in a hotfix triggers the same upgrade path. There is no separate “rollback flow”.
Compatibility rules
- Vector dimensionality is part of the
modelId. Cross-model arithmetic is forbidden. - The reranker treats unknown
modelIdrows as absent. - New template inserts at runtime are encoded lazily on first lookup if the row is missing for the active model.
- Stored embeddings are never serialized off-device.
- The reranker must compose with the fuzzy candidate generator; it never bypasses fuzzy filtering.
Reranker interface
The suggestion pipeline (features/tasks/domain/suggestion/cost_suggester.dart) exposes:
IdentityReranker (returns input unchanged). The embedding reranker, when implemented, replaces this binding without touching call sites.
Performance budget
| Operation | Budget |
|---|---|
| Encode a single query | P95 under 30 ms mid-range Android |
| Rerank top-25 candidates | P95 under 80 ms mid-range Android |
| Cold-start re-encode of 500 templates | under 4 s background |
| Binary size cost of bundled model | 25 MB or less (alarm above 25 MB; hard cap 50 MB) |
Acceptance criteria (ENG-39)
When the reranker is implemented, all of the following must hold:-
modelIdformat is{architecture}-{revision}-{quant}and stored in shared_preferences. -
template_embeddingsDrift table exists with the schema in this doc. - Schema migration is additive (no destructive change to
template_tasks). - Upgrade path runs in background and never blocks UI.
- Failed upgrade leaves the previous
modelIdactive and the app continues to function. - Rollback by re-bundling a previous model works without manual intervention.
- Fuzzy-only path is unchanged when the reranker is off.
- Latency budgets in this doc are enforced by an integration test.
- Vectors are never serialized off-device.
- Suggestions remain deterministic for a given
(query, templateSet, modelId).