Purpose
When you add a task, Canthus suggests a startingrelativeCost and durationMinutes. The pipeline must be deterministic, offline, fast, and never prescriptive. This page is the contract.
Vocabulary
| Term | Meaning |
|---|---|
| Query | The trimmed, normalized title you typed |
| Candidate | A TemplateTask row that survived candidate generation |
| Score | A number in [0, 1] representing match strength |
| Confidence | The score of the top-ranked candidate |
| Suggestion | A CostSuggestion value object returned to the UI |
Stages
- Normalize the query: lowercase, trim, ascii-fold, strip punctuation, collapse whitespace, split into tokens.
- Content reduction: drop a fixed English stopword list (
a,the,had,did,went, …) and apply a conservative suffix stripper (groceries -> grocery,emails -> email,running -> run). The same reduction is applied to indexed template titles so query and target are compared on equal terms. - Candidate generation by fuzzy similarity against pre-tokenized template titles.
- Rerank (optional, deferred): if an embedding reranker is plugged in, it reorders the top-K candidates.
- Threshold decision: pick prefill, suggest, or fallback path based on top score.
- Construct
CostSuggestion: bundle template, computedrelativeCost,durationMinutes, and confidence for the UI.
Candidate generation
Fuzzy score combines three normalized similarity functions weighted as a base score, with a token-containment floor that rescues short queries that are a strict subset of a longer template title:| Function | Why it is included |
|---|---|
tokenSortRatio | Robust to word reorder (walk dog vs dog walk) |
jaroWinkler | Strong on short strings and shared prefixes |
normalizedLevenshtein | Penalizes character-level edits |
tokenContainment | Rescues short queries (tea, groceries) that are fully contained in a longer template title |
cost-suggestion-evaluation.mdx.
Ranking
- All templates are scored. For 500 entries this is well under the latency budget.
- Tie-break: stable lex order on
templateIdso identical scores produce a deterministic ranking. - Candidates with
fuzzyScore < 0.40are dropped before threshold evaluation; this prunes obvious noise.
Thresholds
| Top score | Path | UX behaviour |
|---|---|---|
>= 0.85 | Prefill | Use the top template’s netMET to compute relativeCost. Show the template name as “Looks like X”. User may dismiss or edit. |
0.60 - 0.85 | Suggest | Show top 3 candidates as chips. No prefill. User picks or dismisses. |
< 0.60 | Fallback | No template hit. Ask the user a 1-5 rating. Compute relativeCost from rating. |
relativeCost derivation
Template path
For a template with activity-only MET valuenetMET:
This matches the task-costing spec. The -1.0 removes the resting metabolic baseline; the max(_, 0.1) floor keeps the cost above zero.
Rating fallback path
When confidence is below the suggest threshold, the user picks a 1-5 rating. With the user’s currentpersonalCoefficient:
The relative cost floor 0.1 applies here too.
durationMinutes derivation
| Source | Behaviour |
|---|---|
Template hit with durationMinutes set | Use the template duration. |
| Template hit with no duration | Default 10. |
| Rating fallback | Default 10. |
Caching
- An LRU memo of size 64 keyed by normalized query holds suggestions for the duration of a process.
- No persistent cache. The fuzzy index is cheap to rebuild and avoids cache-invalidation concerns when the template set changes.
Determinism
- All inputs are local: query plus template index.
- No randomness. No network. No timers.
- Same query against same template index always produces the same suggestion.
Latency budget
| Device class | Target |
|---|---|
| Mid-range Android | P95 < 50 ms per query |
| iPhone (last 4 generations) | P95 < 30 ms per query |
Safety framing rules
These rules are non-negotiable. They protect users from feeling judged or prescribed-to.- Never auto-apply. A suggestion is a starting point. The user always confirms.
- Tentative copy. Use phrasing like “looks like”, “we’d guess”, “might be similar to”. Never “this costs X” or “you should rate it Y”.
- Editable everywhere.
relativeCost,durationMinutes, and any rating are editable before submit and after creation (cost_override_sheet). - No silent learning. A suggestion does not write data on its own. Only confirmed task creation persists.
- Reversible. Clear “back” or “use my own” affordance from any prefill.
- No ranking implications. Suggestion confidence is not exposed as a numerical score in the UI; it only drives presentation choice.
- No fallback shaming. When no match is found, copy must be neutral: “Tell us how heavy this feels”. Never “this isn’t in our list” or similar.
copy-system.mdx and the user contract.
”No good match” behaviour
WhentopScore < 0.60:
- The suggestion area collapses to a 1-5 rating row with neutral copy.
- The user can submit immediately after picking a rating; duration defaults to
10. - The form does not block submission to wait for a “better” match.
Outputs
The pipeline returns a sealedCostSuggestion:
Acceptance criteria mapping
| Criterion (ENG-91) | Where it lives |
|---|---|
| Pipeline spec defines stages and thresholds | This page (Stages, Thresholds) |
| Behaviour defined when no good match exists | This page (No good match behaviour) |
| Safety framing rules are explicit and enforced in UI | This page (Safety framing rules) plus widget tests |