Prepare project plan

This commit is contained in:
2026-04-24 11:47:24 +02:00
parent 0c87978547
commit 4b838cfb99
13 changed files with 1815 additions and 7 deletions

View File

@@ -0,0 +1,259 @@
# Architecture Research
**Domain:** Offline-first meal-planning app (KMP + Ktor, household-shared)
**Researched:** 2026-04-23
**Confidence:** HIGH (locked stack; standard patterns within it)
## System Overview
```
┌──────────────────────────────────────────────────────────────────┐
│ composeApp/ (Android · iOS · Desktop · Wasm) │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ UI: Compose screens + NavHost (Jetpack Nav CMP) │ │
│ │ ViewModel (StateFlow) ──► Repository (reactive Flow) │ │
│ │ │ │ │ │
│ │ ▼ ▼ │ │
│ │ SyncEngine (singleton) ◄──► SQLDelight (local) + Outbox │ │
│ │ │ │ │
│ │ │ AuthSession (AppAuth / ASWebAuth) │ │
│ │ ▼ │ │
│ │ Ktor Client (JWT bearer) ─────────────────┐ │ │
│ └────────────────────────────────────────────┼───────────────┘ │
└──────────────────────────────────────────────┼───────────────────┘
│ HTTPS
┌────────────────────────────────┼───────────┐
│ Authentik (OIDC IdP, homelab) │ JWKS │
└────────────────────────────────┴───────────┘
┌──────────────────────────────────────────────▼───────────────────┐
│ server/ (Ktor 3.x, same homelab) │
│ Auth (ktor-server-auth-jwt) ──► Routes /api/v1/* │
│ │ │ │
│ ▼ ▼ │
│ PrincipalResolver ──► Services ──► Exposed DSL ──► Postgres │
│ (Flyway) │
└──────────────────────────────────────────────────────────────────┘
shared/commonMain: domain models + API DTOs (client + server both depend)
```
## Component Responsibilities
| Component | Responsibility | Typical Implementation |
|-----------|----------------|------------------------|
| Screen (`@Composable`) | Render state, forward intents. No I/O. | `PlannerScreen(state, onAddMeal)`; consumes `collectAsStateWithLifecycle()` |
| ViewModel | Expose `StateFlow`; coordinate repo calls; zero Compose imports | Extends `ViewModel`, scoped via `koinViewModel()`, method-per-action |
| Repository | Single source of truth for one aggregate; hide local/remote split | Exposes `Flow<Domain>` from SQLDelight; write path goes through local DB + outbox |
| SyncEngine | Own outbox drain, pull loop, backoff, auth failure handling | App-scoped Koin singleton; one `CoroutineScope(SupervisorJob)`; started after auth |
| DataSource (local) | Thin SQLDelight wrapper, mapping rows ↔ domain | Per-table `Queries` injected; suspend + `asFlow().mapToList()` |
| DataSource (remote) | Typed Ktor calls for `/sync/push`, `/sync/pull`, catalog endpoints | `HttpClient` with `Auth { bearer { ... } }` + `ContentNegotiation(Json)` |
| AuthSession | Own tokens, refresh, sign-in/out; expose `StateFlow<AuthState>` | Platform-specific actual class (AppAuth / ASWebAuth) behind `expect` |
| Koin Module | Wire graph per layer (`appModule`, `dataModule`, `syncModule`, `authModule`) | Declared in `commonMain`; `startKoin` in `App()` + `MainViewController` |
| Ktor route | HTTP surface; validate DTO; call service; never touch DB directly | `Route.planRoutes()` under `authenticate("auth-jwt") { route("/api/v1") { ... } }` |
| Exposed table | Schema definition + column types; DSL queries via `transaction {}` | `object PlanEntries : Table("plan_entries")` — no DAO |
| Outbox | Durable queue of unsynced local writes keyed by aggregate+id | `sync_outbox` table in SQLDelight; `(op, table, pk, payload_json, attempts)` |
## Recommended Project Structure
```
composeApp/src/commonMain/kotlin/app/recipe/
├── app/ # App() composable, root nav, Koin bootstrap
├── navigation/ # @Serializable route classes + NavGraphBuilder extensions
├── ui/
│ ├── theme/ # Color, typography, Haze style tokens
│ ├── components/ # Reusable (GlassCard, MealSlotChip, ...)
│ └── screens/
│ ├── recipes/ # RecipeListScreen, RecipeDetailScreen, *ViewModel
│ ├── planner/ # PlannerScreen, DayColumn, *ViewModel
│ ├── pantry/
│ └── shopping/
├── data/
│ ├── local/ # SQLDelight driver factory (expect/actual), Queries wrappers
│ ├── remote/ # HttpClient factory, DTOs mirroring shared/, auth interceptor
│ ├── sync/ # SyncEngine, Outbox, pull scheduler, conflict policy
│ └── repository/ # PlanRepository, PantryRepository, CatalogRepository, ...
├── domain/ # Value types, enums (MealSlot), pure computations (shortfall, aggregation)
├── auth/ # AuthSession interface, token store, OIDC config
└── di/ # appModule, dataModule, syncModule, authModule
server/src/main/kotlin/app/recipe/server/
├── Application.kt # embeddedServer, install plugins, call moduleMain()
├── plugins/ # Auth, ContentNegotiation, CallLogging, StatusPages, CORS
├── auth/ # JWKS config, PrincipalResolver (sub → user → household)
├── routes/
│ ├── sync/ # push.kt, pull.kt
│ ├── catalog/ # recipes, ingredients, products (read-mostly)
│ ├── households/ # memberships, invites
│ └── health/
├── services/ # PlanService, SyncService — orchestrate transactions
├── db/
│ ├── tables/ # Exposed Table objects (no DAO)
│ ├── Mappers.kt # ResultRow → shared DTO
│ └── Database.kt # HikariCP + Flyway.migrate()
└── util/ # Clock (injectable), IdGen, Json
server/src/main/resources/db/migration/ # V1__init.sql, V2__plan_entries.sql, ...
shared/src/commonMain/kotlin/app/recipe/shared/ # Domain + DTOs (@Serializable) — no I/O deps
```
Rationale: groups by UI concern then data layer, matching the locked decision in PROJECT.md. `data/sync/` is a first-class folder because sync is the spine of the app. `domain/` holds pure logic so it can be unit-tested without Android/iOS runtime. Server mirrors the client's layered split (routes → services → db) so reasoning transfers.
## Architectural Patterns
### Pattern 1: Repository → reactive Flow → StateFlow in ViewModel
Repositories expose `Flow<Domain>` built from SQLDelight's `asFlow().mapToList()`. The ViewModel lifts that into a cold-hot `StateFlow` using `stateIn` with `WhileSubscribed(5_000)`. Writes go through the repo, which writes to SQLDelight; the reactive query re-emits automatically. **Never** pre-fetch state with a suspend call in `init {}` — that races with collection.
```kotlin
class PlannerViewModel(private val repo: PlanRepository) : ViewModel() {
val state: StateFlow<PlannerState> = repo.observeWeek(currentWeek)
.map(PlannerState::fromEntries)
.stateIn(viewModelScope, SharingStarted.WhileSubscribed(5_000), PlannerState.Loading)
fun onAddMeal(day: LocalDate, slot: MealSlot, recipeId: Uuid) =
viewModelScope.launch { repo.add(day, slot, recipeId) }
}
```
### Pattern 2: Sync engine as a Koin singleton owning outbox + poll cycles
One long-lived `SyncEngine` bound in `syncModule` with a `SupervisorJob`-backed scope. It exposes `pushNow()`, `pullNow()`, `status: StateFlow<SyncStatus>`. Two loops: a push loop that drains `sync_outbox` with exponential backoff on 5xx/network errors, and a pull loop that calls `GET /sync/pull?since={lastCursor}` every 2030s while foregrounded. Repositories never talk to HTTP directly for household data — they enqueue outbox rows and trust the engine.
```kotlin
class SyncEngine(private val api: SyncApi, private val local: LocalDb, private val clock: Clock) {
private val scope = CoroutineScope(SupervisorJob() + Dispatchers.Default)
fun start() { scope.launch { pushLoop() }; scope.launch { pullLoop() } }
suspend fun nudge() = pushSignal.emit(Unit)
}
```
Trade-off: single point of failure if the engine deadlocks, so all its work must be cancellable and idempotent (server-side push is keyed by `client_op_id`).
### Pattern 3: Household-scope enforcement at three layers
Defence in depth: (a) **Client query filter** — every SQLDelight query for household-scoped tables includes `WHERE household_id = :hh`, sourced from `AuthSession.activeHouseholdId`; (b) **Server principal resolver** — a `PrincipalResolver` turns the JWT `sub` claim into `(userId, householdId)` via a cached lookup against `memberships`; routes receive an `AuthPrincipal` already carrying `householdId`; (c) **DB row ownership** — every household-scoped table has `household_id uuid NOT NULL` with an index, and every `UPDATE`/`DELETE` includes `AND household_id = ?`.
```kotlin
fun Route.planRoutes(svc: PlanService) = authenticate("auth-jwt") {
post("/api/v1/sync/push") {
val p = call.principal<AuthPrincipal>()!! // householdId baked in
val batch = call.receive<PushBatch>()
call.respond(svc.applyBatch(p.householdId, batch))
}
}
```
Never trust a `householdId` field inside a client payload — overwrite with the principal's.
### Pattern 4: Catalog (read-mostly) vs Household (read-write, synced) split
Two cache + sync policies in one app. **Catalog** (recipes, ingredients, products) is pre-seeded server-side, pulled via versioned ETag (`GET /api/v1/catalog?etag=...`), cached in SQLDelight with a simple "replace all or diff by updated_at" refresh on app start + manual refresh. No outbox. **Household** (plan entries, pantry, shopping items) is LWW-synced with server-assigned `updated_at`, uses the outbox, and is reactively observed. Keep these in separate repositories and separate Koin modules so their refresh semantics don't leak into each other.
## Data Flow — Hero Write Path (Add Meal to Plan)
```
User taps "add meal"
PlannerScreen invokes onAddMeal(day, slot, recipeId)
PlannerViewModel.onAddMeal → viewModelScope.launch { repo.add(...) }
PlanRepository.add():
├─ SQLDelight transaction:
│ INSERT plan_entry (id=localUuid, household_id, day, slot, recipe_id,
│ updated_at=NULL /* server will stamp */, pending=1)
│ INSERT sync_outbox (op='upsert', table='plan_entry', pk=id,
│ payload_json, client_op_id, attempts=0)
└─ Flow<PlanEntries> re-emits → PlannerViewModel.state recomputes → UI updates
│ (optimistic; pending=1 may render a subtle marker)
SyncEngine.nudge() — push loop wakes
Ktor Client POST /api/v1/sync/push (Authorization: Bearer <jwt>)
Ktor Server: install(Authentication) { jwt("auth-jwt") { verifier(jwkProvider) } }
│ JWT validated against Authentik JWKS (cached, rotating)
PrincipalResolver: sub → userId → householdId (cached)
sync/push.kt → SyncService.applyBatch(householdId, batch)
Exposed transaction {
PlanEntries.upsert { it[id]=...; it[householdId]=...; it[updatedAt]=Clock.now() }
// server clock is authoritative
}
Response { applied: [{ id, client_op_id, updated_at: <server ts> }] }
Client: local tx {
UPDATE plan_entry SET updated_at = <server ts>, pending = 0 WHERE id = ?
DELETE FROM sync_outbox WHERE client_op_id = ?
}
Flow re-emits → pending marker vanishes
~~~ (later) partner's device ~~~
Pull loop: GET /api/v1/sync/pull?since=<lastCursor>
Server returns rows with updated_at > since, scoped to householdId
Client upserts rows in a single SQLDelight tx; advances cursor
Partner's PlannerViewModel StateFlow emits new state → their UI updates
```
## Anti-Patterns
### Anti-Pattern 1: Suspend fetch in `init {}` feeding a `MutableStateFlow`
```kotlin
// WRONG
init { viewModelScope.launch { _state.value = repo.getOnce() } }
```
Races with UI collection; loses SQLDelight's reactive updates; forces manual refresh after every write. **Instead:** build the `StateFlow` declaratively from `repo.observeX().stateIn(...)`.
### Anti-Pattern 2: Using Exposed DAO (active record) for new tables
Exposed's DAO API (`IntEntity`, `EntityClass`) looks convenient but leaks lazy-loading through transactions and fights JSONB/composite types. PROJECT.md already forbids it. **Instead:** use the DSL (`Table` objects + `transaction { Table.select { ... } }` + explicit `ResultRow → DTO` mappers). Predictable SQL, no session/transaction surprises.
### Anti-Pattern 3: Sharing SQLDelight transactions across coroutine contexts on iOS
SQLDelight's iOS driver (native-sqlite) uses thread-confined connections. Launching nested `withContext(Dispatchers.IO)` inside a `transaction { }` can throw `IllegalStateException` or silently serialize incorrectly. **Instead:** keep the entire transaction inside one coroutine, use SQLDelight's `transactionWithResult { }`, and do network/CPU work *outside* the tx. On iOS, the driver's own dispatcher handles threading.
### Anti-Pattern 4: Using device clock for `updated_at`
Phones have drifting clocks and timezone shenanigans; a device whose clock is 10 minutes fast will always "win" LWW. **Instead:** server stamps `updated_at` inside the push transaction (`Clock.System.now()` on the server, or `now()` in SQL). The client only stores what the server returns. Local-only edits carry `pending=1` until acknowledged.
### Anti-Pattern 5: Putting UI, HTTP, or DB types in `shared/commonMain`
PROJECT.md scopes `shared/` to domain models + DTOs. Dragging Ktor or SQLDelight into `shared/` pulls platform-specific deps into the server build graph and vice versa. **Instead:** client-only concerns live in `composeApp/`, server-only in `server/`, and `shared/` stays a pure-Kotlin library with `kotlinx.serialization` + `kotlinx.datetime` as its only non-stdlib deps.
## Build Order Implication
The layer that must exist first is **auth + a working Ktor skeleton that echoes an authenticated principal**, because every subsequent layer depends on having a real `householdId` to scope against. After that the unblock order is: (1) **sync engine foundation** — outbox table, empty push/pull endpoints, cursor persistence — so every feature slots into an already-synced path instead of being retrofitted; (2) **catalog read path** — lets the UI render recipes without any write-path complexity, proving HTTP + SQLDelight + Coil end-to-end on a trivial aggregate; (3) **household write path** — the planner as the first real outbox-backed aggregate, which flushes out LWW edge cases; (4) **UI chrome** — Haze-backed glass, navigation polish, theming — last, because decorating a working app is cheap while architecting around decoration is expensive. Skipping step 1 or 2 and jumping to the planner looks faster for a week and costs a month.
## Sources
- `/Users/rwilk/dev/repo/recipe/.planning/PROJECT.md` (authoritative stack + constraints)
- Training knowledge: Compose Multiplatform 1.7+, Jetpack Nav CMP port 2.9.x, SQLDelight 2.x coroutine extensions, Ktor 3.x auth-jwt + JWKS, Exposed DSL transaction semantics, Authentik OIDC discovery
- No web searches needed — patterns are standard within the locked stack
---
*Architecture research for: KMP + Ktor household meal planner*
*Researched: 2026-04-23*

View File

@@ -0,0 +1,292 @@
# Pitfalls Research
**Domain:** Kotlin Multiplatform + Compose Multiplatform (iOS-primary), Ktor/Exposed/Postgres, OIDC, LWW delta sync
**Researched:** 2026-04-23
**Confidence:** HIGH for KMP/Ktor/Exposed gotchas; MEDIUM for Haze + Navigation-CMP specifics (behavior shifts across minor versions)
## Critical Pitfalls
### Pitfall 1: Kotlin/Native iOS GC thrashing and `objcDisposeOnMain` hangs
**What goes wrong:** On-device (especially iPhone XR/11) the app consumes 300700 MB steadily and freezes for 12 s under ViewModel churn. Flamegraphs show GC threads at >100% CPU.
**Why:** The K/N memory manager dispatches Obj-C release to the main thread by default, serializing teardown behind UI frames. Compose/Koin graphs produce many bridged Obj-C references per navigation.
**Warning signs:** Frame hitches on tab switches; main-thread time in `objc_release` / `Kotlin_ObjCExport_releaseReservedObjectTail`; Instruments shows growing K/N heap.
**How to avoid:** Set `kotlin.native.binary.objcDisposeOnMain=false` and `kotlin.native.binary.gc=cms` in `gradle.properties` from day 1. Release Kotlin refs in `onDispose`; don't hold them in long-lived Swift closures.
**Phase:** UI chrome.
---
### Pitfall 2: Legacy `freeze()` / strict-mm ceremony in copy-pasted snippets
**What goes wrong:** Code from 20212022 tutorials adds `freeze()`, `@SharedImmutable`, `AtomicReference` from `kotlin.native.concurrent`, or `ensureNeverFrozen()`. Compiles on Kotlin 2.x but adds dead code and masks real bugs.
**Why:** The new memory manager removed the freeze paradigm entirely; `freeze()` is a no-op and deprecated.
**Warning signs:** Any of the above symbols appearing in snippets you're about to paste.
**How to avoid:** Reject pre-1.7.20 KMP code. Use `kotlinx.atomicfu` if you truly need atomics; StateFlow is already thread-safe.
**Phase:** Data.
---
### Pitfall 3: `ComposeUIViewController` state loss on iOS re-entry
**What goes wrong:** Backgrounding then returning resets scroll positions, selected tabs, half-filled forms. Koin-scoped ViewModels re-create.
**Why:** If the `UIViewController` is instantiated inside a SwiftUI `body`, each re-render builds a fresh composition. Compose state is owned by the controller's composition root.
**Warning signs:** State survives Android rotation but dies on iOS foreground-return; ViewModel `init` fires on backgrounded return.
**How to avoid:** Build the `UIViewController` **once** — store in `@StateObject` or a top-level property, not in a SwiftUI `body`. Use `rememberSaveable` for any UI state that must survive process death. Never nest multiple `ComposeUIViewController` wrappers.
**Phase:** UI chrome.
---
### Pitfall 4: SQLDelight iOS — missing migration files, in-memory vs file driver divergence
**What goes wrong:** JVM tests pass with in-memory driver; the iOS app crashes on launch with `no such column` after a schema change.
**Why:** `NativeSqliteDriver` persists a real file. Editing `.sq` without a numbered `.sqm` migration and a bumped schema `version` means SQLDelight only *verifies* the schema on open — on a device with an existing install, that check fails.
**Warning signs:** Works on fresh simulator install; breaks on physical device with prior install; Android OK, iOS fails.
**How to avoid:** Every schema change gets a numbered `Nm.sqm`. Enable `verifyMigrations = true` and `verifyDefinitions = true`. Add a dev-only "wipe DB" debug button during early development. Reinstall on device before any QA.
**Phase:** Data.
---
### Pitfall 5: Exposed `transaction {}` inside suspend functions → pool exhaustion
**What goes wrong:** Plain `transaction { ... }` in Ktor handlers. Under modest concurrency (~20 requests) the pool exhausts, p99 cliffs, and `IllegalStateException: Transaction is not currently active` appears.
**Why:** `transaction {}` is blocking and binds the transaction to the calling thread. In a coroutine it blocks event-loop threads; if the code suspends mid-transaction, resume lands on a different thread and loses the JDBC connection binding.
**Warning signs:** Connection pool always fully leased at low RPS; latency cliffs; "transaction not active" in logs.
**How to avoid:** Use `newSuspendedTransaction(Dispatchers.IO) { ... }` in suspend contexts. Pass the `Database` instance explicitly. No HTTP calls inside transactions. HikariCP pool size 810 is plenty for 510 users.
**Phase:** Data.
---
### Pitfall 6: Exposed DAO + JSONB footguns
**What goes wrong:** `IntEntity` + `jsonb<T>()` produces double-serialized JSON in Postgres (`"{\"key\":\"v\"}"`) or `SerializationException` on read.
**Why:** DAO integration with JSONB is thin; it's easy to store a pre-stringified value. DAO lazy-loads hide *when* the column is read, so failures manifest far from the cause.
**Warning signs:** Escaped JSON in `psql` output; serialization errors deep in read paths.
**How to avoid:** Use DSL only (already locked in PROJECT.md). For JSONB, define `jsonb("extras", Json.Default, MealExtras.serializer())` once; never stringify upstream. Round-trip integration test per JSONB column.
**Phase:** Data.
---
### Pitfall 7: Ktor JWT — audience, issuer, clock skew, JWKS cache
**What goes wrong:** 401s in production only, after a while, or after Authentik restart. Messages: "Token can't be used before...", "Claim 'aud' doesn't contain required audience", or silent 401s post key-rotation.
**Why:** Four defaults converge:
1. `ktor-server-auth-jwt` requires explicit `.withAudience()` / `.withIssuer()`.
2. Default clock leeway is **zero** — 2 s device drift rejects fresh tokens.
3. JWKS cache defaults to `(10, 24h)` — key rotation invisible for hours.
4. Authentik's `aud` can be array or string depending on provider config.
**Warning signs:** 401 only in prod; 401 only on some devices; works briefly then fails; 401 after Authentik restart.
**How to avoid:** Configure `.withIssuer(issuer).withAudience(clientId).acceptLeeway(30)`. JWKS provider with `.cached(10, 15, MINUTES).rateLimited(10, 1, MINUTES)`. In Authentik, emit `aud` as a single client_id string. Integration test: wrong `aud` → 401.
**Phase:** Auth.
---
### Pitfall 8: OIDC redirect URI mismatch + missing PKCE
**What goes wrong:** "redirect_uri does not match" or consent loop on one platform; or login succeeds without PKCE and is interceptable.
**Why:** Native apps are *public* clients — no shippable secret, so Authentik requires PKCE. Redirect URIs must match byte-for-byte (trailing slash, case). iOS uses a custom URL scheme or Universal Link; Android uses an intent-filter. Debug and release builds can differ.
**Warning signs:** Works on Android, fails on iOS (or vice versa); Authentik logs show `invalid_grant`; no `code_challenge` in auth request; fails on release build only.
**How to avoid:** Authentik provider = "Public" + PKCE S256. Register both `recipe://callback` and `recipe://callback/`. AppAuth (Android) + ASWebAuthenticationSession (iOS) with `usePKCE = true`. Keep the redirect URI in one constant in `shared/commonMain`.
**Phase:** Auth.
---
### Pitfall 9: LWW trusting client clocks
**What goes wrong:** User A's phone clock is 90 s fast; A's edit beats B's real-time-later edit in LWW. B's change silently disappears.
**Why:** Client-assigned timestamps trust unverifiable clocks. Even NTP-synced devices drift; simulators can be minutes off.
**Warning signs:** "My edit vanished"; stable prior state reappears; most common with both household members editing the same meal.
**How to avoid:** Server assigns `updated_at` on every write (already in PROJECT.md — enforce it). Client sends only content + prior `updated_at` for optimistic concurrency. Server sets `updated_at = now()` in the transaction and returns it. Make timestamps strictly monotonic per row (e.g. `GREATEST(now(), old.updated_at + interval '1 microsecond')`) to avoid tie collisions.
**Phase:** Sync.
---
### Pitfall 10: Soft-delete + recreate race
**What goes wrong:** Delete a meal entry, immediately re-add "the same" one. Depending on pull ordering, the new row is hidden by the tombstone, or the old row is resurrected with old fields.
**Why:** If `(plan_date, slot)` is treated as identity, tombstone/recreate races are inevitable on concurrent 2-user editing.
**Warning signs:** Undeleted items; deleted meals reappear on partner's device; duplicates in pantry.
**How to avoid:** Identity is always a fresh UUID per row, never `(date, slot)`. Tombstones carry their own `updated_at`. Pull returns tombstones and live rows; client applies in `updated_at` order. Per-client push outbox replays in local sequence order — never parallel. Integration test: two clients alternating delete/recreate, assert convergence.
**Phase:** Sync.
---
### Pitfall 11: Pull-cursor edge cases — missed updates, same-timestamp ties
**What goes wrong:** Partner edits at 14:00:05; client's last pull cursor is `14:00:04.999`. If cursor semantics or timestamp precision are wrong, the change is skipped forever.
**Why:** Cursor semantics are subtle. Second-precision timestamps, `>=` instead of `>`, and ties among rows sharing a `updated_at` all cause skipped or replayed rows. Debounced push interleaved with pull can reorder writes.
**Warning signs:** Sporadic stale data that vanishes after pull-to-refresh; only reproduces near DB restarts or bulk imports; duplicates after manual refresh.
**How to avoid:** `updated_at` is `timestamptz` with microsecond precision and strictly monotonic. Cursor is `(updated_at, id)` lexicographic: `WHERE (updated_at, id) > (:since_ts, :since_id) ORDER BY updated_at, id LIMIT N`. Pause pull while a push is in flight. Never split the write and its timestamp notification across transactions.
**Phase:** Sync.
---
### Pitfall 12: Haze on scroll + nested children tank older iPhones
**What goes wrong:** LazyColumn scrolling under a blurred top bar stutters badly on iPhone XR/11, dropping to ~30 fps. Nesting `hazeChild` inside a list item sitting in a `hazeSource` Scaffold makes it worse.
**Why:** iOS Haze uses Skiko `GraphicsLayer` for offscreen capture + re-blur each frame. Progressive blur adds ~25% cost. Older A-series chips without hardware-accelerated RenderEffect equivalents jank under this load.
**Warning signs:** Smooth on simulator/M-series, choppy on iPhone 11; FPS 4050; Skiko render thread pegged in Instruments.
**How to avoid:** One `hazeSource` per screen, never nested. Limit blur to chrome (tab bar, nav bar, sheet headers), not scrolling content. Avoid progressive blur on iOS pre-iPhone 13. Test on the oldest target device in real hardware. Feature-flag the effect with a solid-translucent fallback.
**Phase:** UI chrome.
---
### Pitfall 13: Navigation-CMP tabs — `when`-switch kills per-tab back stack
**What goes wrong:** Tabs implemented as `when (tab) { 0 -> RecipesScreen()... }`. Tapping into a detail, switching tabs, and returning loses the detail. System back exits the app instead of unwinding the tab.
**Why:** A `when` switch destroys the non-current tab's Compose tree. Jetpack Navigation's multi-back-stack requires either each tab as a destination in a parent NavHost, or per-tab nested `NavHost` instances, with `popUpTo(saveState) + restoreState + launchSingleTop`.
**Warning signs:** Deep-links don't restore; back from a nested screen jumps tabs; ViewModels re-created on tab switches.
**How to avoid:** One top-level `NavHost`; `navigation(route = "recipesGraph", ...)` block per tab. Bottom bar navigates: `popUpTo(graph.findStartDestination().id) { saveState = true }; launchSingleTop = true; restoreState = true`. Scope `koinViewModel()` to the destination's `NavBackStackEntry`, not the parent graph. Wasm deep-links are deferred per PROJECT.md.
**Phase:** UI chrome.
---
### Pitfall 14: Polish locale — plurals and timestamp zones
**What goes wrong:** "added 2 godzina temu" (wrong plural form). Shopping items near midnight show on the wrong day across devices.
**Why:** Polish has four CLDR plural forms (one / few / many / other). Naive `if (n == 1)` handles at most two. Serializing `LocalDateTime` over the wire (instead of UTC `Instant`) produces zone/DST bugs.
**Warning signs:** Grammatically wrong Polish copy; yesterday's items shown as today's.
**How to avoid:** Use Compose Resources `<plurals>` with all four forms; call `pluralStringResource(count)`. Wire format: `Instant` UTC ISO-8601 only; display: `.toLocalDateTime(TimeZone.currentSystemDefault())`. Unit test plurals with count 0/1/2/5/22.
**Phase:** UI chrome (i18n foundation).
---
## Technical Debt Patterns
| Shortcut | Immediate Benefit | Long-term Cost | When Acceptable |
|---|---|---|---|
| Ad-hoc `psql` DDL, skipping Flyway | Fast schema iteration | Dev/prod drift; can't rebuild from scratch | Pre-first-deploy only; squash into `V1__init.sql` before real data |
| Hardcoded OIDC issuer/client_id in `shared/commonMain` | Avoids build-config plumbing | Can't run against staging Authentik; Authentik change forces rebuild | v1 single-environment only |
| Plain `transaction {}` in admin endpoints | Simpler mental model | Mixing blocking + suspend patterns leaks; eventually every endpoint wants suspend | Admin-only, single-user endpoints |
| Free-form `meal_entry.extras` JSONB without schema | Evolve without migrations | No DB validation; orphan fields accumulate; hard to query | Until extras shape stabilizes; then promote hot fields to columns |
| No indices until queries are slow | Faster early dev | p99 cliffs during sync; adding indices under load is risky | Until first data import; then index every `(household_id, updated_at)` |
## Integration Gotchas
| Integration | Common Mistake | Correct Approach |
|---|---|---|
| Authentik OIDC | Confidential client type with secret shipped in binary | Public client + PKCE S256; never ship `client_secret` |
| Authentik OIDC | Leaving default signing alg; Ktor JWT expects RS256 | Configure RS256 explicitly; verify `kid` resolves via JWKS |
| Haze + Scaffold | `hazeSource` on Scaffold root + `hazeChild` on a sheet both capturing | `hazeSource` on scrollable content only; chrome uses `hazeChild` |
| App Store / TestFlight | ATS exception to reach homelab self-signed cert | Real cert via Let's Encrypt + Caddy/Traefik; never ship ATS exceptions |
| Postgres JSONB | `WHERE extras->>'k' = 'v'` with no GIN index | `CREATE INDEX ... USING GIN (extras jsonb_path_ops)` once access patterns emerge |
## Performance Traps
| Trap | Symptoms | Prevention | When It Breaks |
|---|---|---|---|
| Pull sync without pagination | First-sync-after-seed hangs seconds | Cursor-paginate `LIMIT 200 ORDER BY updated_at, id` | >500 rows in any scoped table |
| Coil full-res images in recipe grid | Memory spikes, laggy scroll | Explicit thumbnail `Size`; memory+disk cache | >30 images on screen |
| Compose recomposition of entire calendar per edit | Calendar flashes on slot change; scroll resets | Stable IDs per slot; hoist per-slot state; `derivedStateOf` for totals | Any calendar with >7 days visible |
| Haze over full scrolling region | Jank on iPhone XR/11 | Blur chrome only, not content; fallback for old devices | Pre-A13 silicon on 60 Hz panels |
## Security Mistakes
| Mistake | Risk | Prevention |
|---|---|---|
| Missing `WHERE household_id = :caller_household` on reads | Cross-household data leak | All scoped reads go through a `HouseholdScope` helper; review rule: no raw `selectAll()` on scoped tables |
| Trusting client-supplied `household_id` in request body | Tenancy bypass via crafted POST | Derive `household_id` from JWT `sub``memberships`; ignore body's value |
| Logging the `Authorization` header in Ktor `CallLogging` | Tokens leak to log files → account compromise | Custom log filter redacting `Authorization`; never `log.info(token)` |
| Storing OIDC refresh token in plain prefs | Local/backup exposure | `multiplatform-settings` with Keychain (iOS) / EncryptedSharedPreferences (Android) backends |
## "Looks Done But Isn't" Checklist
- [ ] **Auth:** Login works — verify token refresh runs before expiry (set Authentik access-token lifetime to 5 min in dev; watch for silent 401s)
- [ ] **Sync:** Pull works — verify tombstones propagate (delete on A, confirm gone on B after pull, not just after push)
- [ ] **Sync:** Offline writes survive app kill + relaunch + reconnect — not just a warm resume
- [ ] **Household isolation:** Log in as household B; hit every endpoint; assert zero household A rows returned
- [ ] **SQLDelight migrations:** Install prior release, launch once, upgrade in place; confirm no crash, no data loss
- [ ] **Polish plurals:** Open every screen with counts 0, 1, 2, 5, 22; verify grammar
- [ ] **Haze performance:** Test on oldest supported device (iPhone XS/11) scrolling a full screen; not just simulator
## Pitfall-to-Phase Mapping
| Pitfall | Prevention Phase | Verification |
|---|---|---|
| K/N GC thrash; `objcDisposeOnMain` | UI chrome (infra) | Gradle property set; Instruments shows no GC-main domination |
| Legacy `freeze()` ceremony | Data | Code search for `freeze(`, `@SharedImmutable` returns empty |
| UIViewController re-creation | UI chrome | State survives background/foreground cycle |
| SQLDelight missing migration | Data | Prior-build → new-build upgrade test on real device |
| Blocking Exposed transaction in suspend | Data | No `transaction {` in suspend paths; 50-concurrent-request load test with pool size 10 |
| DAO + JSONB | Data | No `exposed.dao.*` imports; per-JSONB-column round-trip test |
| JWT aud/iss/leeway/JWKS | Auth | Wrong-aud → 401; 30 s skew → 200; JWKS refreshes within 15 min |
| OIDC redirect URI / PKCE | Auth | Flow passes on iOS *and* Android; Authentik logs show `code_challenge` per request |
| LWW client-clock trust | Sync | All writes set `updated_at` server-side; clients never send it |
| Soft-delete recreate race | Sync | Two-client alternating delete/recreate converges |
| Pull-cursor edge cases | Sync | Cursor is `(updated_at, id)` lexicographic; same-timestamp test |
| Haze scroll jank | UI chrome | iPhone 11 real-device FPS >55 on recipe grid scroll |
| Nested NavHost / multi-back-stack | UI chrome | Tab switch preserves deep state; system back unwinds within tab |
| Polish plurals / timestamps | UI chrome | Plural unit tests pass; wire format is UTC-only |
| Household tenancy bypass | Auth + Sync | Cross-household read test asserts empty result sets |
## Sources
- [Kotlin/Native memory management](https://kotlinlang.org/docs/native-memory-manager.html) (HIGH)
- [Compose Multiplatform for iOS Stable, 2025](https://www.kmpship.app/blog/compose-multiplatform-ios-stable-2025) (MEDIUM)
- [Haze 1.0 release notes — Chris Banes](https://chrisbanes.me/posts/haze-1.0/) (HIGH)
- [Haze Platforms documentation](https://chrisbanes.github.io/haze/latest/platforms/) (HIGH)
- [Navigation in Compose Multiplatform — JetBrains](https://kotlinlang.org/docs/multiplatform/compose-navigation.html) (HIGH)
- [Bottom Nav + Nested Navigation guide](https://saurabhjadhavblogs.com/jetpack-compose-bottom-navigation-nested-navigation-solved) (MEDIUM)
- [Exposed — Working with Transactions](https://www.jetbrains.com/help/exposed/transactions.html) (HIGH)
- [Exposed — JSON/JSONB types](https://www.jetbrains.com/help/exposed/json-and-jsonb-types.html) (HIGH)
- [Exposed — Breaking Changes](https://www.jetbrains.com/help/exposed/breaking-changes.html) (HIGH)
- Community-known K/N + KMP gotchas synthesized from training + surrounding sources (MEDIUM)
---
*Pitfalls research for: Kotlin Multiplatform recipe/meal-planning app with self-hosted Ktor + Postgres + Authentik backend*
*Researched: 2026-04-23*

View File

@@ -0,0 +1,159 @@
# Project Research Summary
**Project:** Recipe (working title) — household meal planner + pantry + shopping list
**Domain:** Mobile (iOS-primary) + self-hosted backend, offline-first collaborative app for a 2-person household
**Researched:** 2026-04-24
**Confidence:** HIGH
## Executive Summary
This is a household-scoped meal-planning app built as a Kotlin Multiplatform client (iOS-primary) against a self-hosted Ktor server, with offline-first operation and last-write-wins sync over HTTP polling. The core value is "my week is planned" — the planner is the hero feature; pantry and shopping exist to reinforce it. User-base is ~5-10 authenticated users across a handful of households. Tech stack was locked in a direct discussion rather than derived from research, so the research scope was narrowed to two areas where novel value was expected: **architecture patterns within the locked stack** and **pitfalls specific to this library combination**.
The recommended approach centers on a **sync-engine-first** architecture: a single Koin-singleton `SyncEngine` owns the outbox, the pull cursor, and the push/pull cycles; repositories only write to SQLDelight + outbox, never to HTTP directly. Every feature in the app sits on top of this spine, which decouples UI/domain from transport concerns and makes offline-first a property of the system rather than a per-feature discipline. Household scope is enforced at **three layers** (client query filter, server `PrincipalResolver` deriving household from JWT `sub`, and a `household_id` column on every tenant-scoped table) — single-layer enforcement is consistently the source of cross-tenant data leaks in apps like this.
The **highest-risk area** is sync correctness under concurrent household edits. LWW with device-clock timestamps silently loses data when clocks drift; the mitigation is server-assigned `updated_at` for every write, UUIDs (never composite natural keys like `(date, slot)`) as row identity, and a `(updated_at, id)` lexicographic pull cursor with microsecond precision to survive same-millisecond edits. Secondary risks: Kotlin/Native memory/GC on iPhone 12-era devices, Ktor JWT validation leeway and JWKS caching interactions with Authentik, and Haze blur over scrolling content on older iPhones.
## Key Findings
### Recommended Stack
Locked via direct discussion (see PROJECT.md § Key Decisions). No research-driven changes required.
**Core technologies:**
- Compose Multiplatform + Jetpack Navigation CMP port — iOS-primary UI, official JetBrains-recommended router
- Koin + SQLDelight + Ktor Client + kotlinx.serialization/datetime + Kermit + Coil 3 + Haze — canonical KMP client stack in 2026
- Ktor Server + Exposed (DSL, not DAO) + Postgres + Flyway + ktor-server-auth-jwt — canonical Kotlin backend
- Authentik OIDC — user's existing homelab identity provider
### Expected Features
Not researched (user explicitly opted to start catalog fresh rather than survey the market). Active v1 requirements captured directly in PROJECT.md § Requirements — four feature pillars: recipe catalog browse, meal planner, pantry, shopping list; plus auth, household sharing, and offline sync foundations.
### Architecture Approach
See `.planning/research/ARCHITECTURE.md` for the full treatment.
**Major components (top to bottom):**
1. **Compose UI + Navigation** — screens observe ViewModel state; navigation via Jetpack Nav CMP with nested NavHosts per tab for independent back stacks
2. **ViewModel layer** — StateFlow of immutable `UiState`, method-per-action pattern, scoped to `NavBackStackEntry` via `koinViewModel()`
3. **Repository layer** — domain-shaped API; reads return SQLDelight Flows `.asFlow().mapToList(dispatcher)`; writes go to SQLDelight + outbox atomically
4. **SyncEngine (Koin singleton)** — drives outbox drain (push) and pull cursor (poll on foreground + pull-to-refresh + debounced-after-write); owns all HTTP sync traffic
5. **Local DataSources** — thin wrappers over SQLDelight generated queries; one driver per process, threaded correctly for iOS NativeSqliteDriver
6. **Remote DataSources** — Ktor Client with JSON negotiation; catalog fetches use HTTP caching; sync endpoints are separate from catalog
7. **Server Ktor routes** — auth-gated via `Authentication.jwt("authentik")`; every household-scoped handler routes through a `PrincipalResolver` that looks up membership once
8. **Server DB (Exposed + Postgres + Flyway)** — DSL-only, JSONB for meal-entry extras, `newSuspendedTransaction` for every coroutine-touching handler
### Critical Pitfalls
See `.planning/research/PITFALLS.md` for 14 critical pitfalls + anti-pattern tables. The five most load-bearing:
1. **Sync correctness under concurrent edits** — server-assigned `updated_at`, UUID row identity, lexicographic `(updated_at, id)` pull cursor. Any shortcut here causes silent data loss.
2. **Ktor JWT + Authentik integration** — audience, issuer, clock-skew leeway, JWKS cache TTL all configurable; default values fail silently when clocks drift or keys rotate. Explicit configuration mandatory.
3. **OIDC redirect URI + PKCE** — byte-exact match required; mobile clients are public so PKCE is mandatory. Common cause of 400-series auth loops that are opaque without server logs.
4. **Household tenancy derivation**`household_id` always derived from authenticated `sub`, never accepted from request body. Single source of cross-tenant leaks.
5. **iOS infra hygiene**`kotlin.native.binary.objcDisposeOnMain=false`, `gc=cms`, single `ComposeUIViewController` instance, Haze on chrome only (never over scrolling content). Cheap to bake in day 1; painful to retrofit when iPhone XR/11 users complain about jank.
## Implications for Roadmap
The architecture research suggests an explicit **foundation-first** build order. The pitfalls research reinforces this by surfacing that most high-cost mistakes live in the foundation (sync engine, auth validation, household scope), not the feature layer. Suggested phase skeleton:
### Phase 1: Project infrastructure + module wiring
**Rationale:** One-time setup that blocks everything. Convention plugins, version catalog, Koin bootstrap, shared DTOs module, iOS target config with binary flags (`objcDisposeOnMain`, `gc=cms`), server Gradle setup, Flyway plumbing.
**Delivers:** A running but empty composeApp (iOS + Android) + running but unrouted Ktor server.
**Avoids:** Retrofit cost on iOS memory flags and Gradle conventions mid-project.
### Phase 2: Authentication foundation
**Rationale:** Nothing else can be built without authenticated principals. Blocks sync, household data, all CRUD.
**Delivers:** Client OIDC flow (AppAuth on Android, ASWebAuthenticationSession on iOS) to Authentik → access token stored. Server ktor-server-auth-jwt validates token against Authentik JWKS. Protected `/api/v1/me` endpoint returns user.
**Avoids:** Pitfalls 7 + 8 (JWT validation misconfig, redirect URI/PKCE errors).
### Phase 3: Households + membership + server data foundation
**Rationale:** Every feature table needs `household_id`. Introducing this after feature tables exist is a migration nightmare.
**Delivers:** `users`, `households`, `memberships`, `invites` tables + Flyway migrations. `PrincipalResolver` that maps JWT `sub``household_id`. Endpoints for creating a household, generating invite codes, accepting invites. Client auth session now includes `household_id`.
**Avoids:** Tenant-scope leaks, cross-household data bugs.
### Phase 4: Sync engine skeleton
**Rationale:** Second-hardest piece after auth. Must exist before any feature-specific data can be synced. Built on a trivial first table (e.g., a `notes` sentinel table or the `households` metadata).
**Delivers:** SyncEngine interface + implementation. Outbox schema in SQLDelight with `id`, `table`, `row_id`, `op`, `payload`, `created_at`. `/api/v1/sync/push` and `/api/v1/sync/pull` endpoints. Polling scheduler + pull-to-refresh + debounced after-write trigger. Cursor persistence. Mock table round-trips.
**Avoids:** Pitfalls 9, 10, 11 (LWW timestamp sources, delete-recreate races, cursor edge cases).
### Phase 5: Recipe catalog (read path)
**Rationale:** Read-mostly, simpler sync (no writes from client), teaches Exposed + SQLDelight + Ktor end-to-end. Seeds the rest of the app with real data to develop against.
**Delivers:** `recipes`, `ingredients`, `products` tables on server (Flyway migrations). Seed data mechanism (SQL fixtures or admin CLI). Catalog Ktor routes. Client-side catalog cache in SQLDelight with pull-only sync. RecipeListScreen + RecipeDetailScreen reading from local cache.
**Avoids:** Sync-strategy-by-accident (catalog uses different cache rules than household data).
### Phase 6: Meal planner (hero write path)
**Rationale:** Core value. Exercises the full write path: optimistic local write + outbox + sync.
**Delivers:** `plan_entry` table (server + client) with `household_id` + `updated_at` + `deleted_at`. Planner calendar UI. Add/remove/edit meal flows. Nutrition totals computed client-side from catalog + plan.
**Avoids:** Pitfall 1 (sync correctness), by this point the foundation is already correct.
### Phase 7: Pantry
**Rationale:** Second household-scoped feature. Reuses all of the plumbing from Phase 6. Validates that sync foundation generalizes.
### Phase 8: Shopping list + session log
**Rationale:** Computed view over pantry + plan + a small session table. Ties the three data sources together.
### Phase 9: UI chrome with Haze liquid-glass approximation
**Rationale:** Intentionally late. Earlier phases use boring default chrome to avoid blocking on design. Now swap in Haze-based nav and tab bars, glassy cards, dark mode polish. Measurable real-device perf can be validated against real data from Phase 6-8.
**Avoids:** Pitfall 12 (Haze perf regressions — easier to measure once data volume is realistic).
### Phase 10: Polish, polish infra, iOS deployment
**Rationale:** Externalized strings with Polish copy, locale-aware date formatting (local display only — wire stays UTC), Bundle ID + privacy manifests, TestFlight distribution to partner.
### Phase 11 (optional, post-v1): Recipe authoring in-app
**Rationale:** Explicitly deferred in PROJECT.md. First v1 seeds catalog via server migrations.
### Phase Ordering Rationale
- **Auth → Households → SyncEngine → features**: each layer enables the next; skipping any accelerates for a week and costs a month (from ARCHITECTURE.md § Build Order).
- **Catalog (read) before planner (write)**: reads are simpler and catch sync-pull bugs in isolation before writes introduce push/outbox/conflict bugs.
- **UI chrome last**: Haze perf is measurable only with realistic data; design iteration shouldn't block data-layer correctness.
### Research Flags
Phases likely needing deeper phase-level research during planning:
- **Phase 2 (Auth):** Authentik-specific OIDC provider setup steps; AppAuth vs custom iOS wrapper tradeoffs; token refresh behavior
- **Phase 4 (SyncEngine):** Concrete cursor format, outbox schema ordering guarantees, retry/backoff policy
- **Phase 9 (UI chrome):** Current Haze perf benchmarks on CMP iOS; liquid-glass approximation design patterns
Phases with well-trodden paths (minimal research-phase needed):
- **Phase 1 (Infra):** Convention plugins + version catalog is well-documented
- **Phase 5 (Catalog read):** Basic CRUD + cache pattern
- **Phases 6-8 (features):** Once foundation is in place, these follow the architecture patterns directly
## Confidence Assessment
| Area | Confidence | Notes |
|------|------------|-------|
| Stack | HIGH | Locked in direct discussion with tradeoff analysis per library |
| Features | HIGH (scope-bounded) | User defined v1 directly in PROJECT.md; no market research done, intentionally |
| Architecture | HIGH | Research agent produced concrete patterns specific to the locked stack |
| Pitfalls | HIGH | 14 specific pitfalls with library-level detail; covers all major risk areas |
**Overall confidence:** HIGH for entering the roadmap phase.
### Gaps to Address
- **Authentik-specific OIDC flow details** — the research documented WHAT to get right (JWT validation, PKCE, redirect URIs) but not Authentik's specific UI/config steps. Resolve during Phase 2 planning.
- **Mobile OIDC library choice for iOS** — PROJECT.md notes "ASWebAuthenticationSession wrapper" with no specific KMP wrapper library recommended. Resolve during Phase 2 planning.
- **Haze current CMP-iOS perf characteristics on iPhone 12-era hardware** — needs real-device measurement, not research. Covered by Phase 9.
- **Seed-data mechanism** — "SQL migrations, JSON fixtures, or CLI tool" listed as options in PROJECT.md § Constraints. Resolve during Phase 5 planning.
## Sources
### Primary (HIGH confidence)
- `.planning/PROJECT.md` — authoritative product scope + locked stack
- `.planning/research/ARCHITECTURE.md` — agent-researched, ~1900 words, 7 sections
- `.planning/research/PITFALLS.md` — agent-researched, ~2800 words, 14 critical pitfalls + tables
### Secondary
- Direct discussion transcript (April 2026) — tech-stack tradeoff conversation that led to PROJECT.md decisions
- [Navigation in Compose Multiplatform](https://kotlinlang.org/docs/multiplatform/compose-navigation.html)
- [Kotlin/Native memory management](https://kotlinlang.org/docs/native-memory-manager.html)
- [Haze 1.0 — Chris Banes](https://chrisbanes.me/posts/haze-1.0/)
- [Exposed — Transactions](https://www.jetbrains.com/help/exposed/transactions.html)
- [Exposed — JSON/JSONB types](https://www.jetbrains.com/help/exposed/json-and-jsonb-types.html)
---
*Research completed: 2026-04-24*
*Ready for roadmap: yes*