diff --git a/.gitignore b/.gitignore index 726acab4..87b98195 100644 --- a/.gitignore +++ b/.gitignore @@ -189,3 +189,4 @@ apps/web/src/dataconnect-generated/ AGENTS.md CLAUDE.md GEMINI.md +TASKS.md diff --git a/CHANGELOG.md b/CHANGELOG.md new file mode 100644 index 00000000..efc3ec16 --- /dev/null +++ b/CHANGELOG.md @@ -0,0 +1,9 @@ +# KROW Workforce Change Log + +| Date | Version | Change | +|---|---|---| +| 2026-02-24 | 0.1.0 | Confirmed dev owner access and current runtime baseline in `krow-workforce-dev`. | +| 2026-02-24 | 0.1.1 | Added backend foundation implementation plan document. | +| 2026-02-24 | 0.1.2 | Added API implementation contract and transition route aliases. | +| 2026-02-24 | 0.1.3 | Added auth-first security policy with deferred role-map integration hooks. | +| 2026-02-24 | 0.1.4 | Locked defaults for idempotency, validation, bucket split, model provider, and p95 objectives. | diff --git a/docs/MILESTONES/M4/planning/m4-api-catalog.md b/docs/MILESTONES/M4/planning/m4-api-catalog.md new file mode 100644 index 00000000..c8e02353 --- /dev/null +++ b/docs/MILESTONES/M4/planning/m4-api-catalog.md @@ -0,0 +1,228 @@ +# M4 API Catalog (Implementation Contract) + +Status: Draft +Date: 2026-02-24 +Owner: Technical Lead +Environment: dev + +## 1) Scope and purpose +This file defines the backend endpoint contract for the M4 foundation build. + +## 2) Global API rules +1. Canonical route groups: +- `/core/*` for foundational integration routes +- `/commands/*` for business-critical writes +2. Foundation phase security model: +- authenticated user required +- role map enforcement deferred +- policy hook required in handler design +3. Standard error envelope: +```json +{ + "code": "STRING_CODE", + "message": "Human readable message", + "details": {}, + "requestId": "optional-request-id" +} +``` +4. Required request headers: +- `Authorization: Bearer ` +- `X-Request-Id: ` (optional but recommended) +5. Required response headers: +- `X-Request-Id` +6. Validation: +- all input validated server-side +- reject unknown/invalid fields +7. Logging: +- route +- requestId +- actorId +- latencyMs +- outcome +8. Timeouts and retries: +- command writes must be retry-safe +- use idempotency keys for command write routes +9. Idempotency storage: +- store in Cloud SQL table +- key scope: `userId + route + idempotencyKey` +- key retention: 24 hours +- repeated key returns original response payload + +## 3) Compatibility aliases (transition) +1. `POST /uploadFile` -> `POST /core/upload-file` +2. `POST /createSignedUrl` -> `POST /core/create-signed-url` +3. `POST /invokeLLM` -> `POST /core/invoke-llm` + +## 4) Rate-limit baseline (initial) +1. `/core/invoke-llm`: 60 requests per minute per user +2. `/core/upload-file`: 30 requests per minute per user +3. `/core/create-signed-url`: 120 requests per minute per user +4. `/commands/*`: 60 requests per minute per user + +## 4.1 Timeout baseline (initial) +1. `/core/invoke-llm`: 20-second hard timeout +2. other `/core/*` routes: 10-second timeout +3. `/commands/*` routes: 15-second timeout + +## 5) Core routes + +## 5.1 Upload file +1. Method and route: `POST /core/upload-file` +2. Auth: required +3. Idempotency key: optional +4. Request: multipart form data +- `file` (required) +- `category` (optional) +- `visibility` (optional: `public` or `private`) +5. Success `200`: +```json +{ + "fileUri": "gs://bucket/path/file.ext", + "contentType": "application/pdf", + "size": 12345, + "bucket": "krow-uploads-private", + "path": "documents/staff/..." +} +``` +6. Errors: +- `UNAUTHENTICATED` +- `INVALID_FILE_TYPE` +- `FILE_TOO_LARGE` +- `UPLOAD_FAILED` + +## 5.2 Create signed URL +1. Method and route: `POST /core/create-signed-url` +2. Auth: required +3. Idempotency key: optional +4. Request: +```json +{ + "fileUri": "gs://bucket/path/file.ext", + "expiresInSeconds": 300 +} +``` +5. Success `200`: +```json +{ + "signedUrl": "https://...", + "expiresAt": "2026-02-24T15:00:00Z" +} +``` +6. Errors: +- `UNAUTHENTICATED` +- `FORBIDDEN_FILE_ACCESS` +- `INVALID_EXPIRES_IN` +- `SIGN_URL_FAILED` + +## 5.3 Invoke model +1. Method and route: `POST /core/invoke-llm` +2. Auth: required +3. Idempotency key: optional +4. Request: +```json +{ + "prompt": "...", + "responseJsonSchema": {}, + "fileUrls": [] +} +``` +5. Success `200`: +```json +{ + "result": {}, + "model": "provider/model-name", + "latencyMs": 980 +} +``` +6. Errors: +- `UNAUTHENTICATED` +- `INVALID_SCHEMA` +- `MODEL_TIMEOUT` +- `MODEL_FAILED` +7. Provider default: +- Vertex AI Gemini + +## 5.4 Health check +1. Method and route: `GET /healthz` +2. Auth: optional (internal policy) +3. Success `200`: +```json +{ + "ok": true, + "service": "krow-backend", + "version": "commit-or-tag" +} +``` + +## 5.5 Storage bucket policy defaults (dev) +1. Public bucket: `krow-workforce-dev-public` +2. Private bucket: `krow-workforce-dev-private` +3. Private objects are never returned directly; only signed URLs are returned. + +## 6) Command routes (wave 1) + +## 6.1 Create order +1. Method and route: `POST /commands/orders/create` +2. Auth: required +3. Idempotency key: required +4. Purpose: create order + shifts + roles atomically +5. Replaces: +- `apps/web/src/features/operations/orders/components/CreateOrderDialog.tsx` +- `apps/mobile/packages/features/client/home/lib/src/presentation/widgets/shift_order_form_sheet.dart` +- `apps/mobile/packages/features/client/create_order/lib/src/data/repositories_impl/client_create_order_repository_impl.dart` + +## 6.2 Update order +1. Method and route: `POST /commands/orders/{orderId}/update` +2. Auth: required +3. Idempotency key: required +4. Purpose: policy-safe multi-entity order update +5. Replaces: +- `apps/web/src/features/operations/orders/EditOrder.tsx` +- `apps/mobile/packages/features/client/view_orders/lib/src/presentation/widgets/view_order_card.dart` + +## 6.3 Cancel order +1. Method and route: `POST /commands/orders/{orderId}/cancel` +2. Auth: required +3. Idempotency key: required +4. Purpose: enforce cancellation policy and return explicit conflict code +5. Replaces: +- `apps/web/src/features/operations/orders/OrderDetail.tsx` + +## 6.4 Change shift status +1. Method and route: `POST /commands/shifts/{shiftId}/change-status` +2. Auth: required +3. Idempotency key: required +4. Purpose: enforce state transitions server-side +5. Replaces: +- `apps/web/src/features/operations/tasks/TaskBoard.tsx` + +## 6.5 Assign staff +1. Method and route: `POST /commands/shifts/{shiftId}/assign-staff` +2. Auth: required +3. Idempotency key: required +4. Purpose: assign + count update + conflict checks atomically +5. Replaces: +- `apps/web/src/features/operations/orders/components/AssignStaffModal.tsx` + +## 6.6 Accept shift +1. Method and route: `POST /commands/shifts/{shiftId}/accept` +2. Auth: required +3. Idempotency key: required +4. Purpose: application + counters + rollback-safe behavior in one command +5. Replaces: +- `apps/mobile/packages/features/staff/shifts/lib/src/data/repositories_impl/shifts_repository_impl.dart` + +## 7) Locked defaults before coding starts +1. Idempotency keys are stored in Cloud SQL with 24-hour retention. +2. Request validation library is `zod`. +3. Validation schema location is `backend//src/contracts/`. +4. Storage buckets are: +- `krow-workforce-dev-public` +- `krow-workforce-dev-private` +5. Model provider is Vertex AI Gemini with a 20-second timeout for `/core/invoke-llm`. + +## 8) Target response-time objectives (p95) +1. `/healthz` under 200ms +2. `/core/create-signed-url` under 500ms +3. `/commands/*` under 1500ms +4. `/core/invoke-llm` under 15000ms diff --git a/docs/MILESTONES/M4/planning/m4-backend-foundation-implementation-plan.md b/docs/MILESTONES/M4/planning/m4-backend-foundation-implementation-plan.md new file mode 100644 index 00000000..31be4add --- /dev/null +++ b/docs/MILESTONES/M4/planning/m4-backend-foundation-implementation-plan.md @@ -0,0 +1,269 @@ +# M4 Backend Foundation Implementation Plan (Dev First) + +Date: 2026-02-24 +Owner: Wilfred (Technical Lead) +Primary environment: `krow-workforce-dev` + +## 1) Objective +Build a secure, modular, and scalable backend foundation in `dev` without breaking the current frontend while we migrate high-risk writes from direct Data Connect mutations to backend command endpoints. + +## 2) First-principles architecture rules +1. Client apps are untrusted for business-critical writes. +2. Backend is the enforcement layer for validation, permissions, and write orchestration. +3. Multi-entity writes must be atomic, idempotent, and observable. +4. Configuration and deployment must be reproducible by automation. +5. Migration must be backward-compatible until each frontend flow is cut over. + +## 3) Pre-coding gates (must be true before implementation starts) + +## Gate A: Security boundary +1. Frontend sends Firebase token only. No database credentials in client code. +2. Every new backend endpoint validates Firebase token. +3. Data Connect write access strategy is defined: +- keep simple reads available to client +- route high-risk writes through backend command endpoints +4. Upload and signed URL paths are server-controlled. + +## Gate B: Contract standards +1. Standard error envelope is frozen: +```json +{ + "code": "STRING_CODE", + "message": "Human readable message", + "details": {}, + "requestId": "optional-request-id" +} +``` +2. Request validation layer is chosen and centralized. +3. Route naming strategy is frozen: +- canonical routes under `/core` and `/commands` +- compatibility aliases preserved during migration (`/uploadFile`, `/createSignedUrl`, `/invokeLLM`) +4. Validation standard is locked: +- library: `zod` +- schema location: `backend//src/contracts/` with `core/` and `commands/` subfolders + +## Gate C: Atomicity and reliability +1. Command endpoints support idempotency keys for retry-safe writes. +2. Multi-step write flows are wrapped in single backend transaction boundaries. +3. Domain conflict codes are defined for expected business failures. +4. Idempotency storage is locked: +- store in Cloud SQL table +- key scope: `userId + route + idempotencyKey` +- retain records for 24 hours +- repeated key returns original response + +## Gate D: Automation and operability +1. Makefile is source of truth for backend setup and deploy in dev. +2. Core deploy and smoke test commands exist before feature migration. +3. Logging format and request tracing fields are standardized. + +## 4) Security baseline for foundation phase + +## 4.1 Authentication and authorization +1. Foundation phase is authentication-first. +2. Role-based access control is intentionally deferred. +3. All handlers include a policy hook for future role checks (`can(action, resource, actor)`). + +## 4.2 Data access control model +1. Client retains Data Connect reads required for existing screens. +2. High-risk writes move behind `/commands/*` endpoints. +3. Backend mediates write interactions with Data Connect and Cloud SQL. + +## 4.3 File and URL security +1. Validate file type and size server-side. +2. Separate public and private storage behavior. +3. Signed URL creation checks ownership/prefix scope and expiry limits. +4. Bucket policy split is locked: +- `krow-workforce-dev-public` +- `krow-workforce-dev-private` +- private bucket access only through signed URL + +## 4.4 Model invocation safety +1. Enforce schema-constrained output. +2. Apply per-user rate limits and request timeout. +3. Log model failures with safe redaction (no sensitive prompt leakage in logs). +4. Model provider and timeout defaults are locked: +- provider: Vertex AI Gemini +- max route timeout: 20 seconds +- timeout error code: `MODEL_TIMEOUT` + +## 4.5 Secrets and credentials +1. Runtime secrets come from Secret Manager only. +2. Service accounts use least-privilege roles. +3. No secrets committed in repository files. + +## 5) Modularity baseline + +## 5.1 Backend module boundaries +1. `core` module: upload, signed URL, model invocation, health. +2. `commands` module: business writes and state transitions. +3. `policy` module: validation and future role checks. +4. `data` module: Data Connect adapters and transaction wrappers. +5. `infra` module: logging, tracing, auth middleware, error mapping. + +## 5.2 Contract separation +1. Keep API request/response schemas in one location. +2. Keep domain errors in one registry file. +3. Keep route declarations thin; business logic in services. + +## 5.3 Cloud runtime roles +1. Cloud Run is the primary command and core API execution layer. +2. Cloud Functions v2 is worker-only in this phase: +- upload-related async handlers +- notification jobs +- model-related async helpers when needed + +## 6) Automation baseline + +## 6.1 Makefile requirements +Add `makefiles/backend.mk` and wire it into root `Makefile` with at least: +1. `make backend-enable-apis` +2. `make backend-bootstrap-dev` +3. `make backend-deploy-core` +4. `make backend-deploy-commands` +5. `make backend-deploy-workers` +6. `make backend-smoke-core` +7. `make backend-smoke-commands` +8. `make backend-logs-core` + +## 6.2 CI requirements +1. Backend lint +2. Backend tests +3. Build/package +4. Smoke test against deployed dev route(s) +5. Block merge on failed checks + +## 6.3 Session hygiene +1. Update `TASKS.md` and `CHANGELOG.md` each working session. +2. If a new service/API is added, Makefile target must be added in same change. + +## 7) Migration safety contract (no frontend breakage) +1. Backend routes ship first. +2. Frontend migration is per-feature wave, not big bang. +3. Keep compatibility aliases until clients migrate. +4. Keep existing Data Connect reads during foundation. +5. For each migrated write flow: +- before/after behavior checklist +- rollback path +- smoke verification + +## 8) Scope for foundation build +1. Backend runtime/deploy foundation in dev. +2. Core endpoints: +- `POST /core/upload-file` +- `POST /core/create-signed-url` +- `POST /core/invoke-llm` +- `GET /healthz` +3. Compatibility aliases: +- `POST /uploadFile` +- `POST /createSignedUrl` +- `POST /invokeLLM` +4. Command layer scaffold for first migration routes. +5. Initial migration of highest-risk write paths. + +## 9) Implementation phases + +## Phase 0: Baseline and contracts +Deliverables: +1. Freeze endpoint naming and compatibility aliases. +2. Freeze error envelope and error code registry. +3. Freeze auth middleware interface and policy hook interface. +4. Publish route inventory from web/mobile direct writes. + +Exit criteria: +1. No unresolved contract ambiguity. +2. Team agrees on auth-first now and role-map-later approach. + +## Phase 1: Backend infra and automation +Deliverables: +1. `makefiles/backend.mk` with bootstrap, deploy, smoke, logs targets. +2. Environment templates for backend runtime config. +3. Secret Manager and service account setup automation. + +Exit criteria: +1. A fresh machine can deploy core backend to dev via Make commands. + +## Phase 2: Core endpoint implementation +Deliverables: +1. `/core/upload-file` +2. `/core/create-signed-url` +3. `/core/invoke-llm` +4. `/healthz` +5. Compatibility aliases (`/uploadFile`, `/createSignedUrl`, `/invokeLLM`) + +Exit criteria: +1. API harness passes for core routes. +2. Error, logging, and auth standards are enforced. + +## Phase 3: Command layer scaffold +Deliverables: +1. `/commands/orders/create` +2. `/commands/orders/{orderId}/cancel` +3. `/commands/orders/{orderId}/update` +4. `/commands/shifts/{shiftId}/change-status` +5. `/commands/shifts/{shiftId}/assign-staff` +6. `/commands/shifts/{shiftId}/accept` + +Exit criteria: +1. High-risk writes have backend command alternatives ready. + +## Phase 4: Wave 1 frontend migration +Deliverables: +1. Replace direct writes in selected web/mobile flows. +2. Keep reads stable. +3. Verify no regressions in non-migrated screens. + +Exit criteria: +1. Migrated flows run through backend commands only. +2. Rollback instructions validated. + +## Phase 5: Hardening and handoff +Deliverables: +1. Runbook for deploy, rollback, and smoke. +2. Backend CI pipeline active. +3. Wave 2 and wave 3 migration task list defined. + +Exit criteria: +1. Foundation is reusable for staging/prod with environment changes only. + +## 10) Wave 1 migration inventory (real call sites) + +Web: +1. `apps/web/src/features/operations/tasks/TaskBoard.tsx:100` +2. `apps/web/src/features/operations/orders/OrderDetail.tsx:145` +3. `apps/web/src/features/operations/orders/EditOrder.tsx:84` +4. `apps/web/src/features/operations/orders/components/CreateOrderDialog.tsx:31` +5. `apps/web/src/features/operations/orders/components/AssignStaffModal.tsx:60` +6. `apps/web/src/features/workforce/documents/DocumentVault.tsx:99` + +Mobile: +1. `apps/mobile/packages/features/client/home/lib/src/presentation/widgets/shift_order_form_sheet.dart:232` +2. `apps/mobile/packages/features/client/view_orders/lib/src/presentation/widgets/view_order_card.dart:1195` +3. `apps/mobile/packages/features/client/create_order/lib/src/data/repositories_impl/client_create_order_repository_impl.dart:68` +4. `apps/mobile/packages/features/staff/shifts/lib/src/data/repositories_impl/shifts_repository_impl.dart:446` +5. `apps/mobile/packages/features/client/authentication/lib/src/data/repositories_impl/auth_repository_impl.dart:257` +6. `apps/mobile/packages/features/staff/profile_sections/onboarding/profile_info/lib/src/data/repositories/personal_info_repository_impl.dart:51` + +## 11) Definition of done for foundation +1. Core endpoints deployed in dev and validated. +2. Command scaffolding in place for wave 1 writes. +3. Auth-first protection active on all new routes. +4. Idempotency + transaction model defined for command writes. +5. Makefile and CI automation cover bootstrap/deploy/smoke paths. +6. Frontend remains stable during migration. +7. Role-map integration points are documented for next phase. + +## 12) Locked defaults (approved) +1. Idempotency key storage strategy: +- Cloud SQL table, 24-hour retention, keyed by `userId + route + idempotencyKey`. +2. Validation library and schema location: +- `zod` in `backend//src/contracts/` (`core/`, `commands/`). +3. Storage bucket naming and split: +- `krow-workforce-dev-public` and `krow-workforce-dev-private`. +4. Model provider and timeout: +- Vertex AI Gemini, 20-second max timeout. +5. Target response-time objectives (p95): +- `/healthz` under 200ms +- `/core/create-signed-url` under 500ms +- `/commands/*` under 1500ms +- `/core/invoke-llm` under 15000ms