docs: lock backend foundation plan and tracking format

This commit is contained in:
zouantchaw
2026-02-23 21:41:44 -05:00
parent f39809867d
commit e81eab1165
4 changed files with 507 additions and 0 deletions

1
.gitignore vendored
View File

@@ -189,3 +189,4 @@ apps/web/src/dataconnect-generated/
AGENTS.md
CLAUDE.md
GEMINI.md
TASKS.md

9
CHANGELOG.md Normal file
View File

@@ -0,0 +1,9 @@
# KROW Workforce Change Log
| Date | Version | Change |
|---|---|---|
| 2026-02-24 | 0.1.0 | Confirmed dev owner access and current runtime baseline in `krow-workforce-dev`. |
| 2026-02-24 | 0.1.1 | Added backend foundation implementation plan document. |
| 2026-02-24 | 0.1.2 | Added API implementation contract and transition route aliases. |
| 2026-02-24 | 0.1.3 | Added auth-first security policy with deferred role-map integration hooks. |
| 2026-02-24 | 0.1.4 | Locked defaults for idempotency, validation, bucket split, model provider, and p95 objectives. |

View File

@@ -0,0 +1,228 @@
# M4 API Catalog (Implementation Contract)
Status: Draft
Date: 2026-02-24
Owner: Technical Lead
Environment: dev
## 1) Scope and purpose
This file defines the backend endpoint contract for the M4 foundation build.
## 2) Global API rules
1. Canonical route groups:
- `/core/*` for foundational integration routes
- `/commands/*` for business-critical writes
2. Foundation phase security model:
- authenticated user required
- role map enforcement deferred
- policy hook required in handler design
3. Standard error envelope:
```json
{
"code": "STRING_CODE",
"message": "Human readable message",
"details": {},
"requestId": "optional-request-id"
}
```
4. Required request headers:
- `Authorization: Bearer <firebase-token>`
- `X-Request-Id: <uuid>` (optional but recommended)
5. Required response headers:
- `X-Request-Id`
6. Validation:
- all input validated server-side
- reject unknown/invalid fields
7. Logging:
- route
- requestId
- actorId
- latencyMs
- outcome
8. Timeouts and retries:
- command writes must be retry-safe
- use idempotency keys for command write routes
9. Idempotency storage:
- store in Cloud SQL table
- key scope: `userId + route + idempotencyKey`
- key retention: 24 hours
- repeated key returns original response payload
## 3) Compatibility aliases (transition)
1. `POST /uploadFile` -> `POST /core/upload-file`
2. `POST /createSignedUrl` -> `POST /core/create-signed-url`
3. `POST /invokeLLM` -> `POST /core/invoke-llm`
## 4) Rate-limit baseline (initial)
1. `/core/invoke-llm`: 60 requests per minute per user
2. `/core/upload-file`: 30 requests per minute per user
3. `/core/create-signed-url`: 120 requests per minute per user
4. `/commands/*`: 60 requests per minute per user
## 4.1 Timeout baseline (initial)
1. `/core/invoke-llm`: 20-second hard timeout
2. other `/core/*` routes: 10-second timeout
3. `/commands/*` routes: 15-second timeout
## 5) Core routes
## 5.1 Upload file
1. Method and route: `POST /core/upload-file`
2. Auth: required
3. Idempotency key: optional
4. Request: multipart form data
- `file` (required)
- `category` (optional)
- `visibility` (optional: `public` or `private`)
5. Success `200`:
```json
{
"fileUri": "gs://bucket/path/file.ext",
"contentType": "application/pdf",
"size": 12345,
"bucket": "krow-uploads-private",
"path": "documents/staff/..."
}
```
6. Errors:
- `UNAUTHENTICATED`
- `INVALID_FILE_TYPE`
- `FILE_TOO_LARGE`
- `UPLOAD_FAILED`
## 5.2 Create signed URL
1. Method and route: `POST /core/create-signed-url`
2. Auth: required
3. Idempotency key: optional
4. Request:
```json
{
"fileUri": "gs://bucket/path/file.ext",
"expiresInSeconds": 300
}
```
5. Success `200`:
```json
{
"signedUrl": "https://...",
"expiresAt": "2026-02-24T15:00:00Z"
}
```
6. Errors:
- `UNAUTHENTICATED`
- `FORBIDDEN_FILE_ACCESS`
- `INVALID_EXPIRES_IN`
- `SIGN_URL_FAILED`
## 5.3 Invoke model
1. Method and route: `POST /core/invoke-llm`
2. Auth: required
3. Idempotency key: optional
4. Request:
```json
{
"prompt": "...",
"responseJsonSchema": {},
"fileUrls": []
}
```
5. Success `200`:
```json
{
"result": {},
"model": "provider/model-name",
"latencyMs": 980
}
```
6. Errors:
- `UNAUTHENTICATED`
- `INVALID_SCHEMA`
- `MODEL_TIMEOUT`
- `MODEL_FAILED`
7. Provider default:
- Vertex AI Gemini
## 5.4 Health check
1. Method and route: `GET /healthz`
2. Auth: optional (internal policy)
3. Success `200`:
```json
{
"ok": true,
"service": "krow-backend",
"version": "commit-or-tag"
}
```
## 5.5 Storage bucket policy defaults (dev)
1. Public bucket: `krow-workforce-dev-public`
2. Private bucket: `krow-workforce-dev-private`
3. Private objects are never returned directly; only signed URLs are returned.
## 6) Command routes (wave 1)
## 6.1 Create order
1. Method and route: `POST /commands/orders/create`
2. Auth: required
3. Idempotency key: required
4. Purpose: create order + shifts + roles atomically
5. Replaces:
- `apps/web/src/features/operations/orders/components/CreateOrderDialog.tsx`
- `apps/mobile/packages/features/client/home/lib/src/presentation/widgets/shift_order_form_sheet.dart`
- `apps/mobile/packages/features/client/create_order/lib/src/data/repositories_impl/client_create_order_repository_impl.dart`
## 6.2 Update order
1. Method and route: `POST /commands/orders/{orderId}/update`
2. Auth: required
3. Idempotency key: required
4. Purpose: policy-safe multi-entity order update
5. Replaces:
- `apps/web/src/features/operations/orders/EditOrder.tsx`
- `apps/mobile/packages/features/client/view_orders/lib/src/presentation/widgets/view_order_card.dart`
## 6.3 Cancel order
1. Method and route: `POST /commands/orders/{orderId}/cancel`
2. Auth: required
3. Idempotency key: required
4. Purpose: enforce cancellation policy and return explicit conflict code
5. Replaces:
- `apps/web/src/features/operations/orders/OrderDetail.tsx`
## 6.4 Change shift status
1. Method and route: `POST /commands/shifts/{shiftId}/change-status`
2. Auth: required
3. Idempotency key: required
4. Purpose: enforce state transitions server-side
5. Replaces:
- `apps/web/src/features/operations/tasks/TaskBoard.tsx`
## 6.5 Assign staff
1. Method and route: `POST /commands/shifts/{shiftId}/assign-staff`
2. Auth: required
3. Idempotency key: required
4. Purpose: assign + count update + conflict checks atomically
5. Replaces:
- `apps/web/src/features/operations/orders/components/AssignStaffModal.tsx`
## 6.6 Accept shift
1. Method and route: `POST /commands/shifts/{shiftId}/accept`
2. Auth: required
3. Idempotency key: required
4. Purpose: application + counters + rollback-safe behavior in one command
5. Replaces:
- `apps/mobile/packages/features/staff/shifts/lib/src/data/repositories_impl/shifts_repository_impl.dart`
## 7) Locked defaults before coding starts
1. Idempotency keys are stored in Cloud SQL with 24-hour retention.
2. Request validation library is `zod`.
3. Validation schema location is `backend/<service>/src/contracts/`.
4. Storage buckets are:
- `krow-workforce-dev-public`
- `krow-workforce-dev-private`
5. Model provider is Vertex AI Gemini with a 20-second timeout for `/core/invoke-llm`.
## 8) Target response-time objectives (p95)
1. `/healthz` under 200ms
2. `/core/create-signed-url` under 500ms
3. `/commands/*` under 1500ms
4. `/core/invoke-llm` under 15000ms

View File

@@ -0,0 +1,269 @@
# M4 Backend Foundation Implementation Plan (Dev First)
Date: 2026-02-24
Owner: Wilfred (Technical Lead)
Primary environment: `krow-workforce-dev`
## 1) Objective
Build a secure, modular, and scalable backend foundation in `dev` without breaking the current frontend while we migrate high-risk writes from direct Data Connect mutations to backend command endpoints.
## 2) First-principles architecture rules
1. Client apps are untrusted for business-critical writes.
2. Backend is the enforcement layer for validation, permissions, and write orchestration.
3. Multi-entity writes must be atomic, idempotent, and observable.
4. Configuration and deployment must be reproducible by automation.
5. Migration must be backward-compatible until each frontend flow is cut over.
## 3) Pre-coding gates (must be true before implementation starts)
## Gate A: Security boundary
1. Frontend sends Firebase token only. No database credentials in client code.
2. Every new backend endpoint validates Firebase token.
3. Data Connect write access strategy is defined:
- keep simple reads available to client
- route high-risk writes through backend command endpoints
4. Upload and signed URL paths are server-controlled.
## Gate B: Contract standards
1. Standard error envelope is frozen:
```json
{
"code": "STRING_CODE",
"message": "Human readable message",
"details": {},
"requestId": "optional-request-id"
}
```
2. Request validation layer is chosen and centralized.
3. Route naming strategy is frozen:
- canonical routes under `/core` and `/commands`
- compatibility aliases preserved during migration (`/uploadFile`, `/createSignedUrl`, `/invokeLLM`)
4. Validation standard is locked:
- library: `zod`
- schema location: `backend/<service>/src/contracts/` with `core/` and `commands/` subfolders
## Gate C: Atomicity and reliability
1. Command endpoints support idempotency keys for retry-safe writes.
2. Multi-step write flows are wrapped in single backend transaction boundaries.
3. Domain conflict codes are defined for expected business failures.
4. Idempotency storage is locked:
- store in Cloud SQL table
- key scope: `userId + route + idempotencyKey`
- retain records for 24 hours
- repeated key returns original response
## Gate D: Automation and operability
1. Makefile is source of truth for backend setup and deploy in dev.
2. Core deploy and smoke test commands exist before feature migration.
3. Logging format and request tracing fields are standardized.
## 4) Security baseline for foundation phase
## 4.1 Authentication and authorization
1. Foundation phase is authentication-first.
2. Role-based access control is intentionally deferred.
3. All handlers include a policy hook for future role checks (`can(action, resource, actor)`).
## 4.2 Data access control model
1. Client retains Data Connect reads required for existing screens.
2. High-risk writes move behind `/commands/*` endpoints.
3. Backend mediates write interactions with Data Connect and Cloud SQL.
## 4.3 File and URL security
1. Validate file type and size server-side.
2. Separate public and private storage behavior.
3. Signed URL creation checks ownership/prefix scope and expiry limits.
4. Bucket policy split is locked:
- `krow-workforce-dev-public`
- `krow-workforce-dev-private`
- private bucket access only through signed URL
## 4.4 Model invocation safety
1. Enforce schema-constrained output.
2. Apply per-user rate limits and request timeout.
3. Log model failures with safe redaction (no sensitive prompt leakage in logs).
4. Model provider and timeout defaults are locked:
- provider: Vertex AI Gemini
- max route timeout: 20 seconds
- timeout error code: `MODEL_TIMEOUT`
## 4.5 Secrets and credentials
1. Runtime secrets come from Secret Manager only.
2. Service accounts use least-privilege roles.
3. No secrets committed in repository files.
## 5) Modularity baseline
## 5.1 Backend module boundaries
1. `core` module: upload, signed URL, model invocation, health.
2. `commands` module: business writes and state transitions.
3. `policy` module: validation and future role checks.
4. `data` module: Data Connect adapters and transaction wrappers.
5. `infra` module: logging, tracing, auth middleware, error mapping.
## 5.2 Contract separation
1. Keep API request/response schemas in one location.
2. Keep domain errors in one registry file.
3. Keep route declarations thin; business logic in services.
## 5.3 Cloud runtime roles
1. Cloud Run is the primary command and core API execution layer.
2. Cloud Functions v2 is worker-only in this phase:
- upload-related async handlers
- notification jobs
- model-related async helpers when needed
## 6) Automation baseline
## 6.1 Makefile requirements
Add `makefiles/backend.mk` and wire it into root `Makefile` with at least:
1. `make backend-enable-apis`
2. `make backend-bootstrap-dev`
3. `make backend-deploy-core`
4. `make backend-deploy-commands`
5. `make backend-deploy-workers`
6. `make backend-smoke-core`
7. `make backend-smoke-commands`
8. `make backend-logs-core`
## 6.2 CI requirements
1. Backend lint
2. Backend tests
3. Build/package
4. Smoke test against deployed dev route(s)
5. Block merge on failed checks
## 6.3 Session hygiene
1. Update `TASKS.md` and `CHANGELOG.md` each working session.
2. If a new service/API is added, Makefile target must be added in same change.
## 7) Migration safety contract (no frontend breakage)
1. Backend routes ship first.
2. Frontend migration is per-feature wave, not big bang.
3. Keep compatibility aliases until clients migrate.
4. Keep existing Data Connect reads during foundation.
5. For each migrated write flow:
- before/after behavior checklist
- rollback path
- smoke verification
## 8) Scope for foundation build
1. Backend runtime/deploy foundation in dev.
2. Core endpoints:
- `POST /core/upload-file`
- `POST /core/create-signed-url`
- `POST /core/invoke-llm`
- `GET /healthz`
3. Compatibility aliases:
- `POST /uploadFile`
- `POST /createSignedUrl`
- `POST /invokeLLM`
4. Command layer scaffold for first migration routes.
5. Initial migration of highest-risk write paths.
## 9) Implementation phases
## Phase 0: Baseline and contracts
Deliverables:
1. Freeze endpoint naming and compatibility aliases.
2. Freeze error envelope and error code registry.
3. Freeze auth middleware interface and policy hook interface.
4. Publish route inventory from web/mobile direct writes.
Exit criteria:
1. No unresolved contract ambiguity.
2. Team agrees on auth-first now and role-map-later approach.
## Phase 1: Backend infra and automation
Deliverables:
1. `makefiles/backend.mk` with bootstrap, deploy, smoke, logs targets.
2. Environment templates for backend runtime config.
3. Secret Manager and service account setup automation.
Exit criteria:
1. A fresh machine can deploy core backend to dev via Make commands.
## Phase 2: Core endpoint implementation
Deliverables:
1. `/core/upload-file`
2. `/core/create-signed-url`
3. `/core/invoke-llm`
4. `/healthz`
5. Compatibility aliases (`/uploadFile`, `/createSignedUrl`, `/invokeLLM`)
Exit criteria:
1. API harness passes for core routes.
2. Error, logging, and auth standards are enforced.
## Phase 3: Command layer scaffold
Deliverables:
1. `/commands/orders/create`
2. `/commands/orders/{orderId}/cancel`
3. `/commands/orders/{orderId}/update`
4. `/commands/shifts/{shiftId}/change-status`
5. `/commands/shifts/{shiftId}/assign-staff`
6. `/commands/shifts/{shiftId}/accept`
Exit criteria:
1. High-risk writes have backend command alternatives ready.
## Phase 4: Wave 1 frontend migration
Deliverables:
1. Replace direct writes in selected web/mobile flows.
2. Keep reads stable.
3. Verify no regressions in non-migrated screens.
Exit criteria:
1. Migrated flows run through backend commands only.
2. Rollback instructions validated.
## Phase 5: Hardening and handoff
Deliverables:
1. Runbook for deploy, rollback, and smoke.
2. Backend CI pipeline active.
3. Wave 2 and wave 3 migration task list defined.
Exit criteria:
1. Foundation is reusable for staging/prod with environment changes only.
## 10) Wave 1 migration inventory (real call sites)
Web:
1. `apps/web/src/features/operations/tasks/TaskBoard.tsx:100`
2. `apps/web/src/features/operations/orders/OrderDetail.tsx:145`
3. `apps/web/src/features/operations/orders/EditOrder.tsx:84`
4. `apps/web/src/features/operations/orders/components/CreateOrderDialog.tsx:31`
5. `apps/web/src/features/operations/orders/components/AssignStaffModal.tsx:60`
6. `apps/web/src/features/workforce/documents/DocumentVault.tsx:99`
Mobile:
1. `apps/mobile/packages/features/client/home/lib/src/presentation/widgets/shift_order_form_sheet.dart:232`
2. `apps/mobile/packages/features/client/view_orders/lib/src/presentation/widgets/view_order_card.dart:1195`
3. `apps/mobile/packages/features/client/create_order/lib/src/data/repositories_impl/client_create_order_repository_impl.dart:68`
4. `apps/mobile/packages/features/staff/shifts/lib/src/data/repositories_impl/shifts_repository_impl.dart:446`
5. `apps/mobile/packages/features/client/authentication/lib/src/data/repositories_impl/auth_repository_impl.dart:257`
6. `apps/mobile/packages/features/staff/profile_sections/onboarding/profile_info/lib/src/data/repositories/personal_info_repository_impl.dart:51`
## 11) Definition of done for foundation
1. Core endpoints deployed in dev and validated.
2. Command scaffolding in place for wave 1 writes.
3. Auth-first protection active on all new routes.
4. Idempotency + transaction model defined for command writes.
5. Makefile and CI automation cover bootstrap/deploy/smoke paths.
6. Frontend remains stable during migration.
7. Role-map integration points are documented for next phase.
## 12) Locked defaults (approved)
1. Idempotency key storage strategy:
- Cloud SQL table, 24-hour retention, keyed by `userId + route + idempotencyKey`.
2. Validation library and schema location:
- `zod` in `backend/<service>/src/contracts/` (`core/`, `commands/`).
3. Storage bucket naming and split:
- `krow-workforce-dev-public` and `krow-workforce-dev-private`.
4. Model provider and timeout:
- Vertex AI Gemini, 20-second max timeout.
5. Target response-time objectives (p95):
- `/healthz` under 200ms
- `/core/create-signed-url` under 500ms
- `/commands/*` under 1500ms
- `/core/invoke-llm` under 15000ms