docs: lock backend foundation plan and tracking format
This commit is contained in:
@@ -0,0 +1,269 @@
|
||||
# M4 Backend Foundation Implementation Plan (Dev First)
|
||||
|
||||
Date: 2026-02-24
|
||||
Owner: Wilfred (Technical Lead)
|
||||
Primary environment: `krow-workforce-dev`
|
||||
|
||||
## 1) Objective
|
||||
Build a secure, modular, and scalable backend foundation in `dev` without breaking the current frontend while we migrate high-risk writes from direct Data Connect mutations to backend command endpoints.
|
||||
|
||||
## 2) First-principles architecture rules
|
||||
1. Client apps are untrusted for business-critical writes.
|
||||
2. Backend is the enforcement layer for validation, permissions, and write orchestration.
|
||||
3. Multi-entity writes must be atomic, idempotent, and observable.
|
||||
4. Configuration and deployment must be reproducible by automation.
|
||||
5. Migration must be backward-compatible until each frontend flow is cut over.
|
||||
|
||||
## 3) Pre-coding gates (must be true before implementation starts)
|
||||
|
||||
## Gate A: Security boundary
|
||||
1. Frontend sends Firebase token only. No database credentials in client code.
|
||||
2. Every new backend endpoint validates Firebase token.
|
||||
3. Data Connect write access strategy is defined:
|
||||
- keep simple reads available to client
|
||||
- route high-risk writes through backend command endpoints
|
||||
4. Upload and signed URL paths are server-controlled.
|
||||
|
||||
## Gate B: Contract standards
|
||||
1. Standard error envelope is frozen:
|
||||
```json
|
||||
{
|
||||
"code": "STRING_CODE",
|
||||
"message": "Human readable message",
|
||||
"details": {},
|
||||
"requestId": "optional-request-id"
|
||||
}
|
||||
```
|
||||
2. Request validation layer is chosen and centralized.
|
||||
3. Route naming strategy is frozen:
|
||||
- canonical routes under `/core` and `/commands`
|
||||
- compatibility aliases preserved during migration (`/uploadFile`, `/createSignedUrl`, `/invokeLLM`)
|
||||
4. Validation standard is locked:
|
||||
- library: `zod`
|
||||
- schema location: `backend/<service>/src/contracts/` with `core/` and `commands/` subfolders
|
||||
|
||||
## Gate C: Atomicity and reliability
|
||||
1. Command endpoints support idempotency keys for retry-safe writes.
|
||||
2. Multi-step write flows are wrapped in single backend transaction boundaries.
|
||||
3. Domain conflict codes are defined for expected business failures.
|
||||
4. Idempotency storage is locked:
|
||||
- store in Cloud SQL table
|
||||
- key scope: `userId + route + idempotencyKey`
|
||||
- retain records for 24 hours
|
||||
- repeated key returns original response
|
||||
|
||||
## Gate D: Automation and operability
|
||||
1. Makefile is source of truth for backend setup and deploy in dev.
|
||||
2. Core deploy and smoke test commands exist before feature migration.
|
||||
3. Logging format and request tracing fields are standardized.
|
||||
|
||||
## 4) Security baseline for foundation phase
|
||||
|
||||
## 4.1 Authentication and authorization
|
||||
1. Foundation phase is authentication-first.
|
||||
2. Role-based access control is intentionally deferred.
|
||||
3. All handlers include a policy hook for future role checks (`can(action, resource, actor)`).
|
||||
|
||||
## 4.2 Data access control model
|
||||
1. Client retains Data Connect reads required for existing screens.
|
||||
2. High-risk writes move behind `/commands/*` endpoints.
|
||||
3. Backend mediates write interactions with Data Connect and Cloud SQL.
|
||||
|
||||
## 4.3 File and URL security
|
||||
1. Validate file type and size server-side.
|
||||
2. Separate public and private storage behavior.
|
||||
3. Signed URL creation checks ownership/prefix scope and expiry limits.
|
||||
4. Bucket policy split is locked:
|
||||
- `krow-workforce-dev-public`
|
||||
- `krow-workforce-dev-private`
|
||||
- private bucket access only through signed URL
|
||||
|
||||
## 4.4 Model invocation safety
|
||||
1. Enforce schema-constrained output.
|
||||
2. Apply per-user rate limits and request timeout.
|
||||
3. Log model failures with safe redaction (no sensitive prompt leakage in logs).
|
||||
4. Model provider and timeout defaults are locked:
|
||||
- provider: Vertex AI Gemini
|
||||
- max route timeout: 20 seconds
|
||||
- timeout error code: `MODEL_TIMEOUT`
|
||||
|
||||
## 4.5 Secrets and credentials
|
||||
1. Runtime secrets come from Secret Manager only.
|
||||
2. Service accounts use least-privilege roles.
|
||||
3. No secrets committed in repository files.
|
||||
|
||||
## 5) Modularity baseline
|
||||
|
||||
## 5.1 Backend module boundaries
|
||||
1. `core` module: upload, signed URL, model invocation, health.
|
||||
2. `commands` module: business writes and state transitions.
|
||||
3. `policy` module: validation and future role checks.
|
||||
4. `data` module: Data Connect adapters and transaction wrappers.
|
||||
5. `infra` module: logging, tracing, auth middleware, error mapping.
|
||||
|
||||
## 5.2 Contract separation
|
||||
1. Keep API request/response schemas in one location.
|
||||
2. Keep domain errors in one registry file.
|
||||
3. Keep route declarations thin; business logic in services.
|
||||
|
||||
## 5.3 Cloud runtime roles
|
||||
1. Cloud Run is the primary command and core API execution layer.
|
||||
2. Cloud Functions v2 is worker-only in this phase:
|
||||
- upload-related async handlers
|
||||
- notification jobs
|
||||
- model-related async helpers when needed
|
||||
|
||||
## 6) Automation baseline
|
||||
|
||||
## 6.1 Makefile requirements
|
||||
Add `makefiles/backend.mk` and wire it into root `Makefile` with at least:
|
||||
1. `make backend-enable-apis`
|
||||
2. `make backend-bootstrap-dev`
|
||||
3. `make backend-deploy-core`
|
||||
4. `make backend-deploy-commands`
|
||||
5. `make backend-deploy-workers`
|
||||
6. `make backend-smoke-core`
|
||||
7. `make backend-smoke-commands`
|
||||
8. `make backend-logs-core`
|
||||
|
||||
## 6.2 CI requirements
|
||||
1. Backend lint
|
||||
2. Backend tests
|
||||
3. Build/package
|
||||
4. Smoke test against deployed dev route(s)
|
||||
5. Block merge on failed checks
|
||||
|
||||
## 6.3 Session hygiene
|
||||
1. Update `TASKS.md` and `CHANGELOG.md` each working session.
|
||||
2. If a new service/API is added, Makefile target must be added in same change.
|
||||
|
||||
## 7) Migration safety contract (no frontend breakage)
|
||||
1. Backend routes ship first.
|
||||
2. Frontend migration is per-feature wave, not big bang.
|
||||
3. Keep compatibility aliases until clients migrate.
|
||||
4. Keep existing Data Connect reads during foundation.
|
||||
5. For each migrated write flow:
|
||||
- before/after behavior checklist
|
||||
- rollback path
|
||||
- smoke verification
|
||||
|
||||
## 8) Scope for foundation build
|
||||
1. Backend runtime/deploy foundation in dev.
|
||||
2. Core endpoints:
|
||||
- `POST /core/upload-file`
|
||||
- `POST /core/create-signed-url`
|
||||
- `POST /core/invoke-llm`
|
||||
- `GET /healthz`
|
||||
3. Compatibility aliases:
|
||||
- `POST /uploadFile`
|
||||
- `POST /createSignedUrl`
|
||||
- `POST /invokeLLM`
|
||||
4. Command layer scaffold for first migration routes.
|
||||
5. Initial migration of highest-risk write paths.
|
||||
|
||||
## 9) Implementation phases
|
||||
|
||||
## Phase 0: Baseline and contracts
|
||||
Deliverables:
|
||||
1. Freeze endpoint naming and compatibility aliases.
|
||||
2. Freeze error envelope and error code registry.
|
||||
3. Freeze auth middleware interface and policy hook interface.
|
||||
4. Publish route inventory from web/mobile direct writes.
|
||||
|
||||
Exit criteria:
|
||||
1. No unresolved contract ambiguity.
|
||||
2. Team agrees on auth-first now and role-map-later approach.
|
||||
|
||||
## Phase 1: Backend infra and automation
|
||||
Deliverables:
|
||||
1. `makefiles/backend.mk` with bootstrap, deploy, smoke, logs targets.
|
||||
2. Environment templates for backend runtime config.
|
||||
3. Secret Manager and service account setup automation.
|
||||
|
||||
Exit criteria:
|
||||
1. A fresh machine can deploy core backend to dev via Make commands.
|
||||
|
||||
## Phase 2: Core endpoint implementation
|
||||
Deliverables:
|
||||
1. `/core/upload-file`
|
||||
2. `/core/create-signed-url`
|
||||
3. `/core/invoke-llm`
|
||||
4. `/healthz`
|
||||
5. Compatibility aliases (`/uploadFile`, `/createSignedUrl`, `/invokeLLM`)
|
||||
|
||||
Exit criteria:
|
||||
1. API harness passes for core routes.
|
||||
2. Error, logging, and auth standards are enforced.
|
||||
|
||||
## Phase 3: Command layer scaffold
|
||||
Deliverables:
|
||||
1. `/commands/orders/create`
|
||||
2. `/commands/orders/{orderId}/cancel`
|
||||
3. `/commands/orders/{orderId}/update`
|
||||
4. `/commands/shifts/{shiftId}/change-status`
|
||||
5. `/commands/shifts/{shiftId}/assign-staff`
|
||||
6. `/commands/shifts/{shiftId}/accept`
|
||||
|
||||
Exit criteria:
|
||||
1. High-risk writes have backend command alternatives ready.
|
||||
|
||||
## Phase 4: Wave 1 frontend migration
|
||||
Deliverables:
|
||||
1. Replace direct writes in selected web/mobile flows.
|
||||
2. Keep reads stable.
|
||||
3. Verify no regressions in non-migrated screens.
|
||||
|
||||
Exit criteria:
|
||||
1. Migrated flows run through backend commands only.
|
||||
2. Rollback instructions validated.
|
||||
|
||||
## Phase 5: Hardening and handoff
|
||||
Deliverables:
|
||||
1. Runbook for deploy, rollback, and smoke.
|
||||
2. Backend CI pipeline active.
|
||||
3. Wave 2 and wave 3 migration task list defined.
|
||||
|
||||
Exit criteria:
|
||||
1. Foundation is reusable for staging/prod with environment changes only.
|
||||
|
||||
## 10) Wave 1 migration inventory (real call sites)
|
||||
|
||||
Web:
|
||||
1. `apps/web/src/features/operations/tasks/TaskBoard.tsx:100`
|
||||
2. `apps/web/src/features/operations/orders/OrderDetail.tsx:145`
|
||||
3. `apps/web/src/features/operations/orders/EditOrder.tsx:84`
|
||||
4. `apps/web/src/features/operations/orders/components/CreateOrderDialog.tsx:31`
|
||||
5. `apps/web/src/features/operations/orders/components/AssignStaffModal.tsx:60`
|
||||
6. `apps/web/src/features/workforce/documents/DocumentVault.tsx:99`
|
||||
|
||||
Mobile:
|
||||
1. `apps/mobile/packages/features/client/home/lib/src/presentation/widgets/shift_order_form_sheet.dart:232`
|
||||
2. `apps/mobile/packages/features/client/view_orders/lib/src/presentation/widgets/view_order_card.dart:1195`
|
||||
3. `apps/mobile/packages/features/client/create_order/lib/src/data/repositories_impl/client_create_order_repository_impl.dart:68`
|
||||
4. `apps/mobile/packages/features/staff/shifts/lib/src/data/repositories_impl/shifts_repository_impl.dart:446`
|
||||
5. `apps/mobile/packages/features/client/authentication/lib/src/data/repositories_impl/auth_repository_impl.dart:257`
|
||||
6. `apps/mobile/packages/features/staff/profile_sections/onboarding/profile_info/lib/src/data/repositories/personal_info_repository_impl.dart:51`
|
||||
|
||||
## 11) Definition of done for foundation
|
||||
1. Core endpoints deployed in dev and validated.
|
||||
2. Command scaffolding in place for wave 1 writes.
|
||||
3. Auth-first protection active on all new routes.
|
||||
4. Idempotency + transaction model defined for command writes.
|
||||
5. Makefile and CI automation cover bootstrap/deploy/smoke paths.
|
||||
6. Frontend remains stable during migration.
|
||||
7. Role-map integration points are documented for next phase.
|
||||
|
||||
## 12) Locked defaults (approved)
|
||||
1. Idempotency key storage strategy:
|
||||
- Cloud SQL table, 24-hour retention, keyed by `userId + route + idempotencyKey`.
|
||||
2. Validation library and schema location:
|
||||
- `zod` in `backend/<service>/src/contracts/` (`core/`, `commands/`).
|
||||
3. Storage bucket naming and split:
|
||||
- `krow-workforce-dev-public` and `krow-workforce-dev-private`.
|
||||
4. Model provider and timeout:
|
||||
- Vertex AI Gemini, 20-second max timeout.
|
||||
5. Target response-time objectives (p95):
|
||||
- `/healthz` under 200ms
|
||||
- `/core/create-signed-url` under 500ms
|
||||
- `/commands/*` under 1500ms
|
||||
- `/core/invoke-llm` under 15000ms
|
||||
Reference in New Issue
Block a user