270 lines
10 KiB
Markdown
270 lines
10 KiB
Markdown
# M4 Backend Foundation Implementation Plan (Dev First)
|
|
|
|
Date: 2026-02-24
|
|
Owner: Wilfred (Technical Lead)
|
|
Primary environment: `krow-workforce-dev`
|
|
|
|
## 1) Objective
|
|
Build a secure, modular, and scalable backend foundation in `dev` without breaking the current frontend while we migrate high-risk writes from direct Data Connect mutations to backend command endpoints.
|
|
|
|
## 2) First-principles architecture rules
|
|
1. Client apps are untrusted for business-critical writes.
|
|
2. Backend is the enforcement layer for validation, permissions, and write orchestration.
|
|
3. Multi-entity writes must be atomic, idempotent, and observable.
|
|
4. Configuration and deployment must be reproducible by automation.
|
|
5. Migration must be backward-compatible until each frontend flow is cut over.
|
|
|
|
## 3) Pre-coding gates (must be true before implementation starts)
|
|
|
|
## Gate A: Security boundary
|
|
1. Frontend sends Firebase token only. No database credentials in client code.
|
|
2. Every new backend endpoint validates Firebase token.
|
|
3. Data Connect write access strategy is defined:
|
|
- keep simple reads available to client
|
|
- route high-risk writes through backend command endpoints
|
|
4. Upload and signed URL paths are server-controlled.
|
|
|
|
## Gate B: Contract standards
|
|
1. Standard error envelope is frozen:
|
|
```json
|
|
{
|
|
"code": "STRING_CODE",
|
|
"message": "Human readable message",
|
|
"details": {},
|
|
"requestId": "optional-request-id"
|
|
}
|
|
```
|
|
2. Request validation layer is chosen and centralized.
|
|
3. Route naming strategy is frozen:
|
|
- canonical routes under `/core` and `/commands`
|
|
- compatibility aliases preserved during migration (`/uploadFile`, `/createSignedUrl`, `/invokeLLM`)
|
|
4. Validation standard is locked:
|
|
- library: `zod`
|
|
- schema location: `backend/<service>/src/contracts/` with `core/` and `commands/` subfolders
|
|
|
|
## Gate C: Atomicity and reliability
|
|
1. Command endpoints support idempotency keys for retry-safe writes.
|
|
2. Multi-step write flows are wrapped in single backend transaction boundaries.
|
|
3. Domain conflict codes are defined for expected business failures.
|
|
4. Idempotency storage is locked:
|
|
- store in Cloud SQL table
|
|
- key scope: `userId + route + idempotencyKey`
|
|
- retain records for 24 hours
|
|
- repeated key returns original response
|
|
|
|
## Gate D: Automation and operability
|
|
1. Makefile is source of truth for backend setup and deploy in dev.
|
|
2. Core deploy and smoke test commands exist before feature migration.
|
|
3. Logging format and request tracing fields are standardized.
|
|
|
|
## 4) Security baseline for foundation phase
|
|
|
|
## 4.1 Authentication and authorization
|
|
1. Foundation phase is authentication-first.
|
|
2. Role-based access control is intentionally deferred.
|
|
3. All handlers include a policy hook for future role checks (`can(action, resource, actor)`).
|
|
|
|
## 4.2 Data access control model
|
|
1. Client retains Data Connect reads required for existing screens.
|
|
2. High-risk writes move behind `/commands/*` endpoints.
|
|
3. Backend mediates write interactions with Data Connect and Cloud SQL.
|
|
|
|
## 4.3 File and URL security
|
|
1. Validate file type and size server-side.
|
|
2. Separate public and private storage behavior.
|
|
3. Signed URL creation checks ownership/prefix scope and expiry limits.
|
|
4. Bucket policy split is locked:
|
|
- `krow-workforce-dev-public`
|
|
- `krow-workforce-dev-private`
|
|
- private bucket access only through signed URL
|
|
|
|
## 4.4 Model invocation safety
|
|
1. Enforce schema-constrained output.
|
|
2. Apply per-user rate limits and request timeout.
|
|
3. Log model failures with safe redaction (no sensitive prompt leakage in logs).
|
|
4. Model provider and timeout defaults are locked:
|
|
- provider: Vertex AI Gemini
|
|
- max route timeout: 20 seconds
|
|
- timeout error code: `MODEL_TIMEOUT`
|
|
|
|
## 4.5 Secrets and credentials
|
|
1. Runtime secrets come from Secret Manager only.
|
|
2. Service accounts use least-privilege roles.
|
|
3. No secrets committed in repository files.
|
|
|
|
## 5) Modularity baseline
|
|
|
|
## 5.1 Backend module boundaries
|
|
1. `core` module: upload, signed URL, model invocation, health.
|
|
2. `commands` module: business writes and state transitions.
|
|
3. `policy` module: validation and future role checks.
|
|
4. `data` module: Data Connect adapters and transaction wrappers.
|
|
5. `infra` module: logging, tracing, auth middleware, error mapping.
|
|
|
|
## 5.2 Contract separation
|
|
1. Keep API request/response schemas in one location.
|
|
2. Keep domain errors in one registry file.
|
|
3. Keep route declarations thin; business logic in services.
|
|
|
|
## 5.3 Cloud runtime roles
|
|
1. Cloud Run is the primary command and core API execution layer.
|
|
2. Cloud Functions v2 is worker-only in this phase:
|
|
- upload-related async handlers
|
|
- notification jobs
|
|
- model-related async helpers when needed
|
|
|
|
## 6) Automation baseline
|
|
|
|
## 6.1 Makefile requirements
|
|
Add `makefiles/backend.mk` and wire it into root `Makefile` with at least:
|
|
1. `make backend-enable-apis`
|
|
2. `make backend-bootstrap-dev`
|
|
3. `make backend-deploy-core`
|
|
4. `make backend-deploy-commands`
|
|
5. `make backend-deploy-workers`
|
|
6. `make backend-smoke-core`
|
|
7. `make backend-smoke-commands`
|
|
8. `make backend-logs-core`
|
|
|
|
## 6.2 CI requirements
|
|
1. Backend lint
|
|
2. Backend tests
|
|
3. Build/package
|
|
4. Smoke test against deployed dev route(s)
|
|
5. Block merge on failed checks
|
|
|
|
## 6.3 Session hygiene
|
|
1. Update `TASKS.md` and `CHANGELOG.md` each working session.
|
|
2. If a new service/API is added, Makefile target must be added in same change.
|
|
|
|
## 7) Migration safety contract (no frontend breakage)
|
|
1. Backend routes ship first.
|
|
2. Frontend migration is per-feature wave, not big bang.
|
|
3. Keep compatibility aliases until clients migrate.
|
|
4. Keep existing Data Connect reads during foundation.
|
|
5. For each migrated write flow:
|
|
- before/after behavior checklist
|
|
- rollback path
|
|
- smoke verification
|
|
|
|
## 8) Scope for foundation build
|
|
1. Backend runtime/deploy foundation in dev.
|
|
2. Core endpoints:
|
|
- `POST /core/upload-file`
|
|
- `POST /core/create-signed-url`
|
|
- `POST /core/invoke-llm`
|
|
- `GET /healthz`
|
|
3. Compatibility aliases:
|
|
- `POST /uploadFile`
|
|
- `POST /createSignedUrl`
|
|
- `POST /invokeLLM`
|
|
4. Command layer scaffold for first migration routes.
|
|
5. Initial migration of highest-risk write paths.
|
|
|
|
## 9) Implementation phases
|
|
|
|
## Phase 0: Baseline and contracts
|
|
Deliverables:
|
|
1. Freeze endpoint naming and compatibility aliases.
|
|
2. Freeze error envelope and error code registry.
|
|
3. Freeze auth middleware interface and policy hook interface.
|
|
4. Publish route inventory from web/mobile direct writes.
|
|
|
|
Exit criteria:
|
|
1. No unresolved contract ambiguity.
|
|
2. Team agrees on auth-first now and role-map-later approach.
|
|
|
|
## Phase 1: Backend infra and automation
|
|
Deliverables:
|
|
1. `makefiles/backend.mk` with bootstrap, deploy, smoke, logs targets.
|
|
2. Environment templates for backend runtime config.
|
|
3. Secret Manager and service account setup automation.
|
|
|
|
Exit criteria:
|
|
1. A fresh machine can deploy core backend to dev via Make commands.
|
|
|
|
## Phase 2: Core endpoint implementation
|
|
Deliverables:
|
|
1. `/core/upload-file`
|
|
2. `/core/create-signed-url`
|
|
3. `/core/invoke-llm`
|
|
4. `/healthz`
|
|
5. Compatibility aliases (`/uploadFile`, `/createSignedUrl`, `/invokeLLM`)
|
|
|
|
Exit criteria:
|
|
1. API harness passes for core routes.
|
|
2. Error, logging, and auth standards are enforced.
|
|
|
|
## Phase 3: Command layer scaffold
|
|
Deliverables:
|
|
1. `/commands/orders/create`
|
|
2. `/commands/orders/{orderId}/cancel`
|
|
3. `/commands/orders/{orderId}/update`
|
|
4. `/commands/shifts/{shiftId}/change-status`
|
|
5. `/commands/shifts/{shiftId}/assign-staff`
|
|
6. `/commands/shifts/{shiftId}/accept`
|
|
|
|
Exit criteria:
|
|
1. High-risk writes have backend command alternatives ready.
|
|
|
|
## Phase 4: Wave 1 frontend migration
|
|
Deliverables:
|
|
1. Replace direct writes in selected web/mobile flows.
|
|
2. Keep reads stable.
|
|
3. Verify no regressions in non-migrated screens.
|
|
|
|
Exit criteria:
|
|
1. Migrated flows run through backend commands only.
|
|
2. Rollback instructions validated.
|
|
|
|
## Phase 5: Hardening and handoff
|
|
Deliverables:
|
|
1. Runbook for deploy, rollback, and smoke.
|
|
2. Backend CI pipeline active.
|
|
3. Wave 2 and wave 3 migration task list defined.
|
|
|
|
Exit criteria:
|
|
1. Foundation is reusable for staging/prod with environment changes only.
|
|
|
|
## 10) Wave 1 migration inventory (real call sites)
|
|
|
|
Web:
|
|
1. `apps/web/src/features/operations/tasks/TaskBoard.tsx:100`
|
|
2. `apps/web/src/features/operations/orders/OrderDetail.tsx:145`
|
|
3. `apps/web/src/features/operations/orders/EditOrder.tsx:84`
|
|
4. `apps/web/src/features/operations/orders/components/CreateOrderDialog.tsx:31`
|
|
5. `apps/web/src/features/operations/orders/components/AssignStaffModal.tsx:60`
|
|
6. `apps/web/src/features/workforce/documents/DocumentVault.tsx:99`
|
|
|
|
Mobile:
|
|
1. `apps/mobile/packages/features/client/home/lib/src/presentation/widgets/shift_order_form_sheet.dart:232`
|
|
2. `apps/mobile/packages/features/client/view_orders/lib/src/presentation/widgets/view_order_card.dart:1195`
|
|
3. `apps/mobile/packages/features/client/create_order/lib/src/data/repositories_impl/client_create_order_repository_impl.dart:68`
|
|
4. `apps/mobile/packages/features/staff/shifts/lib/src/data/repositories_impl/shifts_repository_impl.dart:446`
|
|
5. `apps/mobile/packages/features/client/authentication/lib/src/data/repositories_impl/auth_repository_impl.dart:257`
|
|
6. `apps/mobile/packages/features/staff/profile_sections/onboarding/profile_info/lib/src/data/repositories/personal_info_repository_impl.dart:51`
|
|
|
|
## 11) Definition of done for foundation
|
|
1. Core endpoints deployed in dev and validated.
|
|
2. Command scaffolding in place for wave 1 writes.
|
|
3. Auth-first protection active on all new routes.
|
|
4. Idempotency + transaction model defined for command writes.
|
|
5. Makefile and CI automation cover bootstrap/deploy/smoke paths.
|
|
6. Frontend remains stable during migration.
|
|
7. Role-map integration points are documented for next phase.
|
|
|
|
## 12) Locked defaults (approved)
|
|
1. Idempotency key storage strategy:
|
|
- Cloud SQL table, 24-hour retention, keyed by `userId + route + idempotencyKey`.
|
|
2. Validation library and schema location:
|
|
- `zod` in `backend/<service>/src/contracts/` (`core/`, `commands/`).
|
|
3. Storage bucket naming and split:
|
|
- `krow-workforce-dev-public` and `krow-workforce-dev-private`.
|
|
4. Model provider and timeout:
|
|
- Vertex AI Gemini, 20-second max timeout.
|
|
5. Target response-time objectives (p95):
|
|
- `/healthz` under 200ms
|
|
- `/core/create-signed-url` under 500ms
|
|
- `/commands/*` under 1500ms
|
|
- `/core/invoke-llm` under 15000ms
|