10 KiB
10 KiB
M4 Backend Foundation Implementation Plan (Dev First)
Date: 2026-02-24
Owner: Wilfred (Technical Lead)
Primary environment: krow-workforce-dev
1) Objective
Build a secure, modular, and scalable backend foundation in dev without breaking the current frontend while we migrate high-risk writes from direct Data Connect mutations to backend command endpoints.
2) First-principles architecture rules
- Client apps are untrusted for business-critical writes.
- Backend is the enforcement layer for validation, permissions, and write orchestration.
- Multi-entity writes must be atomic, idempotent, and observable.
- Configuration and deployment must be reproducible by automation.
- Migration must be backward-compatible until each frontend flow is cut over.
3) Pre-coding gates (must be true before implementation starts)
Gate A: Security boundary
- Frontend sends Firebase token only. No database credentials in client code.
- Every new backend endpoint validates Firebase token.
- Data Connect write access strategy is defined:
- keep simple reads available to client
- route high-risk writes through backend command endpoints
- Upload and signed URL paths are server-controlled.
Gate B: Contract standards
- Standard error envelope is frozen:
{
"code": "STRING_CODE",
"message": "Human readable message",
"details": {},
"requestId": "optional-request-id"
}
- Request validation layer is chosen and centralized.
- Route naming strategy is frozen:
- canonical routes under
/coreand/commands - compatibility aliases preserved during migration (
/uploadFile,/createSignedUrl,/invokeLLM)
- Validation standard is locked:
- library:
zod - schema location:
backend/<service>/src/contracts/withcore/andcommands/subfolders
Gate C: Atomicity and reliability
- Command endpoints support idempotency keys for retry-safe writes.
- Multi-step write flows are wrapped in single backend transaction boundaries.
- Domain conflict codes are defined for expected business failures.
- Idempotency storage is locked:
- store in Cloud SQL table
- key scope:
userId + route + idempotencyKey - retain records for 24 hours
- repeated key returns original response
Gate D: Automation and operability
- Makefile is source of truth for backend setup and deploy in dev.
- Core deploy and smoke test commands exist before feature migration.
- Logging format and request tracing fields are standardized.
4) Security baseline for foundation phase
4.1 Authentication and authorization
- Foundation phase is authentication-first.
- Role-based access control is intentionally deferred.
- All handlers include a policy hook for future role checks (
can(action, resource, actor)).
4.2 Data access control model
- Client retains Data Connect reads required for existing screens.
- High-risk writes move behind
/commands/*endpoints. - Backend mediates write interactions with Data Connect and Cloud SQL.
4.3 File and URL security
- Validate file type and size server-side.
- Separate public and private storage behavior.
- Signed URL creation checks ownership/prefix scope and expiry limits.
- Bucket policy split is locked:
krow-workforce-dev-publickrow-workforce-dev-private- private bucket access only through signed URL
4.4 Model invocation safety
- Enforce schema-constrained output.
- Apply per-user rate limits and request timeout.
- Log model failures with safe redaction (no sensitive prompt leakage in logs).
- Model provider and timeout defaults are locked:
- provider: Vertex AI Gemini
- max route timeout: 20 seconds
- timeout error code:
MODEL_TIMEOUT
4.5 Secrets and credentials
- Runtime secrets come from Secret Manager only.
- Service accounts use least-privilege roles.
- No secrets committed in repository files.
5) Modularity baseline
5.1 Backend module boundaries
coremodule: upload, signed URL, model invocation, health.commandsmodule: business writes and state transitions.policymodule: validation and future role checks.datamodule: Data Connect adapters and transaction wrappers.inframodule: logging, tracing, auth middleware, error mapping.
5.2 Contract separation
- Keep API request/response schemas in one location.
- Keep domain errors in one registry file.
- Keep route declarations thin; business logic in services.
5.3 Cloud runtime roles
- Cloud Run is the primary command and core API execution layer.
- Cloud Functions v2 is worker-only in this phase:
- upload-related async handlers
- notification jobs
- model-related async helpers when needed
6) Automation baseline
6.1 Makefile requirements
Add makefiles/backend.mk and wire it into root Makefile with at least:
make backend-enable-apismake backend-bootstrap-devmake backend-deploy-coremake backend-deploy-commandsmake backend-deploy-workersmake backend-smoke-coremake backend-smoke-commandsmake backend-logs-core
6.2 CI requirements
- Backend lint
- Backend tests
- Build/package
- Smoke test against deployed dev route(s)
- Block merge on failed checks
6.3 Session hygiene
- Update
TASKS.mdandCHANGELOG.mdeach working session. - If a new service/API is added, Makefile target must be added in same change.
7) Migration safety contract (no frontend breakage)
- Backend routes ship first.
- Frontend migration is per-feature wave, not big bang.
- Keep compatibility aliases until clients migrate.
- Keep existing Data Connect reads during foundation.
- For each migrated write flow:
- before/after behavior checklist
- rollback path
- smoke verification
8) Scope for foundation build
- Backend runtime/deploy foundation in dev.
- Core endpoints:
POST /core/upload-filePOST /core/create-signed-urlPOST /core/invoke-llmGET /healthz
- Compatibility aliases:
POST /uploadFilePOST /createSignedUrlPOST /invokeLLM
- Command layer scaffold for first migration routes.
- Initial migration of highest-risk write paths.
9) Implementation phases
Phase 0: Baseline and contracts
Deliverables:
- Freeze endpoint naming and compatibility aliases.
- Freeze error envelope and error code registry.
- Freeze auth middleware interface and policy hook interface.
- Publish route inventory from web/mobile direct writes.
Exit criteria:
- No unresolved contract ambiguity.
- Team agrees on auth-first now and role-map-later approach.
Phase 1: Backend infra and automation
Deliverables:
makefiles/backend.mkwith bootstrap, deploy, smoke, logs targets.- Environment templates for backend runtime config.
- Secret Manager and service account setup automation.
Exit criteria:
- A fresh machine can deploy core backend to dev via Make commands.
Phase 2: Core endpoint implementation
Deliverables:
/core/upload-file/core/create-signed-url/core/invoke-llm/healthz- Compatibility aliases (
/uploadFile,/createSignedUrl,/invokeLLM)
Exit criteria:
- API harness passes for core routes.
- Error, logging, and auth standards are enforced.
Phase 3: Command layer scaffold
Deliverables:
/commands/orders/create/commands/orders/{orderId}/cancel/commands/orders/{orderId}/update/commands/shifts/{shiftId}/change-status/commands/shifts/{shiftId}/assign-staff/commands/shifts/{shiftId}/accept
Exit criteria:
- High-risk writes have backend command alternatives ready.
Phase 4: Wave 1 frontend migration
Deliverables:
- Replace direct writes in selected web/mobile flows.
- Keep reads stable.
- Verify no regressions in non-migrated screens.
Exit criteria:
- Migrated flows run through backend commands only.
- Rollback instructions validated.
Phase 5: Hardening and handoff
Deliverables:
- Runbook for deploy, rollback, and smoke.
- Backend CI pipeline active.
- Wave 2 and wave 3 migration task list defined.
Exit criteria:
- Foundation is reusable for staging/prod with environment changes only.
10) Wave 1 migration inventory (real call sites)
Web:
apps/web/src/features/operations/tasks/TaskBoard.tsx:100apps/web/src/features/operations/orders/OrderDetail.tsx:145apps/web/src/features/operations/orders/EditOrder.tsx:84apps/web/src/features/operations/orders/components/CreateOrderDialog.tsx:31apps/web/src/features/operations/orders/components/AssignStaffModal.tsx:60apps/web/src/features/workforce/documents/DocumentVault.tsx:99
Mobile:
apps/mobile/packages/features/client/home/lib/src/presentation/widgets/shift_order_form_sheet.dart:232apps/mobile/packages/features/client/view_orders/lib/src/presentation/widgets/view_order_card.dart:1195apps/mobile/packages/features/client/create_order/lib/src/data/repositories_impl/client_create_order_repository_impl.dart:68apps/mobile/packages/features/staff/shifts/lib/src/data/repositories_impl/shifts_repository_impl.dart:446apps/mobile/packages/features/client/authentication/lib/src/data/repositories_impl/auth_repository_impl.dart:257apps/mobile/packages/features/staff/profile_sections/onboarding/profile_info/lib/src/data/repositories/personal_info_repository_impl.dart:51
11) Definition of done for foundation
- Core endpoints deployed in dev and validated.
- Command scaffolding in place for wave 1 writes.
- Auth-first protection active on all new routes.
- Idempotency + transaction model defined for command writes.
- Makefile and CI automation cover bootstrap/deploy/smoke paths.
- Frontend remains stable during migration.
- Role-map integration points are documented for next phase.
12) Locked defaults (approved)
- Idempotency key storage strategy:
- Cloud SQL table, 24-hour retention, keyed by
userId + route + idempotencyKey.
- Validation library and schema location:
zodinbackend/<service>/src/contracts/(core/,commands/).
- Storage bucket naming and split:
krow-workforce-dev-publicandkrow-workforce-dev-private.
- Model provider and timeout:
- Vertex AI Gemini, 20-second max timeout.
- Target response-time objectives (p95):
/healthzunder 200ms/core/create-signed-urlunder 500ms/commands/*under 1500ms/core/invoke-llmunder 15000ms