docs: lock backend foundation plan and tracking format
This commit is contained in:
228
docs/MILESTONES/M4/planning/m4-api-catalog.md
Normal file
228
docs/MILESTONES/M4/planning/m4-api-catalog.md
Normal file
@@ -0,0 +1,228 @@
|
||||
# M4 API Catalog (Implementation Contract)
|
||||
|
||||
Status: Draft
|
||||
Date: 2026-02-24
|
||||
Owner: Technical Lead
|
||||
Environment: dev
|
||||
|
||||
## 1) Scope and purpose
|
||||
This file defines the backend endpoint contract for the M4 foundation build.
|
||||
|
||||
## 2) Global API rules
|
||||
1. Canonical route groups:
|
||||
- `/core/*` for foundational integration routes
|
||||
- `/commands/*` for business-critical writes
|
||||
2. Foundation phase security model:
|
||||
- authenticated user required
|
||||
- role map enforcement deferred
|
||||
- policy hook required in handler design
|
||||
3. Standard error envelope:
|
||||
```json
|
||||
{
|
||||
"code": "STRING_CODE",
|
||||
"message": "Human readable message",
|
||||
"details": {},
|
||||
"requestId": "optional-request-id"
|
||||
}
|
||||
```
|
||||
4. Required request headers:
|
||||
- `Authorization: Bearer <firebase-token>`
|
||||
- `X-Request-Id: <uuid>` (optional but recommended)
|
||||
5. Required response headers:
|
||||
- `X-Request-Id`
|
||||
6. Validation:
|
||||
- all input validated server-side
|
||||
- reject unknown/invalid fields
|
||||
7. Logging:
|
||||
- route
|
||||
- requestId
|
||||
- actorId
|
||||
- latencyMs
|
||||
- outcome
|
||||
8. Timeouts and retries:
|
||||
- command writes must be retry-safe
|
||||
- use idempotency keys for command write routes
|
||||
9. Idempotency storage:
|
||||
- store in Cloud SQL table
|
||||
- key scope: `userId + route + idempotencyKey`
|
||||
- key retention: 24 hours
|
||||
- repeated key returns original response payload
|
||||
|
||||
## 3) Compatibility aliases (transition)
|
||||
1. `POST /uploadFile` -> `POST /core/upload-file`
|
||||
2. `POST /createSignedUrl` -> `POST /core/create-signed-url`
|
||||
3. `POST /invokeLLM` -> `POST /core/invoke-llm`
|
||||
|
||||
## 4) Rate-limit baseline (initial)
|
||||
1. `/core/invoke-llm`: 60 requests per minute per user
|
||||
2. `/core/upload-file`: 30 requests per minute per user
|
||||
3. `/core/create-signed-url`: 120 requests per minute per user
|
||||
4. `/commands/*`: 60 requests per minute per user
|
||||
|
||||
## 4.1 Timeout baseline (initial)
|
||||
1. `/core/invoke-llm`: 20-second hard timeout
|
||||
2. other `/core/*` routes: 10-second timeout
|
||||
3. `/commands/*` routes: 15-second timeout
|
||||
|
||||
## 5) Core routes
|
||||
|
||||
## 5.1 Upload file
|
||||
1. Method and route: `POST /core/upload-file`
|
||||
2. Auth: required
|
||||
3. Idempotency key: optional
|
||||
4. Request: multipart form data
|
||||
- `file` (required)
|
||||
- `category` (optional)
|
||||
- `visibility` (optional: `public` or `private`)
|
||||
5. Success `200`:
|
||||
```json
|
||||
{
|
||||
"fileUri": "gs://bucket/path/file.ext",
|
||||
"contentType": "application/pdf",
|
||||
"size": 12345,
|
||||
"bucket": "krow-uploads-private",
|
||||
"path": "documents/staff/..."
|
||||
}
|
||||
```
|
||||
6. Errors:
|
||||
- `UNAUTHENTICATED`
|
||||
- `INVALID_FILE_TYPE`
|
||||
- `FILE_TOO_LARGE`
|
||||
- `UPLOAD_FAILED`
|
||||
|
||||
## 5.2 Create signed URL
|
||||
1. Method and route: `POST /core/create-signed-url`
|
||||
2. Auth: required
|
||||
3. Idempotency key: optional
|
||||
4. Request:
|
||||
```json
|
||||
{
|
||||
"fileUri": "gs://bucket/path/file.ext",
|
||||
"expiresInSeconds": 300
|
||||
}
|
||||
```
|
||||
5. Success `200`:
|
||||
```json
|
||||
{
|
||||
"signedUrl": "https://...",
|
||||
"expiresAt": "2026-02-24T15:00:00Z"
|
||||
}
|
||||
```
|
||||
6. Errors:
|
||||
- `UNAUTHENTICATED`
|
||||
- `FORBIDDEN_FILE_ACCESS`
|
||||
- `INVALID_EXPIRES_IN`
|
||||
- `SIGN_URL_FAILED`
|
||||
|
||||
## 5.3 Invoke model
|
||||
1. Method and route: `POST /core/invoke-llm`
|
||||
2. Auth: required
|
||||
3. Idempotency key: optional
|
||||
4. Request:
|
||||
```json
|
||||
{
|
||||
"prompt": "...",
|
||||
"responseJsonSchema": {},
|
||||
"fileUrls": []
|
||||
}
|
||||
```
|
||||
5. Success `200`:
|
||||
```json
|
||||
{
|
||||
"result": {},
|
||||
"model": "provider/model-name",
|
||||
"latencyMs": 980
|
||||
}
|
||||
```
|
||||
6. Errors:
|
||||
- `UNAUTHENTICATED`
|
||||
- `INVALID_SCHEMA`
|
||||
- `MODEL_TIMEOUT`
|
||||
- `MODEL_FAILED`
|
||||
7. Provider default:
|
||||
- Vertex AI Gemini
|
||||
|
||||
## 5.4 Health check
|
||||
1. Method and route: `GET /healthz`
|
||||
2. Auth: optional (internal policy)
|
||||
3. Success `200`:
|
||||
```json
|
||||
{
|
||||
"ok": true,
|
||||
"service": "krow-backend",
|
||||
"version": "commit-or-tag"
|
||||
}
|
||||
```
|
||||
|
||||
## 5.5 Storage bucket policy defaults (dev)
|
||||
1. Public bucket: `krow-workforce-dev-public`
|
||||
2. Private bucket: `krow-workforce-dev-private`
|
||||
3. Private objects are never returned directly; only signed URLs are returned.
|
||||
|
||||
## 6) Command routes (wave 1)
|
||||
|
||||
## 6.1 Create order
|
||||
1. Method and route: `POST /commands/orders/create`
|
||||
2. Auth: required
|
||||
3. Idempotency key: required
|
||||
4. Purpose: create order + shifts + roles atomically
|
||||
5. Replaces:
|
||||
- `apps/web/src/features/operations/orders/components/CreateOrderDialog.tsx`
|
||||
- `apps/mobile/packages/features/client/home/lib/src/presentation/widgets/shift_order_form_sheet.dart`
|
||||
- `apps/mobile/packages/features/client/create_order/lib/src/data/repositories_impl/client_create_order_repository_impl.dart`
|
||||
|
||||
## 6.2 Update order
|
||||
1. Method and route: `POST /commands/orders/{orderId}/update`
|
||||
2. Auth: required
|
||||
3. Idempotency key: required
|
||||
4. Purpose: policy-safe multi-entity order update
|
||||
5. Replaces:
|
||||
- `apps/web/src/features/operations/orders/EditOrder.tsx`
|
||||
- `apps/mobile/packages/features/client/view_orders/lib/src/presentation/widgets/view_order_card.dart`
|
||||
|
||||
## 6.3 Cancel order
|
||||
1. Method and route: `POST /commands/orders/{orderId}/cancel`
|
||||
2. Auth: required
|
||||
3. Idempotency key: required
|
||||
4. Purpose: enforce cancellation policy and return explicit conflict code
|
||||
5. Replaces:
|
||||
- `apps/web/src/features/operations/orders/OrderDetail.tsx`
|
||||
|
||||
## 6.4 Change shift status
|
||||
1. Method and route: `POST /commands/shifts/{shiftId}/change-status`
|
||||
2. Auth: required
|
||||
3. Idempotency key: required
|
||||
4. Purpose: enforce state transitions server-side
|
||||
5. Replaces:
|
||||
- `apps/web/src/features/operations/tasks/TaskBoard.tsx`
|
||||
|
||||
## 6.5 Assign staff
|
||||
1. Method and route: `POST /commands/shifts/{shiftId}/assign-staff`
|
||||
2. Auth: required
|
||||
3. Idempotency key: required
|
||||
4. Purpose: assign + count update + conflict checks atomically
|
||||
5. Replaces:
|
||||
- `apps/web/src/features/operations/orders/components/AssignStaffModal.tsx`
|
||||
|
||||
## 6.6 Accept shift
|
||||
1. Method and route: `POST /commands/shifts/{shiftId}/accept`
|
||||
2. Auth: required
|
||||
3. Idempotency key: required
|
||||
4. Purpose: application + counters + rollback-safe behavior in one command
|
||||
5. Replaces:
|
||||
- `apps/mobile/packages/features/staff/shifts/lib/src/data/repositories_impl/shifts_repository_impl.dart`
|
||||
|
||||
## 7) Locked defaults before coding starts
|
||||
1. Idempotency keys are stored in Cloud SQL with 24-hour retention.
|
||||
2. Request validation library is `zod`.
|
||||
3. Validation schema location is `backend/<service>/src/contracts/`.
|
||||
4. Storage buckets are:
|
||||
- `krow-workforce-dev-public`
|
||||
- `krow-workforce-dev-private`
|
||||
5. Model provider is Vertex AI Gemini with a 20-second timeout for `/core/invoke-llm`.
|
||||
|
||||
## 8) Target response-time objectives (p95)
|
||||
1. `/healthz` under 200ms
|
||||
2. `/core/create-signed-url` under 500ms
|
||||
3. `/commands/*` under 1500ms
|
||||
4. `/core/invoke-llm` under 15000ms
|
||||
@@ -0,0 +1,269 @@
|
||||
# M4 Backend Foundation Implementation Plan (Dev First)
|
||||
|
||||
Date: 2026-02-24
|
||||
Owner: Wilfred (Technical Lead)
|
||||
Primary environment: `krow-workforce-dev`
|
||||
|
||||
## 1) Objective
|
||||
Build a secure, modular, and scalable backend foundation in `dev` without breaking the current frontend while we migrate high-risk writes from direct Data Connect mutations to backend command endpoints.
|
||||
|
||||
## 2) First-principles architecture rules
|
||||
1. Client apps are untrusted for business-critical writes.
|
||||
2. Backend is the enforcement layer for validation, permissions, and write orchestration.
|
||||
3. Multi-entity writes must be atomic, idempotent, and observable.
|
||||
4. Configuration and deployment must be reproducible by automation.
|
||||
5. Migration must be backward-compatible until each frontend flow is cut over.
|
||||
|
||||
## 3) Pre-coding gates (must be true before implementation starts)
|
||||
|
||||
## Gate A: Security boundary
|
||||
1. Frontend sends Firebase token only. No database credentials in client code.
|
||||
2. Every new backend endpoint validates Firebase token.
|
||||
3. Data Connect write access strategy is defined:
|
||||
- keep simple reads available to client
|
||||
- route high-risk writes through backend command endpoints
|
||||
4. Upload and signed URL paths are server-controlled.
|
||||
|
||||
## Gate B: Contract standards
|
||||
1. Standard error envelope is frozen:
|
||||
```json
|
||||
{
|
||||
"code": "STRING_CODE",
|
||||
"message": "Human readable message",
|
||||
"details": {},
|
||||
"requestId": "optional-request-id"
|
||||
}
|
||||
```
|
||||
2. Request validation layer is chosen and centralized.
|
||||
3. Route naming strategy is frozen:
|
||||
- canonical routes under `/core` and `/commands`
|
||||
- compatibility aliases preserved during migration (`/uploadFile`, `/createSignedUrl`, `/invokeLLM`)
|
||||
4. Validation standard is locked:
|
||||
- library: `zod`
|
||||
- schema location: `backend/<service>/src/contracts/` with `core/` and `commands/` subfolders
|
||||
|
||||
## Gate C: Atomicity and reliability
|
||||
1. Command endpoints support idempotency keys for retry-safe writes.
|
||||
2. Multi-step write flows are wrapped in single backend transaction boundaries.
|
||||
3. Domain conflict codes are defined for expected business failures.
|
||||
4. Idempotency storage is locked:
|
||||
- store in Cloud SQL table
|
||||
- key scope: `userId + route + idempotencyKey`
|
||||
- retain records for 24 hours
|
||||
- repeated key returns original response
|
||||
|
||||
## Gate D: Automation and operability
|
||||
1. Makefile is source of truth for backend setup and deploy in dev.
|
||||
2. Core deploy and smoke test commands exist before feature migration.
|
||||
3. Logging format and request tracing fields are standardized.
|
||||
|
||||
## 4) Security baseline for foundation phase
|
||||
|
||||
## 4.1 Authentication and authorization
|
||||
1. Foundation phase is authentication-first.
|
||||
2. Role-based access control is intentionally deferred.
|
||||
3. All handlers include a policy hook for future role checks (`can(action, resource, actor)`).
|
||||
|
||||
## 4.2 Data access control model
|
||||
1. Client retains Data Connect reads required for existing screens.
|
||||
2. High-risk writes move behind `/commands/*` endpoints.
|
||||
3. Backend mediates write interactions with Data Connect and Cloud SQL.
|
||||
|
||||
## 4.3 File and URL security
|
||||
1. Validate file type and size server-side.
|
||||
2. Separate public and private storage behavior.
|
||||
3. Signed URL creation checks ownership/prefix scope and expiry limits.
|
||||
4. Bucket policy split is locked:
|
||||
- `krow-workforce-dev-public`
|
||||
- `krow-workforce-dev-private`
|
||||
- private bucket access only through signed URL
|
||||
|
||||
## 4.4 Model invocation safety
|
||||
1. Enforce schema-constrained output.
|
||||
2. Apply per-user rate limits and request timeout.
|
||||
3. Log model failures with safe redaction (no sensitive prompt leakage in logs).
|
||||
4. Model provider and timeout defaults are locked:
|
||||
- provider: Vertex AI Gemini
|
||||
- max route timeout: 20 seconds
|
||||
- timeout error code: `MODEL_TIMEOUT`
|
||||
|
||||
## 4.5 Secrets and credentials
|
||||
1. Runtime secrets come from Secret Manager only.
|
||||
2. Service accounts use least-privilege roles.
|
||||
3. No secrets committed in repository files.
|
||||
|
||||
## 5) Modularity baseline
|
||||
|
||||
## 5.1 Backend module boundaries
|
||||
1. `core` module: upload, signed URL, model invocation, health.
|
||||
2. `commands` module: business writes and state transitions.
|
||||
3. `policy` module: validation and future role checks.
|
||||
4. `data` module: Data Connect adapters and transaction wrappers.
|
||||
5. `infra` module: logging, tracing, auth middleware, error mapping.
|
||||
|
||||
## 5.2 Contract separation
|
||||
1. Keep API request/response schemas in one location.
|
||||
2. Keep domain errors in one registry file.
|
||||
3. Keep route declarations thin; business logic in services.
|
||||
|
||||
## 5.3 Cloud runtime roles
|
||||
1. Cloud Run is the primary command and core API execution layer.
|
||||
2. Cloud Functions v2 is worker-only in this phase:
|
||||
- upload-related async handlers
|
||||
- notification jobs
|
||||
- model-related async helpers when needed
|
||||
|
||||
## 6) Automation baseline
|
||||
|
||||
## 6.1 Makefile requirements
|
||||
Add `makefiles/backend.mk` and wire it into root `Makefile` with at least:
|
||||
1. `make backend-enable-apis`
|
||||
2. `make backend-bootstrap-dev`
|
||||
3. `make backend-deploy-core`
|
||||
4. `make backend-deploy-commands`
|
||||
5. `make backend-deploy-workers`
|
||||
6. `make backend-smoke-core`
|
||||
7. `make backend-smoke-commands`
|
||||
8. `make backend-logs-core`
|
||||
|
||||
## 6.2 CI requirements
|
||||
1. Backend lint
|
||||
2. Backend tests
|
||||
3. Build/package
|
||||
4. Smoke test against deployed dev route(s)
|
||||
5. Block merge on failed checks
|
||||
|
||||
## 6.3 Session hygiene
|
||||
1. Update `TASKS.md` and `CHANGELOG.md` each working session.
|
||||
2. If a new service/API is added, Makefile target must be added in same change.
|
||||
|
||||
## 7) Migration safety contract (no frontend breakage)
|
||||
1. Backend routes ship first.
|
||||
2. Frontend migration is per-feature wave, not big bang.
|
||||
3. Keep compatibility aliases until clients migrate.
|
||||
4. Keep existing Data Connect reads during foundation.
|
||||
5. For each migrated write flow:
|
||||
- before/after behavior checklist
|
||||
- rollback path
|
||||
- smoke verification
|
||||
|
||||
## 8) Scope for foundation build
|
||||
1. Backend runtime/deploy foundation in dev.
|
||||
2. Core endpoints:
|
||||
- `POST /core/upload-file`
|
||||
- `POST /core/create-signed-url`
|
||||
- `POST /core/invoke-llm`
|
||||
- `GET /healthz`
|
||||
3. Compatibility aliases:
|
||||
- `POST /uploadFile`
|
||||
- `POST /createSignedUrl`
|
||||
- `POST /invokeLLM`
|
||||
4. Command layer scaffold for first migration routes.
|
||||
5. Initial migration of highest-risk write paths.
|
||||
|
||||
## 9) Implementation phases
|
||||
|
||||
## Phase 0: Baseline and contracts
|
||||
Deliverables:
|
||||
1. Freeze endpoint naming and compatibility aliases.
|
||||
2. Freeze error envelope and error code registry.
|
||||
3. Freeze auth middleware interface and policy hook interface.
|
||||
4. Publish route inventory from web/mobile direct writes.
|
||||
|
||||
Exit criteria:
|
||||
1. No unresolved contract ambiguity.
|
||||
2. Team agrees on auth-first now and role-map-later approach.
|
||||
|
||||
## Phase 1: Backend infra and automation
|
||||
Deliverables:
|
||||
1. `makefiles/backend.mk` with bootstrap, deploy, smoke, logs targets.
|
||||
2. Environment templates for backend runtime config.
|
||||
3. Secret Manager and service account setup automation.
|
||||
|
||||
Exit criteria:
|
||||
1. A fresh machine can deploy core backend to dev via Make commands.
|
||||
|
||||
## Phase 2: Core endpoint implementation
|
||||
Deliverables:
|
||||
1. `/core/upload-file`
|
||||
2. `/core/create-signed-url`
|
||||
3. `/core/invoke-llm`
|
||||
4. `/healthz`
|
||||
5. Compatibility aliases (`/uploadFile`, `/createSignedUrl`, `/invokeLLM`)
|
||||
|
||||
Exit criteria:
|
||||
1. API harness passes for core routes.
|
||||
2. Error, logging, and auth standards are enforced.
|
||||
|
||||
## Phase 3: Command layer scaffold
|
||||
Deliverables:
|
||||
1. `/commands/orders/create`
|
||||
2. `/commands/orders/{orderId}/cancel`
|
||||
3. `/commands/orders/{orderId}/update`
|
||||
4. `/commands/shifts/{shiftId}/change-status`
|
||||
5. `/commands/shifts/{shiftId}/assign-staff`
|
||||
6. `/commands/shifts/{shiftId}/accept`
|
||||
|
||||
Exit criteria:
|
||||
1. High-risk writes have backend command alternatives ready.
|
||||
|
||||
## Phase 4: Wave 1 frontend migration
|
||||
Deliverables:
|
||||
1. Replace direct writes in selected web/mobile flows.
|
||||
2. Keep reads stable.
|
||||
3. Verify no regressions in non-migrated screens.
|
||||
|
||||
Exit criteria:
|
||||
1. Migrated flows run through backend commands only.
|
||||
2. Rollback instructions validated.
|
||||
|
||||
## Phase 5: Hardening and handoff
|
||||
Deliverables:
|
||||
1. Runbook for deploy, rollback, and smoke.
|
||||
2. Backend CI pipeline active.
|
||||
3. Wave 2 and wave 3 migration task list defined.
|
||||
|
||||
Exit criteria:
|
||||
1. Foundation is reusable for staging/prod with environment changes only.
|
||||
|
||||
## 10) Wave 1 migration inventory (real call sites)
|
||||
|
||||
Web:
|
||||
1. `apps/web/src/features/operations/tasks/TaskBoard.tsx:100`
|
||||
2. `apps/web/src/features/operations/orders/OrderDetail.tsx:145`
|
||||
3. `apps/web/src/features/operations/orders/EditOrder.tsx:84`
|
||||
4. `apps/web/src/features/operations/orders/components/CreateOrderDialog.tsx:31`
|
||||
5. `apps/web/src/features/operations/orders/components/AssignStaffModal.tsx:60`
|
||||
6. `apps/web/src/features/workforce/documents/DocumentVault.tsx:99`
|
||||
|
||||
Mobile:
|
||||
1. `apps/mobile/packages/features/client/home/lib/src/presentation/widgets/shift_order_form_sheet.dart:232`
|
||||
2. `apps/mobile/packages/features/client/view_orders/lib/src/presentation/widgets/view_order_card.dart:1195`
|
||||
3. `apps/mobile/packages/features/client/create_order/lib/src/data/repositories_impl/client_create_order_repository_impl.dart:68`
|
||||
4. `apps/mobile/packages/features/staff/shifts/lib/src/data/repositories_impl/shifts_repository_impl.dart:446`
|
||||
5. `apps/mobile/packages/features/client/authentication/lib/src/data/repositories_impl/auth_repository_impl.dart:257`
|
||||
6. `apps/mobile/packages/features/staff/profile_sections/onboarding/profile_info/lib/src/data/repositories/personal_info_repository_impl.dart:51`
|
||||
|
||||
## 11) Definition of done for foundation
|
||||
1. Core endpoints deployed in dev and validated.
|
||||
2. Command scaffolding in place for wave 1 writes.
|
||||
3. Auth-first protection active on all new routes.
|
||||
4. Idempotency + transaction model defined for command writes.
|
||||
5. Makefile and CI automation cover bootstrap/deploy/smoke paths.
|
||||
6. Frontend remains stable during migration.
|
||||
7. Role-map integration points are documented for next phase.
|
||||
|
||||
## 12) Locked defaults (approved)
|
||||
1. Idempotency key storage strategy:
|
||||
- Cloud SQL table, 24-hour retention, keyed by `userId + route + idempotencyKey`.
|
||||
2. Validation library and schema location:
|
||||
- `zod` in `backend/<service>/src/contracts/` (`core/`, `commands/`).
|
||||
3. Storage bucket naming and split:
|
||||
- `krow-workforce-dev-public` and `krow-workforce-dev-private`.
|
||||
4. Model provider and timeout:
|
||||
- Vertex AI Gemini, 20-second max timeout.
|
||||
5. Target response-time objectives (p95):
|
||||
- `/healthz` under 200ms
|
||||
- `/core/create-signed-url` under 500ms
|
||||
- `/commands/*` under 1500ms
|
||||
- `/core/invoke-llm` under 15000ms
|
||||
Reference in New Issue
Block a user