docs(m4): add verification architecture contract

2026-02-24 11:42:39 -05:00
parent 07dd6609d9
commit f2912a1c32
4 changed files with 216 additions and 0 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -15,3 +15,4 @@
 | 2026-02-24 | 0.1.10 | Hardened core APIs with signed URL ownership/expiry checks, object existence checks, and per-user LLM rate limiting. |
 | 2026-02-24 | 0.1.11 | Added frontend-ready core API guide and linked M4 API catalog to it as source of truth for consumption. |
 | 2026-02-24 | 0.1.12 | Reduced M4 API docs to core-only scope and removed command-route references until command implementation is complete. |
+| 2026-02-24 | 0.1.13 | Added verification architecture contract with endpoint design and workflow split for attire, government ID, and certification. |
--- a/docs/MILESTONES/M4/planning/m4-api-catalog.md
+++ b/docs/MILESTONES/M4/planning/m4-api-catalog.md
@@ -8,6 +8,10 @@ Environment: dev
 ## Frontend source of truth
 Use this file and `docs/MILESTONES/M4/planning/m4-core-api-frontend-guide.md` for core endpoint consumption.

+## Related next-slice contract
+Verification pipeline design (attire, government ID, certification):
+- `docs/MILESTONES/M4/planning/m4-verification-architecture-contract.md`
+
 ## 1) Scope and purpose
 This catalog defines the currently implemented core backend contract for M4.

--- a/docs/MILESTONES/M4/planning/m4-core-api-frontend-guide.md
+++ b/docs/MILESTONES/M4/planning/m4-core-api-frontend-guide.md
@@ -163,3 +163,5 @@ const data = await res.json();
 2. Aliases exist only for migration compatibility.
 3. `requestId` in responses should be logged client-side for debugging.
 4. For 429 on model route, retry with exponential backoff and respect `Retry-After`.
+5. Verification workflows (`attire`, `government_id`, `certification`) are defined in:
+   `docs/MILESTONES/M4/planning/m4-verification-architecture-contract.md`.
--- a/docs/MILESTONES/M4/planning/m4-verification-architecture-contract.md
+++ b/docs/MILESTONES/M4/planning/m4-verification-architecture-contract.md
@@ -0,0 +1,209 @@
+# M4 Verification Architecture Contract (Attire, Government ID, Certification)
+
+Status: Proposed (next implementation slice)  
+Date: 2026-02-24  
+Owner: Technical Lead
+
+## 1) Goal
+Define a single backend verification pipeline for:
+1. `attire`
+2. `government_id`
+3. `certification`
+
+This contract gives the team exact endpoint behavior, state flow, and ownership before coding.
+
+## 2) Principles
+1. Upload is evidence intake, not final verification.
+2. Verification runs asynchronously in backend workers.
+3. Model output is a signal, not legal truth.
+4. High-risk identity decisions require stronger validation and human audit trail.
+5. Every decision is traceable (`who`, `what`, `when`, `why`).
+
+## 3) Verification types and policy
+
+## 3.1 Attire
+1. Primary check: vision model + rule checks.
+2. Typical output: `AUTO_PASS`, `AUTO_FAIL`, or `NEEDS_REVIEW`.
+3. Manual override is always allowed.
+
+## 3.2 Government ID
+1. Required path for mission-critical use: third-party identity verification provider.
+2. Model/OCR can pre-parse fields but does not replace identity verification.
+3. Final status should require either provider success or manual approval by authorized reviewer.
+
+## 3.3 Certification
+1. Preferred path: verify against issuer API/registry when available.
+2. If no issuer API: OCR extraction + manual review.
+3. Keep evidence of the source used for validation.
+
+## 4) State model
+1. `PENDING`
+2. `PROCESSING`
+3. `AUTO_PASS`
+4. `AUTO_FAIL`
+5. `NEEDS_REVIEW`
+6. `APPROVED`
+7. `REJECTED`
+8. `ERROR`
+
+Rules:
+1. `AUTO_*` and `NEEDS_REVIEW` are machine outcomes.
+2. `APPROVED` and `REJECTED` are human outcomes.
+3. All transitions are append-only in audit events.
+
+## 5) API contract
+
+## 5.1 Create verification job
+1. Route: `POST /core/verifications`
+2. Auth: required
+3. Purpose: enqueue verification job for previously uploaded file.
+4. Request:
+```json
+{
+  "type": "attire",
+  "subjectType": "staff",
+  "subjectId": "staff_123",
+  "fileUri": "gs://krow-workforce-dev-private/uploads/<uid>/item.jpg",
+  "rules": {
+    "attireType": "shoes",
+    "expectedColor": "black"
+  },
+  "metadata": {
+    "shiftId": "shift_123"
+  }
+}
+```
+5. Success `202`:
+```json
+{
+  "verificationId": "ver_123",
+  "status": "PENDING",
+  "type": "attire",
+  "requestId": "uuid"
+}
+```
+
+## 5.2 Get verification status
+1. Route: `GET /core/verifications/{verificationId}`
+2. Auth: required
+3. Purpose: polling from frontend.
+4. Success `200`:
+```json
+{
+  "verificationId": "ver_123",
+  "type": "attire",
+  "status": "NEEDS_REVIEW",
+  "confidence": 0.62,
+  "reasons": ["Color uncertain"],
+  "extracted": {
+    "detectedType": "shoe",
+    "detectedColor": "dark"
+  },
+  "provider": {
+    "name": "vertex",
+    "reference": "job_abc"
+  },
+  "review": null,
+  "createdAt": "2026-02-24T15:00:00Z",
+  "updatedAt": "2026-02-24T15:00:04Z",
+  "requestId": "uuid"
+}
+```
+
+## 5.3 Review override
+1. Route: `POST /core/verifications/{verificationId}/review`
+2. Auth: required (reviewer role later; auth-first now + explicit reviewer id logging)
+3. Purpose: final human decision and audit reason.
+4. Request:
+```json
+{
+  "decision": "APPROVED",
+  "note": "Document matches required certification",
+  "reasonCode": "MANUAL_REVIEW"
+}
+```
+5. Success `200`:
+```json
+{
+  "verificationId": "ver_123",
+  "status": "APPROVED",
+  "review": {
+    "decision": "APPROVED",
+    "reviewedBy": "user_456",
+    "reviewedAt": "2026-02-24T15:02:00Z",
+    "note": "Document matches required certification",
+    "reasonCode": "MANUAL_REVIEW"
+  },
+  "requestId": "uuid"
+}
+```
+
+## 5.4 Retry verification job
+1. Route: `POST /core/verifications/{verificationId}/retry`
+2. Auth: required
+3. Purpose: rerun failed or updated checks.
+4. Success `202`: status resets to `PENDING`.
+
+## 6) Worker execution flow
+1. API validates payload and ownership of `fileUri`.
+2. API writes `verification_jobs` row with `PENDING`.
+3. Worker consumes job, marks `PROCESSING`.
+4. Worker selects processor by type:
+- `attire` -> model + rule scorer
+- `government_id` -> provider adapter (+ optional OCR pre-check)
+- `certification` -> issuer API adapter or OCR adapter
+5. Worker writes machine outcome (`AUTO_PASS`, `AUTO_FAIL`, `NEEDS_REVIEW`, or `ERROR`).
+6. Frontend polls status route.
+7. Reviewer finalizes with `APPROVED` or `REJECTED` where needed.
+
+## 7) Data model (minimal)
+
+## 7.1 Table: `verification_jobs`
+1. `id` (pk)
+2. `type` (`attire|government_id|certification`)
+3. `subject_type`, `subject_id`
+4. `owner_uid`
+5. `file_uri`
+6. `status`
+7. `confidence` (nullable)
+8. `reasons_json`
+9. `extracted_json`
+10. `provider_name`, `provider_ref`
+11. `created_at`, `updated_at`
+
+## 7.2 Table: `verification_reviews`
+1. `id` (pk)
+2. `verification_id` (fk)
+3. `decision` (`APPROVED|REJECTED`)
+4. `reviewed_by`
+5. `note`
+6. `reason_code`
+7. `reviewed_at`
+
+## 7.3 Table: `verification_events`
+1. `id` (pk)
+2. `verification_id` (fk)
+3. `from_status`, `to_status`
+4. `actor_type` (`system|reviewer`)
+5. `actor_id`
+6. `details_json`
+7. `created_at`
+
+## 8) Security and compliance notes
+1. Restrict verification file paths to owner-owned upload prefixes.
+2. Never expose raw private bucket URLs directly.
+3. Keep third-party provider secrets in Secret Manager.
+4. Log request and decision IDs for every transition.
+5. For government ID, keep provider response reference and verification timestamp.
+
+## 9) Frontend integration pattern
+1. Upload file via existing `POST /core/upload-file`.
+2. Create verification job with returned `fileUri`.
+3. Poll `GET /core/verifications/{id}` until terminal state.
+4. Show machine status and confidence.
+5. For `NEEDS_REVIEW`, show pending-review UI state.
+
+## 10) Delivery split (recommended)
+1. Wave A (fast): attire verification pipeline end-to-end.
+2. Wave B: certification verification with issuer adapter + review.
+3. Wave C: government ID provider integration + reviewer flow hardening.