Files
Krow-workspace/docs/MILESTONES/M4/planning/m4-target-schema-blueprint.md

500 lines
18 KiB
Markdown

# M4 Target Schema Blueprint (Command-Ready)
Status: Draft for team alignment
Date: 2026-02-25
Owner: Technical Lead
## 1) Goal
Define the target database shape we want **before** command-backend implementation, so critical flows are atomic, secure, and scalable.
## 1.1 Stakeholder and tenancy model
This product should be designed as a **multi-tenant platform**.
1. Tenant:
- One staffing company account (example: Legendary Event Staffing and Entertainment).
2. Business:
- A customer/client account owned by a tenant.
3. User:
- A human identity (auth account) that can belong to one or more tenants.
4. Staff:
- A workforce profile linked to a user identity and tenant-scoped operations.
Practical meaning:
1. The same platform can serve multiple staffing companies safely.
2. Data isolation is by `tenant_id`, not only by business/vendor IDs.
3. Not every record starts as a full active user:
- invite-first or pending onboarding records are valid,
- then bound to `user_id` when activation is completed.
4. Business-side users and vendor-side users are partitioned with dedicated membership tables, not only one `userId` owner field.
```mermaid
flowchart LR
U["User identity"] --> M["Tenant membership"]
M --> T["Tenant staffing company"]
T --> B["Business client"]
T --> V["Vendor partner"]
B --> O["Orders and shifts"]
V --> O
```
## 1.2 Stakeholder wheel mapping (current baseline)
The stakeholder labels from the customer workshop map to schema as follows:
1. Buyer (Procurements):
- Buyer users inside a business/client account.
- Schema anchor: `users` + `tenant_memberships` + `business_memberships`.
2. Enterprises (Operator):
- Tenant operator/admin users running staffing operations.
- Schema anchor: `tenants`, `tenant_memberships`, `role_bindings`.
3. Sectors (Execution):
- Operational segments or business units executing events.
- Schema anchor: `teams`, `team_hubs`, `team_hud_departments`, `roles`.
4. Approved Vendor:
- Supplier companies approved to fulfill staffing demand.
- Schema anchor: `vendors`, `vendor_memberships`, `workforce`, `vendor_rates`, `vendor_benefit_plans`.
5. Workforce:
- Individual workers/staff and their assignments.
- Schema anchor: `staffs`, `staff_roles`, `applications`, `assignments`, `certificates`, `staff_documents`.
6. Partner:
- External integration or service partner (future).
- Schema anchor: `stakeholder_profiles` extension path + scoped role bindings.
Rule:
1. Start with baseline stakeholders above.
2. Add new stakeholders via extension tables and role bindings, not by changing core scheduling and finance tables.
## 1.3 Future stakeholder expansion model
To add stakeholders later without breaking core schema:
1. Add `stakeholder_types` (registry).
2. Add `stakeholder_profiles` (`tenant_id`, `type`, `status`, `metadata`).
3. Add `stakeholder_links` (relationship graph across stakeholders).
4. Bind permissions through `role_bindings` with scope (`tenant`, `team`, `hub`, `business`, or specific resource).
## 1.4 Roadmap CSV evidence snapshot
Evidence source:
1. `docs/MILESTONES/M4/planning/m4-roadmap-csv-schema-reconciliation.md`
What the exports confirmed:
1. The product is multi-party and multi-tenant by design (client, operator, vendor, workforce, procurement, partner, dashboard).
2. Attendance and offense enforcement are core business workflows, not side features.
3. Finance requires more than invoices (payment runs, remittance, status history, dispute/audit trace).
4. Compliance requires asynchronous verification and requirement templates by tenant/business/role.
## 2) First-principles rules
1. Every critical write must be server-mediated and transactional.
2. Tenant boundaries must be explicit in data and queries.
3. Money and rates must use exact numeric types, not floating point.
4. Data needed for constraints should be relational, not hidden in JSON blobs.
5. Every high-risk state transition must be auditable and replayable.
## 3) Current anti-patterns we are removing
1. Direct client mutation of core entities.
2. Broad `USER`-auth CRUD without strict tenant scoping.
3. Financial values as `Float`.
4. Core workflow state embedded in generic `Any/jsonb` fields.
5. Missing uniqueness/index constraints on high-traffic paths.
## 4) Target modular schema
## 4.1 Identity and Access
Tables:
1. `users` (source identity, profile, auth linkage)
2. `tenant_memberships` (new; membership + base access per tenant)
3. `business_memberships` (new; user access to business account scope)
4. `vendor_memberships` (new; user access to vendor account scope)
5. `team_members` (membership + scope per team)
6. `roles` (new)
7. `permissions` (new)
8. `role_bindings` (new; who has which role in which scope)
Rules:
1. Unique tenant membership: `(tenant_id, user_id)`.
2. Unique business membership: `(business_id, user_id)`.
3. Unique vendor membership: `(vendor_id, user_id)`.
4. Unique team membership: `(team_id, user_id)`.
5. Access checks resolve through tenant membership first, then business/vendor/team scope.
## 4.2 Organization and Tenant
Tables:
1. `tenants` (new canonical boundary: business/vendor ownership root)
2. `businesses`
3. `vendors`
4. `teams`
5. `team_hubs`
6. `hubs`
Rules:
1. Every command-critical row references `tenant_id`.
2. All list queries must include tenant predicate.
3. Business and vendor routes must enforce membership scope before data access.
## 4.8 RBAC rollout strategy (deferred enforcement)
RBAC should be introduced in phases and **not enforced everywhere immediately**.
Phase A: Auth-first (now)
1. Require valid auth token.
2. Resolve tenant context.
3. Allow current work to continue while logging actor + tenant + action.
Phase B: Shadow RBAC
1. Evaluate permissions (`allow`/`deny`) in backend.
2. Log decisions but do not block most requests yet.
3. Start with warnings and dashboards for denied actions.
Phase C: Enforced RBAC on command writes
1. Enforce RBAC on `/commands/*` only.
2. Keep low-risk read flows in transition mode.
Phase D: Enforced RBAC on high-risk reads
1. Enforce tenant and role checks on sensitive read connectors.
2. Remove remaining broad user-level access.
```mermaid
flowchart LR
A["Auth only"] --> B["Shadow RBAC logging"]
B --> C["Enforce RBAC on command writes"]
C --> D["Enforce RBAC on sensitive reads"]
```
## 4.3 Scheduling and Orders
Tables:
1. `orders`
2. `order_schedule_rules` (new; replaces schedule JSON fields)
3. `shifts`
4. `shift_roles`
5. `shift_role_requirements` (optional extension for policy rules)
6. `shift_managers` (new; replaces `managers: [Any!]`)
Rules:
1. No denormalized `assignedStaff` or `shifts` JSON in `orders`.
2. Time constraints: `start_time < end_time`.
3. Capacity constraints: `assigned <= count`, `filled <= workers_needed`.
4. Canonical status names (single spelling across schema).
## 4.4 Staffing and Matching
Tables:
1. `staffs`
2. `staff_roles`
3. `workforce`
4. `applications`
5. `assignments`
6. `staff_reviews`
7. `staff_favorites`
Rules:
1. One active workforce relation per `(vendor_id, staff_id)`.
2. One application per `(shift_id, role_id, staff_id)` unless versioned intentionally.
3. Assignment state transitions only through command APIs.
4. Business quality signals are relational:
- `staff_reviews` stores rating and review text from businesses,
- `staff_favorites` stores reusable staffing preferences,
- aggregate rating is materialized on `staffs`.
## 4.5 Compliance and Verification
Tables:
1. `documents`
2. `staff_documents`
3. `certificates`
4. `verification_jobs`
5. `verification_reviews`
6. `verification_events`
Rules:
1. Verification is asynchronous and append-only for events.
2. Manual review is explicit and tracked.
3. Government ID and certification provider references are persisted.
## 4.6 Financial and Payout
Tables:
1. `invoices`
2. `invoice_templates`
3. `recent_payments`
4. `accounts` (refactor to tokenized provider references)
Rules:
1. Replace monetary `Float` with exact numeric (`DECIMAL(12,2)` or integer cents).
2. Do not expose raw account/routing values in query connectors.
3. Add one-primary-account constraint per owner.
## 4.7 Audit and Reliability
Tables:
1. `domain_events` (new)
2. `idempotency_keys` (already started in command API SQL)
3. `activity_logs`
Rules:
1. Every command write emits a domain event.
2. Idempotency scope: `(actor_uid, route, idempotency_key)`.
## 4.9 Attendance, Timesheets, and Offense Governance
Tables:
1. `clock_points` (approved tap and geo validation points per business or venue)
2. `attendance_events` (append-only: clock-in/out, source, NFC, geo, correction metadata)
3. `attendance_sessions` (derived work session per assignment)
4. `timesheets` (approval-ready payroll snapshot)
5. `timesheet_adjustments` (manual edits with reason and actor)
6. `offense_policies` (tenant/business scoped policy set)
7. `offense_rules` (threshold ladder and consequence)
8. `offense_events` (actual violation events)
9. `enforcement_actions` (warning, suspension, disable, block)
Rules:
1. Attendance corrections are additive events, not destructive overwrites.
2. NFC and geo validation happens against `clock_points`, not hardcoded client logic.
3. Rejected attendance attempts are still logged as events for audit.
4. Offense consequences are computed from policy + history and persisted as explicit actions.
5. Manual overrides require actor, reason, and timestamp in audit trail.
## 4.10 Stakeholder Network Extensibility
Tables:
1. `stakeholder_types` (buyer, operator, vendor, workforce, partner, future types)
2. `stakeholder_profiles` (tenant-scoped typed profile)
3. `stakeholder_links` (explicit relationship graph between profiles)
Rules:
1. New stakeholder categories are added by data, not by schema rewrites to core workflow tables.
2. Permission scope resolves through role bindings plus stakeholder links where needed.
3. Scheduling and finance records remain stable while stakeholder topology evolves.
## 5) Target core model (conceptual)
```mermaid
erDiagram
TENANT ||--o{ BUSINESS : owns
TENANT ||--o{ VENDOR : owns
TENANT ||--o{ TEAM : owns
TEAM ||--o{ TEAM_MEMBER : has
USER ||--o{ TEAM_MEMBER : belongs_to
USER ||--o{ BUSINESS_MEMBERSHIP : belongs_to
USER ||--o{ VENDOR_MEMBERSHIP : belongs_to
BUSINESS ||--o{ BUSINESS_MEMBERSHIP : has
VENDOR ||--o{ VENDOR_MEMBERSHIP : has
BUSINESS ||--o{ ORDER : requests
VENDOR ||--o{ ORDER : fulfills
ORDER ||--o{ ORDER_SCHEDULE_RULE : has
ORDER ||--o{ SHIFT : expands_to
SHIFT ||--o{ SHIFT_ROLE : requires
SHIFT ||--o{ SHIFT_MANAGER : has
USER ||--o{ STAFF : identity
STAFF ||--o{ STAFF_ROLE : skills
VENDOR ||--o{ WORKFORCE : contracts
STAFF ||--o{ WORKFORCE : linked
SHIFT_ROLE ||--o{ APPLICATION : receives
STAFF ||--o{ APPLICATION : applies
SHIFT_ROLE ||--o{ ASSIGNMENT : allocates
WORKFORCE ||--o{ ASSIGNMENT : executes
ASSIGNMENT ||--o{ ATTENDANCE_EVENT : emits
ASSIGNMENT ||--o{ TIMESHEET : settles
OFFENSE_POLICY ||--o{ OFFENSE_RULE : defines
ASSIGNMENT ||--o{ OFFENSE_EVENT : may_trigger
OFFENSE_EVENT ||--o{ ENFORCEMENT_ACTION : causes
STAFF ||--o{ CERTIFICATE : has
STAFF ||--o{ STAFF_DOCUMENT : uploads
DOCUMENT ||--o{ STAFF_DOCUMENT : references
STAFF ||--o{ VERIFICATION_JOB : subject
VERIFICATION_JOB ||--o{ VERIFICATION_REVIEW : reviewed_by
VERIFICATION_JOB ||--o{ VERIFICATION_EVENT : logs
ORDER ||--o{ INVOICE : billed_by
INVOICE ||--o{ RECENT_PAYMENT : settles
TENANT ||--o{ ACCOUNT_TOKEN_REF : payout_method
INVOICE ||--o{ INVOICE_LINE_ITEM : details
PAYMENT_RUN ||--o{ PAYMENT_ALLOCATION : allocates
INVOICE ||--o{ PAYMENT_ALLOCATION : receives
PAYMENT_RUN ||--o{ REMITTANCE_DOCUMENT : publishes
ORDER ||--o{ DOMAIN_EVENT : emits
SHIFT ||--o{ DOMAIN_EVENT : emits
ASSIGNMENT ||--o{ DOMAIN_EVENT : emits
STAKEHOLDER_TYPE ||--o{ STAKEHOLDER_PROFILE : classifies
STAKEHOLDER_PROFILE ||--o{ STAKEHOLDER_LINK : relates
```
## 6) Command write boundary on this schema
```mermaid
flowchart LR
A["Frontend app"] --> B["Command API"]
B --> C["Policy + validation"]
C --> D["Single database transaction"]
D --> E["orders, shifts, shift_roles, applications, assignments"]
D --> F["domain_events + idempotency_keys"]
E --> G["Read models and reports"]
```
## 7) Minimum constraints and indexes to add before command build
## 7.1 Constraints
1. `shift_roles`: check `assigned >= 0 AND assigned <= count`.
2. `shifts`: check `start_time < end_time`.
3. `applications`: unique `(shift_id, role_id, staff_id)`.
4. `workforce`: unique active `(vendor_id, staff_id)`.
5. `team_members`: unique `(team_id, user_id)`.
6. `accounts` (or token ref table): unique primary per owner.
7. `attendance_events`: unique idempotency tuple (for example `(assignment_id, source_event_id)`).
8. `offense_rules`: unique `(policy_id, trigger_type, threshold_count)`.
## 7.2 Indexes
1. `orders (tenant_id, status, date)`.
2. `shifts (order_id, date, status)`.
3. `shift_roles (shift_id, role_id, start_time)`.
4. `applications (shift_id, role_id, status, created_at)`.
5. `assignments (workforce_id, shift_id, role_id, status)`.
6. `verification_jobs (subject_id, type, status, created_at)`.
7. `invoices (business_id, vendor_id, status, due_date)`.
8. `attendance_events (assignment_id, event_time, event_type)`.
9. `offense_events (staff_id, occurred_at, offense_type, status)`.
10. `invoice_line_items (invoice_id, line_type, created_at)`.
## 8) Data type normalization
1. Monetary: `Float -> DECIMAL(12,2)` (or integer cents).
2. Generic JSON fields in core scheduling: split into relational tables.
3. Timestamps: store UTC and enforce server-generated creation/update fields.
## 9) Security boundary in schema/connectors
1. Remove broad list queries for sensitive entities unless tenant-scoped.
2. Strip sensitive fields from connector query payloads (bank/routing).
3. Keep high-risk mutations behind command API; Data Connect remains read-first for client.
## 10) Migration phases (schema-first)
```mermaid
flowchart TD
P0["Phase 0: Safety patch
- lock sensitive fields
- enforce tenant-scoped queries
- freeze new direct write connectors"] --> P1["Phase 1: Core constraints
- add unique/check constraints
- add indexes
- normalize money types"]
P1 --> P2["Phase 2: Tenant and RBAC base tables
- add tenants and tenant_memberships
- add roles permissions role_bindings
- run RBAC in shadow mode"]
P2 --> P3["Phase 3: Scheduling normalization
- remove order JSON workflow fields
- add order_schedule_rules and shift_managers
- add attendance and offense base tables"]
P3 --> P4["Phase 4: Command rollout
- command writes on hardened schema
- emit domain events + idempotency
- enforce RBAC for command routes
- add finance settlement tables for payment runs and remittance"]
P4 --> P5["Phase 5: Read migration + cleanup
- migrate frontend reads as needed
- enforce RBAC for sensitive reads
- retire deprecated connectors"]
```
## 11) Definition of ready for command backend
1. P0 and P1 complete in `dev`.
2. Tenant scoping verified in connector tests.
3. Sensitive field exposure removed.
4. Core transaction invariants enforced by schema constraints.
5. Command API contracts mapped to new normalized tables.
6. RBAC is in shadow mode with decision logs in place (not hard-blocking yet).
7. Attendance and offense tables are ready for policy-driven command routes.
8. Finance settlement tables (`invoice_line_items`, `payment_runs`, `payment_allocations`) are available.
## 12) Full current model relationship map (all models)
```mermaid
flowchart LR
Account["Account"]
ActivityLog["ActivityLog"]
Application["Application"]
Assignment["Assignment"]
AttireOption["AttireOption"]
BenefitsData["BenefitsData"]
Business["Business"]
Category["Category"]
Certificate["Certificate"]
ClientFeedback["ClientFeedback"]
Conversation["Conversation"]
Course["Course"]
CustomRateCard["CustomRateCard"]
Document["Document"]
EmergencyContact["EmergencyContact"]
FaqData["FaqData"]
Hub["Hub"]
Invoice["Invoice"]
InvoiceTemplate["InvoiceTemplate"]
Level["Level"]
MemberTask["MemberTask"]
Message["Message"]
Order["Order"]
RecentPayment["RecentPayment"]
Role["Role"]
RoleCategory["RoleCategory"]
Shift["Shift"]
ShiftRole["ShiftRole"]
Staff["Staff"]
StaffAvailability["StaffAvailability"]
StaffAvailabilityStats["StaffAvailabilityStats"]
StaffCourse["StaffCourse"]
StaffDocument["StaffDocument"]
StaffRole["StaffRole"]
Task["Task"]
TaskComment["TaskComment"]
TaxForm["TaxForm"]
Team["Team"]
TeamHub["TeamHub"]
TeamHudDepartment["TeamHudDepartment"]
TeamMember["TeamMember"]
User["User"]
UserConversation["UserConversation"]
Vendor["Vendor"]
VendorBenefitPlan["VendorBenefitPlan"]
VendorRate["VendorRate"]
Workforce["Workforce"]
Application --> Shift
Application --> ShiftRole
Application --> Staff
Assignment --> ShiftRole
Assignment --> Workforce
BenefitsData --> Staff
BenefitsData --> VendorBenefitPlan
Certificate --> Staff
ClientFeedback --> Business
ClientFeedback --> Vendor
Course --> Category
Invoice --> Business
Invoice --> Order
Invoice --> Vendor
InvoiceTemplate --> Business
InvoiceTemplate --> Order
InvoiceTemplate --> Vendor
MemberTask --> Task
MemberTask --> TeamMember
Message --> User
Order --> Business
Order --> TeamHub
Order --> Vendor
RecentPayment --> Application
RecentPayment --> Invoice
Shift --> Order
ShiftRole --> Role
ShiftRole --> Shift
StaffAvailability --> Staff
StaffAvailabilityStats --> Staff
StaffDocument --> Document
StaffRole --> Role
StaffRole --> Staff
TaskComment --> TeamMember
TeamHub --> Team
TeamHudDepartment --> TeamHub
TeamMember --> Team
TeamMember --> TeamHub
TeamMember --> User
UserConversation --> Conversation
UserConversation --> User
VendorBenefitPlan --> Vendor
VendorRate --> Vendor
Workforce --> Staff
Workforce --> Vendor
```