Files
Krow-workspace/docs/RELEASE/HOTFIX_PROCESS.md
Achintha Isuru 085445e730 feat: add comprehensive release process documentation and version file references
- Introduced RELEASE_VISUAL_GUIDE.md for a visual overview of the release pipeline, including development, staging, and production phases.
- Created RELEASE_WORKFLOW.md detailing step-by-step release procedures for single and multi-product releases, including hotfix processes.
- Added VERSION_FILES_REFERENCE.md to outline all necessary version file updates for each product during releases, ensuring consistency and completeness.
2026-03-05 10:49:09 -05:00

6.4 KiB

Hotfix Process

For Emergency Production Fixes


🚨 When to Hotfix

Use hotfix when:

  • Critical bug in production affecting users
  • Data loss or security vulnerability
  • Service unavailable or major feature broken
  • Customer-blocking issue

Don't use hotfix for:

  • Minor bugs (can wait for next release)
  • Feature requests
  • Nice-to-have improvements
  • Styling issues

🔄 Hotfix Process

Step 1: Assess & Declare Emergency

Issue: [Brief description]
Severity: CRITICAL / HIGH / MEDIUM
Product: [Staff Mobile / Client Mobile / Web / Backend]
Environment: Production
Impact: [How many users affected]

Once severity confirmed → Start hotfix immediately.


Step 2: Create Hotfix Branch

# From production tag
git checkout -b hotfix/krow-withus-worker-mobile-v0.1.1 \
  krow-withus-worker-mobile/prod-v0.1.0

# Verify you're on the right tag
git log -1 --oneline

Format: hotfix/<product>-v<PATCH+1>


Step 3: Fix the Bug

# Make your fix
# Edit files, test locally

# Commit with clear message
git commit -m "fix: [issue description]

HOTFIX for production
Issue: [what happened]
Solution: [what was fixed]
Tested: [how was it tested locally]"

Keep it minimal:

  • Only fix the specific bug
  • Don't refactor or optimize
  • Don't add new features

Step 4: Update Version

Update PATCH version only (0.1.0 → 0.1.1):

For Mobile (apps/mobile/apps/*/pubspec.yaml):

# Old
version: 0.1.0+5

# New
version: 0.1.1+6  # Only PATCH changed

For Web (apps/web/package.json):

"version": "0.1.1"

For Backend (backend/*/package.json):

"version": "0.1.1"

Step 5: Update CHANGELOG

Add entry to top of appropriate CHANGELOG:

| 2026-03-05 | 0.1.1 | HOTFIX: [Issue fixed] |

(previous entries below...)

Step 6: Code Review (Expedited)

# Push hotfix branch
git push origin hotfix/krow-withus-worker-mobile-v0.1.1

# Create PR on GitHub with URGENT label
gh pr create --title "HOTFIX: [Issue description]" \
  --body "**URGENT PRODUCTION FIX**

Issue: [What was broken]
Impact: [Users affected]
Solution: [What was fixed]
Testing: [Local verification]"

Get approval within 15 minutes if possible.


Step 7: Merge to Main

# Review complete - merge
git checkout main
git pull origin main
git merge --ff-only hotfix/krow-withus-worker-mobile-v0.1.1
git push origin main

Step 8: Create Production Tag

# Create tag from main
git tag -a krow-withus-worker-mobile/prod-v0.1.1 \
  -m "HOTFIX: [Issue fixed]"

git push origin krow-withus-worker-mobile/prod-v0.1.1

Step 9: Deploy to Production

# Follow your deployment procedure
# Higher priority than normal releases

./scripts/deploy-mobile-production.sh krow-withus-worker-mobile/prod-v0.1.1

Deployment time: Within 30 minutes of approval


Step 10: Verify & Monitor

# Smoke tests
- App launches
- Core features work
- No new errors

# Monitor for 2 hours
- Watch error logs
- Check user reports
- Verify fix worked

Step 11: Communicate

Immediately after deployment:

🚨 PRODUCTION HOTFIX DEPLOYED

Product: Worker Mobile
Version: 0.1.1
Issue: [Fixed issue]
Impact: [Resolved for X users]
Status: ✅ Deployed & verified

No user action required.
Service restored to normal.

24 hours later:

✅ HOTFIX STATUS UPDATE

Production hotfix v0.1.1 deployed 24 hours ago.
Zero errors reported post-deployment.
System stable.

Thank you for your patience!

⏱️ Timeline

T-0:    Issue detected & reported
T+5min: Severity assessed, hotfix branch created
T+15:   Fix implemented, code review started
T+30:   Approved & merged, tag created
T+45:   Deployed to production
T+60:   Smoke tests pass, monitoring enabled
T+120:  Declare emergency resolved, communicate
T+1day: Follow-up communication

Total time: 2-4 hours from detection to resolution


🚫 Common Mistakes to Avoid

Don't:

  • Skip code review (even in emergency)
  • Add multiple unrelated fixes in one hotfix
  • Forget to update version number
  • Forget CHANGELOG entry
  • Deploy without testing
  • Forget to communicate with users

Do:

  • Keep hotfix minimal and focused
  • Test every fix locally first
  • Get at least one approval
  • Update all version files
  • Deploy immediately after approval
  • Monitor actively for 2+ hours

📋 Hotfix Checklist

Copy for each emergency:

Hotfix: [Product] v[Old Version] → v[New Version]

□ Severity assessed & documented
□ Branch created from production tag
□ Bug fixed & tested locally
□ Version number updated (PATCH only)
□ CHANGELOG entry added
□ Commit message clear
□ Code review requested (marked URGENT)
□ Approval obtained
□ Merged to main
□ Production tag created
□ Tag pushed to remote
□ Deployed to production
□ Smoke tests passed
□ Error logs monitored (2+ hours)
□ Users notified
□ GitHub Release created
□ Incident documented

Total Time: ___ minutes

🔍 Post-Incident

After emergency is resolved:

  1. Document what happened

    • Root cause analysis
    • Why it wasn't caught before
    • What testing was missed
  2. Schedule postmortem (within 24 hours)

    • Review what went wrong
    • Discuss prevention
    • Update processes if needed
  3. Plan prevention

    • Add test coverage
    • Update CI/CD checks
    • Improve monitoring
  4. Communicate findings

    • Share with team
    • Update documentation
    • Prevent recurrence

📞 Emergency Contacts

When issue detected:

  1. Notify:

    • Release Engineer
    • DevOps
    • Product Owner
    • Affected Team
  2. Communication Channel:

    • Slack: #emergency-releases
    • Time-sensitive decisions on call
  3. Decision Maker:

    • Product Owner approves rollback vs hotfix
    • Release Engineer executes
    • DevOps monitors infrastructure


Last Updated: 2026-03-05
Severity Levels:

  • 🔴 CRITICAL: Service down, data loss, security (< 1 hour)
  • 🟠 HIGH: Major feature broken, workaround available (< 4 hours)
  • 🟡 MEDIUM: Minor feature affected (next release OK)