Encryption Key Rotation Procedure
Overview
HIPAA requires regular rotation of encryption keys as part of security best practices. This procedure ensures encrypted PHI remains protected through controlled key lifecycle management.
Frequency: Quarterly (every 90 days) Scope: Application-level encryption keys (AES-256-GCM) Downtime: Zero (rotation is online, re-encryption is background job)
Rotation Schedule
Calendar
Q1: February 1 Q2: May 1 Q3: August 1 Q4: November 1
Automated Reminders
GitHub Actions Workflow: .github/workflows/key-rotation-reminder.yml
name: Key Rotation Reminder
on:
schedule:
# First day of every quarter at 9am UTC
- cron: '0 9 1 2,5,8,11 *'
workflow_dispatch: # Allow manual trigger for testing
jobs:
reminder:
runs-on: ubuntu-latest
steps:
- name: Send Slack Notification
uses: slackapi/[email protected]
with:
payload: |
{
"text": "🔐 QUARTERLY KEY ROTATION DUE",
"blocks": [
{
"type": "header",
"text": {
"type": "plain_text",
"text": "🔐 Encryption Key Rotation Required"
}
},
{
"type": "section",
"text": {
"type": "mrkdwn",
"text": "*Action Required:* Rotate encryption keys for HIPAA compliance.\n\n*Procedure:* docs/reference/key-rotation.md\n*Tables affected:* patient_persons.phone_encrypted, patient_persons.emergency_contact_phone_encrypted, organization_integrations.api_key_encrypted"
}
},
{
"type": "section",
"fields": [
{
"type": "mrkdwn",
"text": "*Deadline:* Within 7 days"
},
{
"type": "mrkdwn",
"text": "*Assignee:* @security-team"
}
]
},
{
"type": "actions",
"elements": [
{
"type": "button",
"text": {
"type": "plain_text",
"text": "View Procedure"
},
"url": "https://github.com/restartix/restartix-platform/blob/main/docs/reference/key-rotation.md"
}
]
}
]
}
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_SECURITY_WEBHOOK }}
- name: Create GitHub Issue
uses: actions/github-script@v7
with:
script: |
github.rest.issues.create({
owner: context.repo.owner,
repo: context.repo.repo,
title: '🔐 Q' + Math.ceil((new Date().getMonth() + 1) / 3) + ' Encryption Key Rotation',
body: `## Quarterly Key Rotation Due
**Deadline:** ${new Date(Date.now() + 7*24*60*60*1000).toISOString().split('T')[0]}
**Procedure:** [docs/reference/key-rotation.md](https://github.com/restartix/restartix-platform/blob/main/docs/reference/key-rotation.md)
**Pre-rotation checklist:**
- [ ] Verify backups are current
- [ ] Generate new encryption key
- [ ] Add new key to AWS Secrets Manager
- [ ] Deploy with multi-version support
- [ ] Run re-encryption job
- [ ] Verify all data re-encrypted
- [ ] Remove old key after 24 hours
- [ ] Document in audit log
**Tables to re-encrypt:**
- \`patient_persons.phone_encrypted\`
- \`patient_persons.emergency_contact_phone_encrypted\`
- \`organization_integrations.api_key_encrypted\`
cc: @security-team @devops
`,
labels: ['security', 'compliance', 'hipaa']
})Alternative: Google Calendar If GitHub Actions is not preferred, set up recurring Google Calendar events:
- Event: "Encryption Key Rotation - Q1/Q2/Q3/Q4"
- Recurrence: Quarterly on first day of Feb/May/Aug/Nov
- Reminders: 7 days before, 3 days before, day of
- Attendees: security team, DevOps lead, compliance officer
Pre-Rotation Checklist
Execute 24 hours before rotation:
[ ] Verify backups are current
bash# Check last RDS backup aws rds describe-db-snapshots \ --db-instance-identifier restartix-prod \ --max-records 1 \ --query 'DBSnapshots[0].[DBSnapshotIdentifier,SnapshotCreateTime]' # Should be within last 24 hours[ ] Count encrypted records
sql-- How many records need re-encryption? SELECT (SELECT COUNT(*) FROM patient_persons WHERE phone_encrypted IS NOT NULL) AS patient_phones, (SELECT COUNT(*) FROM patient_persons WHERE emergency_contact_phone_encrypted IS NOT NULL) AS emergency_phones, (SELECT COUNT(*) FROM organization_integrations) AS org_api_keys; -- Estimate time: ~500 records/sec -- 10,000 records = ~20 seconds -- 100,000 records = ~3 minutes[ ] Test new key generation
bash# Generate test key openssl rand -hex 32 # Should output 64 hex characters[ ] Verify re-encryption tool is ready
bash# Build re-encryption tool cd cmd/tools/reencrypt go build -o reencrypt ./reencrypt --help[ ] Communication plan
- Notify team in Slack #engineering
- No user-facing downtime expected
- If issues arise, rollback plan ready (see below)
Rotation Procedure
Step 1: Generate New Key
# Generate 32-byte (256-bit) AES key
openssl rand -hex 32 > new-encryption-key.txt
# Verify it's 64 hex characters
wc -c new-encryption-key.txt
# Should output: 64
# SECURITY: Never commit this file to git
# Add to .gitignore if not already there
echo "new-encryption-key.txt" >> .gitignoreStore securely:
- Copy to password manager (1Password, LastPass, etc.)
- Label: "RestartiX Encryption Key V2 - Feb 2026"
- Share with: Security team only
Step 2: Add New Key to AWS Secrets Manager
# Option A: AWS CLI
aws secretsmanager update-secret \
--secret-id restartix-prod/env \
--secret-string '{"ENCRYPTION_KEY_V1":"<old_key>","ENCRYPTION_KEY_V2":"'"$(cat new-encryption-key.txt)"'","ENCRYPTION_CURRENT_VERSION":"2"}'
# Verify the secret was updated
aws secretsmanager get-secret-value \
--secret-id restartix-prod/env \
--query 'SecretString' | grep ENCRYPTION_KEY
# Expected keys present:
# ENCRYPTION_KEY_V1=<old_key_64_chars>
# ENCRYPTION_KEY_V2=<new_key_64_chars>
# ENCRYPTION_CURRENT_VERSION=2Option B: AWS Console
- Go to the AWS Secrets Manager console
- Select the
restartix-prod/envsecret - Click "Retrieve secret value" then "Edit"
- Add or update the following key/value pairs:
ENCRYPTION_KEY_V2= (paste 64-char key)ENCRYPTION_CURRENT_VERSION=2
- Keep
ENCRYPTION_KEY_V1unchanged - Click "Save"
For more details on the AWS setup, see AWS Infrastructure.
Step 3: Deploy with Multi-Version Support
# Trigger deployment via GitHub Actions (push to main)
git push origin main
# Verify deployment
curl https://api.restartix.com/health | jq .
# Check logs for successful startup via CloudWatch
aws logs tail /ecs/restartix-core-api --since 10m --followWhat happens:
- New encryptions use V2 key
- Old encrypted data still readable with V1 key
- Zero downtime, gradual rollout
Step 4: Run Re-Encryption Job
Option A: ECS Task (Recommended)
# Run as a one-off ECS task
aws ecs run-task \
--cluster restartix-prod \
--task-definition restartix-reencrypt \
--launch-type FARGATE \
--overrides '{"containerOverrides":[{"name":"reencrypt","command":["go","run","cmd/tools/reencrypt/main.go"]}]}'
# Monitor output via CloudWatch Logs
aws logs tail /ecs/restartix-reencrypt --followOption B: Local Execution (Staging Only)
# Set DATABASE_URL to target database
export DATABASE_URL="postgres://..."
export ENCRYPTION_KEY_V1="<old_key>"
export ENCRYPTION_KEY_V2="<new_key>"
export ENCRYPTION_CURRENT_VERSION=2
# Run re-encryption
go run cmd/tools/reencrypt/main.go
# Expected output:
# [INFO] Starting re-encryption job
# [INFO] Found 1,234 patient_persons with encrypted phones
# [INFO] Found 890 patient_persons with encrypted emergency contact phones
# [INFO] Found 56 org integrations with encrypted API keys
# [INFO] Re-encrypting patient_persons.phone_encrypted... (batch 1/13)
# [INFO] Re-encrypted 100 patient_persons (8.1%)
# [INFO] Re-encrypted 200 patient_persons (16.2%)
# ...
# [INFO] Re-encryption complete: 1,234 patient phones, 890 emergency phones, 56 org integrations
# [INFO] Elapsed time: 2.4 secondsRe-Encryption Tool: cmd/tools/reencrypt/main.go
package main
import (
"context"
"fmt"
"log/slog"
"os"
"time"
"github.com/jackc/pgx/v5/pgxpool"
"restartix-api/internal/config"
"restartix-api/internal/pkg/crypto"
)
func main() {
ctx := context.Background()
// Load config
cfg, err := config.Load()
if err != nil {
slog.Error("failed to load config", "error", err)
os.Exit(1)
}
// Connect to database
db, err := pgxpool.New(ctx, cfg.DatabaseURL)
if err != nil {
slog.Error("failed to connect to database", "error", err)
os.Exit(1)
}
defer db.Close()
// Initialize encryptor with both old and new keys
encryptor, err := crypto.NewVersionedEncryptor(
map[byte][]byte{
1: []byte(cfg.EncryptionKeyV1),
2: []byte(cfg.EncryptionKeyV2),
},
byte(cfg.EncryptionCurrentVersion),
)
if err != nil {
slog.Error("failed to initialize encryptor", "error", err)
os.Exit(1)
}
slog.Info("starting re-encryption job")
// Re-encrypt patient_persons.phone_encrypted
if err := reencryptPatientPhones(ctx, db, encryptor); err != nil {
slog.Error("failed to re-encrypt patient phones", "error", err)
os.Exit(1)
}
// Re-encrypt patient_persons.emergency_contact_phone_encrypted
if err := reencryptEmergencyPhones(ctx, db, encryptor); err != nil {
slog.Error("failed to re-encrypt emergency contact phones", "error", err)
os.Exit(1)
}
// Re-encrypt organization_integrations.api_key_encrypted
if err := reencryptOrgAPIKeys(ctx, db, encryptor); err != nil {
slog.Error("failed to re-encrypt org API keys", "error", err)
os.Exit(1)
}
slog.Info("re-encryption job completed successfully")
}
func reencryptPatientPhones(ctx context.Context, db *pgxpool.Pool, enc *crypto.VersionedEncryptor) error {
// Count total records
var total int
db.QueryRow(ctx, "SELECT COUNT(*) FROM patient_persons WHERE phone_encrypted IS NOT NULL").Scan(&total)
slog.Info("re-encrypting patient phones", "total", total)
const batchSize = 100
offset := 0
reencrypted := 0
for {
// Fetch batch
rows, err := db.Query(ctx, `
SELECT id, phone_encrypted
FROM patient_persons
WHERE phone_encrypted IS NOT NULL
ORDER BY id
LIMIT $1 OFFSET $2
`, batchSize, offset)
if err != nil {
return fmt.Errorf("fetch batch: %w", err)
}
batch := make([]struct {
ID int64
PhoneEncrypted []byte
}, 0, batchSize)
for rows.Next() {
var record struct {
ID int64
PhoneEncrypted []byte
}
if err := rows.Scan(&record.ID, &record.PhoneEncrypted); err != nil {
rows.Close()
return fmt.Errorf("scan row: %w", err)
}
batch = append(batch, record)
}
rows.Close()
if len(batch) == 0 {
break // Done
}
// Re-encrypt each record
for _, record := range batch {
// Check if already using current version
if !enc.NeedsReEncrypt(record.PhoneEncrypted) {
continue // Skip, already using current key
}
// Decrypt with old key
plaintext, err := enc.Decrypt(record.PhoneEncrypted)
if err != nil {
slog.Warn("failed to decrypt patient phone", "id", record.ID, "error", err)
continue // Skip this record, log and continue
}
// Encrypt with new key
ciphertext, err := enc.Encrypt(plaintext)
if err != nil {
return fmt.Errorf("encrypt patient_person %d: %w", record.ID, err)
}
// Update database
_, err = db.Exec(ctx, `
UPDATE patient_persons
SET phone_encrypted = $1, updated_at = NOW()
WHERE id = $2
`, ciphertext, record.ID)
if err != nil {
return fmt.Errorf("update patient_person %d: %w", record.ID, err)
}
reencrypted++
}
offset += batchSize
pct := float64(offset) / float64(total) * 100
slog.Info("re-encryption progress", "done", offset, "total", total, "pct", fmt.Sprintf("%.1f%%", pct))
}
slog.Info("re-encrypted patient phones", "count", reencrypted)
return nil
}
func reencryptEmergencyPhones(ctx context.Context, db *pgxpool.Pool, enc *crypto.VersionedEncryptor) error {
// Same logic as reencryptPatientPhones, but for patient_persons.emergency_contact_phone_encrypted
// ... (implementation similar to above)
return nil
}
func reencryptOrgAPIKeys(ctx context.Context, db *pgxpool.Pool, enc *crypto.VersionedEncryptor) error {
// Same logic as reencryptPatientPhones, but for organization_integrations.api_key_encrypted
// ... (implementation similar to above)
return nil
}Step 5: Verify Re-Encryption
-- Check patient_persons table (phone)
SELECT COUNT(*) AS old_key_count
FROM patient_persons
WHERE phone_encrypted IS NOT NULL
AND get_byte(phone_encrypted, 0) != 2; -- 2 = current key version
-- Expected: 0 (all records using new key)
-- Check patient_persons table (emergency contact phone)
SELECT COUNT(*) AS old_key_count
FROM patient_persons
WHERE emergency_contact_phone_encrypted IS NOT NULL
AND get_byte(emergency_contact_phone_encrypted, 0) != 2;
-- Expected: 0
-- Check organization_integrations table
SELECT COUNT(*) AS old_key_count
FROM organization_integrations
WHERE get_byte(api_key_encrypted, 0) != 2;
-- Expected: 0
-- Double-check total encrypted records
SELECT COUNT(*) AS total_encrypted_phones FROM patient_persons WHERE phone_encrypted IS NOT NULL;
SELECT COUNT(*) AS total_encrypted_emergency_phones FROM patient_persons WHERE emergency_contact_phone_encrypted IS NOT NULL;
SELECT COUNT(*) AS total_encrypted_keys FROM organization_integrations;
-- Compare with counts from Step 4Test decryption:
# Use health check endpoint which decrypts a test record
curl https://api.restartix.com/health | jq '.checks.encryption'
# Expected:
# {
# "status": "healthy",
# "message": "Encryption working with key version 2"
# }Step 6: Wait 24 Hours (Safety Period)
Why: Ensure no issues with new key before removing old key.
Monitor during this period:
- Error logs for decryption failures
- Sentry/Datadog for encryption-related errors
- User reports (none expected)
# Check CloudWatch Logs for encryption errors
aws logs filter-log-events \
--log-group-name /ecs/restartix-core-api \
--filter-pattern "?encrypt ?decrypt ?ERROR" \
--start-time $(date -d '24 hours ago' +%s000)
# Should see no errorsIf issues detected:
- DO NOT REMOVE OLD KEY
- Investigate and fix issue
- Re-run re-encryption if needed
- Extend safety period to 48 hours
Step 7: Remove Old Key
After 24 hours with no issues:
# Remove old key from AWS Secrets Manager
# Update the secret to remove ENCRYPTION_KEY_V1
aws secretsmanager update-secret \
--secret-id restartix-prod/env \
--secret-string '{"ENCRYPTION_KEY_V2":"<new_key>","ENCRYPTION_CURRENT_VERSION":"2"}'
# Verify only V2 remains
aws secretsmanager get-secret-value \
--secret-id restartix-prod/env \
--query 'SecretString' | grep ENCRYPTION_KEY
# Expected:
# ENCRYPTION_KEY_V2=<new_key>
# ENCRYPTION_CURRENT_VERSION=2
# Trigger redeploy
git push origin mainVerify deployment:
# Health check should still pass
curl https://api.restartix.com/health | jq .
# Logs should show no decryption errors
aws logs tail /ecs/restartix-core-api --since 10mStep 8: Audit Log Entry
-- Record rotation in audit_log
INSERT INTO audit_log (
organization_id,
user_id,
action,
entity_type,
entity_id,
changes,
created_at,
action_context
) VALUES (
NULL, -- System-level action
(SELECT id FROM users WHERE role = 'superadmin' AND email = '[email protected]' LIMIT 1),
'KEY_ROTATION',
'encryption_key',
2, -- New key version
jsonb_build_object(
'old_version', 1,
'new_version', 2,
'rotated_at', NOW(),
'records_reencrypted', (
SELECT COUNT(*) FROM patient_persons WHERE phone_encrypted IS NOT NULL
) + (
SELECT COUNT(*) FROM patient_persons WHERE emergency_contact_phone_encrypted IS NOT NULL
) + (
SELECT COUNT(*) FROM organization_integrations
),
'rotation_type', 'quarterly_scheduled'
),
NOW(),
'compliance_maintenance'
);
-- Verify entry
SELECT * FROM audit_log
WHERE action = 'KEY_ROTATION'
ORDER BY created_at DESC
LIMIT 1;Step 9: Update Documentation
- [ ] Update this document with last rotation date (see bottom of file)
- [ ] Update HIPAA compliance checklist in
04-auth-and-security.md - [ ] Close GitHub issue created by automation
- [ ] Notify team in Slack: "✅ Q[1/2/3/4] key rotation completed"
Step 10: Securely Destroy Old Key
After 30 days:
# Remove from password manager
# 1Password: Move to Archive, then delete after 30 days
# LastPass: Delete permanently
# Shred local file
shred -vfz -n 10 new-encryption-key.txt
# (if you kept it locally)
# Clear shell history
history -cEmergency Key Rotation
Trigger scenarios:
- Suspected key compromise (e.g., accidentally committed to git)
- Former employee with access
- Security audit finding
- Breach notification
Immediate actions:
Execute rotation ASAP (follow Steps 1-7 above, but do not wait 24 hours)
Incident response:
sql-- Log as security incident INSERT INTO audit_log ( organization_id, user_id, action, entity_type, changes, created_at, action_context ) VALUES ( NULL, (SELECT id FROM users WHERE role = 'superadmin' LIMIT 1), 'EMERGENCY_KEY_ROTATION', 'encryption_key', jsonb_build_object( 'reason', 'suspected_key_compromise', 'incident_id', 'INC-2026-001', 'rotated_at', NOW() ), NOW(), 'security_incident' );Review audit logs:
sql-- Check for suspicious access in last 30 days SELECT created_at, user_id, action, entity_type, entity_id, ip_address FROM audit_log WHERE created_at > NOW() - INTERVAL '30 days' AND action IN ('CREATE', 'UPDATE') AND entity_type IN ('patient', 'organization_integration') ORDER BY created_at DESC;Notify stakeholders:
- Compliance officer
- Legal team (if breach notification required)
- Executive team
- Affected customers (if data was accessed)
Post-incident review:
- How was key compromised?
- What controls failed?
- How to prevent recurrence?
- Update security policies
Rollback Plan
If re-encryption fails or causes issues:
Rollback Step 1: Revert to Old Key
# Set current version back to V1 in AWS Secrets Manager
aws secretsmanager update-secret \
--secret-id restartix-prod/env \
--secret-string '{"ENCRYPTION_KEY_V1":"<old_key>","ENCRYPTION_KEY_V2":"<new_key>","ENCRYPTION_CURRENT_VERSION":"1"}'
# Deploy
git push origin main
# Verify health
curl https://api.restartix.com/healthResult: New encryptions use old key V1, existing V2 data still readable (both keys available).
Rollback Step 2: Investigate Issue
# Check error logs via CloudWatch
aws logs filter-log-events \
--log-group-name /ecs/restartix-core-api \
--filter-pattern "ERROR" \
--start-time $(date -d '1 hour ago' +%s000)
# Test decryption manually
psql $DATABASE_URL -c "SELECT id, phone_encrypted FROM patient_persons WHERE phone_encrypted IS NOT NULL LIMIT 1;"Rollback Step 3: Re-attempt Rotation
- Fix issue identified in investigation
- Re-run re-encryption job
- Verify success before removing old key
Compliance Documentation
HIPAA Requirements Met
| Requirement | Implementation |
|---|---|
| §164.312(a)(2)(iv) Encryption | AES-256-GCM for phone numbers and API keys |
| §164.308(a)(3)(i) Access | Keys stored in AWS Secrets Manager, accessible only to security team |
| §164.308(a)(8) Evaluation | Quarterly rotation ensures ongoing protection |
| §164.312(b) Audit | All rotations logged in audit_log table |
Audit Trail
Query last 5 rotations:
SELECT
created_at,
changes->>'old_version' AS old_version,
changes->>'new_version' AS new_version,
changes->>'records_reencrypted' AS records_affected,
changes->>'rotation_type' AS type
FROM audit_log
WHERE action = 'KEY_ROTATION'
ORDER BY created_at DESC
LIMIT 5;Rotation History
| Date | Old Version | New Version | Records Re-encrypted | Notes |
|---|---|---|---|---|
| 2026-02-01 | 1 | 2 | 1,234 patients + 56 orgs | Initial quarterly rotation |
| TBD | TBD | TBD | TBD | Q2 2026 rotation |
Update this table after each rotation.
Contact & Escalation
Responsible Team: Security Team Primary Contact: [email protected]Escalation: CTO
Slack Channels:
#security- Security team coordination#engineering- General announcements#incidents- Emergency rotations only
On-Call: PagerDuty rotation (security team)