Segment Performance & Scaling
The Problem
Real-time segment evaluation on every data change works at small scale but becomes expensive as organizations grow.
Worst-case scenario:
1 form save
→ 15 segments reference that form template
→ each segment has 5 rules across 3 data sources
→ 15 × 5 = 75 subqueries per save
→ multiply by concurrent form saves (10 specialists × 5 patients/hour)
→ 750 subquery bursts per hour at modest scaleAt 5,000+ patients with 20+ segments, this becomes a bottleneck.
Solution: Tiered Evaluation Strategy
Instead of pure real-time or pure batch, use a three-tier model:
Tier 1: Synchronous (Inline) — Fast Path
Target: Simple segments that can evaluate in < 50ms
Criteria:
- ≤ 3 rules
- Single data source (form-only OR profile-only OR appointments-only)
- No nested groups
Behavior:
- Evaluates inline during form save/profile update
- Runs within the same database transaction
- Blocks the request briefly but keeps segment membership fresh
Implementation:
func (s *segmentService) evaluateInline(ctx context.Context, tx pgx.Tx, patientID int64, templateID int64) error {
// Find segments eligible for inline evaluation
segments, err := s.findSimpleSegments(ctx, tx, templateID)
if err != nil {
return err
}
for _, seg := range segments {
matched, err := s.evaluateRules(ctx, tx, seg.Rules, patientID)
if err != nil {
// Log but don't fail the form save
s.logger.Error("inline segment eval failed", "segment_id", seg.ID, "error", err)
s.enqueueAsync(ctx, seg.ID, patientID) // Fallback to async
continue
}
s.updateMembership(ctx, tx, seg.ID, patientID, matched)
}
return nil
}SLO: < 50ms added latency to form save (p99)
When to use: High-value segments that must be real-time (e.g., "Critical condition alerts")
Tier 2: Asynchronous — Complex Segments
Target: Complex segments that take longer to evaluate
Criteria:
3 rules
- Multi-source rules (form + profile + appointments)
- Nested groups
Behavior:
- Enqueued after form save completes
- Processed by background worker
- Segment membership updated within seconds
Implementation:
type SegmentEvalJob struct {
OrganizationID int64 `json:"organization_id"`
PatientID int64 `json:"patient_id"`
SegmentIDs []int64 `json:"segment_ids"` // empty = evaluate all
TriggerSource string `json:"trigger"` // "form_save", "profile_update", "manual"
TemplateID *int64 `json:"template_id"` // for targeted evaluation
}
func (w *segmentWorker) processEvalJob(ctx context.Context, job SegmentEvalJob) error {
segments, err := w.segmentRepo.FindByOrg(ctx, job.OrganizationID)
if err != nil {
return err
}
// Filter to relevant segments
if job.TemplateID != nil {
segments = filterByTemplateReference(segments, *job.TemplateID)
}
if len(job.SegmentIDs) > 0 {
segments = filterByIDs(segments, job.SegmentIDs)
}
for _, seg := range segments {
matched, err := w.evaluator.Evaluate(ctx, seg.Rules, job.PatientID)
if err != nil {
w.logger.Error("async segment eval failed",
"segment_id", seg.ID,
"patient_id", job.PatientID,
"error", err,
)
continue
}
w.membershipRepo.Upsert(ctx, seg.ID, job.PatientID, job.OrganizationID, matched)
}
return nil
}Queue implementation: PostgreSQL LISTEN/NOTIFY for simplicity. Can migrate to Redis/NATS later if throughput demands.
SLO: Segment membership updated within 5 seconds of form save
Worker scaling: Start with 5 worker goroutines per instance. Scale horizontally by adding more service instances.
Tier 3: Batch Rebuild — Full Re-evaluation
Target: Admin-triggered rebuilds or rule changes
Behavior:
- Rebuilds ALL members for a segment from scratch
- Clears existing
segment_members, re-evaluates rules against all patients - More efficient than per-patient evaluation at scale
Implementation:
func (s *segmentService) Evaluate(ctx context.Context, segmentID int64) error {
segment, err := s.repo.Get(ctx, segmentID)
if err != nil {
return err
}
// Build a single SQL query from all rules
query, args := s.buildBulkQuery(segment.Rules, segment.MatchMode, segment.OrganizationID)
return s.db.WithTx(ctx, func(tx pgx.Tx) error {
// Clear old members
_, err := tx.Exec(ctx, "DELETE FROM segment_members WHERE segment_id = $1", segmentID)
if err != nil {
return err
}
// Insert new members
_, err = tx.Exec(ctx, fmt.Sprintf(`
INSERT INTO segment_members (segment_id, patient_id, organization_id, matched_at)
SELECT $1, patient_id, $2, NOW()
FROM (%s) AS matched_patients
`, query), append([]any{segmentID, segment.OrganizationID}, args...)...)
return err
})
}SLO: Full rebuild of 10,000-patient segment completes in < 30 seconds
When triggered:
- Admin manually requests rebuild (
POST /v1/segments/{id}/evaluate) - Segment rules are updated (automatic)
- Scheduled nightly rebuild for critical segments (optional)
Tier Decision Logic
When a data change occurs (form save, profile update), determine which tier to use:
func (s *segmentService) evaluateAfterFormSave(ctx context.Context, patientID int64, templateID int64) error {
segments, err := s.findSegmentsByTemplate(ctx, templateID)
if err != nil {
return err
}
var inlineSegments []Segment
var asyncSegments []Segment
for _, seg := range segments {
if s.isEligibleForInline(seg) {
inlineSegments = append(inlineSegments, seg)
} else {
asyncSegments = append(asyncSegments, seg)
}
}
// Tier 1: Evaluate inline (within form save transaction)
if len(inlineSegments) > 0 {
s.evaluateInline(ctx, tx, patientID, inlineSegments)
}
// Tier 2: Enqueue async evaluation
if len(asyncSegments) > 0 {
s.enqueueAsync(ctx, SegmentEvalJob{
PatientID: patientID,
SegmentIDs: extractIDs(asyncSegments),
TriggerSource: "form_save",
TemplateID: &templateID,
})
}
return nil
}
func (s *segmentService) isEligibleForInline(seg Segment) bool {
if len(seg.Rules) > 3 {
return false
}
if hasNestedGroups(seg.Rules) {
return false
}
if isMultiSource(seg.Rules) {
return false
}
return true
}Scaling to 10,000+ Patients
Database Optimization
Critical indexes:
-- Form lookup for segment evaluation (latest completed/signed per template)
CREATE INDEX idx_forms_segment_lookup
ON forms(user_id, form_template_id, status, updated_at DESC)
WHERE status IN ('completed', 'signed');
-- Appointment aggregate for segment rules
CREATE INDEX idx_appointments_segment_count
ON appointments(user_id, organization_id, status);
-- Custom field values lookup
CREATE INDEX idx_custom_values_lookup
ON custom_field_values(entity_type, entity_id, custom_field_id);
-- GIN index for JSONB form values
CREATE INDEX idx_forms_values ON forms USING GIN (values jsonb_path_ops);Query plan validation:
Run EXPLAIN ANALYZE on worst-case segment queries:
EXPLAIN ANALYZE
SELECT p.id
FROM patients p
WHERE p.organization_id = 1
-- Form rule (JSONB query with GIN index)
AND EXISTS (
SELECT 1 FROM forms f
WHERE f.patient_person_id = p.patient_person_id
AND f.organization_id = 1
AND f.form_template_id = 5
AND f.status IN ('completed', 'signed')
AND f.values->>'pain_level' = 'Big pain'
ORDER BY f.updated_at DESC
LIMIT 1
)
-- Profile rule (custom_field_values lookup)
AND EXISTS (
SELECT 1 FROM custom_field_values cfv
JOIN custom_fields cf ON cf.id = cfv.custom_field_id
WHERE cfv.entity_type = 'patient' AND cfv.entity_id = p.id
AND cf.key = 'city' AND cfv.value = 'Bucharest'
)
-- Appointment aggregate
AND (
SELECT COUNT(*) FROM appointments a
WHERE a.patient_person_id = p.patient_person_id
AND a.organization_id = 1
AND a.status = 'done'
) >= 2;Expected: Index scans on all paths. If sequential scans appear, add targeted indexes or move segment to async tier.
Worker Pool Sizing
Default configuration:
- 5 worker goroutines per service instance
- Each worker processes one
SegmentEvalJobat a time - Workers poll a shared queue (PostgreSQL table or Redis list)
Scaling strategy:
Small orgs (< 1,000 patients):
→ 1 service instance, 5 workers
→ All segments can be inline (Tier 1)
Medium orgs (1,000 - 5,000 patients):
→ 2 service instances, 10 workers total
→ Simple segments inline, complex async
Large orgs (5,000 - 20,000 patients):
→ 4 service instances, 20 workers total
→ Most segments async, critical ones inline
Enterprise (20,000+ patients):
→ 8+ service instances, 40+ workers
→ Dedicated segment evaluation service
→ Consider materialized views for hot segmentsAuto-scaling trigger: Queue depth > 50 pending jobs for > 2 minutes
Caching Strategy
segment_members is the cache. It's a materialized view of rule evaluation results.
Cache invalidation:
- Tier 1: Updated inline during form save (< 50ms)
- Tier 2: Updated async (< 5s)
- Tier 3: Fully rebuilt (< 30s)
Freshness guarantee:
matched_attimestamp shows when membership was last confirmed- No additional cache layer needed (avoid cache-on-cache complexity)
For guaranteed-fresh reads:
GET /v1/segments/{id}/members?fresh=trueForces Tier 3 re-evaluation before returning members. Rate-limited to 1 request per minute per segment.
Performance Benchmarks
Target SLOs
| Operation | Target | Measurement |
|---|---|---|
| Inline eval (Tier 1) | < 50ms added | p99 latency added to form save |
| Async eval (Tier 2) | < 5s end-to-end | Time from form save to segment_members update |
| Bulk rebuild (Tier 3) | < 30s for 10K patients | Wall clock for POST /segments/{id}/evaluate |
| Members query | < 100ms | GET /segments/{id}/members response time |
| Patient segments query | < 50ms | GET /patients/{id}/segments response time |
Load Testing
Scenario 1: Concurrent form saves
10 specialists × 5 patients/hour × 8 hours = 400 form saves/day
Each form save triggers 5 segments (3 inline, 2 async)
Expected:
- Inline eval: 3 × 400 = 1,200 evaluations, < 50ms each
- Async eval: 2 × 400 = 800 jobs queued, processed within 5s
- No queue backlogScenario 2: Full segment rebuild
Segment with 4 complex rules, 10,000 patients
Expected:
- SQL query generation: < 10ms
- Query execution: < 25s
- Transaction commit: < 5s
- Total: < 30sScenario 3: Fresh member fetch
Admin requests fresh members for segment with 5,000 matched patients
Expected:
- Tier 3 rebuild: < 25s
- Pagination query: < 100ms
- Total: < 26sMonitoring & Alerts
Metrics to Track
// Prometheus metrics
var (
segmentEvalDuration = prometheus.NewHistogramVec(
prometheus.HistogramOpts{
Name: "segment_eval_duration_seconds",
Buckets: []float64{0.01, 0.05, 0.1, 0.5, 1, 5, 10, 30},
},
[]string{"tier", "segment_id"},
)
segmentQueueDepth = prometheus.NewGauge(
prometheus.GaugeOpts{
Name: "segment_eval_queue_depth",
},
)
segmentEvalErrors = prometheus.NewCounterVec(
prometheus.CounterOpts{
Name: "segment_eval_errors_total",
},
[]string{"tier", "error_type"},
)
)Alert Thresholds
| Alert | Threshold | Action |
|---|---|---|
| Inline eval exceeds SLO | p99 > 50ms for 5 min | Move affected segments to async tier |
| Async queue depth high | > 100 pending jobs | Scale worker goroutines or service instances |
| Async eval exceeds SLO | p99 > 10s for 5 min | Investigate query plans, add indexes |
| Tier 3 timeout | Any rebuild > 60s | Investigate query plans, consider partitioning |
| Eval error rate high | > 5% error rate for 5 min | Investigate logs, check database health |
Optimization Techniques
1. Query Pushdown
Instead of evaluating each patient individually, push logic to SQL:
Before (per-patient evaluation):
for _, patient := range patients {
matched := evaluateRules(ctx, rules, patient.ID)
if matched {
insertMember(ctx, segmentID, patient.ID)
}
}
// 10,000 patients = 10,000 queriesAfter (bulk query):
INSERT INTO segment_members (segment_id, patient_id, organization_id)
SELECT $1, p.id, $2
FROM patients p
WHERE <all_rules_as_sql>
-- 10,000 patients = 1 querySpeedup: 100x - 1000x for large segments
2. Targeted Evaluation
When a form save triggers segment evaluation, only evaluate segments that reference that form template:
func (s *segmentService) findSegmentsByTemplate(ctx context.Context, templateID int64) ([]Segment, error) {
return s.repo.FindByRuleFilter(ctx, func(rules []RuleNode) bool {
return hasTemplateReference(rules, templateID)
})
}
func hasTemplateReference(rules []RuleNode, templateID int64) bool {
for _, rule := range rules {
if rule.Source == "form" && rule.TemplateID != nil && *rule.TemplateID == templateID {
return true
}
if rule.Group {
if hasTemplateReference(rule.Rules, templateID) {
return true
}
}
}
return false
}Impact: Form save that references template 5 only evaluates segments with form rules for template 5, not all segments.
3. Short-Circuit Evaluation
For match_mode: "any", stop evaluating as soon as one rule matches:
func (e *ruleEvaluator) evaluateGroup(ctx context.Context, mode string, rules []RuleNode, patientID int64) (bool, error) {
for _, rule := range rules {
matched, err := e.evaluateNode(ctx, rule, patientID)
if err != nil {
return false, err
}
if mode == "any" && matched {
return true, nil // Short-circuit OR
}
if mode == "all" && !matched {
return false, nil // Short-circuit AND
}
}
return mode == "all", nil
}Impact: Segments with match_mode: "any" can return after first matching rule. Saves 50%+ of queries on average.
4. Denormalization (Future Optimization)
For very hot segments (evaluated > 1000x/day), consider denormalizing:
Option A: Materialized View
CREATE MATERIALIZED VIEW hot_segment_1_members AS
SELECT p.id AS patient_id
FROM patients p
WHERE <all_rules_as_sql>;
CREATE UNIQUE INDEX ON hot_segment_1_members(patient_id);Refresh strategy: Refresh every 5 minutes via cron job or trigger on data changes.
Option B: Dedicated Column
ALTER TABLE patients ADD COLUMN in_segment_1 BOOLEAN DEFAULT FALSE;
CREATE INDEX idx_patients_segment_1 ON patients(in_segment_1) WHERE in_segment_1 = TRUE;Update column on every data change (inline or async).
Trade-off: Faster reads, more complex writes. Only use for < 5 critical segments per org.
Capacity Planning
Single-Instance Limits
| Resource | Limit | Mitigation |
|---|---|---|
| Database connections | 100 max | Use connection pooling (pgBouncer) |
| Worker goroutines | 20 max | Scale horizontally (add instances) |
| Queue depth | 500 max | Auto-scale workers, rate-limit form submissions |
| Segment count | 100 per org | Soft limit enforced by UI, no technical constraint |
Scaling Triggers
| Metric | Threshold | Action |
|---|---|---|
| CPU usage | > 70% sustained | Add service instance |
| Queue depth | > 100 for > 5 min | Add worker goroutines or instance |
| Database query time | p99 > 500ms | Add indexes, optimize queries, consider read replica |
| Eval error rate | > 1% | Investigate failing segments, add monitoring |
Future Optimizations
1. Incremental Evaluation
Instead of re-evaluating all rules, only re-evaluate rules affected by the data change:
Form save for template 5 → only re-evaluate rules with template_id = 5
Profile update for custom_field_id 10 (city) → only re-evaluate rules referencing custom_field_id = 10Challenge: Requires tracking which rules reference which data sources (doable via metadata index).
2. Parallel Evaluation
Evaluate multiple segments in parallel for a single patient:
var wg sync.WaitGroup
results := make(chan EvalResult, len(segments))
for _, seg := range segments {
wg.Add(1)
go func(s Segment) {
defer wg.Done()
matched, err := evaluator.Evaluate(ctx, s.Rules, patientID)
results <- EvalResult{SegmentID: s.ID, Matched: matched, Error: err}
}(seg)
}
wg.Wait()
close(results)Impact: 2-3x speedup for patients with > 10 segments.
3. Segment Dependency Graph
If segments reference each other (future feature), build a DAG and evaluate in topological order:
Segment A depends on Segment B
→ Evaluate B first, cache result, use in A evaluationNot needed in v1 (segments don't reference each other yet).
Summary
| Tier | When | SLO | Use Case |
|---|---|---|---|
| Tier 1 (Inline) | Simple segments (≤3 rules, single source) | < 50ms | Critical real-time alerts |
| Tier 2 (Async) | Complex segments (multi-source, >3 rules) | < 5s | General-purpose segmentation |
| Tier 3 (Batch) | Full rebuilds, rule changes | < 30s | Admin operations, nightly jobs |
Key principle: Segment evaluation should never block data writes. Stale segment membership is acceptable; data loss is not.