Feature Flags in SaaS: Release Management Best Practices
Feature flags (also called feature toggles) let you deploy code to production with features turned off, then enable them selectively by user, tier, or percentage. They decouple deployment from release.
Done right, feature flags reduce risk, enable A/B testing, and let you gate features by subscription tier without code changes. Done wrong, they become technical debt that slows every deployment.
The Four Types of Feature Flags
1. Release Flags
Purpose: Ship incomplete features to production without exposing them to users.
Lifecycle: Short-lived — removed once the feature is fully rolled out.
Example: You're building a new dashboard. The code ships behind a flag for 2 weeks while the team finishes it. Once ready, flip to 100% and delete the flag.
2. Experiment Flags (A/B Testing)
Purpose: Test variations to measure impact on conversion or engagement.
Lifecycle: Medium-lived — removed after choosing a winner.
Example: Test two checkout flows. Track conversion for 2 weeks, pick the winner, remove the flag.
3. Operations Flags (Kill Switches)
Purpose: Instantly disable a feature or external dependency if it breaks.
Lifecycle: Long-lived — remain in place for critical dependencies.
Example: Payment provider goes down. Flip flag to disable checkout, show maintenance message. Flip back when healthy.
4. Permission Flags (Tier Gating)
Purpose: Control access by subscription tier without code changes.
Lifecycle: Permanent — stay as long as pricing model exists.
Example: Advanced analytics gated behind plan === 'enterprise'. Auto-enables on upgrade via billing sync.
Progressive Rollouts: The 5-25-50-100 Pattern
Never release to 100% at once. Use progressive rollouts:
- 5% for 24 hours — monitor error rates, latency, support tickets
- 25% for 48 hours — expand if metrics hold
- 50% for 48 hours — catch issues before they hit majority
- 100% — full rollout
This reduces incidents by 70-90% vs. big-bang releases. If something breaks at 5%, you've protected 95% of users.
Server-Side vs Client-Side Evaluation
Server-side: Server checks flag state, decides what code to execute. Use for subscription checks, rate limiting, payment logic. User never sees flag logic.
Client-side: JavaScript checks flag, renders different UI. Use for UI experiments, non-sensitive features. Faster but exposes flag logic to browser.
Governance: Avoiding Flag Debt
Naming Conventions
Bad: new_feature_v2, test123
Good: feat_new_checkout, exp_onboarding_flow_v2, ops_stripe_payment_gate
Prefix by type: feat_, exp_, ops_, perm_
Ownership
Every flag must have an owner responsible for removing it. Reassign during offboarding to prevent orphaned flags.
Cap Active Flags at 50-100
More than 100 creates cognitive load. Quarterly audits: delete flags untouched 30+ days, convert permanent flags to entitlement service, remove concluded experiments.
CI/CD Integration
Add CI check that fails if flag active >90 days without update, referenced but undefined, or marked for deletion but still has code refs.
Monitoring and Alerting
Alert on:
- Flag evaluation latency >100ms — flag service issue
- Sudden evaluation spike — loop or infinite retry
- KPI drops correlated with flag changes — roll back immediately
Integrate flag state into error tracking (Sentry) and analytics (Mixpanel) to correlate issues with rollouts.
Common Mistakes
- Nested flags:
if (flagA && flagB) creates exponential complexity
- No rollback plan: Define trigger conditions (error rate >1%, latency >500ms)
- Using flags for config: API URLs, DB creds belong in env vars, not flags
- No analytics events: Fire event on every evaluation to measure adoption
Self-hosted: Unleash (open-source, enterprise), Flagsmith (lightweight)
Managed: LaunchDarkly (enterprise standard), PostHog (flags + analytics), Statsig (generous free tier)
For startups, PostHog or Statsig. For enterprise, LaunchDarkly.
For billing tier-gating, see Stripe Subscription Setup. For scaling, see Scaling SaaS from MVP to Enterprise.
FAQs
What is a feature flag in SaaS?
A conditional toggle in code that enables/disables features at runtime without deploying new code. Used for progressive rollouts, A/B testing, tier gating, and kill switches.
What's the difference between feature flags and environment variables?
Feature flags control features dynamically (per user, tier, percentage). Environment variables configure infrastructure (API keys, DB URLs) set at deploy time.
How long should a feature flag live?
Release flags: days to weeks. Experiment flags: weeks to months. Kill switches: permanent. Permission flags: permanent. Anything active >90 days without updates is debt.
Should I use server-side or client-side feature flags?
Server-side for security-sensitive logic (tier checks, payments, rate limiting). Client-side for UI experiments. If in doubt, use server-side—more secure.