Most ops dashboards have 40 metrics. Ops leaders look at maybe six. The other 34 exist because someone requested them in 2022 and no one has the heart to remove them.
This article is about the six. The KPIs that operations leaders at $30M-$500M companies actually use to make decisions, with formulas, benchmark ranges from APQC, Gartner, ITIL, and SHRM, and notes on when each is useful versus when it turns into a vanity number. If you read one section, read the last one. It names the five KPIs to track if you could only track five.
What KPIs matter most for operations leaders?
Operations KPIs fall into seven categories: throughput, quality, utilization, cost, speed, reliability, and people. Each answers a different question about how work flows through the business. A good dashboard picks two to three metrics per category, not forty. The table below shows the core metric per category with formula and benchmark range for growth-stage companies. Treat it as a menu.
| Category | Core KPI | Formula | Benchmark (mid-market) | Source |
|---|---|---|---|---|
| Throughput | Cycle time | End time - start time (active work only) | Workflow-specific; target 20-40% below lead time | APQC |
| Throughput | Work in progress (WIP) | Count of items currently in flight | 1-2x daily throughput rate | Lean |
| Quality | First-pass yield | Units right the first time / total units | 90%+ for knowledge work, 95%+ for manufacturing | Six Sigma / APQC |
| Quality | Error rate | Errors / total outputs | Under 2% for core processes | ISM |
| Utilization | Capacity utilization | Actual output / potential output | 70-85% (sustained) | Federal Reserve / Gartner |
| Utilization | Billable utilization | Billable hours / available hours | 65-80% for services firms | SPI Research |
| Cost | Cost per transaction | Total process cost / transaction count | Varies; track trend over absolute number | APQC |
| Cost | OpEx ratio | Operating expenses / revenue | 60-80% depending on industry | McKinsey |
| Speed | Time-to-resolution | Ticket close time - open time | Under 24 hours for Tier 1 | ITIL |
| Speed | Time-to-onboard | Day one productive - start date | 30-90 days for knowledge roles | SHRM |
| Reliability | SLA attainment | SLAs met / SLAs committed | 95%+ for critical systems | ITIL / Gartner |
| Reliability | MTTR | Total downtime / number of incidents | Under 5 hours for critical systems | ITIL |
| People | eNPS | % promoters - % detractors | 10-30 acceptable, 30+ strong | Culture Amp |
| People | Regrettable attrition | Regretted leavers / headcount | Under 5% annually | SHRM |
What are the most important throughput KPIs?
Throughput KPIs measure how much work moves through a process and how fast. The four that matter are cycle time, lead time, work in progress, and throughput rate. Each one alone only describes half the picture.
Cycle time
Cycle time is the active time a unit of work spends being worked on, not counting queue and wait states. Formula: end timestamp minus start timestamp, counting only active handling. APQC tracks cycle time across most process frameworks, with top-performer targets typically 40-60% faster than median.
Cycle time is the most underused KPI we see. Most teams track lead time and stop there. If cycle time is low but lead time is high, the problem is queues and handoffs, not capacity. The common mistake is treating cycle time as an individual productivity metric. It's a property of the process, not the person.
Lead time
Lead time is the total elapsed time from work entering the system until completion, including every wait state. Formula: completion timestamp minus request timestamp. Lead time is what the customer experiences. Cycle time is what the operator experiences. APQC data shows order-to-cash lead times of 2.8 days for top performers versus over 7 days for bottom quartile. Averaging lead time across mixed work types destroys the signal.
Work in progress (WIP)
WIP counts items currently in flight. Formula: open items at a point in time. Lean treats WIP as the most diagnostic throughput metric. Little's Law ties the three together: average WIP equals throughput rate times average cycle time. Keep WIP between 1x and 2x daily throughput. A queue of 400 open tickets isn't demand. It's a broken intake.
Throughput rate
Throughput rate measures completed units per time period. Formula: units completed / time window. The simplest and most abused metric. Teams raise throughput by cutting corners on quality, which is why it should never be tracked without first-pass yield next to it.
What quality KPIs should operations track?
Quality KPIs measure how often outputs meet standards the first time and how often they need rework. The four core quality metrics are first-pass yield, error rate, rework ratio, and defect density.
First-pass yield (FPY)
First-pass yield is the percentage of units that complete a process correctly without any rework. Formula: units passing the first time / total units. Six Sigma treats FPY as the single most useful quality metric because it captures rework cost in one number. A process at 80% FPY is losing 20% of capacity to fixing its own mistakes. APQC benchmarks show top performers at 95%+ on knowledge work and 98%+ on manufacturing steps. Below 85% usually signals a systemic gap: unclear acceptance criteria, missing input validation, or inadequate training.
Error rate
Error rate is the percentage of outputs containing a defect, regardless of whether it was caught. Formula: outputs with errors / total outputs. The Institute for Supply Management (ISM) uses error rate in supplier scorecards, with 2% treated as the threshold for routine transactions. It matters most in regulated workflows where individual errors carry external cost.
Rework ratio
Rework ratio measures the share of output requiring rework after initial completion. Formula: rework hours / total production hours. The direct economic complement to FPY. At 80% FPY, rework can consume 30-50% of team capacity. Teams that have lived with 20% rework stop seeing it. The capacity is recoverable once visible.
Defect density
Defect density measures defects per unit of output, used mostly in software and manufacturing. Formula: defects / units produced. Most useful for comparing processes or teams over time, not as an absolute target.
How do you measure utilization correctly?
Utilization KPIs measure how fully capacity is being used. The three most relevant for operations are capacity utilization, billable utilization, and meeting load. Utilization is the category most prone to misuse because the intuitive goal (maximize it) is the wrong goal.
Capacity utilization
Capacity utilization is the share of maximum sustainable output a system is currently producing. Formula: actual output / potential output. The Federal Reserve reports U.S. manufacturing capacity utilization at 75.6% in late 2024, 2.6 points below the historical average. Gartner's server utilization guidance sits in the same range: 70-85% for optimal performance. The target is not 100%. Sustained utilization above 90% predicts team burnout and quality collapse. Below 60% usually signals overcapacity.
Billable utilization
Billable utilization is the percentage of working hours billable to clients, used primarily in services firms. Formula: billable hours / available hours. Services Performance Insight (SPI) benchmarks target 70-80%, with 75% as the sustainable midpoint. Treating 90% as success means people are billing every waking hour, which destroys training, internal projects, and retention.
Meeting load
Meeting load measures the share of the workweek consumed by meetings. Formula: meeting hours / available hours. Microsoft's 2023 Work Trend Index found managers at high-performing companies spend about 30% of their week in meetings. Low performers climb past 50%. Above 40% is a structural problem.
What cost KPIs do operations leaders use?
Cost KPIs measure how efficiently operations convert spend into output. The three that move the needle are cost-per-transaction, cost-per-output, and OpEx ratio.
Cost per transaction
Cost per transaction is the total cost to complete one unit of a defined process. Formula: total process cost / transaction volume. APQC benchmarks show wide variance by industry, so the useful signal is almost always the trend, not the absolute number. A cost per transaction rising 20% over four quarters on flat volume points to a scaling problem. Comparing across businesses without normalizing scope is meaningless. One company's invoice cost includes dispute handling, another's doesn't.
Cost per output
Cost per output is the flexible version: total cost divided by a meaningful output unit (deal closed, case resolved, product shipped). When a transaction isn't the right denominator, use output.
OpEx ratio
Operating expense ratio measures operating expenses as a percentage of revenue. Formula: OpEx / revenue. McKinsey operational benchmarks place healthy ratios between 60% and 80% by industry, with SaaS trending lower and manufacturing higher. A rising OpEx ratio while revenue grows signals operational scale problems before they show up in margin compression. For connecting these cost signals to underlying process friction, see how to measure operational friction across departments.
Which speed KPIs correlate with revenue?
Speed KPIs measure how quickly customer-facing or revenue-impacting processes close out. The three that correlate most directly with revenue are time-to-close, time-to-resolution, and time-to-onboard.
Time-to-close
Time-to-close is the average elapsed time from deal creation to signed contract. Formula: close date minus creation date. Shorter time-to-close correlates with higher close rate because stalled deals lose momentum and die. Gartner's 2025 Sales Operations Benchmark Report shows top-quartile B2B companies closing deals 30-40% faster than median. One warning: shortening the metric by disqualifying marginal deals early improves the number while killing pipeline.
Time-to-resolution
Time-to-resolution measures the average time to close a customer issue. Formula: close time minus open time. ITIL and Gartner guidance converges on under 24 hours for Tier 1 and under 72 hours for Tier 2, with the caveat that resolution quality matters more than resolution time. A quick close with a recurring ticket behind it is worse than a slower first-time-right resolution.
Time-to-onboard
Time-to-onboard measures elapsed days from employee start date to full productivity. Formula: first-full-output day minus start date. SHRM research pegs average time-to-onboard at 30-90 days for knowledge roles. Cutting this in half through structured onboarding and systems access automation is one of the highest-impact operational improvements a growing company can make.
What reliability KPIs matter?
Reliability KPIs measure how consistently systems and processes perform to committed levels. The three core metrics come from ITIL and site reliability practice: SLA attainment, MTBF, and MTTR.
SLA attainment
SLA attainment is the percentage of service-level commitments met over a period. Formula: SLAs met / SLAs committed. ITIL treats 95%+ as the minimum for customer-facing critical systems, with 99%+ expected on tier-one commitments. Gartner's 2025 IT Key Metrics Data places median enterprise SLA attainment in the mid-90s. If you're hitting 100% every month, the SLAs are too loose to mean anything.
Mean time between failures (MTBF)
MTBF measures the average operating time between system failures. Formula: total operating time / number of failures. World-class reliability targets 99%+ availability on critical systems, which requires MTBF several orders of magnitude above MTTR. Atlassian's incident management benchmarks note MTBF becomes meaningful once you have at least 10 incidents of a given type. Below that sample size, variance dominates.
Mean time to resolution (MTTR)
MTTR measures the average time to restore service after failure. Formula: total downtime / number of incidents. ITIL guidance targets MTTR under 5 hours for reliable system performance. MTBF and MTTR always go together. High MTBF with high MTTR means failures are rare but brutal, which is a different problem than frequent short failures.
Goodhart's law and the KPI trap
When a measure becomes a target, it stops being a good measure. Goodhart's law is the reason KPI dashboards quietly fail over two or three quarters. We've watched it happen. Hospitals reduce length-of-stay by discharging patients early and readmission rates spike. Sales teams hit call-count targets by making throwaway calls. Support teams close tickets fast by closing them wrong. The fix is tracking paired metrics: throughput with quality, speed with SLA attainment, utilization with attrition. Any single KPI can be gamed. A pair of KPIs pointed at the same underlying outcome is much harder to fake.
What people KPIs should operations include?
People KPIs measure the health of the team doing the work. The three most predictive are employee engagement, eNPS, and regrettable attrition. Ops leaders who skip these measure machines and ignore the humans running them.
Employee engagement
Employee engagement measures the share of employees actively engaged at work, typically through surveys. SHRM's 2025 benchmarking places global engagement between 20-35%, meaning two-thirds of most workforces are neutral or disengaged. Gallup estimates the cost at $8.9 trillion annually in global productivity loss (2023). Engagement correlates more strongly with process quality than with output volume. Disengaged teams ship more errors and higher rework ratios.
Employee net promoter score (eNPS)
eNPS asks one question: on a 0-10 scale, how likely are you to recommend working here? Formula: % promoters (9-10) minus % detractors (0-6). Culture Amp reports the January 2025 global average at 17, with 10-30 acceptable and 30+ strong. Below 0 predicts attrition 6-12 months out. Running eNPS annually wastes it. Quarterly or monthly cadence makes it useful.
Regrettable attrition
Regrettable attrition measures the rate at which high-performing employees voluntarily leave. Formula: regretted voluntary leavers / average headcount. Different from total turnover, which includes low performers leaving and terminations. SHRM guidance places healthy regrettable attrition under 5% annually for knowledge-work organizations. Losing a poor fit is not the same problem as losing a top performer.
How do you pick a starter KPI set?
Most operations teams start with too many KPIs. If you could only track five, pick one from each of these categories and ignore the rest for your first 90 days.
The Starter Five
1. Cycle time on your most painful workflow. Pick the process causing the most internal complaints. Measure how long it actually takes. That number will change how you run the next six months.
2. First-pass yield on the same workflow. Cycle time without quality is a lie. If FPY is 75%, your cycle time only describes 75% of the work. The other 25% comes back.
3. Capacity utilization across your ops team. Not maximum productivity. Actual utilization. Target 70-85% sustained. If you're at 95%, you have a staffing problem disguised as a productivity metric.
4. SLA attainment on your top three customer commitments. These are the promises that affect churn and reputation. If you're missing them, nothing else matters.
5. Regrettable attrition, tracked quarterly. The team that ships the work has to stay. If top performers are leaving, fix that before optimizing anything else.
Five metrics. Everything else is supporting measurement that lives in a deeper dashboard for when the starter five point to a problem. For how to build that deeper dashboard without overloading it, see building a data strategy for operations.
How often should operations KPIs be reviewed?
Operations KPIs need three review cadences. Real-time alerting catches anything that breaches an SLA or threshold. It lives in the system, not in a meeting. Weekly reviews run during ops stand-ups and cover cycle time, WIP, first-pass yield, and SLA attainment. Monthly reviews focus on trend metrics like OpEx ratio, regrettable attrition, and cost per transaction, which move too slowly to review weekly.
Quarterly reviews are for KPI hygiene itself. Which metrics are we tracking that nobody looks at? Which decisions are we making without a supporting metric? Before that review, revisit the most common operational bottlenecks in growing companies to verify your KPIs still map to where friction lives.
What's the difference between a KPI and a vanity metric?
A KPI drives decisions. A vanity metric describes activity. The test: if the number moves, does someone do something different? If the answer is "look at it in the next meeting," it's vanity.
Common vanity metrics in ops dashboards: tickets processed (activity, not value), hours logged (time, not output), meetings attended (presence, not decisions), reports distributed (output, not outcome), projects initiated (start, not completion). Each has a legitimate version: tickets resolved right the first time, billable hours, decisions per meeting, reports actually used, projects completed. The vanity version is a proxy that's easier to measure and easier to game.
Aberdeen Group research on manufacturing performance management found Best-in-Class companies distinguish themselves less by tracking more metrics and more by acting on the ones they track. A dashboard with six KPIs driving weekly decisions beats one with forty that get reviewed quarterly.
Key takeaways
Operations KPIs fall into seven categories: throughput, quality, utilization, cost, speed, reliability, and people. A good dashboard picks two to three per category, not thirty overall. Pair every throughput metric with a quality metric. Cycle time without first-pass yield is misleading. Throughput without error rate is dangerous.
Utilization targets cluster at 70-85% sustained across Federal Reserve, Gartner, and SPI benchmarks. Speed KPIs correlate with revenue. Reliability KPIs correlate with retention. Engagement and regrettable attrition predict process quality more reliably than most output metrics.
Goodhart's law is the core risk: any single metric is gameable once it's a target. The defense is paired metrics pointing at the same outcome. If you're starting fresh, measure five: cycle time, first-pass yield, capacity utilization, SLA attainment, and regrettable attrition. Get those right before adding a sixth. Once stable, a structured operations audit can identify the next layer worth tracking.
Next step
Ready to go AI-native?
Schedule 30 minutes with our team. We’ll explore where AI can drive the most value in your business.
Get in Touch