KPIs for your operational processes

May 26, 2020

Bernat Fages

If you are managing an Operations team engaging in routine manual workflows, whether that is in the context of a Human in the Loop system or a simple fully manual process, you are probably wondering how you can improve your team's performance. To do that, you need to define metrics that reflect your team's performance. Good metrics will help you easily understand where your team is at and how they're trending.

Understanding the performance of your team is clearly important. But having insight into performance trends matters as much, in order to anticipate future issues. Ultimately, you want both an indicator on present, and a leading indicator on future, performance.

Defining Performance

But what does Performance really mean? Performance might mean different things to different people. For us, performance refers to three key dimensions of any operation:

  1. Cost. What is the cost of dealing with each single case?
  2. Latency. How fast are we reacting to each new case?
  3. Quality. How well is each case handled?

Let's examine each of these components below.


Each case sent to a process team will take a certain amount of time to be fulfilled. Because there is a definite economic cost to employing an Operations agent, the time it takes to fulfil a given task directly translates to a certain economic cost. By understanding how long it takes to complete each task in a manual workflow you can obtain a KPI representative of economic cost.

There are to reasons why you want to track and improve your cost KPI:

  1. Reduce the absolute cost of running a process.
  2. Increase your team's capacity.

If your goal is to increase the number of cases your team can handle and/or reduce the share of your budget allocated to a specific initiative, you definitely want to track cost.

A typical measure of operational cost is Average Handling Time (AHT). AHT represents how long it takes an agent to fulfil a task, on average.

At the very least, you should have an understanding of your global AHT over time. It is recommendable to measure this more granularly, at the workflow and agent level too. This will be useful to find more concrete and actionable insights on what to focus on to improve cost.


Latency speaks about how swiftly tasks are being handled. Whether your operation is dealing with identity verification, moderation, customer service, model validation or KYC to name a few examples, you're very likely to be bound by a SLA that requires you to act fast on every newly created case.

The standard here is to measure Turnaround Time (TAT), which is defined as the time between creation and resolution of a task. Measuring both the mean and median values should already give you a fair picture of your team's reactivity. More rigorous operations will probably want to track 90th, 95th or 99th percentile TATs, particularly those dealing with tight SLAs.

TAT should at least be measured globally across the entire operation, and secondarily per workflow.


If you simply optimise for reducing cost and latency, you will most likely compromise on the quality of your team's work. That's because speed is almost always directly opposed with thoughtfulness and accuracy. Quality is the other side of the same coin.

Consequently, holding your team accountable to a certain quality standard and ensuring this standard is being monitored becomes crucial. You don't want your cost reduction efforts to impact your operation's quality more than you'd be willing to accept, and vice versa.

There are two main approaches to measuring quality, depending on whether you have subject matter experts (SMEs) available to assess quality on an ongoing basis:

  1. QA: SMEs evaluate a subset of completed tasks.
  2. Consensus: consists in having more than one rep handle the same task in parallel and then assessing how often those tasks' outcomes are aligned. It's a cheaper alternative to QA, used when SMEs are not available or too expensive.

Other known techniques such as Golden Set or Ground Truth fall into the QA category.

The metric used to display the level of consensus between workers is typically called Consensus or Consistency, and on the Benchmark side we tend to call it Accuracy. These metrics should be computed globally, by workflow, and if possible, also per agent.

Wrapping up

Below is a summary of what you should be measuring:

Investing in implementing these measurements and acting on them will bring innumerable benefits to your organization. However, we know implementing these measurement strategies on top of already existing systems can be a daunting task, and securing Engineering resources can be tricky.

If you are working with a BPO (Business Process Outsourcing), having visibility over these metrics will allow you to set quality, latency and cost-based incentives. This will help you align your outsourcing partners' interests with yours.

Human Lambdas provides out of the box support for Measurement at all levels of an operation, and we are constantly working towards helping managers improve their understanding of their operations. If you are interested in trying out the Human Lambdas platform, we'll be happy to onboard you and help you become more data driven.