Software development is an "invisible" task
Outsiders find it difficult to appreciate the consequences of what they ask software developers to do. This is largely due to lack of data and understanding about how they should operate.
Software ≠ Coding
There are common misunderstandings about what software engineers do. More specifically, what makes a good software engineer. Some of the most impactful work software engineers do have nothing to do with the lines of code they write. In order to truly begin to measure engineer productivity, you have to understand their work and what 'successful outcomes' mean for software engineers.
Many engineering leaders, many opinions
From number of bugs to on-time delivery; this article shows how the approaches of each team can differ drastically. It's clear that software is misunderstood and the market is making attempts at quantifying the complexity of software engineering with basic metrics. The real take away is that any software analytics company that focuses too narrowly on any particular metrics is missing the big picture. Each team is different. Each engineer is different. By making attempts to compare them using the same metrics is inherently missing the mark.
Outcomes over outputs
is the typical statement made when evaluating which metrics to measure. The book accelerate (largely adopted) specifies 4 KPIs that emphasize outcomes over outputs. Lead Time, Deployment Frequency, Change Fail Percentage, and Mean Time to Restore. These can work well at a high level but give nearly no visibility into how the team is actually working or steps you can take to improve. It's important to track these metrics, but they give no visibility into how productive your team is or where they can improve.
9 Common Software Metrics
Additional high level metrics to track team progress. Again these are good practice to understand improvements but give no visibility into how productive the team is or where they can improve.
Measurements and fallbacks
From lines of code and revenue/cost savings to velocity. This article maps some of the attempts made at measuring engineering productivity and why their fallbacks. Can individual performance be measured? Better to measure at the team or organization level? Or is it better to evaluate opportunities to improve? "Measure things that matter" and that can differ across organizations. Use data to learn and improve. Not compare across teams or individuals.
History of measurements
This is an opinion piece that does a decent job of listing the history of measurements and why they fail to truly measure engineering productivity. While we agree with the author when he states "there still doesn’t exist a reliable, objective metric of developer productivity"; we strongly disagree with his end thesis that measurement is a lost cause. Funny thing is, he makes suggests focusing on "measuring anything that impedes progress, or the progress of delivering value to the customer" rather than attempting to measure and compare individual productivity with simplistic metrics. Enter Haystack.
Measuring individual performance is not possible
Many failed attempts have been made to measure this inherently complex function. We tend to agree, but the premise of the article is measuring productivity for individual engineers. The nature of that implies that individuals can be measured and weighed against each other in a quantitative manner. The problem is that each team and individual on the team are different people with different preferences, skills, and experiences. Comparing them with the same metrics are flawed.
It's interesting to note that the author's suggestions for ways to measure IC productivity "subjectively" can in fact be deduced using data. Now while we agree that engineer productivity is difficult to measure; we believe it's being looked at from the wrong perspective. Finding patterns that affect developer's productivity certainly can be measured. Especially when you evaluate trends. Now this isn't to say that Jimmy should crunch out a feature every 2 days but to say "jimmy's slowest features tend to be in areas where there is limited team-wide knowledge".
Each developer and team is a 'snowflake'
While we can easily measure all kinds of things about software components, individuals, teams, and projects, but the metrics we choose will have no basis for comparison outside of their own scope. Otherwise you're comparing apples to oranges. The only valid comparison in all of these cases is relative to individual history, as an indicator of progress toward a goal—ideally a business goal.
'Successful outcomes' for software engineering are largely contextual.
'Success' for an engineer, team or project can change over time and is largely situational. For example, imagine a team that has a huge spike in the number of features they completed this sprint. Is that good? Well, they did more than the other team so at face value you can call it 'good'. But the reality is that there were no changes made to process, automation or team. This indicates that the team increased their bandwidth above their means in a non-sustainable way. The only way to truly know that this is in fact detrimental to long-term performance is if you analyze that team's history and determine this is an anomaly. You'll see an immediate decrease in performance in the coming weeks. You can see that having a single metric, in this case # features completed, as an indicator of success will slowly drive teams to unhealthy patterns of taking on more work than they can chew. This will lead to increase in defects, decrease in knowledge sharing (as engineers focus solely on crunching out features), and eventually burnout. A 'successful' outcome one day becomes a nightmare the next.
Situational metrics over traditional KPIs
The nature of knowledge work, specifically the output of knowledge work, cannot be quantified at all times. In this case, traditional KPIs fail to accurately measure productivity and incentivize the wrong outputs (i.e. lines of code). 'There is convincing evidence that it is beneficial to setup situationally relevant metrics ... using a more sophisticated framework ... to quantify productivity from different dimensions will conduct a more balanced and fair result.'
Traditional measures of productivity don't work for knowledge workers
The traditional measure of productivity, is calculated as "output per hour" of work. Knowledge workers are those who “think for a living,” making productivity challenging to measure since outputs and how to calculate them varies widely.
Metrics are useless or evil
"Many programmers think that metrics are useless or evil, especially if they are used by management to evaluate and compare programmer productivity." Metrics like lines of code, number of coding hours, number of commits, etc. are overly simplistic measures of engineering productivity.
6 Factors with highest impact on productivity
From the review of more than 800 individual research papers and 35 meta analyses, the six factors that had the highest statistical association with the performance of teams involved in knowledge work are:
Having global metrics to measure engineers, teams and projects violates the core cultural factors with the highest impact on software engineering productivity. Engineers know that no one metric should define their success. By using any metric to stack rank and measure their 'productivity', you are actively violating many of the highest impact factors of productivity such as trust, perceived support from managers, removing clarity of top-level goals. You can see how this approach has a negative impact on culture and thus, productivity.
Engineers and teams should be considered independently
Comparisons can only be made relative to their history and should not be used as a KPI. By considering each engineer and team independently, it allows engineering leaders to see trends and evaluate key drivers of productivity as well as the main impediments. It's important to note that metrics should be used to measure changes in productivity as well as main impediments to productivity. Using metrics as a KPI results in over-simplification and incentivizing the wrong behavior. For example, if the engineer's KPI is Cycle Time then what happens to their code quality or incentive to learn new skills if they're just focused on their speed? Always look at trends and tradeoffs. Never look at one metric. Faster delivery is great, but at what cost? Tradeoffs should be always be considered.
Baseline metrics for engineers, teams, and projects
Using baseline metrics to spot bottlenecks, issues and opportunities to improve is better because it's crafted to each engineer and team rather than using global, simplified metrics such as lines of code or number of commits. "Successful" outcomes can also be highlighted at this level. For example, an engineer that needs to be on-boarded onto a new project may have slower Cycle Time, but 'successful outcomes' have changed. In this case, increasing their domain knowledge of would indicate success rather than using a team-wide Cycle Time metric. Always look at trends and trade-offs. Another example is a spike in Throughput. If an engineer gets 3x done this sprint, is that good? Well, in order to know that you have to look across several factors. What happens to the quality of code when throughput increases? What happens to their long-term productivity? Was this good for the engineer's productivity?
Core productivity drivers and blockers
With baseline metrics for each engineer and team, we can use pattern recognition to identify core drivers and impediments to productivity. By focusing not on team-wide metrics and KPIs but looking at changes in productivity and opportunities to improve, we can maintain the top cultural factors that with the highest impact on productivity. This allows engineering leaders to introduce data without sacrificing culture. For example, using data to determine the impact of distracting meetings, technical debt, and over work enables data-driven decision making without sacrificing culture.
Making attempts to measure all engineers and teams against the same yard stick takes away from the actual goal. By analyzing each engineer and team independently you can begin to truly understand the core drivers and blockers to productivity at each level. You cannot do this when you measure the entire team against the same metrics. In this case, you not only get a watered down view of what impacts productivity but you in fact incentivize the wrong behaviors.
By taking into account each engineer and team independently we can begin to measure and understand what impacts productivity at each level. These factors change over time so although we cannot produce no one sole metric that can measure engineer productivity, we can identify factors that impact productivity. This allows us to pinpoint anomalies and measure trends without the need for an arbitrary metric to measure against.
The correct approach to utilizing data in engineering teams needs to take into account the subtleties of engineering culture. Engineering is incredibly complex and there is no one metric that defines a software engineer's productivity. Attempting to stack rank or measure against simplified metrics and KPIs like lines of code or number of commits inherently hurts culture as engineers know these metrics are not true indicators of success. The correct approach is to use data to pinpoint bottlenecks, issues, and trends on an individual and team level. This allows engineer leaders to measure, improve, and make decisions with data while maintaining engineering culture.
Discover our latest articles, feature releases, and more!