Proven Software Engineering Metrics
At Haystack Analytics, we only provide software engineering metrics that are proven to help teams - either through our own validation or empirical scientific evaluation.
For example; our North Star metrics include the Four Key Metrics. Rigorous appraisal of these Four Key Metrics (in research led by Dr Nicole Forsgren) has shown that higher performers are 2x more likely to meet their commercial goals (productivity, profitability, market share, number of customers) and their non-commercial goals (quantity of products or services, operating efficiency, customer satisfaction, quality of products or services and achieving organisational or mission goals). Indeed, companies which do well under these DevOps metrics have a 50% higher market cap growth over 3 years. The mechanics of how metrics drive organisational performance is also well understood and has been rationalised by industry leaders like Martin Fowler.
Harmful Software Engineering Metrics
Unfortunately, many of our competitors don't apply the same level of rigour to validating their metrics. Indeed, in many cases unvalidated metrics may prove harmful and actively damage your organisation.
Both Pluralsight Flow (previously known as GitPrime) and WayDev contain a code quality metric called "churn", this is defined as "code which is deleted or rewritten shortly after being written (less than 3 weeks old)". This metric is then used to compute an "efficiency" metric which is assumed to be "the percentage of all contributed code which is productive work" for a given engineer. Not only have we seen no evidence that such a metric will actually help your organisation, it appears to fly in the face of the evidence we have on the importance of lowering Cycle Time.
Companies with lower Cycle Time (time from development to code in production) are able to rapidly test ideas to gain quick customer feedback, resulting in product that is more likely to satisfy a customer. If you test something and your customers don't like it - you should be able to remove it without being punished for increasing some meaningless "churn" metric.
Other competitors of ours, like LinearB, place a particular focus on comparing individual engineers against each other. These practices also seem contrary to the evidence on the effectiveness on such management practices. In Project Oxygen, Google evaluated manager performance ratings and manager feedback from Google’s annual employee survey - they found that great managers are excellent coaches, but they do not micromanage.
Google also studied team performance in Project Aristotle evaluating a total of 180 teams (including 115 project teams in engineering) using hundreds of double-blind interviews and quantitate data sources. Google's researchers found that individual performance of team members is not significantly connected with team effectiveness, but psychological safety was essential. We have covered both the Project Oxygen and Project Aristotle research in a recent episode of our Engineering Insights podcast.
Finally; we see a lot of data being pulled from solely project management tools (like JIRA), without looking at what's happening in the code Version Control Systems (like Git). Whilst it can be easier to manipulate metrics when measured through a project management tool (as there's a further level of abstraction from what's going on in the source code), it also means that your data may not be accurate to what's actually going on. Simply through everyday differences in practice, it is best advised to not use a project management tool as your sole source of data for engineering metrics.
Due Diligence on DevOps Metrics
When adopting a new engineering metrics tool, be careful you're not buying snake oil and ask for the evidence that what's being measured will actually help your engineering organisation drive the growth of your business. Your leading indicators should be shown to move your North Star metrics, and your North Star metrics must ultimately drive your organisational goals.