The Benefits of Hindsight: A Metrics-Driven Approach to Coaching

In a previous post, I described using a measure-question-learn-improve cycle to drive improvements to development processes. In this post, I assert that this cycle can also help people understand their own opportunities for improvement and growth. It can be a powerful coaching tool — when used correctly.

metrics-driven coaching

At Indeed, we’ve developed an internal web app called Hindsight that rolls up measurements of work done by individuals. This tool makes contributions more transparent for that person and their manager.

Hindsight - metrics-driven coaching

Each individual has a Hindsight card that shows their activity over time (quarter by quarter). Many of the numbers come from Jira, such as issues resolved, reported, commented on, etc. Others come from other SDLC tools. All numbers are clickable so that you can dive down into the details.

When we introduced Hindsight, we worried about the Number Six Principle and Goodhart’s Law (explained in the earlier post). To protect against these negative effects, we constantly emphasize two guidelines:

  • Hindsight is a starting point for discussion. It can’t tell the whole story, but it can surface trends and phenomena that are worth digging into.
  • There are no targets. There’s no notion of a “reasonable number” for a given role and level, because that would quickly become a target. We even avoid analyzing medians/averages for the metrics included.

Hindsight in action: How’s your quality?

To see how Hindsight fits into the measure-question-learn-improve cycle, consider this example: Suppose my card shows that for last quarter I resolved 100 issues and had 30 issues reopened during testing. As my manager, you might be tempted to say, “Jack is really productive, but he tries to ship a lot of buggy code and should pay more attention to quality.”

But remember — the metrics are only a starting point for discussion. You need to ask questions and dig into the data. When you read through the 30 reopened issues, you discover that only 10 of them were actual bugs, and all of those bugs were relatively minor. Now the story is changing. In fact, your investigation might drive insight into how the team can improve their communication around testing.

Measure, question, learn, improve

In this five-part series, I’ve explored how metrics help us improve how we work at Indeed. Every engineering organization can and should use data to drive important conversations. Whether you use Imhotep, spreadsheets, or other tools, it’s worth doing. Start by measuring everything you can. Then question your measurements, learn, and repeat. You’ll soon find yourself in a rewarding cycle of continuous improvement.


Read the full series of blog posts:


Cross-posted on Medium.

Tweet about this on TwitterShare on FacebookShare on LinkedInShare on Google+Share on RedditEmail this to someone

What’s Up, ASF? Using Imhotep to Understand Project Activity

As I described in an earlier post, we built Imhotep as a data analytics platform for the rapid exploration and analysis of large time-series datasets. In the previous post, I showed how an Imhotep dataset based on Atlassian Jira can drive improvements to the development process.

We’re continually searching for new ways to collect metrics. Examining actions in Jira, the tool we use for tracking our development process, seemed like a natural fit for gaining process insights. We decided to find a way to convert Jira issue history for a large set of projects into an Imhotep dataset of actions, organized by time.

The open source Jira Actions Imhotep Builder transforms issue activity in a Jira instance into an Imhotep dataset. Each document in the resulting dataset corresponds to a single action on a Jira issue, such as creation, edit, transition, or comment.

The builder queries the Jira REST API for each Jira issue in the specified time range, then deconstructs the issue into a series of actions. The actions are written to a series of .TSV (tab-separated values) files, which are uploaded to an Imhotep dataset.

Using that builder, we created a dataset of activity on projects in the Apache Software Foundation (from their Jira instance). We hope Apache projects take advantage of the dataset to gain insights about ways they can improve processes for their developer and user communities.

Jira Imhotep Builder and Apache Data

 

Diving into the ASF Jira data

We created an Imhotep dataset of ASF Jira data from January 1, 2016 through the present. As of October 17, 2018, the dataset:

  • contains nearly 3.4 million Jira actions, including 230,298 issue creations, 1.8 million edits, and 1.3 million comments
  • requires only 274MB on disk, or about 81 bytes per action

The dataset, called apachejira, is available on the public Imhotep demo cluster at https://imhotep.indeed.tech/iql/.

Note: For optimal performance, view the IQL webapp on a desktop computer with a fast internet connection.

Using the apachejira dataset, we can answer many questions about what’s happening in ASF projects, such as the following examples.

Who reported the most bugs in ASF projects from July-September?
Beam JIRA Bot, with presumably actual person Sebb in the #2 position:

Jira Imhotep Builder

Which projects have the most bugs reported from July-September?
Ignite edges out Ambari for the top spot, with 401 bugs reported.

project with most bugs - Jira Imhotep Builder

The next two questions explore some differences in project workflows.

How many distinct status values exist in the most active projects?
Five of the top ten projects have 6 distinct statuses, and the other five have 5 distinct statuses. For example, Apache Beam has 5, and Apache Hive has 6.

distinct status - Jira Imhotep Builder

How do the statuses used by Apache Beam and Apache Hive compare to one another?
Hive uses the Patch Available state, Beam doesn’t. It turns out that about 11% of the Apache JIRA projects take advantage of this state.

compare Beam and Hive - Jira Imhotep Builder

Which projects had the most contributors changing issue status to Patch Available in 2018?
Hadoop ecosystem projects (Hive, HDFS, Hadoop Common, YARN, HBase, and Hadoop Distributed Data Store) claim six of the top 10 spots.

Jira Imhotep Builder

Who contributed to (set status to Patch Available in) Apache Hive in 2018?
The top 10 contributors contributed to 578 issues in 2018.

hive contributors - Jira Imhotep Builder

How long does it take to get a patch accepted in the 20 most active projects?
Hadoop Distributed Data Store is the fastest, with an average of 102 hours between the Patch Available and Resolved states.

Patch acceptance time - Jira Imhotep Builder

The average for Kafka is really high, but it turns out that about 28 outliers with resolutions of Not A Problem, Auto Closed, Duplicate, Won’t Fix, and Won’t Do contributed to the high average.

That might be a bad thing or an okay thing for the community. Either way, digging into numbers like these can raise interesting questions.

These are a small sample of the questions you can ask the apachejira dataset. If you find any other good ones, share them with us on Twitter.

Creating and analyzing your own Jira datasets

We’ve made the Jira Actions Imhotep Builder available as open source. We hope you will use it to build your own Jira-based Imhotep datasets. This builder is the first one we’ve published, and we’ve also listed it in a new Imhotep Builder Directory.

If you have an idea for a new builder, or need help getting started with Imhotep, open an issue in the GitHub repository or reach out on Twitter.

In the next post in this series, I describe Hindsight, an internal tool we use to make internal contributor work visible and drive coaching insights.


Read the full series of blog posts:


Cross-posted on Medium.

Tweet about this on TwitterShare on FacebookShare on LinkedInShare on Google+Share on RedditEmail this to someone

Metrics-Driven Process Improvement: A Case Study

In the previous post, I described how we use a measure-question-learn-improve cycle to refine development processes. To reiterate:

  1. Measure everything we possibly can.
  2. Learn by asking questions and exploring the data we’ve collected.
  3. Use our learnings to try to improve.
  4. Measure continuously to confirm improvement.

At Indeed, we get as much data as we can into Imhotep — everything that happens in our products, but also everything that happens in the development process. Process-oriented Imhotep datasets at Indeed include Git commits, Jira issue updates, production deploys, wiki edits, and more.

Let’s take a look at how we applied the measure-question-learn-improve cycle to a problem at Indeed: translation verification.

Translation Process - metrics-driven improvement

How long are translations in “Pending Verification”?

Our international team believed it was taking way too long to verify string translations in our products. We track translation work in Jira, and we track Jira issue updates in an Imhotep dataset. So we started asking questions of our measurements that might help us understand this problem.

This Imhotep query gives us a day-by-day grouping of time spent in the Pending Verification state. It includes only Translation issues (in a sample project called LOREM) that moved out of that state:

from jiraactions 2017-01-08 2017-04-02
where issuetype = 'Translation' AND
      prevstatus = 'Pending Verification' AND
      status != 'Pending Verification' AND
      project = 'LOREM'
group by time(1d)
select timeinstate/86400 /* days pending */ 

We ask Imhotep to graph that metric cumulatively over time. We see that for the given 3-month period, Translation issues spent a total of ~233 days in Pending Verification.

Translation Days Pending - metrics driven process improvement

That sounds like a lot of time, but it’s important to ask more questions! Be skeptical of the answers you’re getting, whether they support your hypothesis or not.

  • Can we dig into the data to better understand it?
  • What other information do we need to interpret?
  • What are the sources of noise?
  • Do we need to iterate on the measurement itself before it is generally useful?

In this example, what if only a few issues dominated this total? Let’s tweak our query to look at how many issues are contributing to this time.

from jiraactions 2017-01-08 2017-04-02
where issuetype = 'Translation' AND
      prevstatus = 'Pending Verification' AND
      status != 'Pending Verification' AND
      project = 'LOREM'
group by time(1d)
select distinct(issuekey) /* number of issues */

number of issues - metrics driven process improvement

Our translators shared with our engineers their perspective on the root cause. Translation changes had to wait for a full build before they would show up in a testing environment. Translators would move on to other work and return to verify later. We paid the cost of these context switches in a slower rate of translation.

The spikes we see in the graph above show that delay. Each time a set of changes reached a testing environment, a number of verification events closely followed. The visualized data confirms the inefficiency described by the people actually doing the work.

When we switch that graph to cumulative, we see that translators verified 278 issues in the time period. That is probably a large enough dataset to validate the hypothesis.

cumulative number of issues - metrics driven process improvement

These are just a few examples of questions we can quickly and iteratively ask using Imhotep. When we have good measurements and we ask good questions, we learn. And based on our learnings, we can try to improve.

Translation verification: There is a better way

If a translation change could go straight to a testing environment as soon as it was submitted, we would eliminate the inefficiency described above. In fact, a couple of engineers at Indeed figured out a way to deploy translations separate from code. They started to try that incrementally on a project-by-project basis. This capability enabled translators to verify issues minutes after completing their changes.

After a period of time, we were able to compare two similar projects. The IPSUM project used the new translation deployment mechanism, while the LOREM project used the old method.

To illustrate the benefits of the new mechanism, it’s worth comparing the worse case scenarios. This query lets us see the 90th percentile time in Pending Verification for just those two projects.

from jiraactions 2016-09-15 2017-02-28
where issuetype = 'Translation' AND
      prevstatus = 'Pending Verification' AND
      status != 'Pending Verification'
group by project in ('LOREM','IPSUM')
select percentile(timeinstate, 90)

compare translation times - metrics driven process improvement

The new process does look faster, with a 90th percentile of 1.8 days, compared to 12 days for the project using the old mechanism.

After digging into the data, asking more questions, and further verifying, we decided to move more projects onto the new system and keep measuring the impact.

Using Imhotep to understand your process

In order to track Jira activity with Imhotep, we wrote a tool that we run daily to extract the data from Jira into an Imhotep dataset. We’ve open sourced this tool, and you can find it in the Imhotep Builder Directory. In the next post, I describe using that builder to analyze Apache Software Foundation projects.


Read the full series of blog posts:


Cross-posted on Medium.

Tweet about this on TwitterShare on FacebookShare on LinkedInShare on Google+Share on RedditEmail this to someone