Open-Source Interactive Data Analytics with Imhotep

We are excited to announce the open-source availability of Imhotep, the interactive data analytics platform that powers data-driven decision making at Indeed. When we test changes to our applications and services, whether to our user interface or our backend algorithms, we measure how those changes affect job seekers. We built Imhotep to allow our engineering and product organizations to focus on key metrics at scale.

Key features

The Imhotep platform and tools allow you to:

  • Perform fast, interactive, ad hoc queries and aggregate results for large datasets
  • Combine results from multiple time-series datasets
  • Build your own data tools for analysis, monitoring, reporting, and automated data processing on top of the Imhotep platform

At its core, Imhotep is a distributed inverted index on time-series data that runs across a cluster of servers. We’ve made it easy to set up an Imhotep cluster on Amazon Web Services (AWS). Once you’ve set up your cluster, you can upload your data and then interactively query that data using IQL, the Imhotep Query Language. The IQL web client enables you to answer all sorts of questions about your data, and iterate quickly on those questions to get to important insights.

For example, at Indeed, we use Imhotep to answer these and many more questions about how people around the world are using our job search engine:

  • How many unique job search queries were performed on a specific day in a specific country?
  • What are the top 50 queries in a specific country? How many times did job seekers click on a search result for each of those queries?
  • Which job titles have the highest click-through rate for the query “Architecture” in the US? Which titles have the lowest click-through rate?

Getting started with Imhotep

You can use our tools to configure your Imhotep cluster on AWS. These setup tools require that you have an AWS account, two S3 buckets for data storage, and your time-series data in TSV or CSV format for uploading into the system.

To learn more, read our Imhotep documentation. If you need help, you can ask questions in our Q&A forum for Imhotep.

To learn more about how we use Imhotep for analytics at Indeed, check out the video and slides of our tech talk from April 2014: Large-Scale Interactive Analytics with Imhotep. If you’re in Austin, join us for our upcoming Imhotep workshop on November 5, 2014.


UPDATE 11/18/2014: Slides and video from the workshop are now available.

Tweet about this on TwitterShare on FacebookShare on LinkedInShare on Google+Share on RedditEmail this to someone

join the conversation