Indeed University: Building Data-Driven Products

July 2015 marked the kickoff of Indeed University (IU), our inaugural 12-week summer program to teach Indeed’s development culture to new Indeedians. Over 50 new Indeed software engineers from our Tokyo, San Francisco, Seattle, and Austin offices took part in the program, held at our Austin headquarters.

Indeed University onboarded new hires by acquainting them with Indeed’s culture, technology, and software development philosophy. Our three main goals were:

  • Accelerate the onboarding of new college grads
  • Provide leadership opportunities for current employees
  • Prototype new ideas for Indeed’s business

With these goals in mind, we set out to test a new means of onboarding Indeed’s software engineers.

IU logo

Onboarding

We like to pair new hires with mentors, who help with anything from dev environment setup to version control to A/B test definition. If a new hire has a question about something beyond their mentor’s expertise, the mentor helps them find the right expert. We encourage every new engineer to get a change into production in their first week, and the mentor supports them as they navigate that learning experience.

From Fall 2014 through Spring 2015, Indeed hired 51 new graduates from Computer Science and Engineering programs in the United States and Japan. With this growth, we could not sustain an onboarding process based on 1:1 mentorship.

But Indeed University did not lose the emphasis on taking ownership and getting code into production. Within the first two weeks of the program, every participant had completed their “gong project,” in which they defined and implemented an A/B test on Indeed. Then they dove into a hands-on curriculum, attended informative talks, and formed teams to build new products.

Indeed University

IUers collaborating in their main work space

Gong project

We take pride in our ability to test everything from simple changes to new experiences. For IU, we embraced this philosophy and offered a blank canvas for our new hires to implement an A/B test in their first week.

The A/B tests aimed to help job seekers find jobs faster by emphasizing helpful filters, such as salary.

Indeed University filter by salary

Screenshot of an Indeed University A/B test that encouraged job seekers to filter by salary

It’s an Indeed tradition for an engineer to ring the gong after their first line of code goes live on Indeed. At IU, everyone gathered around when a test was live. The IU participant (IUer) explained what A/B test they implemented and then rang the gong. Usually with great gusto.

Indeed University

Ringing the gong

Designing and implementing these tests provided an opportunity for these new Indeedians to bring fresh perspectives to Indeed’s core products. What better way for a new hire to learn about data-driven development than to form a hypothesis, design a test, and analyze the results? IUers got a first-hand experience with our culture of data-driven development. It was exciting to see new ideas come out of these A/B tests.

Curriculum

After completing the gong project, IUers launched into the curriculum – a self-paced course designed by seasoned Indeed engineers. Exercises followed each section, making the curriculum more hands-on.

The curriculum began with engineering basics and covered the tools and technologies we use at Indeed to build webapps and services. Then it covered logging (using logrepo) and data analysis (using Imhotep).

Talks

IU talks introduced our product development philosophy and some interesting market opportunities. We wanted participants to think about new products Indeed could build.

Indeed University

Paul D’Arcy presents on How People Look for Jobs (Photo: Hannah Spellman)

Tech talks and product deep dives provided more specific details about infrastructure and technology. Unlike typical startups, the IU teams had immediate access to Indeed data and users, as well as tools for deployment, testing, and authentication.

Social events

Social activities helped IUers get to know one another. They developed lasting connections before many of them headed off to Seattle, San Francisco, and Tokyo.

We planned Friday happy hours that were complete with freshly baked cookies from Tiff’s Treats, local brews, a slew of board games, and Wii U. We also took our new hires to our favorite local restaurants, including Franklin Barbecue, Torchy’s Tacos, and Michi Ramen. We went to a Round Rock Express baseball game. We also did standup paddleboarding, go-karting, and paintball. Last, but not least, we visited the Austin Panic Room.

Indeed University paddleboarding

Standup Paddle Board on Town Lake in Austin

Indeed University social

Enjoying Franklin Barbecue

Developing new leaders

Twelve emerging leaders from the engineering, product, and online marketing organizations joined Indeed University to teach best practices and our style of iterative, data-driven development. These leads gained experience by directly managing 4-5 new hires. They also each advised up to 3 product teams.

The experience gave the leads their first taste of engineering management, including weekly 1-on-1s and quarterly evaluations. The 1:4 ratio allowed leads to develop relationships with the new hires. IUers talked with their leads about product and technology challenges as well as more mundane concerns like finding an apartment.

As product team advisors, leads challenged themselves to teach rather than tell. They encouraged teams to plan product iterations, prioritize issues, design tests, and analyze results.

In other words, leads taught participants how to think like Indeed engineers. They encouraged teams to be independent but unafraid to ask questions, to take risks and use data to measure outcomes, and to take ownership of their products.

New products

We encourage all engineers to imagine product changes that focus on our mission: helping people get jobs. IUers brainstormed new product ideas for Indeed. We held three brainstorming sessions. After each session, groups pitched solutions to problems they had identified.

Brainstorming wrapped up with a final round of pitches to Indeed’s senior leadership team, Indeed University leads, and other interested Indeedians.

Indeed University brainstorming

Brainstorming in the IU lounge

Following brainstorming, IUers formed teams based on their interests and spent the remaining 9 weeks building working products. They served as their own product managers, team leads, designers, marketers, and testers.

At Indeed, we believe we must explore as many product ideas as we can, as quickly as we can. IU immersed participants in this culture of engineering velocity.

engineering velocity

Before product development began, every team researched their market. Teams created Google Surveys, called Indeed employers, and ventured to local shopping malls to speak with retail managers and employees directly.

These conversations challenged some assumptions, helped drive initial product direction, and caused one early product pivot.

We challenged each team to build the minimal viable product (MVP) that would allow them to validate their idea. How would they demonstrate value? One word: data. Teams needed to collect data in order to confirm their product’s value.

Indeed University teams had a few “unfair advantages.” For one, they had Indeed’s data at their fingertips. They could use this data to do preliminary research or to tailor their design to a specific audience. Second, each team had a generous online marketing budget for traffic acquisition. Teams tested showing ads on Google, Facebook, and Indeed job search, to see where they could best connect with potential users.

Teams presented their work at weekly product reviews with Indeed executives and IU leads. In these weekly reviews, teams answered the following questions:

  • What did you do this week?
  • What data did you collect?
  • What did you learn from this data?
  • What are you doing based on the data?

In these meetings, we encouraged teams to start small, validate with data, and iterate. They learned that building successful software is about more than software architecture.

The product reviews encouraged discussion of A/B test results and opportunities for testing assumptions. Many teams experienced high bounce rates on their landing pages, and they ran A/B tests to test their hypotheses about these bouncing users. One team tested a new call to action on their landing page. Another tested delaying the sign-in requirement, allowing job seekers to experience the product before creating an account.

In the course of three months of Indeed University, the new hires built eleven new products:

  • A search engine for college students to find jobs
  • A data trends exploration tool for HR professionals
  • A site that helps high school students choose a college major based on jobs that interest them
  • A product that uses social sentiment analysis to give job seekers unfiltered comments about company reputation
  • A hiring tool that helps employers manage active candidate statuses
  • A product that asks job seekers a series of questions in order to match them to jobs based on their interests
  • An application that allows employees to track progress toward their goals
  • A mobile app for finding local retail and food service jobs
  • An automated phone screening solution to help employers efficiently evaluate candidates
  • A gig marketplace for job seekers to find small jobs in their area
  • A site that lets people visually explore career paths based on a current or desired position

Graduation

At the Indeed University Graduation Party, each team introduced the product they built and presented their learnings. Five products continued beyond IU for product validation through further development, testing, and iteration.

Indeed University graduation

An IU team talks about what they learned at the Graduation Party

After IU, the participants joined their new teams in Tokyo, Seattle, San Francisco, and Austin. They take with them the connections, skills, and knowledge they gained during the program. They are on their way to having a real impact at Indeed.

Whether you are a student or an industry veteran, we are looking for talented engineers to help with our mission of helping people get jobs. To learn more, check out our open positions.

Tweet about this on TwitterShare on FacebookShare on LinkedInShare on Google+Share on RedditEmail this to someone

IndeedEng San Francisco: 600% Growth in 2015

Indeed San Francisco is growing! It’s been a great year and we’ve enjoyed a lot of growth — starting the year with 5 people and ending with over 30!

Group Photo SF

A mixer with the San Francisco and San Mateo offices

The Indeed San Francisco office is primarily focused on product development in these areas:

  • My Jobs provides job seekers with tools to more effectively organize and manage their search for a new job. The My Jobs team uses modern web technologies like ReactJS and ES6 to create an engaging user experience across desktop and mobile.
  • The Candidate Quality team builds models to predict how good of a fit a job seeker is for an opening. These models help employers quickly assess candidates who apply to their jobs. We also use these models to ensure that our job search products are helping people find jobs that they’re qualified for.
  • The Notifications team works on Indeed’s Job Alerts as well as our internal systems and infrastructure for high-volume email and mobile push notifications. These systems generate and deliver almost a billion emails to job seekers every week.
  • New product teams use rapid prototyping and iteration to investigate product ideas. The teams identify opportunities for new products, build and launch working implementations, and collect and analyze data to validate their ideas while iterating on new features. We’re always looking for new ways to connect job seekers with their ideal jobs.

Just like all teams at Indeed, the teams in San Francisco are small, work full-stack, deploy multiple times per week, and use our powerful experimentation and analytics tools — Proctor and Imhotep — to continuously test ideas and improve our products.

This year we moved to a larger space on Market St, had two hackathons (one at a cabin in the Sierra Nevadas), took a trip to the beach in Santa Cruz, and had a holiday party on the Embarcadero.

Market St.

View from our office

SF Cabin

The site of our recent hackathon in the Sierra Nevadas

As we grow, Indeed San Francisco will take on more projects to help people get jobs. If you like that mission and want to help us tackle some really interesting problems, contact Ed Delgado to start the discussion about our open positions!

Tweet about this on TwitterShare on FacebookShare on LinkedInShare on Google+Share on RedditEmail this to someone

Luck, Latitude, or Lemons? How Indeed Locates for Low Latency

Indeed likes being fast. Similar to published studies (here and here), our internal numbers validate the benefits of speed. It makes sense: a snappy site allows job seekers to achieve their goals with less frustration and wasted time.

Application processing time, however, is only part of the story, and in many cases it is not even the most significant delay. The network time – getting the request from the browser and then the data back again – is often the biggest time sink.

How do you minimize network time?

Engineers use all sorts of tricks and libraries to compress content and load things asynchronously. At some point, however, the laws of physics sneak in, and you just need to get your data center and your users communicating faster.

Sometimes, your product runs in a single data center, and the physical proximity of that data center is valuable. In this case, moving is not an option. Perhaps you can do some caching or use a CDN for static resources. For those who are less tied to a physical location, or, like Indeed, run their site out of multiple data centers, a different data center location may be the key. But how do you choose where to go? The straightforward methods are:

Word of Mouth. The price is good and you’ve talked to other customers from the data center. They seem satisfied. The list of Internet carriers the data center provide seems comprehensive. It’s probably a good fit for your users … if you’re lucky.

Location. You have a lot of American users on the East Coast. Getting a data center close to them, say in the New York area, should help make things faster for the East Coast.

Prepare to be disappointed.

These aren’t bad reasons to pick a data center, but the Internet isn’t based on geography – it’s based on peering points, politics, and price. If it’s cheaper for your customer’s ISP to route New York through New Jersey because they have dedicated fiber to a facility they own, they’re probably going to do that, regardless of how physically close your data center is to the person accessing your site. The Internet’s “series of tubes” don’t always connect where you’d think.

What we did

In October of 2012, Indeed faced a similar quandary. We had a few data centers spread out across the U.S., but the West Coast facility was almost full, and the provider warned that they were going to have a hard time with our predicted growth. The Operations team was eager to look at alternate data centers, but we also didn’t want to make things slower for the West Coast users. So we set up test servers in a few data centers. We pinged the test servers from as many places as we could, comparing the results to the ping times of the original data center. This wasn’t a terrible approach, but it also didn’t mimic the job seeker’s experience.

Meanwhile, other departments were thinking about the problem too. A casual hallway conversation with an engineering manager snowballed into the method we use today. It was important to use real user requests to test possible new locations. After all, what better measure would there be to how users perceive a data center than those same users?

After a few rounds of discussion, and some Dev and Ops time, we came up with the Fruits Test, named for the fruit-based hostnames of our test servers. Utilizing this technique, we estimated that the proposed new data center would shave an average of 30 milliseconds off of the response time for most of our West Coast job seekers. We validated this number once we migrated our entire footprint to the new facilities.

How it works

First, we assess a potential data center for eligibility. It doesn’t make sense to run a test against an environment that’s unsuitable because of space or cost. After clearing that hurdle, we set up a lightweight Linux system with a web server. This web server has a single virtual host named after a fruit, such as lemon.indeed.com. We set up the virtual host to serve ten static JavaScript files, named 0.js, 1.js, etc., up to 9.js.

Once the server is ready, we set up a test matrix in Proctor, our open-sourced A/B testing framework. We assign a fruit and a percentage to each test bucket. Then, each request to the site is randomly assigned to one of the test buckets based on the percentages. Each fruit corresponds to a data center being tested (whether new or existing). We publish the test matrix to Production, and then the fun begins!

fruit-flow

Figure 1: Fruits test requests, responses, and logging

Legend

  1. The site instructs the client to perform the fruits test.
  2. The 0.js request and response call dcDNSCallback.
  3. dcDNSCallback sends the latency of the 0.js request to the site.
  4. The [1-9].js request and response call dcPingCallback.
  5. dcPingCallback sends the latency of the [1-9].js request to the site.

Requests in the test bucket receive JavaScript instructing their browser to start a timer and load the 0.js file from their selected fruit site. This file includes a blank comment and an instruction to call the dcDNSCallback function. On lemon.indeed.com, it passes in "l" to indicate the test fruit:

/*

*/
dcDnsCallback("l");

dcDnsCallback then stops the previous timer, and sends a request to indeed.com, which triggers a log event with the recorded request latency.

The dcDnsCallback function serves two purposes. Since the user’s system may not have the fruit hostname’s IP address in its DNS cache, we can get an idea of how long it takes to do a DNS lookup and a single request round trip. Then, subsequent requests to that fruit host within this session won’t have DNS lookup time as a significant variable, making those timing results more precise.

After the dcDnsCallback invocation, the test selects one of the 9 static JavaScript files at random and repeats the same process: start timer, get the file, run function in the file. These files look a little bit like:

/*
3firaei1udgihufif5ly7zbsqyz59ghisb13u1j26tkffr7h67ppywg12lfkg7ortt5t3xoq5
*/
dcPingCallback("l");

These 9 files (1.js through 9.js) are basically the same as 0.js, but call a dcPingCallback function instead, and contain a comment whose length makes the overall response bulk up to a predefined size. The smallest, 1.js is just 26 bytes, and 9.js comes in at a hefty 50 kilobytes. Having different sized files helps us suss out areas where latency may be low, but available bandwidth is limited enough that getting larger files takes a disproportionately long time. It also can identify areas where bandwidth is plentiful enough that the initial TCP connection setup is the most time-consuming aspect of the transaction.

Once the dcPingCallback function is executed, the timer is stopped and the information about which fruit, which JavaScript file, and how long the operation took is sent to Indeed to be logged. These requests are all placed at the end of the browser’s page rendering and executed asynchronously to minimize the impact of the test on the user’s experience.

On indeed.com, the logging endpoint receives this data and records it, along with the source IP address and the site the user is on. We then write the information to a specially formatted logstore that Indeed calls the LogRepo – mysterious name, I know.

After collecting the LogRepo logs, we build indexes from them using Imhotep, which allows for easy querying and graphing. Depending on the nature of the test, we usually let the fruits test run for a couple of weeks, collecting hundreds of thousands or even millions of samples from real job seekers that we can use to make a more informed decision. When the test has run its course, we just turn off the Proctor test and shut down the fruit test server. That’s it! No additional infrastructure changes needed.

One of the nice things about this approach is that it is flexible for other types of tests. Sure, we mainly use it for testing new data center locations, but when you boil it down to its essentials (fruit jam!), all the test does is download a set amount of data from a random sampling of users and tell you how long it took. Interpreting the results is up to the test designer.

Rather than testing data centers, you could test two different caching technologies, or the performance difference between different versions of web or app servers, or the geographic distribution of an Anycast/BGP IP (we’ve done that last one before). As long as the sample size is large enough to be statistically diverse, it makes for a valid comparison, and from the perspective of the best people to ask: your users.

That’s nice, but why “Fruits Test”?

When we were discussing unique names to represent potential and current data center locations, we wanted names that were:

  • easily identifiable to Operations
  • a little bit obscure to users, but not too mysterious
  • not meaningful for the business

As a placeholder while designing things, we used fruits since it was fairly easy to come up with different fruits for letters of the alphabet. Over the course of the design the names became endearing and they stuck. Now I relish opening up tickets to enable tests for jujube, quince (my favorite), and elderberry!

fruits-82524_640

Now what?

Now that we have a pile of data, we graph the heck out of it! But more about that in Part 2 of the Fruits Test series.

Tweet about this on TwitterShare on FacebookShare on LinkedInShare on Google+Share on RedditEmail this to someone