IndeedEng: Proud Supporters of the Open Source Community

Posted on September 25, 2019 by Indeed Engineering

At Indeed, open source is at the core of everything we do. Our collaboration with the open source community allows us to develop solutions that help people get jobs.

As active participants in the community, we believe it is important to give back. This is why we are dedicated to making meaningful contributions to the open source ecosystem.

We’re proud to announce our continuing support by renewing our sponsorship for these foundations and organizations.

The ASF thanks Indeed for their continued generosity as an Apache Software Foundation Sponsor at the Gold level.

In addition, Indeed has expanded on their support by providing our awesome ASF Infrastructure team the opportunity to leverage Indeed.com job listing and advertising resources. This helped us bring on new hires to ensure Apache Infrastructure services continue to run 24x7x365 at near 100% uptime.

We are grateful for their involvement, which, in turn, benefits the greater Apache community.

— Daniel Ruggeri, VP Fundraising, Apache Software Foundation

CNCF is thrilled to have Indeed as a member of the Foundation. They have been a great addition to our growing end-user community. Indeed’s participation in this vibrant ecosystem helps in driving adoption of cloud native computing across industries. We’re looking forward to working with them to help continue to grow our community.

— Dan Kohn, Executive Director, Cloud Native Computing Foundation

Indeed’s active engagement with open source communities highlights that open source software is now fundamental, not only for businesses, but developers as well.

Like most companies today, Indeed is a user of and contributor to open source software, and interestingly, Indeed’s research of resumes shows developers are too—as job seekers highlight open source skills and experience to win today’s most sought after jobs across technology.

— Patrick Masson, General Manager at the OSI

We’re so happy that Indeed continues to join our sponsors—making it possible for us to provide critical opportunities to people who are impacted by systemic bias, underrepresentation and discrimination—and helping them get introduced to free and open source software.

— Karen Sandler, Executive Director, Software Freedom Conservancy

Participation in the PSF Sponsorship Plan shows Indeed’s support of our mission to promote the development of the Python programming language and the growth of its international community.

Sponsorships, like Indeed’s, fund programs that help provide opportunities for underrepresented groups in technology and shows support for open source and the Python community.

— Betsy Waliszewski, Python Software Foundation

We’re committed

Our open source initiatives involve partnerships, sponsorships and memberships that support open source projects we rely on. We work to ensure that Indeed’s own open source projects thrive. And we involve all Indeedians. This year we began a FOSS Contributor Fund to support the open source community. Anyone in the company can nominate an open source project to receive funds that we award each month.

We’re committed to open source. Learn more about how we do it.

IndeedEng Supports the Open Source Community—cross-posted on Medium.

Jobs Filter: Improving the Job Seeker Experience

Posted on September 12, 2019 by Haiyan Luo

As Indeed continues to grow, we’re finding more ways to help people get jobs. We’re also offering more ways job seekers can see those jobs. Job seekers can search directly on Indeed.com, receive recommendations, view sponsored jobs or Indeed Targeted Ads, or receive invitations to apply — to name a few. While each option presents jobs in a slightly different way, our goal for each is the same: showing the right jobs to the right job seekers.

If we miss the mark with the jobs we present, you may lose trust in our ability to connect you with your next opportunity. Our mission is to help people get jobs, not waste their time.

Some of the ways we’d consider a job to be wrong for a job seeker are if it:

Pays less than their expected salary range
Requires special licensure they do not have
Is located outside their preferred geographic area
Is in a related field but mismatched, such as nurses and doctors being offered the same jobs

To mitigate this issue, we built a jobs filter to remove jobs that are obviously mismatched to the job seeker. Our solution uses a combination of rules and machine learning technologies, and our analysis shows it to be very effective.

System architecture

The jobs filter product consists of the following components, as shown in the preceding diagram:

Jobs Filter Service. A high throughput, low latency application service that evaluates potential match-ups of jobs to users, identified by ID. If the service determines that the job is appropriate for the user ID, it returns an ALLOW decision; otherwise it returns a VETO. This service is horizontally scalable so it can serve many real-time Indeed applications.
Job Profile. A data storage service that provides high throughput, low latency performance. It retrieves job attributes such as estimated salary, job titles, and job locations at serving time. The job profile uses Indeed NLP libraries and machine learning technologies to extract or aggregate user attributes.
User Profile. Similar to the job profile, but provides attributes about the job seeker rather than the job. Like the job profile, it is a data storage service that provides high throughput, low latency performance. It retrieves job seeker attributes such as expected salary, current job title, and preferred job locations at serving time. Like the job profile, it uses Indeed NLP libraries and machine learning technologies to extract or aggregate user attributes.
Offline Evaluation Platform. Consumes historic data to evaluate rule effectiveness without actually integrating with the upstream applications. It is also heavily used for fine-tuning existing rules, identifying new rules, and validating new models.
Offline Model Training. Component that consists of our offline training algorithms, with which we train models that can be used in the jobs filter rules at serving time for evaluation.

Filter rules to improve job matches

The jobs filter uses a set of rules to improve the quality of jobs displayed to any given job seeker. Rules can be simple: “Do not show a job requiring professional licenses to job seekers who don’t possess such licenses,” or “Do not show jobs to a job seeker if they come with a significant pay cut.” They can also be complex: “Do not show jobs to the job seeker if we are confident the job seeker will not be interested in the job titles,” or “Do not show jobs to the job seeker if our complex predictive models suggest the job seeker will not be interested in them.”

All rules are compiled into a decision engine library. We share this library in our online service and offline evaluation platform.

Although the underlying data for building jobs filter rules might be complex to acquire, most of the heuristic rules themselves are straightforward to design and implement. For example, in one rule we use a user response prediction model to filter out jobs that the job seeker is less likely to be interested in. An Indeed proprietary metric helps us evaluate our performance by measuring the match quality of the job seeker and the given jobs.

Ads ranking and recommender systems commonly rely on user response prediction models, such as click prediction and conversion prediction, to generate a score. They then set a threshold to filter out everything with low scores. This filtering is possible because the models predict positive reactions from users, and low scores indicate poor match quality.

We adopted similar technologies in our jobs filter product, but we used negative matching models when designing our machine learning based rules. We build models to predict negative responses from users. We use Tensorflow to build the Wide and Deep model. This facilitates future experimentation with more complex models such as Factorization machine or neural networks. The features we use cover major user attributes and job data.

After we train a model that performs well, we export it using the Tensorflow SimpleSave API. We load the exported model into our online systems and serve requests using the Tensorflow Java API. Besides traditional classifier metrics such as AUC, precision, and recall, we also load our model into our offline evaluation platforms to validate the performance.

Putting it all to work

We apply our jobs filter in several applications within Indeed. One application is Job2Job, which recommends similar jobs to the job seeker based on the jobs they have clicked or applied for. Using the Job2Job service, we saw a greater than 20% increase in job match quality. When we applied the service to other applications, we observed similar, if not greater, improvements.

Rule-based engines work well in solving corner cases. However, the number of rules can easily spiral out of control. Our design’s hierarchy of rules and machine learning technologies effectively solve this challenge and keep our system working. In the future, we aim to add more features into the model so that it can become even more effective.

Jobs Filter—cross-posted on Medium.

Time-Tested: 7 Ways to Improve Velocity When A/B Testing a New UX

Posted on August 26, 2019 by Robyn Rap

A/B testing holistic redesigns can be tough. Here at Indeed, we learned this firsthand when we tried to take our UX from this to this:

The Indeed mobile Search Engine Results Page (SERP) circa mid-2017 (left) and circa mid-2018 (right)

Detailed description of image

The before and after image includes two screenshots: one of the Search Engine Results Page mid-2017 and one of the Search Engine Results Page mid-2018.

SERP circa mid-2017:

Each search result includes a job title, an associated employer, a rating, a review count, a location, an “Apply via phone” indicator if applicable, and a label indicating whether or not the job is sponsored.

The job title is on the first line of the result, in the largest font, in bold, and colored blue for emphasis. If the job title is new, a “new” indicator in orange is present.
In the second line of the result, from left to right, is the employer in a smaller blue font, a rating on a five-star scale, and the number of reviews in blue.
At times, the location of the position is on the second line of the result. Other times, the location of the position is on the third line depending on the size of the screen and how much information is on the second line. The font is black.
On the fourth line of the result, there is an estimated salary range if available. The font is the same size as the second line.
On the very bottom of the result, there is an indication of whether the job in sponsored (in red) or when the result was first posted.

Each result is in a white rectangle separated by a thick grey border. They are stacked vertically. On the top right of each result is a heart icon. Upon tapping it, it saves the result for later.

SERP circa mid-2018:

The job title is on the first line of the result, in the largest font, in bold, and colored black for emphasis. If the job title is new, a “new” indicator in red is present.
In the second line of the result, from left to right, is the employer in a smaller black font, a rating on a five-star scale, and the number of reviews. In comparison to the 2017 SERP, the font is a bit larger.
In the third line of the result, the location of the position is present. The font is black.
On the fourth line of the result, there is an estimated salary range if available. The font is grey and smaller than the second line.
On the very bottom of the result, there is an indication of whether the job in sponsored (in red) or when the result was first posted.

Each result is in a white rectangle separated by a thin grey border. They are stacked vertically. On the top right of each result is a heart icon. Upon tapping it, it saves the result for later.

Things didn’t go so hot. We’d spent months and months coming up with a beautiful design vision, and then months and months trying to test all of these changes all at once. We had a bunch of metrics move (mostly down) and it was super confusing because we couldn’t figure out what UI changes caused what effects.

So, we took a new approach. In the middle of 2018, we founded the Job Search UI Lab, a cross-functional team with one goal: to scientifically test as many individual UI elements as we could to understand the levers on our job search experience. In just the last 12 months, our team ran over 52 tests with over 502 groups. We’ve since used our learnings to successfully overhaul the Job Search UX on both desktop and mobile browsers.

In this blog post, we share some of the A/B test accelerating approaches we incorporated in the JSUI Lab — approaches that garnered us the 2018 Indeed Engineering Innovation Award. Whether you’re interested in doing a UX overhaul or just trying out a new feature, we think you can incorporate each of these tips into your A/B testing, too!

#1: Have a healthy backlog

No one should ever be waiting around to start development on a new test. Having a healthy backlog of prioritized A/B tests for developers helps you roll out A/B tests one after the other.

One way that the JSUI Lab creates a backlog is by gathering all of our teammates — regardless of their role — at the beginning of each quarter to brainstorm tests. We pull up the current UX on mobile and desktop and ask questions about how each element or feature works. Each question or idea ends up on its own sticky note. We end up with over 40 test ideas, which we then prioritize based off of how each test might address a job seeker pain point or improve the design system while minimizing effort. And while we may not get to every test in our backlog, we never have to worry about not having tests lined up.

#2: Write down hypotheses ahead of time

In the hustle and bustle of product development, sometimes experimenters don’t take the time to specify their hypotheses for a given A/B test ahead of time. Sure, not writing hypotheses may save you 10–30 minutes up front. But this can come back to bite you once the test is completed, when your team is looking at dozens of metrics and trying to make a decision about what to do next.

Not only is it confusing to see some metrics go up while others go down, chances are you’re probably also seeing some false positives (also known as Type I error) the more metrics you look at. You may even catch yourself looking at metrics that wouldn’t feasibly be affected by your test (e.g., “How does changing this UI element from orange to yellow affect whether or not a job seeker gets a call back from an employer?!”).

So do yourself a solid. Pick 3–4 metrics that your test could reasonably be expected to move, conduct a power analysis for each one, and write down your hypotheses for the test ahead of time.

#3: Test UI elements one at a time

This one’s a little counterintuitive. It might seem like it would increase the amount of time it would take to do a UX holistic redesign by testing each and every UI element separately. But by testing elements one at a time, the conclusions about our tests were more sound. Why? Because we could more clearly establish causality.

Consequently, we were able to take all of the learnings from our tests and roll them into one big test that we were fairly confident would perform well. Rather than see metrics tank like the first time we did a holistic design test, we actually saw some of Indeed’s biggest user engagement wins for 2018, in less than half the time of the first attempt.

By running tests on UI elements one at a time, we were able to iterate on our design vision in a data-driven way and set up our holistic test for success.

Indeed’s mobile SERP circa mid-2019

Detailed description of image

The job title is on the first line of the result, in the largest font, in bold, and colored black for emphasis. If the job title is new, a “new” indicator is present. Unlike the SERP circa mid-2018, this indicator has a black font with a light red highlight.
In the second line of the result, from left to right, is the employer in a smaller black font and a rating. However, instead of showing all five stars like on the SERP circa mid-2018, the rating is conveyed by a number with a yellow star to the right.
In the third line of the result, the location of the position is present. The font is black.
On the fourth line of the result, there is an estimated salary range if available. The font is black with a pastel green highlight.
On the fifth line of the result, the “Apply via phone” indicator is present in red. There are also indicators noting whether you can be an early applicant or if the employer is responsive. These types of indicators are new.
On the very bottom of the result, there is an indication of when the result was first posted.

Each result is in a white rectangle separated by a thin grey border. They are stacked vertically. On the top right of each result is a heart icon. Upon tapping it, it saves the result for later.

So, what do these tests look like in practice? Below are a few examples of some of the groups we ran. You’ll notice that the only real difference between the treatments is a minor change, like font size or spacing.

#4: Consider multivariate tests

Multivariate tests (sometimes referred to as “factorial tests”) test all possible combinations of each of the factors of interest in your A/B test. So, in a way, they’re more like an A/B/C/D/E/… test! What’s cool about multivariate tests is that you end up with winning combinations that you would have missed had you tested each factor one at a time.

An example from the JSUI Lab illustrates this benefit. We knew from UX research that our job seekers really cared about salary when making the decision to learn more about a job. In 2018, this was how we displayed salary on each result:

We wanted to see if increasing the visual prominence using color, font size, and bolding would increase job seeker engagement with search results. So, we developed four font size variants, four color variants, and two variants that were bolded or unbolded. We ended up with 4x4x2 groups for 32 total groups including control.

While multivariate tests can speed up how you draw conclusions about different UI elements, they’re not without their drawbacks. First and foremost, you’ll need to weigh the tradeoffs to statistical power, or the likelihood that you’ll detect a given effect if one actually exists (also known as Type II error). Without sufficient statistical power, you risk not detecting an effect of your test if there is one.

Power calculations are a closed-form equation that require your product team to make tradeoffs between your α-level and β-level of choice, your sample size (n), and the effect size you care about your treatment having (p1). On Indeed, we have the benefit of having over 220M+ unique users each month. That level of traffic may not be available to you and your team. So, to have sufficient statistical power, you’ll potentially need to run your experiment for longer, run groups at higher allocations, cut some groups, or be willing to introduce more Type I error, depending on how small of an effect you’d like to confidently detect.

n=[(z_1-α+z_1-β)/((P₀–P₁)/√P₁(1-P₁))]

The closed form calculation is a power test between two proportions

With a typical A/B test, it’s usually relatively straightforward to analyze with a t-test. Multivariate tests, however, will benefit from multivariate regression models, which will allow you to suss out the effects of particular variables and their interaction effects. Here’s a simplified regression equation:

ŷ = β₀+β₁x1+β₂x₂+β₃x₃+e

And an example of a regression equation for one of the tests we ran that modified both font size and the spacing on the job card:

P(click = 1) = β₀+β₁(FontSize)+β₂(Spacing)+β₃(FontSize×Spacing)+e

Another caveat of multivariate tests is that they can quickly become infeasible. If we had 10 factors with 2 levels each, we’d have a 2^10 multivariate test, with a whopping 1,024 test groups. In cases like these, running what’s called a fractional factorial experiment might make more sense.

Finally, multivariate tests may sometimes yield some zany combinations. In our salary example above, our UX Design team was mildly mortified when we introduced the salary variant with 16pt, green, and bolded font. We lovingly referred to this variant as “the Hulk.” In some cases, it may not be feasible to run a variant due to accessibility concerns. In the JSUI Lab, we determine on a case-by-case basis whether the tradeoff of statistical rigor is worth a temporarily poor user experience.

#5: Deploy CSS and JavaScript changes differently

Sometimes a typical deploy cycle can get in the way of testing new features quickly. At Indeed, we developed a tool called CrashTest, which allows us to sidestep the deploy cycle. CrashTest relies on a separate code base of CSS and JavaScript files that are injected into “hooks” in our main code base. While installing CrashTest hooks follows the standard deploy, once hooks are set up, we can inject new CSS and JavaScript treatments and see the changes reflected in our product in just a few minutes.

In the JSUI Lab, we rely on our design technologist Christina to quickly develop CSS and JavaScript treatments for dozens of groups at a time. With CrashTest, Christina can develop her features and get them QAed by Cory. We can push them into production that same day using our open source experimentation management platform Proctor. Had we relied on the typical deploy cycle, it would have taken Christina’s work several more days to be seen by job seekers, and that much more time until we had results from our A/B tests.

#6: Have a democratized experimentation platform

Combing through logs and tables to figure out how your tests performed is not the best use of time. Instead, consider building or buying an experimentation platform for your team. As a data-driven company, Indeed has an internal tool for this called TestStats. The tool displays how each test group performed on key metrics and whether the test has enough statistical power to draw meaningful conclusions at the predetermined effect size. This makes it easy to share and discuss results with others.

#7: Level up everyone’s skills through cross-training

On the JSUI team, we firmly believe that allowing everyone to contribute to team decisions equally helps our team function better. Our teammates include product managers, UX designers, QA engineers, data scientists, program managers, and design technologists. Each of us brings a unique background to the team. Teaching each other the skills we use in our day-to-day jobs helps increase velocity for our A/B tests because we’re able to talk one another’s language more readily.

For instance, I’m a product scientist, and led a training on A/B testing. This allowed all of the other members of JSUI Lab to feel more empowered to make test design decisions without my direct guidance every time. Our UX designer Katie shadowed our product managers CJ and Kevin as they turned on tests. Katie now turns on tests herself. Not only does this kind of cross-training reduce the “bus factor” on your team, it can also be a great way of helping your teammates master their subject and improve their confidence in their own expertise.

Now it’s time to test!

Whether you take only one or two tips or all seven, they can be a great way of improving your velocity when running A/B tests. The Job Search UI Lab has already started sharing these simple steps with other teams at Indeed. We think they’re more broadly applicable to other companies and hope you’ll give them a try, too.

And if you’re passionate about A/B testing methods, Indeed’s hiring!

Improving Velocity When A/B Testing a New UX—cross-posted on Medium.

«Newer
7
8
9current
10
11
Older»

IndeedEng: Proud Supporters of the Open Source Community

We’re committed

Jobs Filter: Improving the Job Seeker Experience

System architecture

Filter rules to improve job matches

Putting it all to work

Time-Tested: 7 Ways to Improve Velocity When A/B Testing a New UX

#1: Have a healthy backlog

#2: Write down hypotheses ahead of time

#3: Test UI elements one at a time

#4: Consider multivariate tests

n=[(z_1-α+z_1-β)/((P₀–P₁)/√P₁(1-P₁))]

ŷ = β₀+β₁x1+β₂x₂+β₃x₃+e

P(click = 1) = β₀+β₁(FontSize)+β₂(Spacing)+β₃(FontSize×Spacing)+e

#5: Deploy CSS and JavaScript changes differently

#6: Have a democratized experimentation platform

#7: Level up everyone’s skills through cross-training

Now it’s time to test!

Categories

Archives

We’re committed

System architecture

Filter rules to improve job matches

Putting it all to work

#1: Have a healthy backlog

#2: Write down hypotheses ahead of time

#3: Test UI elements one at a time

#4: Consider multivariate tests

n=[(z1-α+z1-β)/((P0–P1)/√P1(1-P1))]

ŷ = β0+β1x1+β2x2+β3x3+e

P(click = 1) = β0+β1(FontSize)+β2(Spacing)+β3(FontSize×Spacing)+e

#5: Deploy CSS and JavaScript changes differently

#6: Have a democratized experimentation platform

#7: Level up everyone’s skills through cross-training

Now it’s time to test!

Categories

Archives

n=[(z_1-α+z_1-β)/((P₀–P₁)/√P₁(1-P₁))]

ŷ = β₀+β₁x1+β₂x₂+β₃x₃+e

P(click = 1) = β₀+β₁(FontSize)+β₂(Spacing)+β₃(FontSize×Spacing)+e