Indeed Reads: October 2017

The Indeed Engineering blog lets us write about the technologies we get to work with every day. We talk about what works, what sometimes doesn’t work, and how we’re scaling our technologies as a fast growing company. We participate in industry-wide conversations through this blog. We also read a lot of other tech blogs.

An open office lounge area with multiple people sitting down and using laptops or mobile phones.

Indeedians at the Austin Tech Campus

Here are a few interesting pieces recommended by Indeed engineers:

Engineering Personas

by Casey Rosenthal

We know that sales and marketing research often includes user personas for building software. This article looks at five engineering personas that also can help you with hiring decisions. These personas aren’t intended to define people, but they provide insight into the kinds of behaviors and characteristics that would most benefit your team.

Read post

 

Gray Failure

by Adrian Colyer

Working from the paper “Gray failure: the Achilles’ heel of cloud-scale systems” by Huang et al., Colyer explores the idea that breakdowns and performance anomalies in cloud-scale systems are often caused by gray failures, subtle failures perceived differently by various components of a system. Huang et al., describe these failures: “One entity is negatively affected by the failure and another entity does not perceive the failure.” Colyer takes this observation of gray failures and asserts that when we fail, we should fail properly.

Read post

 

How To: S3 Multi-Region Replication (And Why You Should Care)

by Jessica Lucci

Indeed’s Jessica Lucci provides step-by-step instructions to create a replica set by combining Amazon’s S3, SNS, SQS and Lambda technologies. As Amazon only provides two-region bucket replication, deploying your own replica set allows you to replicate data across as many regions as you need. A custom replica set also allows you to customize replication behavior via your Lambda functions.

Read post

 

The Pitfalls of A/B Testing in Social Networks

by Brenton McMenamin

A/B testing can be the perfect tool for most testing situations. This method helps determine whether a feature positively affects user behavior in a given test group. But if the purpose of your site is to get users to connect with as many other users as possible, like on the dating site OkCupid, keeping the feature away from your control group can lead to unreliable results. Tests for user-to-user features, such as chat or video, need interaction with as many people as possible to verify effectiveness. When A/B testing involves social interactions, it’s highly likely you’ll need to rethink and restructure your experiment methodology.

Read post

 

How to Do Code Reviews Like a Human (Part One)

by Michael Lynch

Code reviews are more than a technical process used to help identify bugs; they’re also social interactions that, when done mindfully, can improve the process itself. Delivering critical feedback is rarely easy, and when the feedback is focused on a technical subject, our instinct can often be to provide technical responses without considering the human interaction involved. But a human wrote the code that’s being reviewed, and the best way to iterate with that human is to follow guidelines that help you to be constructive and avoid miscommunications.

Read post

 

Floating Point Visually Explained

by Fabien Sanglard

Floating-point arithmetic is one of the trickier concepts in computer programming. This datatype has been called both esoteric and essential. To better understand floating points, Sanglard proposes visualizing the sections of the mathematical notation of a floating-point number (sign, exponent, and mantissa). Then, instead of exponent, Sanglard proposes thinking of a window between two consecutive power of two integers, and instead of a mantissa, thinking of an offset within that window. Redefining the components of a floating-point number in a more visual way provides a clearer path to explaining how this datatype works.

Read post

Indeed at PyData Seattle 2017

PyData 2017 logoIndeed was a proud sponsor of PyData Seattle 2017, an international conference promoting the use of open source data analysis tools for the Python community, such as Pandas, Matplotlib, IPython and Project Jupyter.

Indeed Data Scientists presented two tutorials at the conference:

  • Using Pandas for analyzing structured time series data
  • Using open source natural language processing (NLP) libraries for analyzing unstructured text

This post introduces their presentations and includes links to videos and tutorials for you to try the exercises yourself.

Joe McCarthy illustrated how to use tools in the Pandas data analysis library to investigate unevenly spaced time series data in The Simpsons. This type of data analysis tends to focus more on the intervals between events rather than the frequency of events occurring within regularly spaced intervals. At Indeed, one example of such a task is estimating how long it takes a recruiter to review a resume (or profile), based on the gaps in timestamps of initial profile disposition events.

Joe’s tutorial focused on a collection of data about episodes, characters, locations and scripts from The Simpsons. This collection is one of many data sets available at data.world.

Video 1. D’oh! Unevenly spaced time series analysis of The Simpsons in Pandas

Alex Thomas demonstrated how to use open source NLP tools such as the Natural Language Toolkit (NLTK) and word_cloud for vocabulary analysis of job descriptions. His tutorial covered basic NLP techniques such as tokenization, stemming and lemmatization in the context of analyzing job descriptions posted on Indeed. Other techniques include the use of stop words, multi-word phrases (n-grams) and the TF-IDF statistic for estimating the relevance of documents.

Alex highlighted challenges in processing text and some interesting and often-unanticipated problems in interpreting the results of applying each of these techniques.

Video 2. Vocabulary Analysis of Job Descriptions

Exercises and Jupyter Notebooks for both Indeed tutorials are on GitHub at pydata-simpsons and pydata-vocab-analysis. For more PyData conference presentations, check out their YouTube channel.

Indeed at Litmus Live 2017: How to Run a Successful Email Workshop

Lindsay Brothers

Indeed is proud to announce that Lindsay Brothers will be speaking at Litmus Live in Boston on August 3, 2017. Lindsay is a Product Manager at Indeed. Her team sends billions of job alert emails every month to job seekers around the world.

As the world’s #1 job site, Indeed communicates with job seekers around the globe. A unified email strategy allows us to effectively understand how, when, and why we should email job seekers. To develop this strategy, we built an email planning workshop to share ideas and come to consensus quickly. During this workshop, we created a job seeker’s Bill of Rights and brought users onsite for feedback and validation. Lindsay’s session will cover workshop details and offer takeaways for anyone developing a similar email strategy.

Litmus Live brings together email marketers for two days of real-world advice, best practices, and key takeaways. Free from product pitches and hype, Litmus Live is all about content: Teaching designers, developers, marketers, and strategists how to create emails that look great, perform well, and engage audiences.

If you’re at Litmus Live Boston this year, join Lindsay to learn more about Indeed!

Speakers converse on stage at the Litmus Live Email Design ConferenceLitmus Live: The Email Design Conference

Indeed is hiring talented Sales, Product, Marketing and Engineering minds from Toronto to Tokyo and beyond. Find out more about opportunities to work at one of our 24 global offices.