Indeed + Hacktoberfest 2020: By The Numbers

logo for Hacktoberfest 2020

Indeed + Hacktoberfest 2020 is in the books! We’re thrilled to share our results.

External focus Internal focus
As a Hacktoberfest Community Partner, we engaged directly with the external community.

  • 1 external landing page
  • 1 case study
  • 6 supported open source projects tagged with the ‘hacktoberfest’ label
  • 11 virtual office hours
  • 437 commits into our supported repos
To build on our strong base of internal contributors, we focused on flexibility.

  • 29 virtual study halls hosted in 4 time zones
  • 11 open source ambassadors with weekly check-ins
  • 65 new open source participants
  • 100 total Hacktoberfest participants
  • 2,229 activities—pull requests opened, issues filed, comments posted, and code reviews conducted
  • 328 submitted pull requests 

Our focus is on open source sustainability. To help us understand what this means, we use the oceanic ecosystem as a model.

The ocean requires clean water, all sizes of fish, reefs for congregating, critters to clean up, and plankton to let bigger animals thrive. Similarly, our open source ecosystem is varied and interrelated. We support all sizes of projects, events for contributors to congregate, and an emphasis on cleaning up to help projects thrive. Our objective with this mindful approach: release open source projects that benefit interested adopters and contributors.

September was Prep-tember

September was a busy month of preparation. Indeed’s Open Source Program Office (OSPO) identified six projects to work with and promote during Hacktoberfest. Our qualifying criteria for the projects: Indeed actively uses the project and at least one of the project’s maintainers is an Indeed employee.

Our program office shared the Hacktoberfest guidelines with project maintainers. We asked them to tag repos and issues within their projects that they wanted help with. We requested a three-day turnaround time for responding to comments or opening pull requests (PRs). We then worked with maintainers to schedule and publicize open office hours. Manager buy-in was crucial, so we worked with maintainers and their managers to dedicate time towards Hacktoberfest during the work week.

Engaging with the external community

Office hours and timely PR merges helped us make sure that the experience of Hacktoberfest participants was positive.

The maintainers scheduled multiple office hours. These were times during which anyone, Indeed employee or not, could join a video call and ask project-specific questions. Our program office coordinated the publicity through the Hacktoberfest Event Board, Indeed’s Hacktoberfest landing page, and on each project’s readme.md page.

Expanding our internal reach

Virtual study halls—internal office hours that were not project specific—allowed us to help as many Indeed employees as possible. Instead of standing meeting times, the program office and our open source ambassadors hosted these events on an as-needed basis, resulting in more than one study hall every workday in October.

We invited mentors and mentees to our new mentorship program. We paired people by timezone and their experience with open source: from brand new to needing help finding issues to needing guidance closing a “reach” issue to expand technical capabilities.

The study hall events and mentorship programs were great. It felt like there was an involved community and lots of support and encouragement throughout the month. —Technical Business Analyst

Leveraging Indeed’s open source projects

For Hacktoberfest, we leveraged our existing tools to share open issues in projects that Indeed is dependent on. First, we used Mariner (open sourced by our OSPO in 2019) to identify beginner-friendly issues recently opened in open source projects. For 2020, we open sourced Mariner Issue Collector—a version of Mariner that runs as a GitHub Action. Since August 2019, we’ve been using Mariner output to produce a weekly internal blog post highlighting contribution opportunities for everyone at Indeed.

We generated the list of Indeed employees who participated in Hacktoberfest using Starfish (open sourced by our program office in 2019). We used Starfish because it gives us accurate contributions over a period of time, no matter the date in which we receive a GitHub ID. We also use Starfish to compile the list of employees who are eligible to vote in our FOSS Contributor Fund.

Encouraging open source sustainability

We’re happy with the great results from Hacktoberfest 2020. We can only reach our open source sustainability goals if we create and maintain a habit of using and contributing to open source projects. Events like Hacktoberfest help us motivate and inspire Indeedians to get involved in supporting the open source software they use every day. One way we measure program success is by counting the number of people who contribute to open source on two or more days throughout the quarter. We refer to this metric as active recurring participants (ARPs). Compared to previous months, we saw an increase of over 200% of ARPs in October.

I’ve had it as a goal for, let’s be honest, years to commit to OSS [open source]. And I went from 0 to 5 PRs this week. So thanks for the motivation and support to finally get me committing to open source. —Senior Quality Assurance Automation Engineer

To build on our Hacktoberfest 2020 momentum, we’re continuing to post open issues that Indeed has dependencies on. We’ve surveyed our 100 participants so that we can meet Indeed employees where they are.

We believe the best time to start to contribute to open source is now. And the best open source ecosystem is sustainable.


Indeed + Hacktoberfest 2020—cross-posted on Medium.

k8dash: Indeed’s Open Source Kubernetes Dashboard

So you’ve got your first Kubernetes (also known as k8s) cluster up and running. Congratulations! Now, how do you operate the thing? Deployments, replica sets, stateful sets, pods, ingress, oh my! Getting Kubernetes running can feel like a big enough challenge in and of itself, but does day two of operations need to be just as much of a challenge?

Kubernetes is an amazing but complex system. The learning curve can be steep. Plus, the standard Kubernetes dashboard has limited features. Another option is kubectl, which is extremely powerful but also a power user tool. Even if you become a kubectl wizard, can you expect everyone in your organization to do the same? And with kubectl it’s difficult to gain visibility into the general health and performance of the entire cluster all at once.

Enter k8dash—pronounced Kate Dash (/kāt,daSH/)—Indeed’s open source Kubernetes dashboard.

k8dash deployment dashboard

Since k8dash’s release in March of 2019, it’s received over 625 Github stars and been downloaded from DockerHub over 1 million times. k8dash is a key component of Kubernetes operations for many organizations.

In May of 2020, the Indeed Engineering organization adopted the k8dash project. We’re excited about the visibility this brings to the project.

Benefits of managing your Kubernetes cluster with k8dash

Here are a few of the benefits of k8dash.

Quick installation

Low operational complexity is a core tenet of the k8dash project. As such, you can install k8dash quickly with a couple dozen lines of YAML. k8dash runs only a single service. No databases or caches are required. This extends to AuthN/AuthZ via OIDC. If you’re already using OIDC to secure your cluster, k8dash makes extending this to your dashboards easy: configure 2-3 environment variables and you’re up and running. No special authenticating proxies or other complicated configurations are required.

Cluster visualization and management

k8dash helps you understand the current status of all of your cluster’s moving parts: namespaces, nodes, pods, deployments. Real-time charts show poorly performing resources. The intuitive interface removes much of the behind-the-scenes complexity and helps flatten your Kubernetes learning curve.

You can manage your cluster components via the dashboard and leverage k8dash’s YAML editor to edit resources. k8dash uses the Kubernetes API and provides context-aware API docs. With k8dash you can view pod logs and even SSH directly into a running pod via a terminal directly in your browser, with a single click.

k8dash also integrates with Metrics Server, letting you visualize CPU/RAM usage. This visualization helps you understand how well your services are running. As Kubernetes simplifies the complexity of running hundreds or even thousands of microservices across an abstract compute pool, it brings the promise of improved resource utilization through bin packing. However, for many organizations this promise goes unrealized because it can be difficult to know which services are over- or under-provisioned. k8dash’s intuitive UI takes the guesswork out of understanding how well services are provisioned.

k8dash visualization of CPU/RAM usage for easier display of the provision of services

Real-time dashboard

Because k8dash is a real-time dashboard, you don’t need to refresh pages to see the current state of your cluster. Instead, you can watch charts, graphs, and tables update in real time as you roll out a deployment. You can watch as nodes are added to and removed from your cluster, and know as soon as new nodes are fully available. Or, simply monitor a stream of Kubernetes cluster-wide events as they happen.

Because k8dash is mobile optimized, you can monitor—and even modify—your cluster on the go. If you’re getting paged about a troublesome pod just as your movie is about to start, with k8dash you can restart the pod directly from your phone!

The k8dash project: How to contribute

k8dash is made up of a lightweight server and a client, and we’re always looking for core contributors.

The server—built in Node.js and weighing in at ~150 LOC—is predominantly a proxy for the front end to the Kubernetes API server. The k8dash server makes heavy use of the following npm packages:

  • express (web server)
  • http-proxy-middleware (proxies requests to Kubernetes API)
  • @kubernetes/client-node (official Kubernetes npm module. Used to discover Kubernetes API location)
  • openid-client (fantastic npm module for OIDC)

The client is a React.js application (using create-react-app) with minimal additional dependencies.

If you would like to contribute, see the list of issues in GitHub.


About the author

Eric Herbrandson is a staff software engineer and member of the site reliability engineering team at Indeed. Eric has used orchestration frameworks, including ECS, Heroku, Docker Swarm, Hashicorp’s Nomad, DCOS (Marathon/Mesos), and Kubernetes. While finding Kubernetes to be the clear winner in the orchestration space, Eric recognized that existing visualization options were lacking compared to other frameworks. In an effort to better understand the Kubernetes API and to create a solution that contained all of the features he needed, Eric developed k8dash over a three-week period.

k8dash—cross-posted on Medium.

Jackson: A Growing User Base Presents New Challenges

Jackson is a mature and feature-rich open source project that we use, support, and contribute to here at Indeed. In my previous post, I introduced Jackson’s core competency as a JSON library for Java. I went on to describe the additional data formats, data types, and JVM languages Jackson supports. In this post, I will present challenges resulting from Jackson’s expansion, in my view, as Jackson’s creator and primary maintainer. I’ll also share our plans to address these challenges as a community.

A running river surrounded by lush tress in Cape Flattery in Washington.

“The other side of Cape Flattery” by Tatu Saloranta

Growing pains

Over its 13 years, the Jackson project added a lot of new functionality with the help of a large community of contributors. Users report issues and developers contribute fixes, new features, and even new modules. For example, the Java 8 date/time datatype module—which supports Java 8 date/time types, java.time.*, specified in JSR-310—was developed concurrently with the specification itself by a member of the community.

Because of its ever-expanding functionality and user base, Jackson is experiencing growing pains and could use help with:

  • Improving the documentation
  • Enhancing the project’s website and branding
  • Managing and prioritizing large-scale changes and features
  • Refining the details of reported issues
  • Testing the compatibility of new Jackson versions against  downstream dependencies

Documentation’s structure and organization

The availability, accessibility, and freshness of documentation is always a challenge for widely used libraries and frameworks. Jackson is no exception. It may even suffer from more documentation debt than other libraries with a comparable user base.

The problem is not so much a lack of documentation per se. A lot of content exists already:

3rd-party tutorials
Jackson project repos
StackOverflow

Jackson also has a Javadoc reference that documents most classes of interest. However, Jackson is a vast library. The Javadoc reference can seem overwhelming, especially to new users. And the Javadoc reference does not include usage examples.

Our first priority is creating How-To guides that walk library consumers through the most typical use cases, such as:

  • How to convert the f.ex JSON structure to match Java objects.
  • How to format other data types such as XML.
  • How to use Jackson within frameworks, such as Spring/Spring Boot, DropWizard, JAX-RS/Jersey, Lombok, and Immutables.

We also want to improve the Javadoc reference. Some classes and methods have no descriptions, while descriptions for others are incomplete. In addition to the auto-generated Javadoc reference, we publish manually created wiki pages that detail the most used features and objects. We’d like to find ways to auto-update these wiki pages.

To implement these modifications, we need the following documentation structure, tooling, and process changes:

  • A super structure for adding inline content and links to external content
  • Access permissions for contributors to add and correct documentation
  • A documentation feedback mechanism to help focus improvement and maintenance efforts
  • Documentation templates to make it easier to add information

Project website and branding

Even though the project is 13 years old and widely used, the main project home page still has the stock GitHub project look. This repo contains a lot of helpful information about Jackson, but it doesn’t have a distinct style, brand, or logo.

For Hacktoberfest 2020, we created an issue for a Jackson logo design. Examples we like:

Managing large-scale changes

We appreciate the healthy stream of code contributions—mostly to fix reported issues, but for new features as well.

The most valuable type of contribution for the Jackson project has historically been user bug reports filed as GitHub issues. We get a dozen issues each week, leading to fixes and quality improvements. Without these reports, Jackson would not be half as good as it is. The new feature requests we receive regularly are another valuable source of improvements. New feature requests also help us prioritize new development work.

Better management of larger changes and features is an area for improvement. These efforts take a long time to execute and benefit from careful planning and collaboration. While it’s possible to handle bigger changes under a single issue, this can get unwieldy quickly.

The Jackson project could use help with setting the priority of new feature requests.

  • Which issues should we focus on improving?
  • Can we gauge the consensus of the whole community instead of relying on a small number of active and vocal participants? This helps to avoid “squeaky wheel problems” getting priority.
  • What’s the best way to obtain early feedback on API and implementation plans? Using existing mailing lists, issue trackers, and chat all have their challenges.

To address the need for more community feedback, I introduced the idea of “mini-specifications” called Jackson STrategic Enhancement Plan: JSTEP. Similar to KIPs in Kafka, JSTEPs are intended to foster collaboration. So far this has had limited success, partly due to the limitations of the tool used: GitHub wiki pages do not have the same feature set as GitHub issues or Google documents.

I also started keeping a personal but public todo or work-in-progress (WIP) list—Jackson WIP—as a GitHub wiki page. I wanted a low-maintenance mechanism for me to track and share my own near-term plans.

Refining reported issues, collaboratively

As mentioned above, bug reports have historically been the most useful type of feedback. But while useful, managing reported issues and new feature requests takes a lot of effort. This has become especially challenging now that the data formats, data types, JVM languages, and user base have grown.

Basic issue triage has become time consuming and includes the following steps:

  • Validating the reported issues.
  • Determining if the reported issues resulted from documentation that is missing or unclear.
  • Asking for more information.

The triage time takes away from the limited time we have to work on functionality and documentation. It can also slow down responses to issue reports, which in turn leads to a poor reporter experience.

We would like to find ways to train and include volunteer issue triagers. They can help with refining issues in a timely manner and improving user engagement by:

  • Adding labels
  • Asking questions to get more information
  • Adding findings of their own
  • Getting module and domain experts engaged where applicable

Starting with jackson-databind issues, we improved the workflow and refined GitHub issues by creating:

  • Issue templates to help users submit better issue reports
  • The “to-evaluate” label—set automatically for new issues
  • The “most-wanted” label—which indicates issues that have consistently come up on mailing lists or up-voted using the GitHub thumbs-up emoticon

If these changes are successful, we’ll propagate them to other project repos. Similarly, we can make other incremental changes along these lines.

Compatibility testing of new versions with dependencies

The unit test coverage in the Jackson project’s code is reasonable, and coverage for core components is quite good. There’s even a nascent effort to write more inter-component functional tests in the jackson-integration-tests repo. However, there is currently a sizable gap in testing the compatibility between the minor Jackson version that is under development and downstream dependencies, such as frameworks like Spring Boot.

One challenge is the difference between the development lifecycles of Jackson and the libraries and frameworks that use Jackson. Existing libraries and frameworks depend on the stable Jackson minor version and test everything only against that version. Version 2.11 is the current stable version. They report any issues they find. We may fix the issues in a patch release, like 2.11.1.

Simultaneously, the Jackson project is developing a new minor version, 2.12, which will be released and published as version 2.12.0. Working on versions 2.11.1 and 2.12 concurrently should not be an issue. We use the standard semantic versioning scheme for the Jackson public API, which should guard against creating new issues.

In practice, standard semantic versioning cannot protect against all actual issues. Semantic versioning is used for Jackson’s public API, but internal APIs across core components do not have the same level of backwards compatibility. While Jackson maintainers consider the compatibility between the published and documented API, users tend to rely on the API’s actual observed behavior rather than its intended behavior. Sometimes what maintainers consider bugs or unexpected behaviors, users consider the behavior an actual feature or working as expected.

That said, we can catch and address such issues during the development cycle of the new version if we test downstream systems using the under-development SNAPSHOT version of Jackson. At the time of writing this article, the current version of Spring Boot ships with a dependency on Jackson 2.11.2, the latest patch version. We can create an alternative set of Spring Boot unit and integration tests that instead rely on Jackson 2.12.0-SNAPSHOT. The snapshot version is published regularly, and the Jackson project provides automatic snapshot builds.

Such cross-project integration tests would benefit all involved projects. This prevents the release of some (or perhaps most) regression issues, reducing the need for patch versions. Over time it should also significantly improve compatibility across Jackson versions, closing the gap between expected and actual behavior of the API.

How to get involved

Jackson’s growing user base and features have presented us with documentation, branding, issue triaging, issue prioritization, and code testing challenges. I’ve shared some of the steps we’ve taken to address them. You can help us by implementing some of these solutions or suggesting alternates. Together we can lay a foundation for Jackson’s future growth. For contribution opportunities, visit Jackson Hacktoberfest 2020. To find out more, join us on Gitter chat or sign up for the jackson-dev mailing list.


About the author

Tatu Saloranta is a staff software engineer at Indeed, leading the team that is integrating the next-generation continuous deployment system. Outside of Indeed, he is best known for his open source activities. Under the handle @cowtowncoder, he has authored many popular open source Java libraries, such as Jackson, Woodstox, lzf compression codec, and Java ClassMate. For a complete list see FasterXML Org.

 

Jackson’s Growing User Base—cross-posted on Medium.