All posts categorized in: Big Data

Vectorized VByte Decoding: High Performance Vector Instructions

Posted on March 9, 2015 by Jeff Plaisance

Data-driven organizations like Indeed need great tools. We built Imhotep, our interactive data analytics platform (released last year), to manage the parallel execution of queries. To balance memory efficiency and performance in Imhotep, we developed a technique called vectorized variable-byte (VByte) decoding. VByte with differential decoding Many applications use VByte and differential encoding to compress […]

Read the full article »

Memory Mapping with util-mmap

Posted on February 25, 2015 by Preetha Appan

We are excited to highlight the open-source availability of util-mmap, a memory mapping library for Java. It provides an efficient mechanism for accessing large files. Our analytics platform Imhotep (released last year) uses it for managing data access. Why use memory mapping? Our backend services handle large data sets, like LSM trees and Lucene indexes. […]

Read the full article »

Serving over 1 billion documents per day with Docstore v2

Posted on October 24, 2013 by Julie Scully

[Editor’s note: This post is the second installment of a two-part piece accompanying our first @IndeedEng talk.] The number of job searches on Indeed grew at an extremely rapid rate during our first 6 years. We made multiple improvements to our document serving architecture to keep pace with that growing load. A core focus at […]

Read the full article »

1
2current
3

Do Not Sell My Personal Information - Accessibility at Indeed

All posts categorized in: Big Data

Vectorized VByte Decoding: High Performance Vector Instructions

Memory Mapping with util-mmap

Serving over 1 billion documents per day with Docstore v2

Categories

Archives