Estimate Set Cardinality using Druid and Datasketches on Petabytes of Data

DevconTLV X Conference, Tuesday, November 15, 2016, 16:30

At eXelate we need to present to our clients the number of unique users who meet a given criteria. The condition is typically a set-theoretic expression over a stream of events for a given time range. Historically, we have used ElasticSearch to answer these types of questions, however, we have encountered major scaling issues. In this presentation we will detail the journey of researching, benchmarking and productionizing a new technology, Druid, with DataSketches, to overcome the limitations we were facing.

Yakir is an Architect working at eXelate the leading DMP and data exchange company.

His fields of interest are Big data solutions and large scale machine learning.

Other Presentations at DevconTLV X

Open Accessibilty Menu