0% found this document useful (0 votes)
24 views15 pages

Hyper Log Log

Uploaded by

damrudadakijay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views15 pages

Hyper Log Log

Uploaded by

damrudadakijay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 15

Advance Algorithm

Class: T.Y. B.Tech.

An adaptation of various ebooks and online resources for


educational purpose
Unit III

Probabilistic Data Structure:


LogLog and HyperLogLog
Probabilistic Data Structure

• Problem: Devise an approximation algorithm which finds the


number of distinct users connected to your website.

• Design: Reducing memory footprint

• Applications: Redis, Facbook, Google

An adaptation of various ebooks an


d online resources for educational
purpose
Solution 1

• Brute Force: • Performance:


Space complexity O(n)
• Store connection data Time complexity O(n)

• Select distinct userid from


connection where
status=“Active”

An adaptation of various ebooks an


d online resources for educational
purpose
Solution 2

• Performance:
Space complexity O(n)
Time complexity O(1)

Problem: Distributed
environment

An adaptation of various ebooks an


d online resources for educational
purpose
• HyperLogLog

• Design: Do we really need a fully accurate count?


Lets approximate the count using probability
Solution: HyperLogLog

An adaptation of various ebooks an


d online resources for educational
purpose
HyperLogLog

• Solution: HyperLogLog

An adaptation of various ebooks an


d online resources for educational
purpose
HyperLogLog

• Solution: HyperLogLog

An adaptation of various ebooks an


d online resources for educational
purpose
HyperLogLog

• Solution: HyperLogLog

An adaptation of various ebooks an


d online resources for educational
purpose
HyperLogLog

• Drawbacks:
• Solution: HyperLogLog
• It only estimates power of two

• Too dependent on luck

An adaptation of various ebooks an


d online resources for educational
purpose
HyperLogLog

• Solution: HyperLogLog (Use bucketing)


Use first 2 bits to choose bucket
At the end each bucket will have its own count

An adaptation of various ebooks an


d online resources for educational
purpose
HyperLogLog

• Solution: HyperLogLog

An adaptation of various ebooks an


d online resources for educational
purpose
HyperLogLog
• Final estimate will have a bias
• Use constant 0.79 to remove it

An adaptation of various ebooks an


d online resources for educational
purpose
HyperLogLog
• This formula is too sensitive to mean
• So, use Harmonic mean

An adaptation of various ebooks an


d online resources for educational
purpose
HyperLogLog

• Solution: HyperLogLog

An adaptation of various ebooks an


d online resources for educational
purpose

You might also like