Distinct Elements Problem

Download Distinct Elements Problem

Post on 07-Feb-2016

30 views

Category:

Documents

0 download

Embed Size (px)

DESCRIPTION

Distinct Elements Problem. Ariel Rosenfeld. Definition. Input : a stream of m integers i1, i2, ..., im. (over 1,,n) Output : the number of distinct elements in the stream. Example count the distinct number of IP addresses you encounter. Solutions. - PowerPoint PPT Presentation

TRANSCRIPT

  • Ariel Rosenfeld

  • Input: a stream of m integers i1, i2, ..., im. (over 1,,n)Output: the number of distinct elements in the stream.

    Example count the distinct number of IP addresses you encounter.

  • Bit vector of size n (mark 1 when encountered)Keeping all m integers and naively answer.Sort and count

    O(min{n,mlogm})

  • a determinitic exact algorithm is impossible using o(n) bits.A deterministic approximation algorithm for this problem providing a (1 1/1000)-approximation using o(n) bits is impossible.

  • Pick random hash function h : [n] [0, 1]Calculate z = ministream h(i)Output 1/z 1

  • Same ints gets same hash value.

    We will show that the output is a good approximation.

  • This is idealized for 2 reasons:1.We dont have perfect precision.2. We need n bits at least to remember the randomness associated with every i.

    Lets ignore it for now

  • S = {j1,jt} (unique elements in the stream)

    h(j1), ..., h(jt) = X1, ..., Xt are independent variables from Unif[0, 1]

    Z = min{Xi}

  • P=10101F(x)11

  • .

    .

    (HW)

    We get a bounded variance.

  • q increases -> better approximation Chebyshev

  • We want a function that doesn't need n bits or more to represent.

    So we will use k-wise independent hash functions (H) each can be represented using a small number of bits (log|H|).In lecture.

  • An example - Set q > k a prime power, and dene Hpoly,k to be the set of all degree (k 1) polynomials in Fq[x]. Hpoly,k is a k-wise independent family.

    Size: qkNeeds: k log q bits.

Recommended

View more >