Algorithm:Streaming

From Algorithm Lab

Jump to: navigation, search

Finding Frequent Elements

  • Frequent element set K (|K| < 1/θ (0<θ<1)) can be computed with O(1/θ) memory and O(1) operations per occurrence in the worst case
  • usage : DDoS attack for detecting IP address, search query analysis
  • ruby sourcecode
 queries = ["test", "test", "more", "less", "hoge", "hoge", "less", "test"]
 theta = 2
 k = Array.new
 count = Hash.new
 queries.each { |val|
   if k.include?(val) then
     count[val] = count[val] + 1
   else
     k.push(val)
     count[val] = 1
   end
   if k.size > theta then
     k.delete_if { |v|
       (count[v] - 1) == 0
     }
   end
 }
 p k 

output :  ["test", "hoge"]

Karp, R. M.; Papadimitriou, C. H.; Shenker, S. (2003), "A simple algorithm for finding frequent elements in streams and bags", ACM Transactions on Database Systems 28 (1): 51–55 [1]

Personal tools