Sunday, May 25, 2008

[Reading] Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach

This paper presents a new method for mining frequent patterns. The main technique is a new data structure, FP tree, for the compact representation of the database. A FP tree can compactly represent all frequent patterns (no information loss) and enable fast query (by traversing the tree and using some range pointers).

There are always some detailed examples in describing the data structure and mining methods, usually appear before the proofs of the lemmas. This makes the paper very easy to understand. Also all pseudo codes are provides and thus it should be simple to implement the algorithm. Even the FP tree always greatly compresses the database. the authors still provide a projection method to reduce the required memory.

Because this method makes no approximation to the problem, the results are identical to the ground truth, and obtained much faster than the previous methods. In summary, this is a very exciting work !

No comments: