Significant Pattern Mining (Westfall-Young Light)

 

Felipe Llinares-​López, Mahito Sugiyama, Laetitia Papaxanthos and Karsten Borgwardt
Fast and Memory-​Efficient Significant Pattern Mining via Permutation Testing (SIGKDD 2015)

 

Summary

In this project, we developed an approach to improve the statistical power in significant pattern mining by using permutation-​testing.

Significant pattern mining algorithms must deal with a vast search space, often containing billions or even trillions of candidate patterns. However, these patterns are often heavily inter-​related, resulting in pronounced statistical redundancies. Previously existing approaches either: (1) ignore these redundancies, leading to over-​conservative significance thresholds and a loss of statistical power or (2) are computationally demanding, both in terms of runtime and memory usage, limiting their applicability to small-​sized datasets.

Here, we proposed a novel, fast and memory-​efficient permutation testing algorithm for significant pattern mining that overcomes both limitations.

Code

A beta version of code is available in our GitHub repository here.

Publication

Fast and Memory-​Efficient Significant Pattern Mining via Permutation Testing

Felipe Llinares-​López, Mahito Sugiyama, Laetitia Papaxanthos and Karsten Borgwardt
Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD2015), 2015, 725-​734
Online  |  ETH Research Collection  |  Project page  |  GitHub

    

Contact felipe.llinares@bsse.ethz.ch for questions regarding usage or reporting bugs.

Go to Editor View