Patterns
that Matter

News and updates

  • 01.11.2016 Our paper titled Towards Data Driven Process Control in Manufacturing Car Body Parts, with Bas van Stein, Hao Wang, Stephan Purr, Sebastian Kreissl, Josef Meinhardt, and Thomas Bäck, got accepted at IEEE CSCI-ISBD 2016.
  • 11.10.2016 Our paper titled Local Subspace-Based Outlier Detection using Global Neighbourhoods, with Bas van Stein and Thomas Bäck, got accepted at IEEE BigData 2016. Congratulations Bas!
  • 27.09.2016 Our paper titled Evolving the Structure of Evolution Strategies, with Sander van Rijn, Hao Wang, and Thomas Bäck, got accepted at IEEE SSCI 2016. Congratulations Sander!
  • 01.09.2016 NEW JOB! I am now assistant professor Data Science at the Leiden Institute of Advanced Computer Science (LIACS).
  • 28.06.2016 Our paper titled Expect the Unexpected - On the Significance of Subgroups, with Antti Ukkonen, got accepted at DS 2016. Update: Slides now available!
  • 01.06.2016 Hugo Proença has started as a PhD student in the SAPPAO project, in collaboration with IIT Roorkee and GE Aviation. He will work on pattern mining for flight data. Welcome Hugo!
  • 30.05.2016 Our paper titled Simultaneous discovery of cancer subtypes and subtype features by molecular data integration, with Thanh Le Van et al., got accepted at Bioinformatics. Congratulations Thanh!
  • 01.05.2016 Sander van Rijn has started as a PhD student in the DAMIOSO project, in collaboration with Honda Research. He will work on simulation data mining. Welcome Sander!
  • 10.03.2016 IDEA 2016, our (full-day!) workshop on Interactive Data Exploration and Analytics, got accepted at KDD 2016.

I am assistant professor Data Science at the Leiden Institute of Advanced Computer Science (LIACS) at Leiden University, where I participate in the Leiden Data Science research programme. My main interest is exploratory data mining: how can we enable domain experts to explore and analyse their data, to discover structure and ultimately novel knowledge?

The approach I take is to define and identify patterns that matter, i.e., succinct descriptions that characterise relevant structure present in the data. Which patterns matter strongly depends on the data and task at hand, hence defining the problem is one of the key challenges of exploratory data mining. I often use pattern-based modelling techniques, for which information theoretic concepts such as the Minimum Description Length (MDL) principle has proven very useful. I am also interested in interactive data mining, i.e., involving humans in the loop.

Finally, I find it very interesting to do fundamental data mining for real-world applications, both in science (e.g., life sciences, social sciences) and industry (e.g., manufacturing and engineering, aviation). There is no better way to show the potential of exploratory data mining than by demonstrating that patterns matter.


see all

Activities

Current and upcoming Recent
  • Training: "Statistics". Ministry of Infrastructure and the Environment, October 24, The Hague, the Netherlands.
  • Invited talk: "Expect the Unexpected - On the Significance of Subgroups". SSDM workshop @ ECML PKDD 2016. September 19, Riva del Garda, Italy.
  • Workshop and Tutorial Co-ChairECML PKDD 2016.
  • Invited talk: "Big Data: Hit or Hype?". ECMA Congress 2016. September 15, Antibes, France.
  • Co-ChairIDEA 2016, workshop on Interactive Data Exploration and Analytics at KDD 2016.
  • Invited talk: "Big Data: Hype of Hit?". Verenigingscongres De Nederlandse Associatie (DNA). June 3, Noordwijkerhout.
  • Guest lecture: "Big Data: Opportunity or Risk?". Honours College Social and Behavioural Sciences. March 7, Leiden.
  • Guest lecture: "MDL for Pattern Mining". Advanced Data Mining at Antwerp University. March 1, Antwerp.
  • Guest lecture: "Information Theoretic Methods in Data Mining". SIKS course on Foundations of Data Science; Data and Process mining. December 14, Utrecht.
  • Invited talk: "Patterns that Matter". Science meets Business cafĂ©. December 10, Leiden.

see all

Selected recent publications

2016
 
van Stein, B., van Leeuwen, M., Wang, H., Purr, S., Kreissl, S., Meinhardt, J. & Bäck, T. Towards Data Driven Process Control in Manufacturing Car Body Parts. In: Proceedings of IEEE International Conference on Computational Science and Computational Intelligence (IEEE CSCI-ISBD'16), 2016.
 
van Stein, B., van Leeuwen, M. & Bäck, T. Local Subspace-Based Outlier Detection using Global Neighbourhoods. In: Proceedings of IEEE International Conference on Big Data (IEEE BigData'16), 2016.
 
van Rijn, S., Wang, H., van Leeuwen, M. & Bäck, T. Evolving the Structure of Evolution Strategies. In: Proceedings of IEEE Symposium Series on Computational Intelligence (IEEE SSCI'16), 2016.
Le Van, T., van Leeuwen, M., Fierro, A.C., De Maeyer, D., Van den Eynden, J., Verbeke, L., De Raedt, L., Marchal, K. & Nijssen, S. Simultaneous discovery of cancer subtypes and subtype features by molecular data integration. In: Bioinformatics, vol.32(17), 2016.
van Leeuwen, M., De Bie, T., Spyropoulou, E. & Mesnage, C. Subjective Interestingness of Subgraph Patterns. In: Machine Learning, vol.105(1), 2016.
Chau, D.H., Vreeken, J., van Leeuwen, M., Shahaf, D. & Faloutsos, C. (eds) Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics (IDEA 2016), 2016.
van Leeuwen, M., & Ukkonen, A. Expect the Unexpected - On the Significance of Subgroups. In: Proceedings of Discovery Science (DS'16), 2016.
van Leeuwen, M. & Galbrun, E. Association Discovery in Two-View Data (extended abstract). In: TKDE Poster Track of ICDE 2016, 2016.
Copmans, D., Meinl, T., Dietz, C., van Leeuwen, M., Ortmann, J., Berthold, M.R. & de Witte, P.A.M. A KNIME-based Analysis of the Zebrafish Photomotor Response Clusters the Phenotypes of 14 Classes of Neuroactive Molecules. In: Journal of Biomolecular Screening, vol.21(5), 2016.
2015
van Leeuwen, M. & Galbrun, E. Association Discovery in Two-View Data. In: Transactions on Knowledge and Data Engineering, vol.27(12), 2015.
Fromont, E., De Bie, T. & van Leeuwen, M. (eds) Advances in Intelligent Data Analysis XIV (proceedings of IDA 2015), LNCS 9385, Springer, 2015.
Aksehirli, E., Nijssen, S., van Leeuwen, M. & Goethals, B. Finding Subspace Clusters using Ranked Neighborhoods. In: Workshop proceedings of ICDM 2015 (HDM workshop), 2015.
Chau, D.H., Vreeken, J., van Leeuwen, M., Shahaf, D. & Faloutsos, C. (eds) Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics (IDEA 2015), 2015.
van Leeuwen, M. & Cardinaels, L. VIPER - Visual Pattern Explorer. Demo paper at: ECML PKDD 2015, 2015.
Paramonov, S., van Leeuwen, M., Denecker, M. & De Raedt, L. An exercise in declarative modeling for relational query mining. In: Proceedings of the 25th International Conference On Inductive Logic Programming (ILP'15), 2015.
Le Van, Th., van Leeuwen, M., Nijssen, S. & De Raedt, L. Rank Matrix Factorisation. In: Proceedings of the 19th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'15), 2015.
van Leeuwen, M. & Ukkonen, A. Same bang, fewer bucks: efficient discovery of the cost-influence skyline. In: Proceedings of the SIAM Conference on Data Mining 2015 (SDM'15), 2015.