Wouter Duivesteijn

Hi, I'm Wouter, and I'm incurably curious. I am an Assistant Professor in Data Mining at the Technische Universiteit Eindhoven. My research revolves around Exceptional Model Mining (EMM): a local pattern mining method where we seek subsets of the dataset that are interesting, which they are if they satisfy two conditions. On the one hand, they must be interpretable: we must be able to succinctly describe the definition of a subgroup, so that the knowledge that they represent becomes actionable. On the other hand, they must be exceptional: they must display some kind of behavior that sets them apart from the overall population. The scientific challenges revolve around how to efficiently search for subgroups, and how to express exceptional behavior such that the subgroups we find are meaningful.

CV (last updated: September 08, 2023)


Work-life rebalancing

Soon* I will be on paternity leave. This means I will be out of the office for two months straight. Before and after the paternity leave, I will be on parental leave, every Friday. So, my working week will be Monday-Thursday, except during my paternity leave, when I am out of the office altogether. Please keep these things in mind when getting in touch. You may of course send me emails at any time you want, as long as we agree that I will not be reading them while I'm out of the office.

*it is impossible to predict exactly when a baby comes, so I cannot make this more precise. I expect my paternity leave to cover roughly all of July and August. I'll update this when I know more details.


Latest publications

You can find my full publication list sorted by category here, and sorted by year here.

How to Cheat the Page Limit

I am one of the Proceedings Chairs of ECMLPKDD 2024. I am reprising this role from previous editions: for the 2022 edition, we wrote a report on our findings, updating our 2020 paper on How to Cheat the Page Limit.

Sound of Science #16

Flemish comedian and science fanatic Lieven Scheire presents a podcast at TU/e, called Sound of Science. In this podcast, he discusses the last scientific discoveries and the role of technology in society with researchers and students. In Episode 16, he and I talked about my work (in Dutch). NLP researchers are invited to use this episode as a training set to let their algorithms distinguish a Vlaams from a Rotterdams accent in the Dutch language.

Gelfilter dataset

In addition to a new data mining method, our IDA 2020 paper also introduces a dataset, publicly available for research purposes free of charge. It is a run to failure time series library, where the primary task is to predict the Remaining Useful Life of a gel filter in a chemical plant. You can find more details and the dataset itself here.
Contact information
Miscellaneous