Wouter Duivesteijn
Hi, I'm Wouter, and I'm incurably curious.
I am an Assistant Professor in Data Mining at the
Technische Universiteit Eindhoven.
My research revolves around Exceptional Model Mining (EMM): a local pattern
mining method where we seek subsets of the dataset that are
interesting, which they are if they satisfy two conditions.
On the one hand, they must be interpretable: we must be able to succinctly describe the definition of a subgroup, so that the knowledge that they represent becomes actionable. On the other hand, they must be exceptional: they must display some kind of behavior that sets them apart from the overall population. The scientific challenges revolve around how to efficiently search for subgroups, and how to express exceptional behavior such that the subgroups we find are meaningful.
CV (last updated: March 27, 2023)
Now Hiring!
I have an open position for a fully
funded 5-year PhD TA position on the topic of Efficient Algorithms
for Exceptional Model Mining on Time-Varying Data. You can read more about the
research project in this document.
How to Cheat the Page Limit: the 2022 Update
I am one of the Proceedings Chairs of ECML PKDD 2022. In this role, we wrote
a report on our findings, updating our paper on
How
to Cheat the Page Limit.
Sound of Science #16
Flemish comedian and science fanatic Lieven Scheire presents a podcast at TU/e,
called Sound of
Science. In this podcast, he discusses the last scientific discoveries
and the role of technology in society with researchers and students. In
Episode 16,
he and I talked about my work (in Dutch). NLP researchers are invited to
use this episode as a training set to let their algorithms distinguish a
Vlaams from a Rotterdams accent in the Dutch language.
Latest publications
- R.F.A. Verhaegh, J.J.E. Kiezebrink, F. Nusteling, A.W.A. Rio,
M.B. Bendiscek, W. Duivesteijn, R.M. Schouten: A Clustering-inspired
Quality Measure for Exceptional Preferences Mining - Design Choices and
Consequences. In: Proceedings of the 25th International Conference on
Discovery Science (DS 2022), pp. 429-444.
- J.F. van der Haar, S.C. Nagelkerken, I.G. Smit, K. van Straaten, J.A. Tack,
R.M. Schouten, W. Duivesteijn: Efficient Subgroup Discovery Through
Auto-Encoding. In: Proceedings of the 20th International Symposium
on Intelligent Data Analysis (IDA 2022), pp. 327-340.
- R.M. Schouten, W. Duivesteijn, M. Pechenizkiy: Exceptional Model Mining
for Repeated Cross-Sectional Data (EMM-RCS). In: Proceedings of
the 2022 SIAM International Conference on Data Mining
(SDM 2022), pp. 585-593.
A substantially longer version appeared on Figshare.
- R.M. Schouten, M.L.P. Bueno, W. Duivesteijn, M. Pechenizkiy:
Mining Sequences with Exceptional
Transition Behaviour of Varying Order using Quality Measures based
on Information-Theoretic Scoring Functions. In: Data Mining
and Knowledge Discovery 36(1), pp. 379-413, 2022.
You can find my full publication list
sorted by category here, and
sorted by year here.
Gelfilter dataset
In addition to a new data mining method, our IDA 2020 paper also introduces a
dataset, publicly available for research
purposes free of charge. It is a run to failure time series library,
where the primary task is to predict the Remaining Useful Life of a gel
filter in a chemical plant. You can find more details and the dataset itself
here.
Contact information
Conferences
Miscellaneous