Wouter Duivesteijn

Hi, I'm Wouter, and I'm incurably curious. Currently, I am an Assistant Professor in Data Mining at the Technische Universiteit Eindhoven. Previously, I was employed as a Postdoctoraal Bursaal on the FORSIED project, at the Universiteit Gent, after a brief visit to the University of Bristol to work on the same project as a Research Associate. Before that, I worked as a Wissenschaftlicher Mitarbeiter in the Collaborative Research Center SFB 876 at the Technische Universität Dortmund. Even earlier, I was employed as a PhD candidate in the Data Mining group of LIACS, Leiden University.

On this website you can find:
Contact information
Research and publications
Miscellaneous


Contact information

Name:Wouter Duivesteijn
Postal address:Data Mining group, W&I, TU/e
P.O. Box 513
5600 MB Eindhoven
the Netherlands
Visiting address:MetaForum 7.145
Tel.:+31 40 247 4008
E-mail:w.duivesteijn@tue.nl (please do not send me MS Word documents. I have no problem with you using Word, but if you do, please export your document to PDF before sending it to me.)

CV (last updated: May 2, 2017)


Research and publications

During my PhD candidacy, I worked on the Exceptional Model Mining (EMM) project, funded by NWO. EMM is a framework that can be seen as a generalisation of Subgroup Discovery (SD). Both SD and EMM attempt to find small portions of the data where the observed behaviour is notably different from that of the database as a whole. In SD, `behaviour' is traditionally interpreted in terms of the distribution of a single nominal variable. EMM, on the other hand, seeks subgroups for which a fitted local model is surprisingly different from a global model. In this approach, `behaviour' is described by a number of attributes, and fitting a model captures multivariate dependencies between these attributes.

On September 17, 2013, I publicly defended my dissertation, named "Exceptional Model Mining", together with the accompanying theorems. The dissertation you currently find on this website has twelve small errors corrected with respect to the version Leiden University offers, which is also available in book form. A list of errata can be found here. The dissertation was awarded the C.J. Kokprijs for the best dissertation of 2013 at the Faculty of Science of Leiden University. The prize had been awarded since 1971; I am the first ever computer scientist who has won it.

Currently, my research interests revolve around SD and EMM in a somewhat wider scope. On the one hand, I am interested in combining concepts from Subgroup Discovery and ROC Analysis. On the other hand, I am interested in SD and EMM on streams. Initially this can take the relatively simple form of extracting a meaningful flat table representation of features from the stream, and observing the results that can be obtained by running out-of-the-box SD and EMM algorithms on that table. Eventually, though, I strive to develop a full-fledged algorithm that allows to mine for interesting subgroups directly on the data stream.

List of publications

Journal publications
  1. L. Downar, W. Duivesteijn: Exceptionally Monotone Models --- the Rank Correlation Model Class for Exceptional Model Mining. In: Knowledge and Information Systems 51 (2), pp. 369--394, 2017.
    Impact factor: 1.702 (2015).
  2. C. Pölitz, W. Duivesteijn, K. Morik: Interpretable Domain Adaptation via Optimization over the Stiefel Manifold. In: Machine Learning 104 (2-3), pp. 315-336, 2016.
    Impact factor: 1.719 (2015).
  3. W. Duivesteijn: Correction to Jin-Ting Zhang's "Approximate and Asymptotic Distributions of Chi-Squared-Type Mixtures with Applications". In: Journal of the American Statistical Association 111 (515), pp. 1370-1371, 2016.
    Impact factor: 1.979 (2015).
  4. W. Duivesteijn, A.J. Feelders, A. Knobbe: Exceptional Model Mining - Supervised Descriptive Local Pattern Mining with Complex Target Concepts. In: Data Mining and Knowledge Discovery 30 (1), pp. 47-98, 2016.
    Impact factor: 2.714 (2015).
  5. R.M. Konijn, W. Duivesteijn, M. Meeng, A. Knobbe: Cost-based Quality Measures in Subgroup Discovery. In: Journal of Intelligent Information Systems, 45 (3), pp. 337-355, 2015.
    Impact factor: 1.000.
  6. P. Lohuis, S. Faraj-Hakim, W. Duivesteijn, A. Knobbe, A.-J. Tasman: Benefits of a Short, Practical Questionnaire to Measure Subjective Perception of Nasal Appearance after Aesthetic Rhinoplasty. In: Plastic and Reconstructive Surgery 132 (6), pp. 913e-923e, 2013.
    Impact factor: 3.328.
  7. P.J.F.M. Lohuis, S. Hakim, A. Knobbe, W. Duivesteijn, G.M. Bran: Split hump technique for reduction of the overprojected nasal dorsum - a statistical analysis on subjective body image in relation to nasal appearance and nasal patency in 97 aesthetic rhinoplasty patients. In: Archives of Facial Plastic Surgery 14 (5), pp. 346-353, 2012.
    Impact factor: 1.463.
  8. S. Hakim, A. Knobbe, W. Duivesteijn, P.J.F.M. Lohuis: Results of a screening questionnaire measuring physical perception of patients undergoing esthetic rhinoplasty: a statistical analysis. In: Nederlands Tijdschrift voor Keel-Neus-Oorheelkunde (Dutch Journal for Otorhinolaryngology) (2), p. 100, 2010.
    Impact factor: 0.
Conference publications
  1. W. Duivesteijn, T. Farzami, T. Putman, E. Peer, H.J.P. Weerts, J.N. Adegeest, G. Foks, M. Pechenizkiy: Have It Both Ways - from A/B Testing to A&B Testing with Exceptional Model Mining. In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2017), to appear.
  2. C. Rebelo de Sá, W. Duivesteijn, C. Soares, A. Knobbe: Exceptional Preferences Mining. In: Proceedings of the 19th International Conference on Discovery Science (DS 2016), pp. 3-18, 2016.
    Acceptance rate: 0.5000 (30 out of 60).
  3. L. Downar, W. Duivesteijn: Exceptionally Monotone Models - the Rank Correlation Model Class for Exceptional Model Mining. In: Proceedings of the 15th IEEE International Conference on Data Mining (ICDM 2015), pp. 111-120, 2015.
    Acceptance rate: 0.0843 (68 out of 807). Including short papers: 0.1821 (147 out of 807).
    A substantially longer version appeared as a Bachelor Thesis at the TU Dortmund.
  4. W. Duivesteijn, J. Thaele: Understanding Where Your Classifier Does (Not) Work. In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2015) (III), pp. 250-253, 2015.
    Acceptance rate: 0.4828 (14 out of 29).
  5. W. Duivesteijn, J. Thaele: Understanding Where Your Classifier Does (Not) Work - the SCaPE Model Class for EMM. In: Proceedings of the 14th IEEE International Conference on Data Mining (ICDM 2014), pp. 809--814, 2014.
    Acceptance rate: 0.1953 (142 out of 727).
    A substantially longer version appeared as a Technical Report of the SFB 876 at the TU Dortmund.
  6. J. Witteveen, W. Duivesteijn, A. Knobbe, P. Grünwald: RealKRIMP - Finding Hyperintervals that Compress with MDL for Real-Valued Data. In: Proceedings of the 13th International Symposium on Intelligent Data Analysis (IDA 2014), pp. 368--379, 2014.
    Acceptance rate: 0.4800 (36 out of 75).
    A substantially longer version appeared as a Bachelor thesis at the Universiteit Leiden.
  7. M. Meeng, W. Duivesteijn, A. Knobbe: ROCsearch - An ROC-guided Search Strategy for Subgroup Discovery. In: Proceedings of the 2014 SIAM International Conference on Data Mining (SDM 2014), pp. 704-712, 2014.
    Acceptance rate: 0.3084 (120 out of 389).
  8. W. Duivesteijn, A. Knobbe: Exceptional Model Mining - Describing Deviations in Datasets. In: Proceedings of the 22nd Belgian-Dutch Conference on Machine Learning (BENELEARN 2013), p. 86, 2013.
    Acceptance rate: 0.8919 (33 out of 37).
  9. R.M. Konijn, W. Duivesteijn, W. Kowalczyk, A. Knobbe: Discovering Local Subgroups, with an Application to Fraud Detection. In: Proceedings of the 17th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2013), pp. 1-12, 2013.
    Acceptance rate: 0.1134 (39 out of 344). Including short presentations: 0.2849 (98 out of 344).
  10. W. Duivesteijn, E. Loza Mencía, J. Fürnkranz, A. Knobbe: Multi-label LeGo - Enhancing Multi-label Classifiers with Local Patterns. In: Proceedings of the 11th International Symposium on Intelligent Data Analysis (IDA 2012), pp. 114-125, 2012.
    Acceptance rate: 0.2250 (18 out of 80). Including poster presentations: 0.4375 (35 out of 80).
    A substantially longer version appeared as a Technical Report of the TU Darmstadt, TUD-KE-2012-02.
  11. G. Ribeiro, W. Duivesteijn, C. Soares, A. Knobbe: Multilayer Perceptron for Label Ranking. In: Proceedings of the 22nd International Conference on Artificial Neural Networks (ICANN 2012), pp. 25-32, 2012.
    Acceptance rate: 0.6559 (162 out of 247).
  12. W. Duivesteijn, A. Feelders, A. Knobbe: Different Slopes for Different Folks - Mining for Exceptional Regression Models with Cook's Distance. In: Proceedings of the 18th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2012), pp. 868-876, 2012.
    Acceptance rate: 0.1762 (133 out of 755).
  13. W. Duivesteijn, A. Knobbe: Exploiting False Discoveries - Statistical Validation of Patterns and Quality Measures in Subgroup Discovery. In: Proceedings of the 11th IEEE International Conference on Data Mining (ICDM 2011), pp. 151-160, 2011.
    Acceptance rate: 0.1285 (101 out of 786). Including short papers: 0.1883 (148 out of 786).
  14. W. Duivesteijn, A. Knobbe, A. Feelders, M. van Leeuwen: Subgroup Discovery meets Bayesian networks - an Exceptional Model Mining approach. In: Proceedings of the 10th IEEE International Conference on Data Mining (ICDM 2010), pp. 158-167, 2010.
    Acceptance rate: 0.0903 (72 out of 797). Including short papers: 0.1945 (155 out of 797).
  15. W. Duivesteijn, A. Feelders: Nearest Neighbour Classification with Monotonicity Constraints. In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD) 2008 (I), pp. 301-316, 2008.
    Acceptance rate: 0.1919 (100 out of 521).
Workshop publications
  1. S. van der Zon, O. Zeev Ben Mordehai, T. Vrijdag, W. van Ipenburg, J. Veldsink, W. Duivesteijn, M. Pechenizkiy: BoostEMM - Transparent Boosting using Exceptional Model Mining. In: Proceedings of the Second Workshop on MIning DAta for financial applicationS (MIDAS 2017), to appear.
  2. W. Duivesteijn, M. Meeng, A. Knobbe: ROCsearch in a Wider Context - A ROC-Guided Search Strategy for Subgroup Discovery and Beyond. In: Proceedings of the First International Workshop on Learning over Multiple Contexts (LMCE 2014), 2014.
  3. W. Duivesteijn: A Short Survey of Exceptional Model Mining - Exploring Unusual Interactions Between Multiple Targets. In: Proceedings of the 2014 International Workshop on Multi-Target Prediction (MTP 2014), 2014.
  4. M. Meeng, W. Duivesteijn, A. Knobbe: ROCsearch - An ROC-guided Search Strategy for Subgroup Discovery. In: Proceedings of the 2014 Workshop on Knowledge Discovery, Data Mining and Machine Learning (KDML 2014), p. 180, 2014.
  5. R.M. Konijn, W. Duivesteijn, M. Meeng, A. Knobbe: Cost-based Quality Measures in Subgroup Discovery. In: Proceedings of the 3rd Quality Issues, Measures of Interestingness, and Evaluation of data mining models workshop (QIMIE 2013), PAKDD Workshops, pp. 404-415, 2013.

Other stuff

I like to travel around the world.

Some thoughts on Dutch vowels.