HaCDAIS 2011:
The 2nd International Workshop on
Handling Concept Drift
in Adaptive Information Systems
Information

Program   

Attending   
Tentative schedule is here.

WELCOME

The 2nd International Workshop on Handling Concept Drift in Adaptive Information Systems will take place in Vancouver, Canada on December 11th, 2011. It is organized in conjunction with the 11th IEEE International Conference on Data Mining (IEEE ICDM 2011). The 1st HaCDAIS workshop was held in conjunction with ECML/PKDD 2010 in Barcelona, Catalonia, Spain.

The objective of the workshop is to provide a forum for discussion of recent advances in handling concept drift in adaptive information systems, and to offer an opportunity for researchers and practitioners to identify and discuss recent advances and new promising research directions.

WORKSHOP FORMAT

The workshop will take place on December 11th, from 8:30 till 16:00. Besides regular sessions consisting of presentations of selected peer-reviewed papers, the programme will feature two invited talks from industry, by Wei Fan and by Petr Kadlec.

In the closing session, we will lead an open discussion aimed to foresee the future of concept drift research and to identify immediate opportunities for collaboration.

INVITED SPEAKERS

Wei Fan "Streaming Models and Systems for Smarter Transportation Systems: Challenges and Solutions"

Abstract: Building smart transportation systems using sensor technologies incurs various challenges in data collection, system design, data management, modeling, optimization etc. We propose and focus on how to leverage streaming systems, languages, platforms, hypergraph models, heteregenenous source modeling, transfer learning, formulations, graph-based optimization, model deployment and etc. We discuss short/long-term visions, review implemented systems. and present the various technical challenges encountered. In addition, we introduce two engagements that are in place.

Short Bio: Wei Fan's (www.weifan.info) research interests are in risk analysis, high performance computing, skewed distribution, cost-sensitive learning, streams, ensembles, graph mining, feature construction, feature selection, sample selection bias, transfer learning, novel applications and commercial systems. He is particularly interested in simple, unconventional, but effective methods to solve difficult problems. He is on the editorial board of ACM TKDD. His thesis work on intrusion detection has been licensed by a start-up company. His co-teamed submission that uses Random Decision Tree (RDT) has won the ICDM'08 Contest championship. His co-authored paper in ICDM'06 that uses RDT won the best application paper award. The open source code of RDT is available from www.dice4dm.com. His co-authored paper in KDD'97 on distributed learning system won the runner-up best application paper award. He received 2010 IBM Outstanding Technical Achievement Award for contribution in building Infosphere Streams.

Petr Kadlec "To adapt, or not to adapt: adapt but with caution"

Abstract: This talk will focus on the adaptation of predictive and explanatory computational models in industrial applications. The focus of the talk will be on case studies from chemical industry, where there is a great demand for such models. Due to frequent changes in the production plants and in the data acquisition, the models require frequent adjustments. Automated adaptation strategies are therefore very useful for ensuring the continuous operation of the models at stable quality levels. However, despite the strong demand for adaptation, there are also situations when adaptation can be counterproductive and a lot of cautiousness is necessary to maintain the validity of the models.

Short Bio: Petr Kadlec is a researcher at the Quality Management Group of Evonik Industries, Germany. His primary responsibility is the development and implementation of predictive and descriptive models for the chemical plants of Evonik Industries. In the industrial environment, the main challenges he has to deal with are the dynamic nature of the plants and the changing quality of the measured data. Petr's research interests are predictive modelling using statistical and machine learning techniques, pre-processing of real-life industrial data, adaptive systems, ensemble methods, and meta learning to name few. He received his Ph.D. in Computer Science from the Bournemouth University, UK. He earned Master's degree in Computer Science and Bachelor's degree in Electrical Engineering from the Vienna University of Technology, Austria.

ACCEPTED PAPERS AND TENTATIVE SCHEDULE

Session 1: 8:30 - 10:00
8:30 - 8:35Introduction to the workshop from the organizers
8:35 - 8:52 Unifying Change - Towards a Framework for Detecting the Unexpected
Iris Ada and Michael R. Berthold
slides
8:55 - 9:18 Change Mining of Customer Profiles based on Transactional Data
Edward Apeh and Bogdan Gabrys
slides
9:21 - 9:38 Drift Detection using Uncertainty Distribution Divergence
Patrick Lindstrom, Brian Mac Namee, and Sarah Jane Delany
slides
9:41 - 9:58 Pool and Precision Based Stream Classification: A new ensemble algorithm on data stream classification using recurring concept detection
Mohammad Javad Hosseini, Zahra Ahmadi, and Hamid Beigy
slides
10:00 - 10:30 Coffee break and discussions
Session 2: 10:30 - 12:20
10:30 - 11:15 Invited Talk: To adapt, or not to adapt: adapt but with caution
Petr Kadlec
slides
11:17 - 11:34 Detecting Mean Changes in Data Streams
Murad Badarna and Ran Wolff
slides
11:37 - 12:00 What's your current stress level? Detection of stress patterns from GSR sensor data
Jorn Bakker, Mykola Pechenizkiy, and Natalia Sidorova
slides
12:03 - 12:20 Interpretability of Sudden Concept Drift in Medical Informatics Domain
Gregor Stiglic and Peter Kokol
slides
12:30 - 14:00 Lunch break and discussions
Session 3: 14:00 - 15:45
14:00 - 14:45 Invited Talk: Streaming Models and Systems for Smarter Transportation Systems: Challenges and Solutions
Wei Fan
14:47 - 15:06 Classification in Presence of Drift and Latency
Georg Krempl and Vera Hofer
slides
15:09 - 15:26 Interpretable, Online Soft-Sensors for Process Control
Mark Eastwood and Petr Kadlec
slides
15:29 - 15:45Discussion and closing

CALL FOR PAPERS (in txt)

In the real world data is often non stationary. In predictive analytics, machine learning and data mining the phenomenon of unexpected change in underlying data over time is known as concept drift. Changes in underlying data might occur due to changing personal interests, changes in population, adversary activities or they can be attributed to a complex nature of the environment.

When there is a shift in data, the predictions might become less accurate as the time passes or opportunities to improve the accuracy might be missed. Thus the learning models need to be adaptive to the changes.

The problem of concept drift is of increasing importance to machine learning and data mining as more and more data is organized in the form of data streams rather than static databases, and it is rather unusual that concepts and data distributions stay stable over a long period of time. It is not surprising that the problem of concept drift has been studied in several research communities including but not limited to machine learning and data mining, data streams, information retrieval, and recommender systems. Different approaches for detecting and handling concept drift have been proposed in the literature, and many of them have already proved their potential in a wide range of application domains, e.g. fraud detection, adaptive system control, user modeling, information retrieval, text mining, biomedicine.

TOPICS OF INTEREST

In this workshop, we aim to attract researchers with an interest in handling concept drift and recurring contexts in adaptive information systems. Although we have emphasized the application aspects of handling concept drift we are open to any original work in this area.
A non-exhaustive list of topics includes:

  • Classification and clustering on data streams and evolving data
  • Change and novelty detection in online, semi-online and offline settings
  • Adaptive ensembles
  • Adaptive sampling and instance selection
  • Incremental learning and model adaptivity
  • Delayed labeling in data streams
  • Dynamic feature selection
  • Handling local and complex concept drift
  • Qualitative and quantitative evaluation of concept drift handling performance
  • Reoccurring contexts and context-aware approaches
  • Application-specific and domain driven approaches within the areas of information retrieval, recommender systems, pattern recognition, user modeling, decision support and adaptive (information) systems
  • Case studies and application examples dealing with drifting data

Please notice that we encourage prospective contributors to submit full papers (8 pages) and short papers (5 pages).

IMPORTANT DATES

August 5, 2011 (hard) Submission due (for both full and short papers)
September 20, 2011 Notification of acceptance
October 11, 2011 Final papers due
December 11, 2011 Workshop day

SUBMISSION PROCEDURE

Paper submissions should strictly follow the IEEE 2-column format, which is the same as the camera-ready format (see the IEEE Computer Society Press Proceedings Author Guidelines). Workshop submissions do not need to be double-blind. We recommend submissions of 8 pages for full papers and 5 pages for short papers. Submissions up to 10 pages are allowed.

All papers should be submitted through the ICDM Workshop Submission Site. At the time of submission, the papers must not be under review or accepted for publication elsewhere. Submission implies the willingness of at least one of the authors to register and present the paper.

All papers will be reviewed by the Program Committee based on technical quality, relevance to data mining, originality, significance, and clarity.
All accepted workshop papers will be published in ICDM workshop proceedings published by the IEEE Computer Society Press. In addition, authors with accepted papers to the workshop will have the opportunity to be invited to publish their extended versions to a special issue in a journal.

WORKSHOP CHAIRS

Latifur KhanUniversity of Texas, Dallas, USA
Mykola PechenizkiyEindhoven University of Technology, the Netherlands
Indrė Žliobaitė Bournemouth University, UK

PROGRAMME COMMITTEE

Charu Agrawal IBM T.J.Watson Research, USA
Albert Bifet University of Waikato, New Zealand
Sarah Jane Delany Digital Media Centre, Ireland
Anton Dries Universitat Pompeu Fabra, Spain
Wei Fan IBM T.J.Watson Research, USA
Bogdan Gabrys Bournemouth University, UK
João Gama University of Porto, Portugal
Jing Gao University of Illinois, Urbana-Champaign, USA
Vivekanand Gopalkrishnan Nanyang Technological University, Singapore
Geoff Holmes University of Waikato, New Zealand
Ioannis Katakis University of Cyprus, Cyprus
Ludmila Kuncheva Bangor University, UK
Matthijs van Leeuwen Universiteit Utrecht, the Netherlands
Mohammad Masud University of Texas, Dallas, USA
Ernestina Menasalvas Universidad Politecnica de Madrid, Spain
Leandro Minku The University of Birmingham, UK
Bernhard Pfahringer University of Waikato, New Zealand
Robi Polikar Rowan University, USA
Grigorios Tsoumakas Aristotle University, Thessaloniki, Greece
Alexey Tsymbal Siemens AG, Germany

For further questions, please contact organizers at hacdais2011@gmail.com