TU/e

2II26. Web Analytics

(Fall 2012-2013, Semester A, Quartile 1)


Last update: 31 Aug 2012; if you notice any outdated or likely wrong information on this webpage, please e-mail to 2II26.Teachers@gmail.com

Announcements:
  • The first class will be held at 8.45 on Tuesday 4 September 2012, in Auditorium 2. We look forward to seeing you there!
  • Adversitement BV, the Web Analytics company will give invited lectures and tutorials!
  • This course is meant for the 2nd year bachelor Web Science, Software Science and Technische Informatica programs. Those who have already took 2II15 Data mining and knowledge systems may take also this Web Analytics course, but you should make sure that you have a non-overlapping set of home-works.
  • Course and examination information and registration on OWInfo.
Responsible lecturer: Mykola Pechenizkiy
Co-lecturers and instructors: Jeroen De Knijf, Julia Kiseleva, Guido Budziak, Erik Tromp.
Contacting teachers:
via e-mail:
  • Send all correspondence to 2II26.Teachers@gmail.com with a meaningful subject; it is fine to start with Hi, Hello or Dear FirstName.
  • Please do not send requests to our personal e-mails. There is also no need cc personal e-mails addresses of the teachers.
  • We will try to answer all your requests as soon as possible. However, if you have not received a reply within 3 working days please do not hesitate to resend your request.
in person:
  • option 1: please, do not hesitate to approach the teachers during the lecture breaks on Tuesdays in Aud 2. 8.45 - 10.30, Wednesdays in Aud 16. 13:30-15:45 and during the tutorial sessions on Tuesdays in Pav l10. 10.45-12-30, on Thursdays in Pav l10. 15:45-17:30;
  • option 2: on Mondays 10.00 - 12.00 we have office hours in MF7.099 dedicated for the educational activities; list with available time slots is hanging by the door;
  • option 3: if you cannot make it during the lecture breaks or our dedicated office hours, please send a meeting request to 2II26.Teachers@gmail.com indicating your availability for the corresponding period;
Modes of study and evaluation:
  • 15 face-to-face lectures
  • 15 face-to-face tutorials/instructions (all students are in the same instructions group)
  • 6 Homeworks (only passed assignments will be counted for final grade).
  • Self-study of the literature
  • Trial exam
  • Question answering sessions
  • Real exam

Final grade:

  • 50% homeworks and 50% written exam
Course Materials:
  • There is no single text book that covers the topics you will study in this course. However, most of the lectures main course book is MMDS: Mining of Massive Datasets (by Rajaraman, Ullman, Leskovec) accessible online from here. For each covered topic the corresponding book chapter(s) or other reading will be provided.
  • Handouts, reading materials and guidelines will be available via Sakai Learning Management System. Please register using your TU/e login, and join 2II26.
  • Homework descriptions and homework uploading via Sakai

Course Syllabus:

Please note that this schedule is indicative and changes may be possible as the course progresses.

Date, Time, and Room Lecture Title and Contents
4 Sep 2012
Tuesday
08:45-10:30
AUD 2
Lecture 1: Introduction to the course
  • Motivation and historical perspective on the development of web analytics
  • Overview of the covered topics and connections to other related courses
  • Practicalities
4 Sep 2012
Tuesday
10:45-12:30
PAV L10
Instructions/Tutorial 1: Introduction to the instructions, exercises, tutorials. and homeworks
  • Overview of the homeworks and corresponding deadlines
  • What to expect from the instructions and tutorials
5 Sep 2012
Wednesday
13:45-15:30
AUD 16
Lecture 2: Web analytics at e-Business scale
  • Framework for mapping business needs to web analytics tasks
  • Examples of success stories and currently missed opportunities
6 Sep 2012
Thursday
15:45-17:30
PAV L10
Instructions/Tutorial 2: Web Analytics at e-Business scale
  • Data collection architecture
  • Introduction to OLAP, Web data exploration and reporting
  • Introduction to Splunk
11 Sep 2012
Tuesday
08:45 - 10:30
AUD 2
Lecture 3: Computational advertisement
  • Display and search advertising
  • Ad Auctions
  • Conversion attribution
11 Sep 2012
Tuesday
10:45-12:30
PAV L10
Instructions/Tutorial 3: Computational advertisement
  • Bidding strategies
  • Introduction to Google AdWords and O2MC
  • Simulation tool
12 Sep 2012
Wednesday
13:45-15:30
AUD 16
Lecture 4: Knowledge discovery from web data
  • Major computing paradigms
  • Typical problem formulations
13 Sep 2012
Thursday
15:45-17:30
PAV L10
Instructions/Tutorial 4: Introduction to DM tools.
18 Sep 2012
Tuesday
15:45 - 17:30
AUD 2
Lecture 5: Predictive modeling. Classification
  • Generative and discriminative models
  • Classification vs. regression vs. ranking vs. prediction
  • Active learning and semi-supervised learning
18 Sep 2012
Tuesday
10:45-12:30
PAV L10
Instructions/Tutorial 5: Data preprocessing
  • Different representations (vector space)
  • Dimensionality reduction
  • Sampling
  • Discretization
19 Sep 2012
Wednesday
13:45-15:30
AUD 16
Lecture 6: Predictive modeling. Evaluation.
  • Remainder from lecture 5
  • Cross-validation vs. prequential evaluation
  • Cost-sensitive classification
20 Sep 2012
Thursday
15:45-17:30
PAV L10
Instructions/Tutorial 6: Cost-sensitive classification
  • Translate business problem to data mining problem
  • User profiling
  • Training and evaluating models
25 Sep 2012
Tuesday
08:45-10:30
AUD 2
Lecture 7: Descriptive modeling. Clustering.
  • Partitional clustering (kMeans and DBSCAN)
  • Hierarchical clustering (AHC)
  • Evaluation of clustering (statistical vs. utility-based)
25 Sep 2012
Tuesday
10:45-12:30
PAV L10
Instructions/Tutorial 7: Clustering
  • user segmentation with clustering
  • subgroup discovery
26 Sep 2012
Wednesday
13:45-15:30
AUD 16
Lecture 8: Pattern mining.
  • Frequent itemset mining
  • Frequent sequence mining
  • Subgroup discovery
27 Sep 2012
Thursday
15:45-17:30
PAV L10
Instructions/Tutorial 8: Pattern mining
  • Association rule mining
  • Query auto-correction
  • Uplift modeling
2 Oct 2012
Tuesday
08:45-10:30
AUD 2
Lecture 9: Recommendation strategies.
  • Content-based filtering
  • Collaborative filtering
  • Hybrid strategies
2 Oct 2012
Tuesday
10:45-12:30
PAV L10
Instructions/Tutorial 9: Recommender systems
  • Exercises: Text processing, tf.idf, cosine similarity for content-based filtering;
  • Collaborative filtering and hybridization ideas
  • Problems of biased data, explore/exploit and normalization
3 Oct 2012
Wednesday
13:45-15:30
AUD 16
Lecture 10: Web analytics at Web scale
  • Properties of large-scale networks (degree, diameter, centrality, clustering)
  • PageRank and HITS
  • Web spam, truth finding
4 Oct 2012
Thursday
15:45-17:30
PAV L10
Instructions/Tutorial 10: Computing properties of social networks
9 Oct 2012
Tuesday
08:45-10:30
AUD 2
Lecture 11: Information propagation in networks
  • Influence propagation, viral marketing
  • Acceptance behavior, general contagion model
9 Oct 2012
Tuesday
10:45-12:30
PAV L10
Lecture 12: Modeling evolution of large networks (instead of the tutorial)
  • How do networks grow
  • Random graphs, rich-gets-richer, community-guided attachment, Kronecker graphs
10 Oct 2012
Wednesday
13:45-15:30
AUD 16
Instructions/Tutorial 11: SNA, Sentiment analysis (instead of the lecture)
  • How to collect, store and manage social media data
  • How to perform sentiment analysis and get is summarized
11 Oct 2012
Thursday
15:45-17:30
PAV L10
Instructions/Tutorial 12: SNA: Dutch elections case
  • SNA analytics
16 Oct 2012
Tuesday
08:45-10:30
AUD 2
Lecture 13: Heterogeneous network
  • analytics (top influencing nodes, ambassadors, etc)
  • mining (clustering, classification, prediction) MetaPath
  • construction (information extraction)
16 Oct 2012
Tuesday
10:45-12:30
PAV L10
Instructions/Tutorial 13: SNA, Dutch elections case
  • Mining heterogeneous networks
17 Oct 2012
Wednesday
13:45-15:30
AUD 16
Lecture 14: Web as experimentation platform
  • Theory and practice A/B and multivariate testing
  • Connections to data mining
  • Connections to multi-armed bandits
18 Oct 2012
Thursday
15:45-17:30
PAV L10
Instructions/Tutorial 14: Web as experimentation platform
  • Case studies on web as experimentation platform.
  • Reflection on SNA: Dutch elections homework.
23 Oct 2012
Tuesday
08:45-10:30
AUD 2
Lecture 15: Summary of the course
  • Question-answering session
  • Overview of topics not covered in the course
  • Future of Web analytics
23 Oct 2012
Tuesday
10:45-12:30
PAV L10
Instructions/Tutorial 15: Trial exam
  • It is not mandatory to take the trial exam
24 Oct 2012
Wednesday
13:45-15:30
AUD 16
QA session: Question-answering on the trial exam
25 Oct 2012
Thursday
15:45-17:30
PAV L10
QA session: Question-answering on the homeworks
01 Nov 2012
Thursday
14:00-17:00
Place t.b.a.
FINAL EXAM
  • Do not forget to register on OWInfo.
  • The results will be available on Nov 15.
  • You can come and check your results Nov 19, 10.00-12.00

Colour agenda:

Lectures

Instructions/Tutorials

Deadlines for submitting homeworks and final exam

Handouts and course materials will be available with Sakai or other Learning Management System.

Remarks:

  • MMDS: Mining of Massive Datasets (by Rajaraman, Ullman, Leskovec) accessible online from here.
  • IDM: Introduction to Data Mining book (by Tan, Steinbach, Kumar), chapters accessible online from here.
  • NCM: Networks, Crowds, and Markets: Reasoning About a Highly Connected World (by Easley and Kleinberg), accessible online from here.
  • Please notice that this schedule is indicative and some changes may be still possible.
  • Last update: 31 Aug 2012; if you notice any outdated or wrong information on this webpage, please e-mail to 2II26.Teachers@gmail.com