COMP9318 (Data Warehousing and Data Mining) 2009s1

1. Announcement

2. Tutorials

2.1. Tutorial Schedule

Tutorial Week Contents Solution
tut1 Week 6 Data Warehousing and Data Preprocessing sol
tut2 Week 8 Clustering sol
tut3 Week 10 Classification sol
tut4 Week 12 Association Rule Mining sol

3. Quizzes

Specification Topic(s) Deadline
Quiz 1 (q1 FAQ) (q1 results) Data Warehousing and OLAP 6 May, 2009

4. Written Assignments

All assignments are individual assignments.

Specification Topic(s) Deadline Solution
Ass1 (ass1 FAQ) misc 4 Jun 2009 sol

5. Programming Projects

All projects are individual projects.

Specification Topic(s) Deadline
Proj1 (proj1 FAQ) Clustering Search Engine 5 Jun 2009

It is possible for you to propose your own course project. Please contact the Lecturer-in-Charge to discuss this option.

6. About the Course

6.1. Detailed Course Introduction

See here.

6.2. Staff

Name Role Telephone Email
Dr. Wei Wang Lecturer-in-charge 9385 7162 cs9318@cse
Yifei Lu tutor 9385 7225 yifeil@cse
Jianbin Qin tutor 9385 7205 jqin@cse
Juanjuan Wang tutor wangj@cse

6.3. Textbook and Reference Books

Ref Role Book
[HK00] Textbook Data Mining: Concepts and Techniques, Jiawei Han and Micheline Kamber. Kaufmann Publishers, August 2000. ISBN: 1-55860-489-8
[WF00] Reference Book Data Mining : Practical Machine Learning Tools andTechniques with Java Implementations, Ian H. Witten, Eibe Frank. Morgan Kaufmann, 2000. ISBN: 1558605525.

Errata of the textbook [HK00]: here.

6.4. Softwares

Software Comment
Pentaho Mondrian OLAP Server
Pentaho Kettle ETL toolkit
Weka Data mining toolkit

9318 Mondrian Server: http://snare09.cse.unsw.edu.au:8080/mondrian-embedded/index.html

6.5. Lecture Time

Day Time Location
Thu 1800 -- 2100 Civil Engineering G1 (K-H20-G1)

6.6. Consultation Time

Day Time Location
MON 1600 -- 1700 K17 507

7. Syllabus

Week Contents Reading Tut/Quiz/Ass/Proj
1 Course Introduction + Introduction Chap 1
2 Data Warehousing and OLAP + BUC (updated 9 Apr; fix the missing (1,1,2) tuple and added two more slides) Chap 2 + the BUC paper
3 Data Warehousing and OLAP + MDX tutorial + Address Data Cleansing Chap 2 + 9318 Mondrian Server
4 Edit Distance + Approx String Join Chap 3 + Pentaho Kettle
5 Data Pre-processing + Similarity Join Chap 3
B
6 IR Preliminaries + Clustering Chap 8 tut1
7 Clustering + Hierarchical Clustering Chap 8
8 Clustering + DBScan + Classification Chap 8 + 7 tut2
9 Classification + Text Classification (optional) Chap 8 + 7
10 Association Rule Mining + FP-tree Chap 6 tut3
11 Association Rule Mining + TiVo + WWW 2008 paper Chap 6 + Reading material
12 Review tut4

Reading list:


UNSW (The University of New South Wales) CRICOS Provider Number: 00098G