COMSM0204: Advanced Topics in Machine Learning and Data Mining
Resources
- WEKA
- MLC++
- Clementine
- Intelligent Miner
- videolectures.net
- The department's Learning page
Course outline
For 2009/10, Advanced Topics in Machine Learning has been scheduled as a 'fat unit' over weeks 17 and 18. The course will be structured around a group 'data-mining challenge' (to be announced) supported by additional lectures on advanced techniques and applications in machine learning. This will be a very intensive course that will require you to work consistently over a 3 week period. The timetable is as follows, and key dates for the coursework are described below.
| Week | Date and Time | Speaker | Lecture |
| 17 | 22 Feb - 10:30 to 11:00 | James Marshall | Course introduction and orientation |
| 17 | 23 Feb - 11:00 to 13:00 | Peter Flach | Introduction to Exabyte Informatics |
| 17 | 24 Feb - 11:00 to 13:00 | Nello Cristianini | Pattern Matching |
| 17 | 25 Feb - 11:00 to 13:00 | Nello Cristianini | Pattern Matching |
| 18 | 3 Mar - 14:00 to 16:00 | Tilo Burghardt | Animal Biometrics |
| 18 | 4 Mar - 15:00 to 17:00 | Students | Group presentations |
Coursework
The coursework is to develop a solution to the Netflix Challenge, which is to predict how much people are likely to like particular movies, based on their profile. The coursework will be undertaken intensively by groups over a three week period. 50 percent of the marks will be allocated based on the quality of the results produced by the team, weighted by the novelty of the approach. Reproduction of existing algorithms is allowed, as long as the source is correctly cited, but this will score lower in novelty than another approach generating similar quality results without directly re-implementing a previously proposed solution to the problem. There will be a competitive element to the performance marks, with some proportion due to ranked performance against the other groups, in order to keep things interesting. The remaining marks will be divided between the technical group report (25 percent) and presentation (25 percent). At final submission, group members will evaluate each others' contributions, and this will be used to assign group marks to individual members.A mirror of the dataset is available locally.
Key dates:
- Monday 22nd February 10:30 - Introduction to course, announcement of data-mining challenge, assignment into project groups
- Thursday 4th March 14:00 to 16:00 - Presentations by groups on methodology and results for data-mining challenge
- Friday 12th March - Submission of group reports on data-mining challenge

