0% found this document useful (0 votes)
27 views15 pages

L0 Overview

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views15 pages

L0 Overview

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 15

Data Mining

Flavius Frasincar

1
Contents

• Your Teacher
• This Course
• Evaluation
• Book

2
Your Teacher

• Flavius Frasincar, [email protected]


• PhD completed at the Eindhoven University of Technology
(TU/e) in June 2005:
– Title of the thesis: “Hypermedia Presentation Generation for
Semantic Web Information Systems”
(thesis available from http://alexandria.tue.nl/extra2/200511530.pdf)
– 2004-2005 assistant professor at TU/e

• From August 2005 assistant professor at Erasmus


University Rotterdam (EUR)
• I do come originally from Romania

3
Romania

4
Computer Science Minor

• I am the coordinator of the (Advanced) Computer


Science Minor
• Courses:
– Introduction to Programming [broadening minor] or Advanced
Programming (minor) [deepening minor] (4 ECTS)
– Databases (4 ECTS)
– Data Mining (4 ECTS)
– Topics in Business Intelligence (3 ECTS): compulsory only for
the ones following a 15 ECTS variant of the minor (non-ESE)

5
Computer Science Minor

• “Successful participation in this minor requires a significant ability to


deal with abstract concepts. In addition, a good mathematical
background (algebra, calculus and statistics) is desired.” (from
https://www.eur.nl/en/minor/computer-science)
• This minor is a lot of hard work, but:
– You will learn many Computer Science topics
– You will learn a lot of useful Computer Science skills (e.g., programming,
querying, designing, modeling, etc.), which are very much appreciated by future
employers (especially in industry)
– If you like mathematics, the minor is a lot of fun!

6
This Course

• The course Data Mining (FEB53020) covers the major


principles and techniques used in Data Mining
• At the end of the course you should know:
– What is data mining?
– What are data types, data quality, and data preprocessing?
– What are data similarity and data dissimilarity?
– What are data classification techniques?
– How to evaluate data classification techniques?
– What are data clustering techniques?
– How to evaluate data clustering techniques?

7
Topics

• Data Types, Data Quality, and Data Preprocessing


• Data Similarity and Data Dissimilarity
• Data Classification Techniques
• Data Classification Evaluation
• Data Clustering Techniques
• Data Clustering Evaluation

8
Lectures

• Week 1 (Tuesday, 30 August 2022)


• Week 2 (Tuesday, 06 September 2022)
• Week 3 (Tuesday, 13 September 2022)
• Week 4 (Tuesday, 20 September 2022)
• Week 5 (Tuesday, 27 September 2022)
• Week 6 (Tuesday, 04 October 2022)
• Week 7 (Tuesday, 11 October 2022)

9
Group Meetings

• Take place at my office ET-44 (after the lectures 10


minutes/group starting at 15:10, ordered by group id)
• Evaluation of the progress of the work
• Questions regarding lectures
• Workload: 4 ECTS x 28 hours = 112 hours (112 hours/(4
hours/day) = 28 days!)
• Presentations (PowerPoint) each week (on your laptops)
• Presentations are compulsory!
• 30 August 2022 (today) there are no group meetings

10
Evaluation

• Assignments:
– Report (assignments after each lecture)
• For the assignments groups of 5 people should be formed:
– Every team member should equally contribute to the assignment
• The groups need to be formed 30 August 2022 (today)
• You can sign-up using
https://docs.google.com/spreadsheets/d/1OSuB-H2aSfKZigW_-
AKN3xHuX_0HivP45htDTIY3mx0/edit#gid=0
(if you do not have a group, sign-up for the first free slot and
coordinate with the rest of the group members by email)
• Evaluation based on:
– Report
– Written examination
11
Evaluation

• Tuesday 11 October 2022 final presentation


• Thursday 13 October 2022 reports sent by email in PDF
to me
• Tuesday 18 October 2022 09:30-11:30 written
examination
• Mark = individual input in the work resulting in the report
+ quality of the report [2 points] + written examination [8
points]

12
Report

• Group ID
• Authors
• Assignment 1:
– Exercise 1: formulation + solution
– Exercise 2: formulation + solution
– …

• Assignment 2:
– …

13
Book

• Title: Introduction to Data


Mining
• Authors: Pang-Ning Tan,
Michael Steinbach, and Vipin
Kumar
• Publisher: Pearson Education
• Year: 2005
• ISBN: 978-0321420527
• 1st Edition

14
Tools

• WEKA: GUI
• Matlab: statistics toolbox>machine learning package
• R: caret, nnet, randomForest, etc.
• Python: scikit-learn, TensorFlow, PyTorch, etc.
• Java: WEKA, deeplearning4j, Mahout, MALLET, etc.

• Many videos and tutorials online!

15

You might also like