0% found this document useful (0 votes)
26 views17 pages

DM Lecture 1 Introudction and Policies

This document provides information about a CSCS 455 course on Data Mining and Warehousing taught by Faizad Ullah. It introduces the instructor and his background, describes the course as theoretical and hands-on with programming assignments and a project, and outlines the grading policy which includes assignments, exams, quizzes, and a project. It also lists various course policies around attendance, submissions, and academic integrity. Finally, it discusses course materials that will be provided and how students can contact the instructor.

Uploaded by

Faizad Ullah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views17 pages

DM Lecture 1 Introudction and Policies

This document provides information about a CSCS 455 course on Data Mining and Warehousing taught by Faizad Ullah. It introduces the instructor and his background, describes the course as theoretical and hands-on with programming assignments and a project, and outlines the grading policy which includes assignments, exams, quizzes, and a project. It also lists various course policies around attendance, submissions, and academic integrity. Finally, it discusses course materials that will be provided and how students can contact the instructor.

Uploaded by

Faizad Ullah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 17

CSCS 455 – Data Mining and

Warehousing
Faizad Ullah

1
About Me
 Faizad Ullah
 Ph.D. Student at LUMS
 Specialization
 Natural Language Processing (NLP)
 Machine Learning
 Data Science
 Contributions
 Medical Image Analysis
 Graph Analysis
 Text Analytics of Low-Resourced Language

2
Course Description
 Theoretical (3 Credit Hrs Course)
 Emphasis on hands-on

 *3-5 Assignments
 Programming Assignments

 *One Project
 Programming Environment
 Python (Pytoch, TensorFlow, Colab)

*Vivas will be conducted for assignments and the project


3
Grading
 Point distribution
Class Attendance and Participation 5%

Quizzes (5-8) 20%

Assignments (3-5) 20%

Midterm 20%

Final Term 25%

Project (Or Additional Assignments) 10%

4
Policies
 Quizzes
 Most quizzes are announced (50% quizzes will be unannounced)
 Announcements will be made during the class and on slides, so check slides regularly if you miss
lectures. No announcement will be sent via email
 Sharing
 Copying is not allowed for assignments. Discussions are encouraged; however, you must submit your
own work.
 Violators would be reported to the Disciplinary Committee or face marks reduction penalties

 Plagiarism
 Do NOT pass someone else’s work as your own!
 Write in your own words and cite the reference if you use someone else’s material.

5
Policies (2)
 Submission Policy
 Submissions are due at the day and time specified
 Late submissions will result in 50% marks deduction per day from obtained marks (i.e., 2 days
late submission will get zero credit).
 Attendance Policy
 You are advised to attend all lectures.
 It’s the students’ responsibility to recover any information or announcements posted during a
lecture from which they were absent.
 Classroom behavior
 Maintain classroom sanctity by remaining quiet and attentive
 Asking questions is encouraged.
 You are not allowed to use a Laptop/mobile phone, etc., during class.

6
Policies (3)
 Retakes
 No retakes for quizzes, assignments, exams, or projects
 In case of any medical emergency or unavoidable circumstances, inform before hand and seek a formal
approval. You need to share medical reports for departmental record.
 Do not wait for the final exam to seek approval for retakes

7
Course Material
 All course material (i.e.,Books, class handouts, reading
assignments) will be shared on Moodle
 Text Book
 Data Mining: Concepts and Techniques, 3rd ed, by Jiawein Han, Micheline Kamber, Jian Pei
 Pattern Classification, 2nd Edition, by Richard O. Duda (Author), Peter E. Hart (Author), David G. Stork

 Reference Book
 Mining Massive Datasets, by Jure Leskovec, Anand Rajaraman, Jeffrey D. Ullman

8
Contact
 How to contact me?
 E-mail: Will share soon
 Office:
 Office Hours: Mentioned on office door

9
Most Important

Don’t be afraid of giving wrong answers!


Let’s start our course
journey…

11
Data is Everywhere
• There has been enormous data growth in both
commercial and scientific databases due to advances
in data generation and collection technologies

E-Commerce
Traffic Patterns
Gather whatever data you can
whenever and wherever possible.

Sensor Networks Social Networking:


Twitter
12
Why Data Mining – Commercial Viewpoint
• The Explosive Growth of Data
• Lots of data is being collected and warehoused
• Web data
• Google has Peta Bytes of web data
• Facebook has billions of active users
• purchases at department/grocery stores, e-commerce
• Amazon handles millions of visits/day
• Bank/Credit Card transactions
• Computers have become cheaper and more powerful
• Competitive Pressure is Strong
• Provide better, customized services for an edge (e.g. in Customer Relationship Management)

We are drowning in data, but starving for knowledge!

• Data mining—Automated analysis of massive data sets


13
Why Data Mining – Scientific Viewpoint
• Data collected and stored at enormous
speeds
• remote sensors on a satellite Sky Survey Data
• NASA EOSDIS archives over petabytes of earth
science data/year
• telescopes scanning the skies
• Sky survey data
• High-throughput biological data fMRI Data from Brain
• scientific simulations Gene Expression Data

• terabytes of data generated in a few hours


• Data mining helps scientists
• in automated analysis of massive datasets
• In hypothesis formation Surface Temperature of Earth
14
What is Data Mining?
• Many Definitions!
Non-trivial extraction of implicit, previously unknown
and potentially useful information from data

Exploration & analysis, by automatic or semi-automatic


means, of large quantities of data in order to discover
meaningful patterns

• Alternative Names:
• Knowledge discovery (mining) in databases (KDD), knowledge extraction,
data/pattern analysis, data archeology, data dredging, information
harvesting, business intelligence, etc. 15
Data Mining: Confluence of Multiple Disciplines

Machine Pattern Statistics


Learning Recognition

Applications Data Mining Visualization

Algorithm Database High-Performance


Technology Computing

16
• Comments

You might also like