0% found this document useful (0 votes)
14 views

Mod01 Intro Datamining

Uploaded by

Zameer Qasim
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Mod01 Intro Datamining

Uploaded by

Zameer Qasim
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Class starts @

1
Agenda
in Databases I

• Housekeeping

• Lecture 1 :
• Intro to data Mining

• Python Jupiter Notebook / Anaconda

2
Definitions

◼ Data
► Representations of Facts

◼ Information
► Data with “Relevance and Importance”
► Any datum (and/or data) that changes the probability
distribution (chances) of a relevant outcome.

3
Definitions
◼ Knowledge
► Ability to use information to act (or not), in order to achieve
objectives.
► The ability to understand and explain, relationship between
different phenomena (usually as a rule)

◼ Wisdom
► Ability to synthesize information and knowledge, to create a
framework for optimal actions.

◼ Intelligence
► The ability to apply knowledge
4
What are Data, Information, Knowledge, & Wisdom?

5
Support Systems In a Typical Organization

Social Computing
Crowd Sourcing

Data Mining
& ML & AI

Data Warehouse, MIS

OLTP

Necessity is the mother of invention

6
Evolution of Technology
◼ 1960s
◼ Data collection, database creation, IMS and network DBMS
◼ 1970s:
◼ Relational data model, relational DBMS implementation
◼ 1980s:
◼ RDBMS, advanced data models (extended-relational, OO, deductive, etc.)
◼ Application-oriented DBMS (spatial, scientific, engineering, etc.)
◼ 1990s:
◼ Data mining, data warehousing, multimedia databases, and Web databases
◼ 2000s
◼ ML, Stream data management and mining, global information systems
Deep Learning, Natural Language Processing, Computer Vision
◼ 2020S
◼ AI, Generative models
7
Gartner Hype Cycle

Time

8
What is ML & Knowledge Discovery ?
ML & KD Mean Different Things to Different Professionals
▪ Management: Potentially money making tools
▪ Computer Scientists: A new Knowledge Discovery breakthrough - NOT
STATISTICS
▪ Statisticians: Not statistically, significantly, new - A computerized statistician
▪ Electrical Engineers: Another application of Information Theory and Entropy
▪ Neuroscientists: Neurocomputer - a computer model of the human brain
▪ Mathematicians: Some weighted average of a bunch of numbers

9
How to Get Information Out of “Big” Data
New Data Warehouse Architectures

10
How to Get Knowledge Out of “Big” Data

There is a need for a new generation of techniques with the ability to


intelligently and automatically assist humans in analyzing ‘mountains’ of
data for nuggets of useful knowledge (and not just information).

This has led to an emerging field:

Data Mining, ML & Knowledge Discovery (LM & KD)

11
DM vs. ML vs. AI vs. DL
◼ Data Mining :
► finding patterns in data to explain some phenomenon. (e.g. loan default)

◼ Machine Learning:
► enable machine to "learn"

◼ Artificial Intelligence:
► create ways that machine can mimic human behaviors. (e.g. Deep Blue)

◼ Deep Learning:
► creating a machine that mimics the working of our brains (image
recognition).

◼ Source: Linked in

12
ML & Knowledge Discovery
• Underlying Disciplines
Biology, Neurology, Psychology, Statistics, Computer Science,
Engineering

• Artificial Intelligence (AI)


Integrates the “Underlying Disciplines” for solving various
types of problems

• Techniques
– Symbolic: Rules Based Systems (RBS), Case-Based
Reasoning (CBR), Fuzzy Logic (FL)
– Connectionist: Artificial Neural Networks (ANN)
– Inductive (ML): C4.5, CART
– Evolutionary: Genetic Algorithms (GA)
13
What is ML & Knowledge Discovery?

The non-trivial process of identifying valid, novel, potentially


useful, and ultimately understandable patterns in data.

-- Fayad, Shapiro, Smyth (1996)

• process: knowledge discovery is iterative, as you uncover


“nuggets” in the data, you learn to ask better questions
• valid: generalize to the future
• novel: not something we already know
• useful: actionable, can be used for a task
• understandable: process leads to human insight

14
What is Machine Learing
& Knowledge Discovery ?

The New York Times:

Machine Learning has entered a golden age, whether


being used to set ad prices, find new drugs more quickly
or fine-tune financial models. Companies as diverse as
Google, Pfizer, Merck, Bank of America, the
InterContinental Hotels Group and Shell use it.

15
Architecture: Typical LM System

Graphical user interface

Pattern evaluation

Data mining engine


Knowledge-base
Database or data
warehouse server
Data Cleaning & Data Integration Filtering

Data
Databases Warehouse

16
DM & KD Process: End-to-End Solution

▪ Pose a Profound Question


▪ Identify Relevant Data
▪ Access the Data
▪ Clean the Data
▪ Transform & Integrate the Data
▪ Mine/Discover Knowledge
▪ Make Intelligent Decisions

17
Intelligence Chiefs Testify At Senate Hearing

◼ https://www.youtube.com/watch?v=7OVVbrT
P18g 40 minute

18

You might also like