Topic 1a - Introduction To Data Mining
Topic 1a - Introduction To Data Mining
Introducti
on to Data
Mining
Ts. Dr. Tuan Norhafizah Tuan
Zakaria
Objectives
Data
• the basic facts such as names, numbers or characters Pak Kassim needs to taking care his food
Applied
that come in different forms (like text or image). intake and body health.
• Raw fact with no meaning. Wisdom
Wisdom
Data Raw 140mmHg
• Applied of knowledge
What is Data?
• Table 1 - a sample of data with five (5) variables, where the last column indicates the outcome
of that sample.
Alternatives names:
• Knowledge discovery (mining) in databases (KDD),
knowledge extraction, data/pattern analysis, data
archeology, data dredging, information harvesting
• Today, massive growth of data availability, from Terabyte to Yottabyte, it is everywhere and anywhere
• “There were 5 exabytes of information created between the dawn of civilization through 2003, but that much
information is now created every 2 days” – Eric Schmidt, Executive Chairman of Google
• “Information is the oil of 21st century, and analytics is the combustion engine.” – Peter Sondergaard, Gartner
Research.
• Source of data ?
https://www.techentice.com/the-data-vera
city-big-data/
From Data Mining to Big Data
Mining: Examples
Big data mining
• referred to the collective data mining or extraction techniques that is performed on large volume of data or the
big data.
Goal
• to discover insights from the social media platforms (Instagram, Twitter, Facebook) with thousand of postings.
Classifying youth emotions based on Twitter data Sentiment analysis on reviews of Proton Cars in
Malaysia using Facebook postings
Conclusions
Finds relationship
(that exist within the dataset)
and
makes prediction
References
1. Pang-Ning Tan, Michael Steinbach & Vipin Kumar, Introduction to Data Mining, Addison Wesley, 2019.
2. Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques, 3rd Edition, Morgan Kaufmann,
2012.
3. Che D., Safran M., Peng Z. (2013) From Big Data to Big Data Mining: Challenges, Issues, and Opportunities.
In: Hong B., Meng X., Chen L., Winiwarter W., Song W. (eds) Database Systems for Advanced Applications.
DASFAA 2013. Lecture Notes in Computer Science, vol 7827. Springer, Berlin, Heidelberg.
https://doi.org/10.1007/978-3-642-40270-8_1
4. Razak, Z. I., & Mutalib, S. (2018). Web Mining In Classifying Youth Emotions. Malaysian Journal of
Computing, 3(1), 1-11.
5. Wah, Y. B., Abdullah, N., Abdul-Rahman, S., & Tan, M. L. P. (2018). text mining and sentiment analysis on
reviews of proton cars in malaysia. Malaysian Journal of Science, 37(2), 137-153.