Mastering Data Mining with Python – Find patterns hidden in your data
By Megan Squire
()
About this ebook
- Dive deeper into data mining with Python – don’t be complacent, sharpen your skills!
- From the most common elements of data mining to cutting-edge techniques, we’ve got you covered for any data-related challenge
- Become a more fluent and confident Python data-analyst, in full control of its extensive range of libraries
This book is for data scientists who are already familiar with some basic data mining techniques such as SQL and machine learning, and who are comfortable with Python. If you are ready to learn some more advanced techniques in data mining in order to become a data mining expert, this is the book for you!
Megan Squire
Megan Squire is deputy director for analytics at the Southern Poverty Law Center.
Related to Mastering Data Mining with Python – Find patterns hidden in your data
Related ebooks
Text Analytics with Python: A Brief Introduction to Text Analytics with Python Rating: 0 out of 5 stars0 ratingsPython Data Science Essentials - Second Edition Rating: 4 out of 5 stars4/5Python 3 Text Processing with NLTK 3 Cookbook Rating: 4 out of 5 stars4/5Practical Data Analysis Cookbook Rating: 0 out of 5 stars0 ratingsPython Data Science Essentials Rating: 0 out of 5 stars0 ratingsMachine Learning with R - Third Edition: Expert techniques for predictive modeling, 3rd Edition Rating: 0 out of 5 stars0 ratingsPython Data Analysis Cookbook Rating: 4 out of 5 stars4/5Learning Data Mining with Python - Second Edition Rating: 0 out of 5 stars0 ratingsModular Programming with Python Rating: 0 out of 5 stars0 ratingsHands-On Data Analysis with Pandas: Efficiently perform data collection, wrangling, analysis, and visualization using Python Rating: 0 out of 5 stars0 ratingsLearning Jupyter Rating: 3 out of 5 stars3/5Bayesian Analysis with Python Rating: 4 out of 5 stars4/5Python Data Visualization Cookbook Rating: 4 out of 5 stars4/5Learning Data Mining with Python Rating: 0 out of 5 stars0 ratingsGetting Started with Python Data Analysis Rating: 0 out of 5 stars0 ratingsAdvanced Machine Learning with Python Rating: 0 out of 5 stars0 ratingsLarge Scale Machine Learning with Python Rating: 2 out of 5 stars2/5Django 1.1 Testing and Debugging Rating: 4 out of 5 stars4/5Principles of Data Science Rating: 4 out of 5 stars4/5Learning Predictive Analytics with Python Rating: 4 out of 5 stars4/5R High Performance Programming Rating: 4 out of 5 stars4/5Data Science with Jupyter: Master Data Science skills with easy-to-follow Python examples Rating: 0 out of 5 stars0 ratingsWeb Scraping with Python Rating: 4 out of 5 stars4/5Mastering Python Data Analysis Rating: 0 out of 5 stars0 ratingsPractical Data Science Cookbook - Second Edition Rating: 0 out of 5 stars0 ratingsPython Unlocked Rating: 0 out of 5 stars0 ratings
Programming For You
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL Rating: 4 out of 5 stars4/5Python: Learn Python in 24 Hours Rating: 4 out of 5 stars4/5Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1 Rating: 5 out of 5 stars5/5Microsoft Azure For Dummies Rating: 0 out of 5 stars0 ratingsLearn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer. Rating: 5 out of 5 stars5/5Coding All-in-One For Dummies Rating: 4 out of 5 stars4/5SQL All-in-One For Dummies Rating: 3 out of 5 stars3/5Learn SQL in 24 Hours Rating: 5 out of 5 stars5/5Python Programming : How to Code Python Fast In Just 24 Hours With 7 Simple Steps Rating: 4 out of 5 stars4/5JavaScript All-in-One For Dummies Rating: 5 out of 5 stars5/5Godot from Zero to Proficiency (Foundations): Godot from Zero to Proficiency, #1 Rating: 5 out of 5 stars5/5Linux: Learn in 24 Hours Rating: 5 out of 5 stars5/5PYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project Rating: 5 out of 5 stars5/5PYTHON PROGRAMMING Rating: 4 out of 5 stars4/5C All-in-One Desk Reference For Dummies Rating: 5 out of 5 stars5/5Python Data Structures and Algorithms Rating: 5 out of 5 stars5/5Excel 101: A Beginner's & Intermediate's Guide for Mastering the Quintessence of Microsoft Excel (2010-2019 & 365) in no time! Rating: 0 out of 5 stars0 ratingsAlgorithms For Dummies Rating: 4 out of 5 stars4/5Mastering JavaScript: The Complete Guide to JavaScript Mastery Rating: 5 out of 5 stars5/5Excel 2021 Rating: 4 out of 5 stars4/5
Reviews for Mastering Data Mining with Python – Find patterns hidden in your data
0 ratings0 reviews
Book preview
Mastering Data Mining with Python – Find patterns hidden in your data - Megan Squire
Table of Contents
Mastering Data Mining with Python – Find patterns hidden in your data
Credits
About the Author
About the Reviewers
www.PacktPub.com
eBooks, discount offers, and more
Why subscribe?
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Expanding Your Data Mining Toolbox
What is data mining?
How do we do data mining?
The Fayyad et al. KDD process
The Han et al. KDD process
The CRISP-DM process
The Six Steps process
Which data mining methodology is the best?
What are the techniques used in data mining?
What techniques are we going to use in this book?
How do we set up our data mining work environment?
Summary
2. Association Rule Mining
What are frequent itemsets?
The diapers and beer urban legend
Frequent itemset mining basics
Towards association rules
Support
Confidence
Association rules
An example with data
Added value – fixing a flaw in the plan
Methods for finding frequent itemsets
A project – discovering association rules in software project tags
Summary
3. Entity Matching
What is entity matching?
Merging data
Merging datasets vertically
Merging datasets horizontally
Techniques for matching
Attribute-based similarity matching
Be careful of pairwise comparisons
Leverage rare values
Methods for matching attributes
Range-based or distance from target
String edit distance
Hamming distance
Levenshtein distance
Soundex
Leveraging disjoint sets
Context-based similarity matching
Machine learning-based entity matching
Evaluation of entity matching techniques
Efficiency – how long does it take to do the matching?
Effectiveness – how accurate are the matches that we generate?
Usefulness – how practical is the matching procedure to use?
Entity matching project
Difficulties with matching software projects
Two examples
Matching on project names
Matching on people names
Matching on URLs
Matching on topics and description keywords
The dataset
The code
The results
How many entity matches did we find?
How good are the pairs we found?
Summary
4. Network Analysis
What is a network?
Measuring a network
Degree of a network
Diameter of a network
Walks, paths, and trails in a network
Components of a network
Centrality of a network
Closeness centrality
Degree centrality
Betweenness centrality
Other measures of centrality
Representing graph data
Adjacency matrix
Edge lists and adjacency lists
Differences between graph data structures
Importing data into a graph structure
Adjacency list format
Edge list format
GEXF and GraphML
GDF
Python pickle
JSON
JSON node and link series
JSON trees
Pajek format
A real project
Exploring the data
Generating the network files
Understanding our data as a network
Generating simple network metrics
Playing with the parameters of a network
Analyzing subgraphs
Analyzing cliques and centrality in the subgraphs
Looking for change over time
Summary
5. Sentiment Analysis in Text
What is sentiment analysis?
The basics of sentiment analysis
The structure of an opinion
Document-level and sentence-level analysis
Important features of opinions
Sentiment analysis algorithms
General-purpose data collections
Hu and Liu's sentiment analysis lexicon
SentiWordNet
Vader sentiment
Sentiment mining application
Motivating the project
Data preparation
Data analysis of chat messages
Data analysis of e-mail messages
Summary
6. Named Entity Recognition in Text
Why look for named entities?
Techniques for named entity recognition
Tagging parts of speech
Classes of named entities
Building and evaluating NER systems
NER and partial matches
Handling partial matches
Named entity recognition project
A simple NER tool
Apache Board meeting minutes
Django IRC chat
GnuIRC summaries
LKML e-mails
Summary
7. Automatic Text Summarization
What is automatic text summarization?
Tools for text summarization
Naive text summarization using NLTK
Text summarization using Gensim
Text summarization using Sumy
Sumy's Luhn summarizer
Sumy's TextRank summarizer
Sumy's LSA summarizer
Sumy's Edmundson summarizer
Summary
8. Topic Modeling in Text
What is topic modeling?
Latent Dirichlet Allocation
Gensim for topic modeling
Understanding Gensim LDA topics
Understanding Gensim LDA passes
Applying a Gensim LDA model to new documents
Serializing Gensim LDA objects
Serializing a dictionary
Serializing a corpus
Serializing a model
Gensim LDA for a larger project
Summary
9. Mining for Data Anomalies
What are data anomalies?
Missing data
Locating missing data
Zero values
Fixing missing data
Ignore the problem rows
Fix the problem manually
Use a fabricated value
Use a central measure
Use Last Observation Carried Forward
Use a similar value
Use the most likely value
Data errors
Truncated fields
Data type and character set errors
Logic or semantic errors
Outliers
Visual mining for outliers
Statistical detection of outliers
Detecting outliers with modified z-scores
Detecting outliers by combining statistics and visual mining
Detecting outliers with machine learning
Summary
Index
Mastering Data Mining with Python – Find patterns hidden in your data
Mastering Data Mining with Python – Find patterns hidden in your data
Copyright © 2016 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author(s), nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: August 2016
Production reference: 1240816
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-78588-995-0
www.packtpub.com
Credits
Author
Megan Squire
Reviewers
Sanjeev Jaiswal
Ron Mitsugo Zacharski
Commissioning Editor
Veena Pagare
Acquisition Editor
Lester Frias
Content Development Editor
Mamata Walkar
Technical Editor
Naveenkumar Jain
Copy Editors
Safis Editing
Sneha Singh
Project Coordinator
Shweta H Birwatkar
Proofreader
Safis Editing
Indexer
Pratik Shirodkar
Graphics
Kirk D'Penha
Production Coordinator
Shantanu N. Zagade
Cover Work
Shantanu N. Zagade
About the Author
Megan Squire is a professor of computing sciences at Elon University.
Her primary research interest is in collecting, cleaning, and analyzing data about how free and open source software is made. She is one of the leaders of the FLOSSmole.org, FLOSSdata.org, and FLOSSpapers.org projects.
About the Reviewers
Sanjeev Jaiswal is a computer graduate with 7 years of industrial experience. His works involves Perl, Python, and GNU/Linux. He is currently working on projects involving penetration testing, source code review, and security design and implementations.
He is very much interested in web and cloud security. He is also learning NodeJS and cloud security.
Sanjeev loves teaching engineering students and IT professionals. He has been teaching for the last 8 years in his free time. He founded Alien Coders (http://www.aliencoders.org), based on the learning through sharing principle for computer science students and IT professionals in 2010, which became a huge hit in India among engineering students.
You can follow him on Facebook at http://www.facebook.com/aliencoders, on Twitter at @aliencoders, and on GitHub at https://github.com/jassics.
Sanjeev wrote Instant PageSpeed Optimization and co-authored Learning Django Web Development for Packt Publishing. He has reviewed more than 5 books for Packt and looks forward to more such opportunities.
Ron Mitsugo Zacharski is a computational linguist working in the areas of information extraction and machine learning (zacharski.org). He has a BFA in music from the University of Wisconsin at Milwaukee and a PhD in computer science from the University of Minnesota, and he completed a post doctorate in linguistics at the University of Edinburgh. He authored the free online book A Programmer's Guide to Data Mining: The Ancient Art of the Numerati (www.guidetodatamining.com) and co-edited The Grammar-Pragmatics Interface: Essays in Honor of Jeanette K. Gundel, published by John Benjamins. For the majority of his academic life, he has focused on multilingual natural language processing, particularly with lesser-studied languages. Dr. Zacharski is a Zen monk in the Sōtō School lineage of Soyu Matsuoka. He lives in New Mexico.
www.PacktPub.com
eBooks, discount offers, and more
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
eBooks, discount offers, and morehttps://www2.packtpub.com/books/subscription/packtlib
Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.
Why subscribe?
Fully searchable across every book published by Packt
Copy and paste, print, and bookmark content
On demand and accessible via a web browser
Preface
Over the past decade, cheaper data storage, faster hardware, and impressive advances in algorithms have combined to pave the way for a rapid ascendance of data science as one of the most important opportunities in computing. While the term data science can include everything from cleaning data and storing data to visualizing it in graphs and charts, the area that has made the most significant gain is the invention of intelligent and sophisticated algorithms for analyzing data. Using computers to find the interesting patterns buried within massive amounts of data is called data mining, an area that encompasses elements of database systems, statistics, and machine learning.
Right now there are dozens of great data mining and machine learning books available for software developers to get up to date on all these advances in the field. What most of these books have in common is that they all cover a small set of tried-and-true methods for finding patterns in data: classification, clustering, decision trees, and regression. Of course, all of these are critically important methods for any data miner to know and they are popular because they can be effective. But these same few techniques are not the whole story. Data mining is a rich field encompassing many dozens of techniques to uncover patterns and make predictions. A true master of data mining should have many tools in her toolbox, not just a few. Thus, the mission of this book, Mastering Data Mining with Python, is to introduce some of the lesser-known data mining concepts that are typically only covered in academic textbooks.
This book uses the Python programming language and a project-based approach to introduce diverse and often overlooked data mining concepts, such as association rules, entity matching, network analysis, text mining, and anomaly detection. Each chapter thoroughly illustrates the basics of one particular data mining technique, provides alternatives for evaluating its effectiveness, and then implements the technique using real-world data.
Our focus on real-world data is another feature of this book that sets it apart from many other data mining books. The true test of whether we have mastered a concept is whether we can apply a method to a new, unknown problem. In our case, this means applying each data mining method to a new problem area or a new data set. The emphasis on real data also means that our results may not always be as clean and tidy as results that come from a canned, example data set. For this reason, each chapter includes a discussion for how to critically evaluate the method. Do the results make sense? What do the results mean? How can the results be improved?
So, in many ways, this book picks up where some of the other data mining books leave off. If you want to round up your growing data mining toolbox with a set of interesting but often overlooked techniques, then read on to learn the specific topics we will cover and how they will be applied in each chapter.
What this book covers
Chapter 1, Expanding Your Data Mining Toolbox, gives an introduction to the field of data mining. In this chapter we pay special attention to how data mining relates to similar topics, such as machine learning and data science. We also review many different data mining methodologies, and talk about their various strengths and weaknesses. This foundational knowledge is important as we transition into the remaining chapters of the book, which are much more technique-oriented and focus on the application of specific data mining tools.
Chapter 2, Association Rule Mining, introduces our first data mining tool: mining for co-occurring sets of items, sometimes called frequent itemsets. We extend our understanding of frequent itemset mining to include mining for association rules, and we learn how to evaluate whether the rules we have found are helpful or not. To put our knowledge into practice, at the end of the chapter we implement a small project wherein we find association rules in the keywords chosen to describe a large set of software projects.
Chapter 3, Entity Matching, focuses on finding matching pairs of data elements that may look slightly different but are actually the same. We learn how to determine whether two items are actually the same thing by using the attributes of the data. At the end of the chapter, we implement an entity matching project where we learn to find the software projects that have moved from one hosting service to another, even after changing their names and other important attributes.
Chapter 4, Network Analysis, is a tour through the basics of network or graph analysis, as used to describe the relationships between various interconnected groups of entities. We investigate the various types of network and learn how to describe and measure them. Then we put our learning into practice to describe how a network of software developers has changed over time.
Chapter 5, Sentiment Analysis in Text, is the first of four text mining chapters in this book. This chapter serves as an introduction to the growing field of sentiment, or mood, analysis in text. After comparing various approaches to sentiment mining and learning how to evaluate the results, we practice using a machine learning classifier to determine the sentiment of a set of software developer chat logs and e-mail logs.
Chapter 6, Named Entity Recognition in Text, is about finding proper nouns and proper names in text. We spend some time learning why this task is useful, and why finding named entities can sometimes be more difficult than it sounds. At the end of the chapter we implement a named entity recognition system on several different types of real-world text data including e-mail, chat logs, and board meeting minutes. Along the way we apply different techniques for quantifying the success or failure of our results.
Chapter 7, Automatic Text Summarization, presents several strategies for automatically create condensed summaries of text. This chapter emphasizes extractive summarization tools, which are designed to find the most important sentences in a text sample. To this end, we experiment with three different tools for accomplishing this goal, testing the summarization methods, and learning how they differ. Following the introduction of each tool, we attempt to summarize a common set of text documents and compare the results.
Chapter 8, Topic Modeling in Text, shows how to use software tools to reveal what topics or concepts are present in a given text. Can we train a computer program to infer the themes that are present in large amounts of text? In a series of experiments, we learn how to use common topic modeling libraries to reveal the topics present in software developer e-mails, and how those topics change over time.
Chapter 9, Mining for Data Anomalies, is where we learn how to use data mining and statistical techniques to improve our own data mining process. While all of the other chapters in this book deal with finding different types of patterns in data, here we focus on finding data that is anomalous or that does not match a particular pattern. Whether it is because the data is empty, missing, or just plain weird, this chapter presents strategies for finding or fixing this type of data so that the rest of your data can be mined more effectively.
What you need for this book
To complete the projects in this book, you will need a version of Python 3.5 or higher. I recommend using Anaconda Python, but any Python distribution will do as long as it is updated and contains the following packages: Numpy, Matplotlib, NetworkX, PyMySQL, Gensim, and NLTK. In Chapter 1, Expanding Your Data Mining Toolbox, we will walk through an easy installation of Python and all these libraries, and each time a library is used later in the book, we will install it or upgrade it together.
Because data mining is obviously data-centric, and because the data sets we are working with are sometimes large or require some type of persistent data storage, I chose to implement some of the data mining algorithms alongside a relational database system. I chose MySQL for accomplishing this since it is an established, easy-to-download and install piece of infrastructure. The chapters where MySQL comes into play are in working with the memory-intensive algorithms in Chapter 2, Association Rule Mining, and Chapter 3, Entity Matching. I also use MySQL for some of the examples in Chapter 9, Mining for Data Anomalies, but it is possible to go through that chapter without MySQL.
Who this book is for
If you picked up a book on mastering data mining, you are probably familiar with the basics of data analysis and you have likely experimented with machine learning techniques such as regression, decision trees, classification, and cluster analysis. If you have intermediate experience with Python, understand basic relational database terminology, have some exposure to basic statistics, and can understand the rudiments of how supervised and unsupervised machine learning techniques work, then you are ready for this book. Let's build on what you already know to learn some more exotic, unusual strategies for mining your data!
Conventions
In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.
Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: We can include other contexts through the use of the include directive.
A block of code is set as follows:
MINSUPPORTPCT = 5
allSingletonTags = []
allDoubletonTags = set()
doubletonSet = set()
Any command-line input or output is written as follows:
conda install pymysql
New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: Clicking the Next button moves you to the next screen.
Note
Warnings or important notes appear in a box like this.
Tip
Tips and tricks appear like this.
Reader feedback
Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.
To send us general feedback, simply e-mail <[email protected]>, and mention the book's title in the subject of your message.
If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.
Customer support
Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.
Downloading the example code
You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
You can download the code files by following these steps:
Log in or register to our website using your e-mail address and password.
Hover the mouse pointer on the SUPPORT tab at the top.
Click on Code Downloads & Errata.
Enter the name of the book in the Search box.
Select the book for which you're looking to download the code files.
Choose from the drop-down menu where you purchased this book from.
Click on Code Download.
You can also download the code files by clicking on the Code Files button on the book's webpage at the Packt Publishing website. This page can be accessed by entering the book's name in the Search box. Please note that you need to be logged in to your Packt account.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR / 7-Zip for Windows
Zipeg / iZip / UnRarX for Mac
7-Zip / PeaZip for Linux
The code bundle for the book is also hosted on GitHub at https://github.com/megansquire/masteringDM. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
Errata
Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.
To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.
Piracy
Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.
Please contact us at <[email protected]> with a link to the suspected pirated material.
We appreciate your help in protecting our authors and our ability to bring you valuable content.
Questions
If you have a problem with any aspect of this book, you can contact us at <[email protected]>, and we will do our best to address the problem.
Chapter 1. Expanding Your Data Mining Toolbox
When faced with sensory information, human beings naturally want to find patterns to explain, differentiate, categorize, and predict. This process of looking for patterns all around us is a fundamental human activity, and the human brain is quite good at it. With this skill, our ancient ancestors became better at hunting, gathering, cooking, and organizing. It is no wonder that pattern recognition and pattern prediction were some of the first tasks humans set out to computerize, and this desire continues in earnest today. Depending on the goals of a given project, finding patterns in data using computers nowadays involves database systems, artificial intelligence, statistics, information retrieval, computer vision, and any number of other various subfields of computer science, information systems, mathematics, or business, just to name a few. No matter what we call this activity – knowledge discovery in databases, data mining, data science – its primary mission is always to find interesting patterns.
Despite this humble-sounding mission, data mining has existed for long enough and has built up enough variation in how it is implemented that it has now become a large and complicated field to master. We can think of a