Python Data Structures and Algorithms
4.5/5
()
About this ebook
- A step by step guide, which will provide you with a thorough discussion on the analysis and design of fundamental Python data structures.
- Get a better understanding of advanced Python concepts such as big-o notation, dynamic programming, and functional data structures.
- Explore illustrations to present data structures and algorithms, as well as their analysis, in a clear, visual manner.
The book will appeal to Python developers. A basic knowledge of Python is expected.
Related to Python Data Structures and Algorithms
Related ebooks
Python Machine Learning By Example Rating: 4 out of 5 stars4/5Python GUI Programming Cookbook - Second Edition Rating: 5 out of 5 stars5/5Mastering Python Design Patterns Rating: 0 out of 5 stars0 ratingsNumPy Essentials Rating: 0 out of 5 stars0 ratingsGetting Started with Beautiful Soup Rating: 3 out of 5 stars3/5Artificial Intelligence with Python - Second Edition: Your complete guide to building intelligent apps using Python 3.x, 2nd Edition Rating: 0 out of 5 stars0 ratingsR for Data Science Rating: 5 out of 5 stars5/5Getting Started with Python Data Analysis Rating: 0 out of 5 stars0 ratingsPython Essentials Rating: 5 out of 5 stars5/5Artificial Intelligence with Python Rating: 4 out of 5 stars4/5Python 3 Object Oriented Programming Rating: 4 out of 5 stars4/5Web Scraping with Python Rating: 4 out of 5 stars4/5Mastering Python Rating: 0 out of 5 stars0 ratingsPython Data Science Essentials - Second Edition Rating: 4 out of 5 stars4/5Principles of Data Science Rating: 4 out of 5 stars4/5Learning Python Rating: 5 out of 5 stars5/5Python Data Analysis Rating: 4 out of 5 stars4/5Python 3 Object-oriented Programming - Second Edition Rating: 4 out of 5 stars4/5Mastering Python Regular Expressions Rating: 5 out of 5 stars5/5Modern Python Cookbook Rating: 5 out of 5 stars5/5Mastering Objectoriented Python Rating: 5 out of 5 stars5/5The Ultimate Python Programming Guide For Beginner To Intermediate Rating: 5 out of 5 stars5/5Learning pandas - Second Edition Rating: 4 out of 5 stars4/5Python: Journey from Novice to Expert Rating: 5 out of 5 stars5/5Python: Real-World Data Science Rating: 0 out of 5 stars0 ratings
Programming For You
Python Programming : How to Code Python Fast In Just 24 Hours With 7 Simple Steps Rating: 4 out of 5 stars4/5Coding All-in-One For Dummies Rating: 4 out of 5 stars4/5Learn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer. Rating: 5 out of 5 stars5/5Excel 101: A Beginner's & Intermediate's Guide for Mastering the Quintessence of Microsoft Excel (2010-2019 & 365) in no time! Rating: 0 out of 5 stars0 ratingsSQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL Rating: 4 out of 5 stars4/5Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1 Rating: 5 out of 5 stars5/5JavaScript All-in-One For Dummies Rating: 5 out of 5 stars5/5Grokking Algorithms: An illustrated guide for programmers and other curious people Rating: 4 out of 5 stars4/5Python: For Beginners A Crash Course Guide To Learn Python in 1 Week Rating: 4 out of 5 stars4/5SQL All-in-One For Dummies Rating: 3 out of 5 stars3/5HTML & CSS: Learn the Fundaments in 7 Days Rating: 4 out of 5 stars4/5PYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project Rating: 5 out of 5 stars5/5HTML in 30 Pages Rating: 5 out of 5 stars5/5Linux: Learn in 24 Hours Rating: 5 out of 5 stars5/5Beginning Programming with C++ For Dummies Rating: 4 out of 5 stars4/5SQL: For Beginners: Your Guide To Easily Learn SQL Programming in 7 Days Rating: 5 out of 5 stars5/5Programming Arduino: Getting Started with Sketches Rating: 4 out of 5 stars4/5Python QuickStart Guide: The Simplified Beginner's Guide to Python Programming Using Hands-On Projects and Real-World Applications Rating: 0 out of 5 stars0 ratingsC Programming For Beginners: The Simple Guide to Learning C Programming Language Fast! Rating: 5 out of 5 stars5/5C# Programming from Zero to Proficiency (Beginner): C# from Zero to Proficiency, #2 Rating: 0 out of 5 stars0 ratingsCoding with JavaScript For Dummies Rating: 0 out of 5 stars0 ratings
Reviews for Python Data Structures and Algorithms
2 ratings0 reviews
Book preview
Python Data Structures and Algorithms - Benjamin Baka
Python Data Structures and Algorithms
Improve application performance with graphs, stacks, and queues
Benjamin Baka
BIRMINGHAM - MUMBAI
Python Data Structures and Algorithms
Copyright © 2017 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: May 2017
Production reference: 1260517
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-78646-735-5
www.packtpub.com
Credits
About the Author
Benjamin Baka works as a software developer and has over 10 years, experience in programming. He is a graduate of Kwame Nkrumah University of Science and Technology and a member of the Linux Accra User Group. Notable in his language toolset are C, C++, Java, Python, and Ruby. He has a huge interest in algorithms and finds them a good intellectual exercise.
He is a technology strategist and software engineer at mPedigree Network, weaving together a dizzying array of technologies in combating counterfeiting activities, empowering consumers in Ghana, Nigeria, and Kenya to name a few.
In his spare time, he enjoys playing the bass guitar and listening to silence. You can find him on his blog.
Many thanks to the team at Packt who have played a major part in bringing this book to
light. I would also like to thank David Julian, the reviewer on this book, for all the assistance he extended through diverse means in preparing this book.
I am forever indebted to Lorenzo E. Danielson and Guido Sohne for their immense help in ways I can never repay.
About the Reviewer
David Julian has over 30 years of experience as an IT educator and consultant.
He has worked on a diverse range of projects, including assisting with the design of a machine learning system used to optimize agricultural crop production in controlled environments and numerous backend web development and data analysis projects.
He has authored the book Designing Machine Learning Systems with Python and worked as a technical reviewer on Sebastian Raschka’s book Python Machine Learning, both by Packt Publishing.
www.PacktPub.com
For support files and downloads related to your book, please visit www.PacktPub.com.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
https://www.packtpub.com/mapt
Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.
Why subscribe?
Fully searchable across every book published by Packt
Copy and paste, print, and bookmark content
On demand and accessible via a web browser
Customer Feedback
Thanks for purchasing this Packt book. At Packt, quality is at the heart of our editorial process. To help us improve, please leave us an honest review on this book's Amazon page at https://www.amazon.com/dp/1786467356.
If you'd like to join our team of regular reviewers, you can e-mail us at [email protected]. We award our regular reviewers with free eBooks and videos in exchange for their valuable feedback. Help us be relentless in improving our products!
Table of Contents
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
Python Objects, Types, and Expressions
Understanding data structures and algorithms
Python for data
The Python environment
Variables and expressions
Variable scope
Flow control and iteration
Overview of data types and objects
Strings
Lists
Functions as first class objects
Higher order functions
Recursive functions
Generators and co-routines
Classes and object programming
Special methods
Inheritance
Data encapsulation and properties
Summary
Python Data Types and Structures
Operations and expressions
Boolean operations
Comparison and Arithmetic operators
Membership, identity, and logical operations
Built-in data types
None type
Numeric Types
Representation error
Sequences
Tuples
Dictionaries
Sorting dictionaries
Dictionaries for text analysis
Sets
Immutable sets
Modules for data structures and algorithms
Collections
Deques
ChainMaps
Counter objects
Ordered dictionaries
defaultdict
Named tuples
Arrays
Summary
Principles of Algorithm Design
Algorithm design paradigms
Recursion and backtracking
Backtracking
Divide and conquer - long multiplication
Can we do better? A recursive approach
Runtime analysis
Asymptotic analysis
Big O notation
Composing complexity classes
Omega notation (Ω)
Theta notation (ϴ)
Amortized analysis
Summary
Lists and Pointer Structures
Arrays
Pointer structures
Nodes
Finding endpoints
Node
Other node types
Singly linked lists
Singly linked list class
Append operation
A faster append operation
Getting the size of the list
Improving list traversal
Deleting nodes
List search
Clearing a list
Doubly linked lists
A doubly linked list node
Doubly linked list
Append operation
Delete operation
List search
Circular lists
Appending elements
Deleting an element
Iterating through a circular list
Summary
Stacks and Queues
Stacks
Stack implementation
Push operation
Pop operation
Peek
Bracket-matching application
Queues
List-based queue
Enqueue operation
Dequeue operation
Stack-based queue
Enqueue operation
Dequeue operation
Node-based queue
Queue class
Enqueue operation
Dequeue operation
Application of queues
Media player queue
Summary
Trees
Terminology
Tree nodes
Binary trees
Binary search trees
Binary search tree implementation
Binary search tree operations
Finding the minimum and maximum nodes
Inserting nodes
Deleting nodes
Searching the tree
Tree traversal
Depth-first traversal
In-order traversal and infix notation
Pre-order traversal and prefix notation
Post-order traversal and postfix notation.
Breadth-first traversal
Benefits of a binary search tree
Expression trees
Parsing a reverse Polish expression
Balancing trees
Heaps
Summary
Hashing and Symbol Tables
Hashing
Perfect hashing functions
Hash table
Putting elements
Getting elements
Testing the hash table
Using [] with the hash table
Non-string keys
Growing a hash table
Open addressing
Chaining
Symbol tables
Summary
Graphs and Other Algorithms
Graphs
Directed and undirected graphs
Weighted graphs
Graph representation
Adjacency list
Adjacency matrix
Graph traversal
Breadth-first search
Depth-first search
Other useful graph methods
Priority queues and heaps
Inserting
Pop
Testing the heap
Selection algorithms
Summary
Searching
Linear Search
Unordered linear search
Ordered linear search
Binary search
Interpolation search
Choosing a search algorithm
Summary
Sorting
Sorting algorithms
Bubble sort
Insertion sort
Selection sort
Quick sort
List partitioning
Pivot selection
Implementation
Heap sort
Summary
Selection Algorithms
Selection by sorting
Randomized selection
Quick select
Partition step
Deterministic selection
Pivot selection
Median of medians
Partitioning step
Summary
Design Techniques and Strategies
Classification of algorithms
Classification by implementation
Recursion
Logical
Serial or parallel
Deterministic versus nondeterministic algorithms
Classification by complexity
Complexity curves
Classification by design
Divide and conquer
Dynamic programming
Greedy algorithms
Technical implementation
Dynamic programming
Memoization
Tabulation
The Fibonacci series
The Memoization technique
The tabulation technique
Divide and conquer
Divide
Conquer
Merge
Merge sort
Greedy algorithms
Coin-counting problem
Dijkstra's shortest path algorithm
Complexity classes
P versus NP
NP-Hard
NP-Complete
Summary
Implementations, Applications, and Tools
Tools of the trade
Data preprocessing
Why process raw data?
Missing data
Feature scaling
Min-max scalar
Standard scalar
Binarizing data
Machine learning
Types of machine learning
Hello classifier
A supervised learning example
Gathering data
Bag of words
Prediction
An unsupervised learning example
K-means algorithm
Prediction
Data visualization
Bar chart
Multiple bar charts
Box plot
Pie chart
Bubble chart
Summary
Preface
A knowledge of data structures and the algorithms that bring them to life is the key to building successful data applications. With this knowledge, we have a powerful way to unlock the secrets buried in large amounts of data. This skill is becoming more important in a data-saturated world, where the amount of data being produced dwarfs our ability to analyze it. In this book, you will learn the essential Python data structures and the most common algorithms. This book will provide basic knowledge of Python and an insight into the exciting world of data algorithms. We will look at algorithms that provide solutions to the most common problems in data analysis, including sorting and searching data, as well as being able to extract important statistics from data. With this easy-to-read book, you will learn how to create complex data structures such as linked lists, stacks, and queues, as well as sorting algorithms such as bubble sort and insertion sort. You will learn the common techniques and structures used in tasks such as preprocessing, modeling, and transforming data. We will also discuss how to organize your code in a manageable, consistent, and extendable way. You will learn how to build components that are easy to understand, debug, and use in different applications.
A good understanding of data structures and algorithms cannot be overemphasized. It is an important arsenal to have in being able to understand new problems and find elegant solutions to them. By gaining a deeper understanding of algorithms and data structures, you may find uses for them in many more ways than originally intended. You will develop a consideration for the code you write and how it affects the amount of memory and CPU cycles to say the least. Code will not be written for the sake of it, but rather with a mindset to do more using minimal resources. When programs that have been thoroughly analyzed and scrutinized are used in a real-life setting, the performance is a delight to experience. Sloppy code is always a recipe for poor performance. Whether you like algorithms purely from the standpoint of them being an intellectual exercise or them serving as a source of inspiration in solving a problem, it is an engagement worthy of pursuit.
The Python language has further opened the door for many professionals and students to come to appreciate programming. The language is fun to work with and concise in its description of problems. We leverage the language's mass appeal to examine a number of widely studied and standardized data structures and algorithms.
The book begins with a concise tour of the Python programming language. As such, it is not required that you know Python before picking up this book.
What this book covers
Chapter 1, Python Objects, Types, and Expressions, introduces you to the basic types and objects of Python. We will give an overview of the language features, execution environment, and programming styles. We will also review the common programming techniques and language functionality.
Chapter 2, Python Data Types and Structures, explains each of the five numeric and five sequence data types, as well as one mapping and two set data types, and examine the operations and expressions applicable to each type. We will also give examples of typical use cases.
Chapter 3, Principles of Algorithm Design, covers how we can build additional structures with specific capabilities using the existing Python data structures. In general, the data structures we create need to conform to a number of principles. These principles include robustness, adaptability, reusability, and separating the structure from a function. We look at the role iteration plays and introduce recursive data structures.
Chapter 4, Lists and Pointer Structures, covers linked lists, which are one of the most common data structures and are often used to implement other structures, such as stacks and queues. In this chapter, we describe their operation and implementation. We compare their behavior to arrays and discuss the relative advantages and disadvantages of each.
Chapter 5, Stacks and Queues, discusses the behavior and demonstrates some implementations of these linear data structures. We give examples of typical applications.
Chapter 6, Trees, will look at how to implement a binary tree. Trees form the basis of many of the most important advanced data structures. We will examine how to traverse trees and retrieve and insert values. We will also look at how to create structures such as heaps.
Chapter 7, Hashing and Symbol Tables, describes symbol tables, gives some typical implementations, and discusses various applications. We will look at the process of hashing, give an implementation of a hash table, and discuss the various design considerations.
Chapter 8, Graphs and Other Algorithms, looks at some of the more specialized structures, including graphs and spatial structures. Representing data as a set of nodes and vertices is convenient in a number of applications, and from this, we can create structures such as directed and undirected graphs. We will also introduce some other structures and concepts such as priority queues, heaps, and selection algorithms.
Chapter 9, Searching, discusses the most common searching algorithms and gives examples of their use for various data structures. Searching a data structure is a fundamental task and there are a number of approaches.
Chapter 10, Sorting, looks at the most common approaches to sorting. This will include bubble sort, insertion sort, and selection sort.
Chapter 11, Selection Algorithms, covers algorithms that involve finding statistics, such as the minimum, maximum, or median elements in a list. There are a number of approaches and one of the most common approaches is to first apply a sort operation. Other approaches include partition and linear selection.
Chapter 12, Design Techniques and Strategies, relates to how we look for solutions for similar problems when we are trying to solve a new problem. Understanding how we can classify algorithms and the types of problem that they most naturally solve is a key aspect of algorithm design. There are many ways in which we can classify algorithms, but the most useful classifications tend to revolve around either the implementation method or the design method.
Chapter 13, Implementations, Applications, and Tools, discusses a variety of real-world applications. These include data analysis, machine learning, prediction, and visualization. In addition, there are libraries and tools that make our work with algorithms more productive and enjoyable.
What you need for this book
The code in this book will require you to run Python 2.7.x or higher. Python's default interactive environment can also be used to run the snippets of code. In order to use other third-party libraries, pip should be installed on your system.
Who this book is for
This book would appeal to Python developers. Basic knowledge of Python is preferred but is not a requirement. No previous knowledge of computer concepts is assumed. Most of the concepts are explained with everyday scenarios to make it very easy to understand.
Conventions
In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.
Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: This repetitive construct could be a simple while loop or any other kind of loop.
A block of code is set as follows:
When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:
Any command-line input or output is written as follows:
New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: Clicking the Next button moves you to the next screen.
Warnings or important notes appear in a box like this.
Tips and tricks appear like this.
Reader feedback
Feedback from our readers is always welcome. Let us know what you think about this book-what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.
To send us general feedback, simply e-mail [email protected], and mention the book's title in the subject of your message.
If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.
Customer support
Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.
Downloading the example code
You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
You can download the code files by following these steps:
Log in or register to our website using your e-mail address and password.
Hover the mouse pointer on the SUPPORT tab at the top.
Click on Code Downloads & Errata.
Enter the name of the book in the Search box.
Select the book for which you're looking to download the code files.
Choose from the drop-down menu where you purchased this book from.
Click on Code Download.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR / 7-Zip for Windows
Zipeg / iZip / UnRarX for Mac
7-Zip / PeaZip for Linux
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Python-Data-Structures-and-Algorithma. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
Errata
Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books-maybe a mistake in the text or the code-we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.
To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.
Piracy
Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.
Please contact us at [email protected] with a link to the suspected pirated material.
We appreciate your help in protecting our authors and our ability to bring you valuable content.
Questions
If you have a problem with any aspect of this book, you can contact us at [email protected], and we will do our best to address the problem.
Python Objects, Types, and Expressions
Python is the language of choice for many advanced data tasks for a very good reason. Python is one of the easiest advanced programming languages to learn. Intuitive structures and semantics mean that for people who are not computer scientists, but maybe biologists, statisticians, or the directors of a start-up, Python is a straightforward way to perform a wide variety of data tasks. It is not just a scripting language, but a full-featured object-oriented programming language.
In Python, there are many useful data structures and algorithms built in to the language. Also, because Python is an object-based language, it is relatively easy to create custom data objects. In this book, we will examine both Python internal libraries,