Modern Graph Theory Algorithms with Python: Harness the power of graph algorithms and real-world network applications using Python
()
Related to Modern Graph Theory Algorithms with Python
Related ebooks
Graph Data Science with Python and Neo4j: Hands-on Projects on Python and Neo4j Integration for Data Visualization and Analysis Using Graph Data Science for Building Enterprise Strategies Rating: 0 out of 5 stars0 ratingsData Science with Python: Unlocking the Power of Pandas and Numpy Rating: 0 out of 5 stars0 ratingsPython Feature Engineering Cookbook: A complete guide to crafting powerful features for your machine learning models Rating: 0 out of 5 stars0 ratingsApplied Deep Learning on Graphs: Leverage graph data for business applications using specialized deep learning architectures Rating: 0 out of 5 stars0 ratingsData Science Unveiled: A Practical Guide to Key Techniques Rating: 0 out of 5 stars0 ratingsMastering Data Science: A Comprehensive Guide to Techniques and Applications Rating: 0 out of 5 stars0 ratingsMastering Algorithm in Python Rating: 0 out of 5 stars0 ratingsAdvanced Machine Learning with Python Rating: 0 out of 5 stars0 ratingsData-Centric Machine Learning with Python: The ultimate guide to engineering and deploying high-quality models based on good data Rating: 0 out of 5 stars0 ratingsDynamic Bayesian Networks: Fundamentals and Applications Rating: 0 out of 5 stars0 ratingsData Science Basics Rating: 0 out of 5 stars0 ratingsMachine Learning with Spark and Python: Essential Techniques for Predictive Analytics Rating: 0 out of 5 stars0 ratingsNumPy Essentials Rating: 0 out of 5 stars0 ratingsIntroduction to Scientific Programming with Python Rating: 0 out of 5 stars0 ratingsPython Data Structures and Algorithms Rating: 5 out of 5 stars5/5Data Science with Python: From Zero to Machine Learning Rating: 0 out of 5 stars0 ratingsData Science, AI, and Blockchain: Integrated Approaches Rating: 0 out of 5 stars0 ratingsBig Data and Data Science: Analytics for the Future Rating: 0 out of 5 stars0 ratingsMachine Learning For Dummies Rating: 4 out of 5 stars4/5Mastering Data Structures and Algorithms with Python: Unlock the Secrets of Expert-Level Skills Rating: 0 out of 5 stars0 ratingsAlgorithms: Computer Science Unveiled Rating: 0 out of 5 stars0 ratingsIPython Notebook Essentials Rating: 0 out of 5 stars0 ratingsData Science Mastery: From Beginner to Expert in Big Data Analytics Rating: 0 out of 5 stars0 ratings
Trending on #Booktok
The Assassin and the Pirate Lord: A Throne of Glass Novella Rating: 4 out of 5 stars4/5A Court of Mist and Fury Rating: 5 out of 5 stars5/5Icebreaker: A Novel Rating: 4 out of 5 stars4/5It Ends with Us: A Novel Rating: 4 out of 5 stars4/5Better Than the Movies Rating: 4 out of 5 stars4/5The Secret History: A Read with Jenna Pick: A Novel Rating: 4 out of 5 stars4/5Pride and Prejudice Rating: 4 out of 5 stars4/5Powerless Rating: 4 out of 5 stars4/5The Love Hypothesis Rating: 4 out of 5 stars4/5A Little Life: A Novel Rating: 4 out of 5 stars4/5Normal People: A Novel Rating: 4 out of 5 stars4/5Rich Dad Poor Dad Rating: 4 out of 5 stars4/5The Summer I Turned Pretty Rating: 4 out of 5 stars4/5Happy Place Rating: 4 out of 5 stars4/5If We Were Villains: A Novel Rating: 4 out of 5 stars4/5Seven Stones to Stand or Fall: A Collection of Outlander Fiction Rating: 4 out of 5 stars4/5The Lord Of The Rings: One Volume Rating: 5 out of 5 stars5/5Fire & Blood: 300 Years Before A Game of Thrones Rating: 4 out of 5 stars4/5Funny Story Rating: 4 out of 5 stars4/5Once Upon a Broken Heart Rating: 4 out of 5 stars4/5Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones Rating: 4 out of 5 stars4/5Divine Rivals: A Novel Rating: 4 out of 5 stars4/5Finnegans Wake Rating: 4 out of 5 stars4/5Crime and Punishment Rating: 4 out of 5 stars4/5The 48 Laws of Power Rating: 4 out of 5 stars4/5Beauty and the Beast Rating: 4 out of 5 stars4/5Beach Read Rating: 4 out of 5 stars4/5The Little Prince: New Translation Version Rating: 5 out of 5 stars5/5Dune Rating: 4 out of 5 stars4/5Milk and Honey: 10th Anniversary Collector's Edition Rating: 4 out of 5 stars4/5
Reviews for Modern Graph Theory Algorithms with Python
0 ratings0 reviews
Book preview
Modern Graph Theory Algorithms with Python - Colleen M. Farrelly
Modern Graph Theory Algorithms with Python
Copyright © 2024 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
The authors acknowledge the use of cutting-edge AI (NightCafe’s Stable Diffusion algorithms) for the figures illustrated in this book. It’s important to note that the content itself has been crafted by the authors and edited by a professional publishing team.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Group Product Manager: Ali Abidi
Publishing Product Manager: Yasir Khan
Book Project Manager: Hemangi Lotlikar
Senior Editor: Tazeen Shaikh
Technical Editor: Rahul Limbachiya
Copy Editor: Safis Editing
Proofreader: Tazeen Shaikh
Indexer: Subalakshmi Govindhan
Production Designer: Jyoti Kadam
DevRel Marketing Coordinator: Nivedita Singh
First published: June 2024
Production reference: 1230524
Published by Packt Publishing Ltd.
Grosvenor House
11 St Paul’s Square
Birmingham
B3 1RB, UK.
ISBN 978-1-80512-789-5
www.packtpub.com
Many thanks and much gratitude to Peter Schnable, for his encouragement and guidance during my journey into science and mathematics; for long discussions about herpetology, conservation, and ethics; and for inspiring me to share my knowledge with others and to forge my own path in STEM for social good.
- Colleen Molloy Farrelly
To my beloved mother, Ngoy Justine, whose unwavering love and encouragement fueled my passion for knowledge. Throughout the writing of these pages, her memory has been a guiding light, inspiring me to pursue excellence and share the joy of discovery. Though she is no longer with us, her spirit lives on in the words and sentiments expressed within these chapters. With profound gratitude and love, I dedicate this work to the woman whose influence continues to shape my journey.
To Meda, Divine, and Abigael.
- Franck Kalala Mutombo
Foreword
As the CTO/CIO of life sciences as well as automotive, energy, and high-tech industry-focused companies, I have continuously been challenged with creating meaningful data insights from a great variety of unstructured and semi-structured data sources. A decade ago, my advanced analytics and machine learning journey experienced the biggest push forward when I benefited from one of the most impactful introductions in my career. Ever since I got to work with Colleen Farrelly on our ontology-focused data science requirements when building out a genomics diagnostics platform, my approach to AI has been tremendously elevated. Subsequently, I got to work on Colleen Farrelly’s exceptional book, The Shape of Data, which intrigued me deeply with her unique delivery of how to combine geometry and machine learning-based algorithms and supporting data structures to create powerful topological representations of complex data problems.
In this equally captivating sequel, Modern Graph Theory Algorithms with Python, Colleen Farrelly and Franck Kalala Mutombo take you to the next level of unleashing the potential of network science by diving deep into specific graph theory approaches to solve a great variety of industry problems ranging from ecological to financial, spatial and temporal sales, and clinical data challenges. It is easy to see why readers and data science practitioners of any level will find each chapter profoundly valuable based on the supporting Python library and code examples along with the underlying mathematical explanations.
With each turn of a page, I found myself wanting to reinforce the presented learnings by applying well-defined examples to my own business requirements. Being a keen user of graph databases, I very much enjoyed the hands-on practical discussions on specific technologies such as in Chapter 12
to illustrate the relative ease of Python integrations and built-in capabilities of open source solutions that can be leveraged in today’s growing powerful set of available tools.
This comprehensive book provides just-enough and just-in-time fundamental concepts to enable data scientists and software engineers to greatly elevate their machine learning techniques with directly applicable well-structured scenarios. Each of the consistent problem-solution breakdowns includes key considerations for the required data wrangling, transformation, and modeling aspects.
Furthermore, this book provides a convincing lead into the relevance of new frontiers such as quantum network science algorithms, neural network architectures as graphs, hierarchical networks, and hypergraphs. With a concise and easy-to-follow thought process, this book provides you with the important context of how to reduce the large volumes of parameterization required for large language models and address the critical aspect of metadata management via hypergraph databases, for example.
The use of graph theory today is highly relevant to every industry and science domain. Whether the challenge is to provide predictive modeling or simulations or the optimization of business operations or clinical outcomes and many more requirements, this book is an indispensable guide to mastering the complexities of these critical real-world challenges. With all the key insights and GitHub repository examples at your fingertips, you will be transformed instantly into a subject matter expert. The Modern Graph Theory Algorithms with Python exploration is a must-read and thoroughly enjoyable book.
Michael Giske
Chairman of Inomo Technologies and Global CIO of B-ON
Contributors
About the authors
Colleen Molloy Farrelly is a chief mathematician, data scientist, and researcher who has expertise in applying math to the biological, medical, social, and physical sciences. She has also authored the book, The Shape of Data. She has mentored, coauthored papers, and worked with people across Latin America, Africa, Europe, and Asia.
She is based in Miami, Florida in the US and holds a master’s in biostatistics from the University of Miami. She is passionate about educational initiatives in the developing world and speaks at conferences such as Women in Data Science, IEEE conferences, PyData, and Applied Machine Learning Days.
I want to thank the people who have supported me over the years, especially early on, including John and Nancy Farrelly, Peter Schnable, the Warmus family, Mr. and Mrs. De Jong, the Mayor families, Justin and Christy Moeller, Luke Robinson, and many professors and colleagues throughout my career.
To all my students and those who will come after me who motivate me to teach and share my knowledge, you can use math and science to change the world for the better.
Franck Kalala Mutombo is a professor of mathematics at Lubumbashi University and former academic director of AIMS-Senegal. He previously worked in a research position at the University of Strathclyde and AIMS-South Africa in a joint appointment with the University of Cape Town. He holds a PhD in mathematical sciences from the University of Strathclyde, Glasgow, Scotland. His current research considers the impact of network structure on long-range interactions applied to epidemics, diffusion, object clustering, differential geometry of manifolds, finite element methods for PDEs, and data science. Currently, he teaches at the University of Lubumbashi and across the AIMS Network.
I express gratitude to my supportive network throughout my journey. I’m thankful for friends and professors who’ve contributed to my career. I am indebted to the countless students across Africa and those who will succeed them. Their enthusiasm and curiosity serve as constant reminders of the profound impact that mathematics and science can have on shaping a better world. It is with gratitude that I embrace the opportunity to teach and share knowledge, fostering a community of learners committed to leveraging the transformative potential of these disciplines for the greater good.
About the reviewer
Casey Moffatt, with a master’s in applied mathematics and a double bachelor’s in pure mathematics and philosophy, specializes in graph theory, optimization, and computer science. He is proficient in Python and various essential software for graph data science, machine learning, and algorithm development. He is eager to push boundaries in mathematics and computer science. He would like to thank Packt Publishing and contributors for enabling projects like this and to the countless individuals behind open source technologies.
Table of Contents
Preface
Part 1: Introduction to Graphs and Networks with Examples
1
What is a Network?
Technical requirements
Introduction to graph theory and networks
Formal definitions
Creating networks in Python
Random graphs
Examples of real-world social networks
Other type of networks
Advanced use cases of network science
Summary
References
2
Wrangling Data into Networks with NetworkX and igraph
Technical requirements
Introduction to different data sources
Social interaction data
Spatial data
Temporal data
Biological networks
Other types of data
Wrangling data into networks with igraph
Social network examples with NetworkX
Summary
References
Part 2: Spatial Data Applications
3
Demographic Data
Technical requirements
Introduction to demography
Demographic factors
Geographic factors
Homophily in networks
Francophone Africa music spread
AIMS Cameroon student network epidemic model
Summary
References
4
Transportation Data
Technical requirements
Introduction to transportation problems
Paths between stores
Fuel costs
Time to deliver goods
Navigational hazards
Shortest path applications
Traveling salesman problem
Max-flow min-cut algorithm
Summary
References
5
Ecological Data
Technical requirements
Introduction to ecological data
Exploring methods to track animal populations across geographies
Exploring methods to capture plant distributions and diseases
Spectral graph tools
Clustering ecological populations using spectral graph tools
Spectral clustering on text notes
Summary
References
Part 3: Temporal Data Applications
6
Stock Market Data
Technical requirements
Introduction to temporal data
Stock market applications
Introduction to centrality metrics
Application of centrality metrics across time slices
Extending network metrics for time series analytics
Summary
References
7
Goods Prices/Sales Data
Technical requirements
An introduction to spatiotemporal data
The Burkina Faso market dataset
Store sales data
Analyzing our spatiotemporal datasets
Summary
References
8
Dynamic Social Networks
Technical requirements
Social networks that change over time
Friendship networks
Triadic closure
A deeper dive into spreading on networks
Dynamic network introduction
SIR models, Part Two
Factors influencing spread
Example with evolving wildlife interaction datasets
Crocodile network
Heron network
Summary
References
Part 4: Advanced Applications
9
Machine Learning for Networks
Technical requirements
Introduction to friendship networks and friendship relational datasets
Friendship network introduction
Friendship demographic and school factor dataset
ML on networks
Clustering based on student factors
Clustering based on student factors and network metrics
Spectral clustering on the friendship network
DL on networks
GNN introduction
Example GNN classifying the Karate Network dataset
Summary
References
10
Pathway Mining
Technical requirements
Introduction to Bayesian networks and causal pathways
Bayes’ Theorem
Causal pathways
Bayesian networks
Educational pathway example
Outcomes in education
Course sequences
Antecedents to success
Analyzing course sequencing to find optimal student pathways to graduation
Introduction to a dataset
bnlearn analysis
Structural equation models
Summary
References
11
Mapping Language Families – an Ontological Approach
Technical requirements
What is an ontology?
Introduction to ontologies
Representing information as an ontology
Language families
Language drift and relationships
Nilo-Saharan languages
Mapping language families
Summary
References
12
Graph Databases
Introduction to graph databases
What is a graph database?
What can you represent in a graph database?
Querying and modifying data in Neo4j
Basic query example
More complicated query examples
Summary
References
13
Putting It All Together
Technical requirements
Introduction to the problem
Ebola spread in the Democratic Republic of Congo – 2018-2020 outbreak
Geography and logistics
Introduction to GEEs
Mathematics of GEEs
Our problem and GEE formulation
Data transformation
Python wrangling
GEE input
Data modeling
Running the GEE in Python
Summary
References
14
New Frontiers
Quantum network science algorithms
Graph coloring algorithms
Max flow/min cut
Neural network architectures as graphs
Deep learning layers and connections
Analyzing architectures
Hierarchical networks
Higher-order structures and network data
An example using gene families
Hypergraphs
Displaying information
Metadata
Summary
References
Index
Other Books You May Enjoy
Preface
Hello there! Network science combines the power of analytics with the deep theoretical tools of graph theory to solve difficult problems in data analytics. This empowers researchers and industry engineers/data scientists to analyze data at scale and reframe intractable analytics problems to produce powerful insights into problems and predictions about system behaviors, including biological, physical, and social systems of interest.
There are many important applications of network science today, including these:
Social network data
Spatial data
Time series data
Spatiotemporal data
More advanced data structures, such as ontologies or hypergraphs
This book gives a brief overview of social network applications and focuses on the cutting edge of network science applications to areas of data science, such as transportation logistics, conversation, public health, linguistics, and education. By the end of your journey, you’ll be able to frame your own data problem within the framework of network science to derive insights and tackle difficult problems in your field.
We will provide the necessary mathematical background as we dive into practical examples and code related to our work in academia and industry over the past decades, including work on predicting Ebola outbreaks, forecasting food price volatility, modeling genetic and linguistic relationships, and mining social networks for insights into social tie formation. As the world faces food shortages, public health crises, economic inequality, supply chain breakdowns, and environmental crises, network science will play an important role in big data analytics for social good.
Who this book is for
This book is for you if you are working with data. To get the most out of the book, you should have some familiarity with Python, particularly the pandas and numpy packages. In addition, some familiarity with data analytics is assumed, though the network science tools and problems we tackle are built from scratch for readers without a background in those problems or methods.
Network science has a rich history in many scientific disciplines, including epidemiology, biomedical engineering, sociology, genetics, environmental science, particle physics, computer science, and economics. Its foundations in graph theory influence research in many areas of pure and applied mathematics as well. Anyone in the fields of science, technology, engineering, and mathematics can benefit from network science’s toolset and approach to problem-solving.
What this book covers
Chapter 1
, What Is a Network?, introduces the theoretical concept of a network and provides several examples of networks in real-world applications, including work with random graphs. We’ll also get started with Python’s igraph and NetworkX packages.
Chapter 2
, Wrangling Data into Networks with NetworkX and igraph, builds on Chapter 1
by providing three examples of real-world data that can be formulated as network data and showing how to convert data into network form in Python. We’ll introduce problems involving spatial data, temporal data, and spatiotemporal data and explore how network science can solve these problems by converting the data into network form.
Chapter 3
, Demographic Data, explores two real-world projects using demography data from the developing world to understand network structures and capacity for information/infectious disease spread. We’ll consider the demographic characteristics and network properties of a friend group to see how both types of information can influence disease spread.
Chapter 4
, Transportation Data, provides a real-world example of a transportation network and introduces tools related to minimum paths and network flow. We’ll consider optimal routing and the shortest paths to destinations, including multistop pathways from one location to another.
Chapter 5
, Ecological Data, shows a real-world example of an ecological network and introduces spectral graph theory tools, including spectral clustering and graph Laplacians.
Chapter 6
, Stock Market Data, examines a real-world example of stock market data analysis with network tools, including edge-based centrality measures of volatility. We’ll mine data for tipping points, heralding either a period of market growth or market crash.
Chapter 7
, Goods Prices/Sales Data, provides two real-world examples of commerce data analysis over both space and time with tools previously covered in time series and spatial data applications. We’ll examine sales and pricing trends across time and space to better understand consumer behavior and the impacts of pricing changes across time and space.
Chapter 8
, Dynamic Social Networks, introduces a real-world example of social network datasets evolving over time and analyzes their vulnerability to spreading processes, such as epidemics and misinformation flow. We’ll consider factors influencing ecological social networks’ vulnerability to spread of disease.
Chapter 9
, Machine Learning for Networks, presents a comprehensive description on network-based machine learning and deep learning, including examples with supervised, unsupervised, and semi-supervised learning to understand disease risks within social networks.
Chapter 10
, Pathway Mining, introduces Bayesian networks and mining for causal pathways using an educational data example, where we’ll see how course sequencing and performance influence student outcomes.
Chapter 11
, Mapping Language Families – an Ontological Approach, covers ontologies and maps between ontologies using a linguistic data example from the Nilo-Sudanic language family and its lexicon variations.
Chapter 12
, Graph Databases, introduces graph databases with Neo4j, including data from prior chapters and how to query Neo4j with graph tools introduced in prior chapters and Neo4j’s query language. We’ll see how graph databases and network science tools create synergy in data science, as well as efficient data storage solutions.
Chapter 13
, Putting It All Together, ties together material from previous chapters into a final project, analyzing spatiotemporal network data and demographic data from Ituri and North Kivu provinces with generalized estimating equations to understand the evolution of the 2019 Ebola epidemic.
Chapter 14
, New Frontiers, introduces quantum graph algorithms, graph theory for neural network optimization, hierarchical networks, and hypergraphs.
To get the most out of this book
We provide Python scripts and assume some knowledge of basic Python analytics packages (such as NumPy and scikit-learn) and Python syntax. We assume some knowledge of basic analytics tasks such as summary statistics and working with different types of data in Python with either numpy or pandas. Scripts are written for each chapter, with later scripts often depending on earlier scripts in the chapter to build knowledge. Other concepts in Python and in analytics will be introduced conceptually and then with Python code examples.
If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.
You are encouraged to try out the code examples in this book on your real-world data science projects. If you want to delve deeper into graph algorithms and network science, we encourage you to look at the latest research papers on network science topics. Google Scholar and arXiv are two good references for network science methods and application papers.
Download the example code files
You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Modern-Graph-Theory-Algorithms-with-Python
. If there’s an update to the code, it will be updated in the GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/
. Check them out!
Conventions used
There are a number of text conventions used throughout this book.
Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: This script shows that average subgraph centrality varies between the two subfamily trees, with Greenberg’s average subgraph centrality of 2.478 and Dimmendaal’s average subgraph centrality of 3.276.
A block of code is set as follows:
#compare subgraph centrality of language families
gs=nx.subgraph_centrality(G)
print(np.mean(np.array(list(gs.values()))))
gs2=nx.subgraph_centrality(G2)
print(np.mean(np.array(list(gs2.values()))))
Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: When you hover over the Movie DBMS label on the right-hand side of the screen, you’ll see a Start button that launches the connection to this database. Click on Start.
Tips or important notes
Appear like this.
Get in touch
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, email us at [email protected]
and mention the book title in the subject of your message.
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata
and fill in the form.
Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected]
with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com
.
Share Your Thoughts
Once you’ve read Modern Graph Theory Algorithms with Python, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page
for this book and share your feedback.
Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.
Download a free PDF copy of this book
Thanks for purchasing this book!
Do you like to read on the go but are unable to carry your print books everywhere?
Is your eBook purchase not compatible with the device of your choice?
Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.
Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application.
The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily
Follow these simple steps to get the benefits:
Scan the QR code or visit the link below
https://packt.link/free-ebook/9781805127895
Submit your proof of purchase
That’s it! We’ll send your free PDF and other benefits to your email directly
Part 1:Introduction to Graphs and Networks with Examples
This part of the book builds