ODSC Machine Learning Guide V1.1
ODSC Machine Learning Guide V1.1
Guide
Now is the time to ask and answer these questions. The power of the
tools and applications built from the research in the field is clear; we
must now consider the world we wish to shape using these tools. They
are excellent magnifying glasses, capable of revealing intricate patterns
in the information all around us. They have the ability to help us bet-
ter understand the world that we have, and more effectively build the
world that we want. But like all tools, we must handle them carefully
or we risk harming ourselves and others. We risk further reinforcing
the walls we find in our society today built during a less informed and
analytical era.
Contents
In this report, you will find the 10 most popular talks from the 2018
4 6 8 9 10
Open Data Science Conferences and the 10 most popular blogs from
OpenDataScience.com in 2018. I hope you take the information and
inspiration herein and use it to further your journey into machine
learning. Please share it widely with anyone you know who wants to be top blogs from the
a part of the community of data scientists and engineers that will help experts
shape the future of data science and machine learning. what’s
Katherine Gorman
next
Executive Producer and Co-host of Talking Machines & top
Executive Producer, Collective Next
ODSC talks editor’s note
10
Top Blogs | Page 4 Machine Learning Anthology | 2018
SCIENCE BLOGS Check out this tutorial to get a general outline of how different popular
machine learning algorithms work, plus some code recipes you can use
if you want to experiment.
Read it here.
An Introduction to Reinforcement Learning Concepts, What Overfitting is and How to Fix It,
Diego Arenas Spencer Norris
This recap of the reinforcement learning workshop from ODSC London 2018 will give you Overfitting is an interesting problem with fascinating solutions embedded in
the key takeaways you need to implement the framework yourself. the very structure of the algorithms you’re using. Here, we break down what
Read it here. overfitting is and how we can provide an antidote to it in the real world.
Read it here.
Understanding the Three Primary Types of Gradient How to Define a Machine Learning Problem Like a
Descent, Daniel Gutierrez Detective, Spencer Norris
Gain insights into the three primary types of gradient descent, the most commonly used What steps do you need to take to identify the best machine learning
optimization method deployed in machine learning and deep learning algorithms. problems to ask? Learn how to define a machine learning problem, but
Read it here. through the eyes of a detective.
Read it here.
Client-side Web Development and Machine Learning, Three Popular Clustering Methods and When to
Caspar Wylie Use Each, Spencer Norris
See how and why machine learning and web development are beginning to collaborate Unsupervised machine learning can be very powerful in its own right, and
rather successfully, such as through the collaboration with JavaScript. clustering is by far the most common expression of this group of problems.
Read it here. Check out this quick rundown of three of the most popular clustering
approaches and what situations each is best suited to.
Read it here.
Gradient Boosting and XGBoost, Daniel Gutierrez Crash Course: Pool-Based Sampling in Active
Learning, Spencer Norris
Learn the basics behind gradient boosting for statistical learning and the popular XG-
Boost implementation — the most recent evolution of gradient boosting. See more Active learning is a class of machine learning problems where labeled data
about how XGBoost became king of the hill for data isn’t available for supervised algorithms. Read more about one of the most
scientists desiring accurate predictions. common active learning problems in the field: pool-based sampling.
Read it here. Read it here.
10
Top Talks | Page 6 Machine Learning Anthology | 2018
Bernard Marr
CONFERENCES
business author and strategic advisor to companies and governments. Gain
insight into applied AI and machine learning for your organization.
Watch here.
Out of 300+ talks from ODSC conferences in 2018, here Jon Peck
are the 10 top-rated sessions covering machine learning. OS for AI: How Serverless Computing Enables the
Next Gen of ML
This talk examines the need for and implementations of an “Operating System
Andreas Mueller for AI” — a common interface to use and combine algorithms and a general ar-
chitecture for serverless machine learning that is discoverable, versioned, scal-
Introduction to Machine Learning & able, and shareable.
Watch here.
Intermediate Machine Learning with scikit-learn
Start with the basics of machine learning before you dive into the tools,
applications, and advanced functions!
Watch part one here.
Randy Olson
Watch part two here. The Past, Present, and Future of Automated ML
In this talk, Randy Olson draws from his AutoML research to discuss the benefits
of AutoML and highlights some promising future directions of the field.
Crissman Loomis Watch here.
Keith Santarelli
Making the Most of Your Time Series: Signal
Jeffrey Yau
Multivariate Time Series Forecasting Using Statistical
Processing for Machine Learning applications
Learn more about common machine learning signal processing techniques on
and ML Models
time series data. This talk discusses a number of common tools in signal This lecture discusses the formulation Vector Autoregressive (VAR) Models, one
processing and shows how they can be implemented in various Python of the most important classes of multivariate time series statistical models, and
packages, including tools to remove “noise” to find underlying trends. neural network-based techniques.
Watch here. Watch here.
“
Founder, Data Moves Me LLC field, I’ve learned so much about to appear, developers need to be
Conferences in various frameworks and languages, careful with their applications and
5
the unique ways that different to avoid leaving entryways for
Almost all ML models are based on statistics. No one, including your customers and employees, likes to
industries are using AI, and of malicious attacks, and to be well-
be treated as a statistic, which is what your ML processes will tend to do. Plan to mitigate that from the
course, how strong the open data prepared to defend against potential
beginning.
science community is in sharing its adversarial attacks. It’s better to
Adam Breindel countries knowledge. spend the time building up your
Independent Consultant and Instructor, ML/AI and Data Engineering
70,000
There are definitely a few topics defenses rather than going through
in particular that I’m paying extra the headache of resolving issues
attention to heading into 2019. from hackers and malware.
Apache Spark’s MLlib is a frequent Whether you use the information
topic of conversation when I speak provided in this anthology to learn
To be successful, machine learning subscriptions to our with data science experts, largely a new tool, framework, or you’re
weekly newsletter
adopters must enable a flexible thanks to MLib’s remarkable scaling
ability in implementing multiple
now more invested in security
and defense, I hope that all of the
infrastructure and be agile. It is critical ML algorithms. At this rate, I can videos, blogs, and insights from
see the scikit-learn library for the experts in this anthology prove
to experiment and accept failure in Python becoming a common job useful for you. Whether you’re
#1
exchange for quicker learning and less blog on
requirement for ML experts — it’s
appearing in social feeds, featured
new to data science and machine
learning or a seasoned vet, already
money spent going after projects that @ODSC blogs, and job listings that cross working in applied data science
aren’t successful. Medium: my desk more and more. The
TensorFlow framework is popping
or academia, or you’re just a fan
of new technology, the open data
Three Popular
up in countless ML tutorials, so science community is a good place
Clustering Methods and
Kate Strachnyi When to Use Each,
I’m hoping people stay active to share knowledge and expand
Data Visualization Specialist; Host of Humans of Data Science; with it, even with newer libraries understanding of the most exciting
Author of “The Disruptors: Data Science Leaders and Journey to Data Scientist” Spencer Norris releasing frequently. This should be topics in applied data science.
a no-brainer, but start looking into
2018 was the year that machine learning, and by extension deep learning, have taken many industries by automated machine learning if you Alex Landa, ODSC content manager
storm. Machine learning has touched just about every problem domain, bringing accelerated diversity in haven’t already, as it will help to
the types of problems being solved. increase the pace in which you can
Learn.AI create more complex ML processes
Daniel Gutierrez course with the
Data Science Consultant and algorithms. Let the machines
HIGHEST work for you!
Data scientists need to make
Avoid learning everything at once. Content from the ML/data science community can make you feel like
you’ve got to know everything, like yesterday. Resist this. Don’t pick up a textbook and read it cover to ENROLLMENT: 2019 a strong year to address the
common “black box” problem. As
cover. If you’re interested in getting started, spend time thinking about one question from one subject area Machine Learning and Daniel Gutierrez, a data science
that you’re interested in. Find some data, clean it, explore it, and use it to answer your questions. You NLP for Detecting consultant and frequent author for
won’t know how. Research what you need to know and figure it out, step by step. Fake News OpenDataScience.com put it, “I
Brandon Dey Watch here
Technical Team Lead, Data Science Global Marketing, Fisher Investments
BE A PART OF THE
ODSC COMMUNITY
There are many ways
you can engage with
the Open Data Science
Community today! Webinars
We offer free webinars several times
a month, covering a variety of topics.
Follow this page to learn more about
upcoming webinars.
ODSC Events
East 2019: April 30-May 3 Become a Speaker
India 2019: August 7-10 Are you a technical or business
West 2019: October 29-November 1 expert in the world of data science
Europe 2019: November 19-22 and AI? Consider speaking at one of
our events! Each event has its own
Meetups speaker submission page:
We hold meetups in 37 cities around
the world, designed to convene data ODSC East 2019
scientists for education, networking, ODSC Europe 2019
and even a little fun. See upcoming
events here. And more coming soon!