0% found this document useful (0 votes)
98 views6 pages

ODSC Machine Learning Guide V1.1

This document provides a summary of the 10 most popular machine learning blogs from OpenDataScience.com in 2018. It briefly describes each blog post, including introductions to reinforcement learning concepts, understanding gradient descent, client-side web development with machine learning, gradient boosting and XGBoost algorithms, and more. The blogs cover a variety of machine learning topics from introductions to specific algorithms to discussions of overfitting and active learning.

Uploaded by

Nishant Tripathi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
98 views6 pages

ODSC Machine Learning Guide V1.1

This document provides a summary of the 10 most popular machine learning blogs from OpenDataScience.com in 2018. It briefly describes each blog post, including introductions to reinforcement learning concepts, understanding gradient descent, client-side web development with machine learning, gradient boosting and XGBoost algorithms, and more. The blogs cover a variety of machine learning topics from introductions to specific algorithms to discussions of overfitting and active learning.

Uploaded by

Nishant Tripathi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Machine Learning

Guide

The Open Data


Community’s Top
20
Resources
FOREWORD
2018 was another year of intense growth - and attention - for the fields
of machine learning and artificial intelligence. As technologies and
techniques improve across industry and academia, this space draws
more investment, but also more scrutiny.

As a result, in 2018 we saw the community of practitioners paying more


attention to the impacts of their work and research on the wider world.
All corners of the field saw a focus on technical approaches to resolv-
ing bias, moving away from operational black boxes, and taking up the
ideas of transparency and explainability.

Now is the time to ask and answer these questions. The power of the
tools and applications built from the research in the field is clear; we
must now consider the world we wish to shape using these tools. They
are excellent magnifying glasses, capable of revealing intricate patterns
in the information all around us. They have the ability to help us bet-
ter understand the world that we have, and more effectively build the
world that we want. But like all tools, we must handle them carefully
or we risk harming ourselves and others. We risk further reinforcing
the walls we find in our society today built during a less informed and
analytical era.

The only way to combat this is to widen the conversation, to equip


anyone who wants to learn with an understanding of the fundamental
principles and technical knowledge of the field. Only by this sharing of
knowledge, this open exchange of ideas, will we be able to make sure
these tools truly benefit everyone. The work that was done in 2018 and
that will be done in 2019 will define not just the machine learning com-
munity, but how the world uses these technologies in our day-to-day
lives.

Contents
In this report, you will find the 10 most popular talks from the 2018

4 6 8 9 10
Open Data Science Conferences and the 10 most popular blogs from
OpenDataScience.com in 2018. I hope you take the information and
inspiration herein and use it to further your journey into machine
learning. Please share it widely with anyone you know who wants to be top blogs from the
a part of the community of data scientists and engineers that will help experts
shape the future of data science and machine learning. what’s
Katherine Gorman
next
Executive Producer and Co-host of Talking Machines & top
Executive Producer, Collective Next
ODSC talks editor’s note
10
Top Blogs | Page 4 Machine Learning Anthology | 2018

TOP OPEN DATA Machine Learning for Beginners —


Guide, Spencer Norris
­ a How-to

SCIENCE BLOGS Check out this tutorial to get a general outline of how different popular
machine learning algorithms work, plus some code recipes you can use
if you want to experiment.
Read it here.

In 2018 alone, we published nearly 400 articles


on data science. Machine learning was a common Efficient, Simplistic Training Pipelines for GANs in
the Cloud with Paperspace, Spencer Norris
and well-received topic in our community. Here Generative adversarial networks are making waves in the world of machine
are the 10 most-read machine learning blogs. learning. Learn about a package that wraps PyTorch implementations for ten
different types of GANs in an easy-to-use interface.
Read it here.

An Introduction to Reinforcement Learning Concepts, What Overfitting is and How to Fix It,
Diego Arenas Spencer Norris
This recap of the reinforcement learning workshop from ODSC London 2018 will give you Overfitting is an interesting problem with fascinating solutions embedded in
the key takeaways you need to implement the framework yourself. the very structure of the algorithms you’re using. Here, we break down what
Read it here. overfitting is and how we can provide an antidote to it in the real world.
Read it here.

Understanding the Three Primary Types of Gradient How to Define a Machine Learning Problem Like a
Descent, Daniel Gutierrez Detective, Spencer Norris
Gain insights into the three primary types of gradient descent, the most commonly used What steps do you need to take to identify the best machine learning
optimization method deployed in machine learning and deep learning algorithms. problems to ask? Learn how to define a machine learning problem, but
Read it here. through the eyes of a detective.
Read it here.

Client-side Web Development and Machine Learning, Three Popular Clustering Methods and When to
Caspar Wylie Use Each, Spencer Norris
See how and why machine learning and web development are beginning to collaborate Unsupervised machine learning can be very powerful in its own right, and
rather successfully, such as through the collaboration with JavaScript. clustering is by far the most common expression of this group of problems.
Read it here. Check out this quick rundown of three of the most popular clustering
approaches and what situations each is best suited to.
Read it here.

Gradient Boosting and XGBoost, Daniel Gutierrez Crash Course: Pool-Based Sampling in Active
Learning, Spencer Norris
Learn the basics behind gradient boosting for statistical learning and the popular XG-
Boost implementation — the most recent evolution of gradient boosting. See more Active learning is a class of machine learning problems where labeled data
about how XGBoost became king of the hill for data isn’t available for supervised algorithms. Read more about one of the most
scientists desiring accurate predictions. common active learning problems in the field: pool-based sampling.
Read it here. Read it here.
10
Top Talks | Page 6 Machine Learning Anthology | 2018

Bernard Marr

TOP TALKS FROM ODSC


Artificial Intelligence and Machine Learning in Practice
This ODSC Europe keynote features Bernard Marr, internationally best-selling

CONFERENCES
business author and strategic advisor to companies and governments. Gain
insight into applied AI and machine learning for your organization.
Watch here.

Out of 300+ talks from ODSC conferences in 2018, here Jon Peck
are the 10 top-rated sessions covering machine learning. OS for AI: How Serverless Computing Enables the
Next Gen of ML
This talk examines the need for and implementations of an “Operating System
Andreas Mueller for AI” — a common interface to use and combine algorithms and a general ar-
chitecture for serverless machine learning that is discoverable, versioned, scal-
Introduction to Machine Learning & able, and shareable.
Watch here.
Intermediate Machine Learning with scikit-learn
Start with the basics of machine learning before you dive into the tools,
applications, and advanced functions!
Watch part one here.
Randy Olson
Watch part two here. The Past, Present, and Future of Automated ML
In this talk, Randy Olson draws from his AutoML research to discuss the benefits
of AutoML and highlights some promising future directions of the field.
Crissman Loomis Watch here.

Machine Learning in Chainer Python


When choosing a framework for working on neural networks, it is important to Yuriy Guts
choose a framework that is flexible and allows for customization. Chainer is a
neural network framework written almost entirely in Python. Gain the knowledge
Target Leakage in Machine Learning
you need to get started with Chainer, from data formatting and augmentation to Target leakage is one of the most difficult problems in developing real-world
reinforcement learning and more. machine learning models. Hear more about real-life examples of data leakage at
Watch here. different stages of the data science project lifecycle, and discuss various counter-
measures and best practices for model validation.
Watch here.

Keith Santarelli
Making the Most of Your Time Series: Signal
Jeffrey Yau
Multivariate Time Series Forecasting Using Statistical
Processing for Machine Learning applications
Learn more about common machine learning signal processing techniques on
and ML Models
time series data. This talk discusses a number of common tools in signal This lecture discusses the formulation Vector Autoregressive (VAR) Models, one
processing and shows how they can be implemented in various Python of the most important classes of multivariate time series statistical models, and
packages, including tools to remove “noise” to find underlying trends. neural network-based techniques.
Watch here. Watch here.

Jared Lander Kirk Borne


A Tour of Machine Learning Algorithms: The Usual
Machine Learning in R Parts I & II
This two-part course focuses on the available methods for implementing
Suspects in Some Unusual Applications
machine learning algorithms in R and examines some of the underlying theory Walk through use cases for several machine learning algorithms to see how to
behind the curtain. implement them in interesting and unexpected ways. Topics include predictive
Watch part one here. modeling and anomaly discovery.
Watch part two here. Watch here.
From the Experts | Page 8 Machine Learning Anthology | 2018

FROM THE EXPERTS:


Open Data Science Letter from
What do leading data scientists think
about the state of machine learning? HIGHLIGHTS the Editor
My hope is that new practitioners will start to have a clearer idea of how to break into the field. Currently
there is no standardization across different types of data science roles and titles. In addition, since the One common thread I’ve seen hope the trend of ‘explainability’ or
field is multidisciplinary, those trying to learn have a difficult time prioritizing what they should learn and at every stop in my career is the ‘interpretability’ of AI will continue to
don’t feel confident in starting to apply to jobs because the learning could theoretically go on forever. My ever-increasing focus on using data be seen as critical to the continued
message to those people is to start applying so that they can receive feedback from the job market to to make informed decisions. acceptance of the technology.”
help them in their pursuit. Now, perched in a place to see Since machine learning may open
Kristen Kehrer day-to-day developments in the up more chances for vulnerabilities


Founder, Data Moves Me LLC field, I’ve learned so much about to appear, developers need to be
Conferences in various frameworks and languages, careful with their applications and

5
the unique ways that different to avoid leaving entryways for
Almost all ML models are based on statistics. No one, including your customers and employees, likes to
industries are using AI, and of malicious attacks, and to be well-
be treated as a statistic, which is what your ML processes will tend to do. Plan to mitigate that from the
course, how strong the open data prepared to defend against potential
beginning.
science community is in sharing its adversarial attacks. It’s better to
Adam Breindel countries knowledge. spend the time building up your
Independent Consultant and Instructor, ML/AI and Data Engineering

70,000
There are definitely a few topics defenses rather than going through
in particular that I’m paying extra the headache of resolving issues
attention to heading into 2019. from hackers and malware.
Apache Spark’s MLlib is a frequent Whether you use the information
topic of conversation when I speak provided in this anthology to learn
To be successful, machine learning subscriptions to our with data science experts, largely a new tool, framework, or you’re
weekly newsletter
adopters must enable a flexible thanks to MLib’s remarkable scaling
ability in implementing multiple
now more invested in security
and defense, I hope that all of the
infrastructure and be agile. It is critical ML algorithms. At this rate, I can videos, blogs, and insights from
see the scikit-learn library for the experts in this anthology prove
to experiment and accept failure in Python becoming a common job useful for you. Whether you’re

#1
exchange for quicker learning and less blog on
requirement for ML experts — it’s
appearing in social feeds, featured
new to data science and machine
learning or a seasoned vet, already
money spent going after projects that @ODSC blogs, and job listings that cross working in applied data science
aren’t successful. Medium: my desk more and more. The
TensorFlow framework is popping
or academia, or you’re just a fan
of new technology, the open data
Three Popular
up in countless ML tutorials, so science community is a good place
Clustering Methods and
Kate Strachnyi When to Use Each,
I’m hoping people stay active to share knowledge and expand
Data Visualization Specialist; Host of Humans of Data Science; with it, even with newer libraries understanding of the most exciting
Author of “The Disruptors: Data Science Leaders and Journey to Data Scientist” Spencer Norris releasing frequently. This should be topics in applied data science.
a no-brainer, but start looking into
2018 was the year that machine learning, and by extension deep learning, have taken many industries by automated machine learning if you Alex Landa, ODSC content manager
storm. Machine learning has touched just about every problem domain, bringing accelerated diversity in haven’t already, as it will help to
the types of problems being solved. increase the pace in which you can
Learn.AI create more complex ML processes
Daniel Gutierrez course with the
Data Science Consultant and algorithms. Let the machines
HIGHEST work for you!
Data scientists need to make
Avoid learning everything at once. Content from the ML/data science community can make you feel like
you’ve got to know everything, like yesterday. Resist this. Don’t pick up a textbook and read it cover to ENROLLMENT: 2019 a strong year to address the
common “black box” problem. As
cover. If you’re interested in getting started, spend time thinking about one question from one subject area Machine Learning and Daniel Gutierrez, a data science
that you’re interested in. Find some data, clean it, explore it, and use it to answer your questions. You NLP for Detecting consultant and frequent author for
won’t know how. Research what you need to know and figure it out, step by step. Fake News OpenDataScience.com put it, “I
Brandon Dey Watch here
Technical Team Lead, Data Science Global Marketing, Fisher Investments
BE A PART OF THE
ODSC COMMUNITY
There are many ways
you can engage with
the Open Data Science
Community today! Webinars
We offer free webinars several times
a month, covering a variety of topics.
Follow this page to learn more about
upcoming webinars.
ODSC Events
East 2019: April 30-May 3 Become a Speaker
India 2019: August 7-10 Are you a technical or business
West 2019: October 29-November 1 expert in the world of data science
Europe 2019: November 19-22 and AI? Consider speaking at one of
our events! Each event has its own
Meetups speaker submission page:
We hold meetups in 37 cities around
the world, designed to convene data ODSC East 2019
scientists for education, networking, ODSC Europe 2019
and even a little fun. See upcoming
events here. And more coming soon!

Weekly Newsletter Partner with ODSC


Don’t miss any future articles on We also offer opportunities for
data science and machine learning! partnerships! Have your product,
Sign up for our weekly newsletter service, or research seen by
and get tutorials, insights, and the thousands of data scientists at an
latest news sent to you directly. event. Learn more here.

You might also like