Module 3
Module 3
• Text can be used to make predictions, but we first have to convert text to
“features”
• Sentiment, spelling, number of words used, etc.
• We can pre-process text to prepare it for analysis
• Correct issues with white space, extra spaces, punctuation, etc.
• Then identify what it is about the text that might be important for making
predictions in the future
Text Features
Example: A word
• Which words that appear in different texts predict outcomes
• Example: Online review
• Look at particular words that have meaning for making predictions about
product sales, repeat purchases, whether or not a review is “helpful”
• Could be sentimental words, positive/negative words, words about the
product
Text Features
• These features extracted from the text allow us to predict outcomes (e.g.
sentiment predicts buying behavior)
• Deep learning gives us even more flexibility
• Can combine the text content in richer and more meaningful ways
Natural Language Processing
• Used to generate artificial content that is increasingly hard to tell apart from
real content
• Uses two networks “competing” with one another
• A generative network that produces new content
• Another network, a discriminator, is used simply to tell whether the output of
the first network is real or fake
Generator and Discriminator Networks
• Over time, the generator will learn what it needs to do to create content that is
harder and harder for the discriminator to identify as being fake content
GAN Example: Real vs. Fake Faces
Generative Adversarial Networks (GANs)
Practices and tools used to build, test, and deploy code to production
https://docs.gitlab.com/ee/ci/introduction/
Machine Learning Workflow
Machine Learning Workflow
• Code is not the only source of changes, the data might change, the model itself
might change as it re-trains
Machine Learning Workflow
• Code is not the only source of changes, the data might change, the model itself
might change as it re-trains
Existing ML Ops Tools
Original
1 Start with a non-AI product that generates data
Product(s)
AI Systems Users
New
5 Rethink the need for data
Users
Solving the Chicken & Egg Problem in AI Entrepreneurship
Original
1 Start with a non-AI product that generates data
Product(s)
AI Systems Users
New
5 Rethink the need for data
Users
1. Start with a Non-AI Product that Generates Data
• Create a non-AI service that solves customer problems, generates the data in
the process
• That data can then be used to train an AI system that enhances the existing
service or creates a related service
https://www.lemonade.com/blog/the-sixth-sense/
https://www.sec.gov/Archives/edgar/data/1691421/000104746920003846/a2241899zs-1a.htm
1. Start with a Non-AI Product that Generates Data
• Create a non-AI service that solves customer problems and generates data in
the process
• This data can then be used to train an AI system that enhances the existing
service or creates a related service
• Now AI handles the “first notice of loss” for 96% of claims & manages
full claim resolution without human involvement in 1/3 of cases
https://www.lemonade.com/blog/the-sixth-sense/
https://www.sec.gov/Archives/edgar/data/1691421/000104746920003846/a2241899zs-1a.htm
Solving the Chicken & Egg Problem in AI Entrepreneurship
Original
1 Start with a non-AI product that generates data
Product(s)
AI Systems Users
New
5 Rethink the need for data
Users
2. Partner With An Organization That Has Data
+
• Combine patient data with Google’s cloud and AI
capabilities to solve important questions in healthcare
• Using alarm data to distinguish “false alarms from
real ones” in hospitalized patients’ monitors.
https://med.stanford.edu/news/all-news/2016/08/stanford-medicine-google-team-up-to-harness-power-of-data-science.html
Solving the Chicken & Egg Problem in AI Entrepreneurship
Original
1 Start with a non-AI product that generates data
Product(s)
AI Systems Users
New
5 Rethink the need for data
Users
3. Crowdsource the (Labeled) Data You Need
https://medium.com/swlh/ai-labeling-crowdsourcing-platforms-630adbc79c40
CAPTCHA Image: https://chrome.google.com/webstore/detail/buster-captcha-solver-for/mpbjkejclgfgadiemmefgebjfooflfhl
Solving the Chicken & Egg Problem in AI Entrepreneurship
Original
1 Start with a non-AI product that generates data
Product(s)
AI Systems Users
New
5 Rethink the need for data
Users
4. Make Use of Public Data (and Pre-Trained Models)
https://www.cnbc.com/2020/03/03/bluedot-used-artificial-intelligence-to-predict-coronavirus-spread.html
Solving the Chicken & Egg Problem in AI Entrepreneurship
Original
1 Start with a non-AI product that generates data
Product(s)
AI Systems Users
New
5 Rethink the need for data
Users
5. Rethink the Need for Data
https://medium.com/curai-tech/the-science-of-assisting-medical-diagnosis-from-expert-systems-to-machine-learned-models-cc2ef0b03098
Reinforcement Learning
• AI systems do not begin with large training datasets, but learn by taking
actions and observing the results
• Google’s AlphaGo was trained on a large dataset, but iteration #2,
AlphaZero, was based on reinforcement learning— yet AlphaZero beat
AlphaGo (which itself beat world champion Lee Sedol)
Expert Systems