MLBP Joining the IBM Data Science Community!
We’re excited to announce that the Machine Learning Blueprint is joining the IBM DataScience Community! We’ve always strived to source high quality content across the web and put deep thought into our curations. However, continually delivering this every week is not easy, so after two years of publishing, we decided to take a pause.
Through our work with the IBM Community we identified an opportunity to join forces to grow a community of machine learning practitioners, we were thrilled at the prospect. Their mission was clear: provide a place for data scientists to interact with other experts, share support and insights and start dialogue around relevant topics. This aligns with our priorities.
We encourage you to check out the IBM Data Science Community, where you'll find:
- Archives of all of the MLBP past issues, fully searchable & tagged
- Tutorials, courses, demos, how-to guides, videos, contests, campaigns, in-person events, webinars, podcasts, AMAs and technical articles where you can network and grow your skills in data science
- Thriving discussions forums with over 1,000 posts a month and a rapidly growing population of 114,000 members across the community platform.
- A community leaderboard and badging program recognizing users for their engagement and contributions
To continue receiving the newsletter
Click to Join the IBM Data Science Community and Continue to Receive the Newsletter.
After today, MLBP will send out reminders of the full content available there.
It’s our sincerest intention to maintain the quality and consistency that Machine Learning Blueprint subscribers are accustomed to, and to grow this community into the premier outlet for all things machine learning. We’re looking forward to this journey, and we thank you for joining us on it.
- The Machine Learning Blueprint Editors
Spotlight Articles
An AI App that “Undressed” Women Shows How DeepFakes Harm the Most Vulnerable
Founders of the “DeepNude” app announced this month that it would officially be taken offline. The app used Generative Adversarial Networks (GANs) to conduct image translation and modification, allowing users to generate a fake image of a woman with her clothing removed from an original (clothed) image. Vice reported that the app was trained to specifically target women, evidently generating an image of a female body when provided images of men.
Machine Learning Blueprint's Take
While not the first instance of deepfakes being intimately misused against women , the DeepNude app is a startling example of how widely the technology can be abused. As with other audio and visual deepfake manipulation, this represents an especially concerning example of the risks inherent in generative modeling. Practitioners have a responsibility to lead open conversations about standards as well as methods of using machine learning to fight these potential abuses (see next spotlight), as the technology will no doubt improve, making it more difficult to detect such fakes with the human eye.
Detecting Photoshopped Images in Adobe
UC Berkeley and Adobe team up to create a deep learning based approach (ResNet architecture) for detecting images of human faces that have been altered with the advanced PhotoShop tool Face Aware Liquify. It can detect with 95% accuracy which images have been altered, and identify the specific areas and methods used. Humans scored 53%, slightly better than guessing. Code and paper here
Machine Learning Blueprint's Take
While photoshop may not be a traditional machine learning approach, it certainly has dual-use consideration, but perhaps not at scale. This matters because we have an example of someone who releases a tool and the decoder ring to detect when it’s being used; this sets a healthy precedent (albeit a little bit late). Will we see more detection techniques for uncovering transformations with other tools?
Boston Dynamics Robots Learn to Fight Back
You may have noticed some of the abuse the BostonDynamics robots receive during their training/demonstration to help foster resilience to environmental factors in the past. Due to a change in their loss function, the robots have done some strange behavior to achieve their tasks that could be chalked up to fighting back. There’s no saying it’s malicious at this point, but a robot might not have a notion of that. If you’ve read this far, Spoiler Alert: it’s a parody, and a clever one at that.
To continue receiving the newsletter from IBM:
Click to Join the IBM Data Science Community and Continue to Receive the Newsletter.
Learning Machine Learning
Artists Tutorial to Using GAN’s
An end-to-end guide on creating art with a Cycle GAN that covers the deep learning mechanics, tips for artistically tuning, data-selection (arguably one of the most important aspects for honing in on a style), to even the computation setup and workflow.
Machine Learning Blueprint's Take
This tutorial might be a little advanced for the non-computationally inclined, so help your friends with getting setup. It’ll be a democratizing moment of this technology when we see more accessible toolkits available. Adobe could integrate technologly like this into their photoshop workflows, but this could bring up more dual-use concerns.
Modern Deep Learning Techniques Applied to Natural Language Processing
A guide to all the SOTA methods, their lead-up technologies and some of the fundamentals employed for an array of NLP tasks. It covers all the papers and provides references to code libraries where available.
Techniques to Curb Overfitting in Self-Driving RC-Cars
A case study in how the training data matters more than complex model architectures, these hackathon participants were finding that tried and true optimal control methods were outperforming their deep learned RL methods because the models learned the race background, instead of how to follow race course lines! This breakdown outlines all the methods they applied and how they improved or didn’t improve race performance. In the end, it turns out using a style-transfer GAN improved performance by visually popping the course lines, allowing the model to ignore background images.
Introducing TensorFlow Privacy: Learning with Differential Privacy for Training Data
Differential Privacy in machine learning entails that the model does not learn or remember details about any particular data point, or user, during training; this is something a highly parameterized neural network is capable of. Learn how to use new features in TensorFlow Privacy out of the box to protect your users and be ahead of the data privacy curve.
Targeted Dropout - Finding Efficient Subnetworks in Over-Parameterized Models
Machine Learning News
An Open Source Toolkit for Debugging and Monitoring Neural Network Training
Microsoft open sources a library for real time DNN training monitoring with visualizations in Jupyter Notebooks. By treating all objects as streams and implementing lazy logging, you can observe almost any number of variables since they're only observed, not stored. You can transform or combine streams to make more meaningful observations, or can opt to store certain ones. There's also a number of pre & post-training task helpers like architecture visualization, layer stats, and dimensionality reduction visualizations for dataset exploration.
Machine Learning has Been Used to Automatically Translate Long-Lost Languages
Use of a constrained word2vec type model is proving useful in helping linguists decode ancient languages, particularly those with limited corpora. They’re also able to use machine translation, leveraging the fact that language evolution is slow; they can use one ancient language and it’s structures to help decode another.
Using either unsupervised or supervised methods, Facebook Research introduces a way to search a corpus of code from normal search-engine style queries. The unsupervised method extracts key tokens from method snippets, and embeds them with fastText to create document vectors so similar code snippets are mapped nearby. Queries are also similarly embedded, and a FAISS search algorithm finds queries + code snippets with a close cosine similarity. The supervised approach is implemented differently, but requires a corpus of queries and accurate answers (think Stack Overflow); it unsurprisingly performs better. So far it does not seem like there is a published library for this. When it comes to ML on source code in general, check out this awesome-list
Apple MLCore 3.0 Release Details
A huge update that adds new countless models, neural network layers, varying model precision, and a new model definition format using protobuf. iOS apps leveraging ML can seriously step up their game.
Using Deep Learning to Curb Checkout Theft
Intel & Baidu Release on AI Training Processor - Nervana
Interesting Research
Language, Trees, and Geometry in Neural Networks; A Visualization Technique to Understand BERT.
Weight Agnostic Neural Networks
Please pass along to your family & friends, and join the IBM Data Science Community by following the below link:
Click to Join the IBM Data Science Community and Continue to Receive the Newsletter.