Artificial Intelligence | 0 articles | Tech News, Tutorials & Expert Insights

article-image-sherin-thomas-explains-how-to-build-a-pipeline-in-pytorch-for-deep-learning-workflows

09 May 2019

8 min read

Sherin Thomas explains how to build a pipeline in PyTorch for deep learning workflows

09 May 2019

A typical deep learning workflow starts with ideation and research around a problem statement, where the architectural design and model decisions come into play. Following this, the theoretical model is experimented using prototypes. This includes trying out different models or techniques, such as skip connection, or making decisions on what not to try out. PyTorch was started as a research framework by a Facebook intern, and now it has grown to be used as a research or prototype framework and to write an efficient model with serving modules. The PyTorch deep learning workflow is fairly equivalent to the workflow implemented by almost everyone in the industry, even for highly sophisticated implementations, with slight variations. In this article, we explain the core of ideation and planning, design and experimentation of the PyTorch deep learning workflow. This article is an excerpt from the book PyTorch Deep Learning Hands-On by Sherin Thomas and Sudhanshi Passi. This book attempts to provide an entirely practical introduction to PyTorch. This PyTorch publication has numerous examples and dynamic AI applications and demonstrates the simplicity and efficiency of the PyTorch approach to machine intelligence and deep learning. Ideation and planning Usually, in an organization, the product team comes up with a problem statement for the engineering team, to know whether they can solve it or not. This is the start of the ideation phase. However, in academia, this could be the decision phase where candidates have to find a problem for their thesis. In the ideation phase, engineers brainstorm and find the theoretical implementations that could potentially solve the problem. In addition to converting the problem statement to a theoretical solution, the ideation phase is where we decide what the data types are and what dataset we should use to build the proof of concept (POC) of the minimum viable product (MVP). Also, this is the stage where the team decides which framework to go with by analyzing the behavior of the problem statement, available implementations, available pretrained models, and so on. This stage is very common in the industry, and I have come across numerous examples where a well-planned ideation phase helped the team to roll out a reliable product on time, while a non-planned ideation phase destroyed the whole product creation. Design and experimentation The crucial part of design and experimentation lies in the dataset and the preprocessing of the dataset. For any data science project, the major timeshare is spent on data cleaning and preprocessing. Deep learning is no exception from this. Data preprocessing is one of the vital parts of building a deep learning pipeline. Usually, for a neural network to process, real-world datasets are not cleaned or formatted. Conversion to floats or integers, normalization and so on, is required before further processing. Building a data processing pipeline is also a non-trivial task, which consists of writing a lot of boilerplate code. For making it much easier, dataset builders and DataLoader pipeline packages are built into the core of PyTorch. The dataset and DataLoader classes Different types of deep learning problems require different types of datasets, and each of them might require different types of preprocessing depending on the neural network architecture we use. This is one of the core problems in deep learning pipeline building. Although the community has made the datasets for different tasks available for free, writing a preprocessing script is almost always painful. PyTorch solves this problem by giving abstract classes to write custom datasets and data loaders. The example given here is a simple dataset class to load the fizzbuzz dataset, but extending this to handle any type of dataset is fairly straightforward. PyTorch's official documentation uses a similar approach to preprocess an image dataset before passing that to a complex convolutional neural network (CNN) architecture. A dataset class in PyTorch is a high-level abstraction that handles almost everything required by the data loaders. The custom dataset class defined by the user needs to override the __len__ and __getitem__ functions of the parent class, where __len__ is being used by the data loaders to determine the length of the dataset and __getitem__ is being used by the data loaders to get the item. The __getitem__ function expects the user to pass the index as an argument and get the item that resides on that index: from dataclasses import dataclassfrom torch.utils.data import Dataset, DataLoader@dataclass(eq=False)class FizBuzDataset(Dataset): input_size: int start: int = 0 end: int = 1000 def encoder(self,num): ret = [int(i) for i in '{0:b}'.format(num)] return[0] * (self.input_size - len(ret)) + ret def __getitem__(self, idx): x = self.encoder(idx) if idx % 15 == 0: y = [1,0,0,0] elif idx % 5 ==0: y = [0,1,0,0] elif idx % 3 == 0: y = [0,0,1,0] else: y = [0,0,0,1] return x,y def __len__(self): return self.end - self.start The implementation of a custom dataset uses brand new dataclasses from Python 3.7. dataclasses help to eliminate boilerplate code for Python magic functions, such as __init__, using dynamic code generation. This needs the code to be type-hinted and that's what the first three lines inside the class are for. You can read more about dataclasses in the official documentation of Python (https://docs.python.org/3/library/dataclasses.html). The __len__ function returns the difference between the end and start values passed to the class. In the fizzbuzz dataset, the data is generated by the program. The implementation of data generation is inside the __getitem__ function, where the class instance generates the data based on the index passed by DataLoader. PyTorch made the class abstraction as generic as possible such that the user can define what the data loader should return for each id. In this particular case, the class instance returns input and output for each index, where, input, x is the binary-encoder version of the index itself and output is the one-hot encoded output with four states. The four states represent whether the next number is a multiple of three (fizz), or a multiple of five (buzz), or a multiple of both three and five (fizzbuzz), or not a multiple of either three or five. Note: For Python newbies, the way the dataset works can be understood by looking first for the loop that loops over the integers, starting from zero to the length of the dataset (the length is returned by the __len__ function when len(object) is called). The following snippet shows the simple loop: dataset = FizBuzDataset()for i in range(len(dataset)): x, y = dataset[i]dataloader = DataLoader(dataset, batch_size=10, shuffle=True, num_workers=4)for batch in dataloader: print(batch) The DataLoader class accepts a dataset class that is inherited from torch.utils.data.Dataset. DataLoader accepts dataset and does non-trivial operations such as mini-batching, multithreading, shuffling, and so on, to fetch the data from the dataset. It accepts a dataset instance from the user and uses the sampler strategy to sample data as mini-batches. The num_worker argument decides how many parallel threads should be operating to fetch the data. This helps to avoid a CPU bottleneck so that the CPU can catch up with the GPU's parallel operations. Data loaders allow users to specify whether to use pinned CUDA memory or not, which copies the data tensors to CUDA's pinned memory before returning it to the user. Using pinned memory is the key to fast data transfers between devices, since the data is loaded into the pinned memory by the data loader itself, which is done by multiple cores of the CPU anyway. Most often, especially while prototyping, custom datasets might not be available for developers and in such cases, they have to rely on existing open datasets. The good thing about working on open datasets is that most of them are free from licensing burdens, and thousands of people have already tried preprocessing them, so the community will help out. PyTorch came up with utility packages for all three types of datasets with pretrained models, preprocessed datasets, and utility functions to work with these datasets. This article is about how to build a basic pipeline for deep learning development. The system we defined here is a very common/general approach that is followed by different sorts of companies, with slight changes. The benefit of starting with a generic workflow like this is that you can build a really complex workflow as your team/project grows on top of it. Build deep learning workflows and take deep learning models from prototyping to production with PyTorch Deep Learning Hands-On written by Sherin Thomas and Sudhanshu Passi. F8 PyTorch announcements: PyTorch 1.1 releases with new AI tools, open sourcing BoTorch and Ax, and more Facebook AI open-sources PyTorch-BigGraph for faster embeddings in large graphs Top 10 deep learning frameworks

0
0
44721

article-image-paper-in-two-minutes-attention-is-all-you-need

Sugandha Lahoti

05 Apr 2018

4 min read

Paper in Two minutes: Attention Is All You Need

Sugandha Lahoti

05 Apr 2018

4 min read

A paper on a new simple network architecture, the Transformer, based solely on attention mechanisms The NIPS 2017 accepted paper, Attention Is All You Need, introduces Transformer, a model architecture relying entirely on an attention mechanism to draw global dependencies between input and output. This paper is authored by professionals from the Google research team including Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. The Transformer – Attention is all you need What problem is the paper attempting to solve? Recurrent neural networks (RNN), long short-term memory networks(LSTM) and gated RNNs are the popularly approaches used for Sequence Modelling tasks such as machine translation and language modeling. However, RNN/CNN handle sequences word-by-word in a sequential fashion. This sequentiality is an obstacle toward parallelization of the process. Moreover, when such sequences are too long, the model is prone to forgetting the content of distant positions in sequence or mix it with following positions’ content. Recent works have achieved significant improvements in computational efficiency and model performance through factorization tricks and conditional computation. But they are not enough to eliminate the fundamental constraint of sequential computation. Attention mechanisms are one of the solutions to overcome the problem of model forgetting. This is because they allow dependency modelling without considering their distance in the input or output sequences. Due to this feature, they have become an integral part of sequence modeling and transduction models. However, in most cases attention mechanisms are used in conjunction with a recurrent network. Paper summary The Transformer proposed in this paper is a model architecture which relies entirely on an attention mechanism to draw global dependencies between input and output. The Transformer allows for significantly more parallelization and tremendously improves translation quality after being trained for as little as twelve hours on eight P100 GPUs. Neural sequence transduction models generally have an encoder-decoder structure. The encoder maps an input sequence of symbol representations to a sequence of continuous representations. The decoder then generates an output sequence of symbols, one element at a time. The Transformer follows this overall architecture using stacked self-attention and point-wise, fully connected layers for both the encoder and decoder. The authors are motivated to use self-attention because of three criteria. One is that the total computational complexity per layer. Another is the amount of computation that can be parallelized, as measured by the minimum number of sequential operations required. The third is the path length between long-range dependencies in the network. The Transformer uses two different types of attention functions: Scaled Dot-Product Attention, computes the attention function on a set of queries simultaneously, packed together into a matrix. Multi-head attention, allows the model to jointly attend to information from different representation subspaces at different positions. A self-attention layer connects all positions with a constant number of sequentially executed operations, whereas a recurrent layer requires O(n) sequential operations. In terms of computational complexity, self-attention layers are faster than recurrent layers when the sequence length is smaller than the representation dimensionality, which is often the case with machine translations. Key Takeaways This work introduces Transformer, a novel sequence transduction model based entirely on attention mechanism. It replaces the recurrent layers most commonly used in encoder-decoder architectures with multi-headed self-attention. Transformer can be trained significantly faster than architectures based on recurrent or convolutional layers for translation tasks. On both WMT 2014 English-to-German and WMT 2014 English-to-French translation tasks, the model achieves a new state of the art. In the former task the model outperforms all previously reported ensembles. Future Goals Transformer has only been applied to transduction model tasks as of yet. In the near future, the authors plan to use it for other problems involving input and output modalities other than text. They plan to apply attention mechanisms to efficiently handle large inputs and outputs such as images, audio and video. The Transformer architecture from this paper has gained major traction since its release because of major improvements in translation quality and other NLP tasks. Recently, the NLP research group at Harvard have released a post which presents an annotated version of the paper in the form of a line-by-line implementation. It is accompanied with 400 lines of library code, written in PyTorch in the form of a notebook, accessible from github or on Google Colab with free GPUs.

0
0
36534

article-image-startup-focus-sea-machines-winning-contracts-for-autonomous-marine-systems-from-ai-trends

Matthew Emerick

15 Oct 2020

8 min read

Startup Focus: Sea Machines Winning Contracts for Autonomous Marine Systems from AI Trends

Matthew Emerick

15 Oct 2020

8 min read

By AI Trends Staff The ability to add automation to an existing marine vessel to make it autonomous is here today and is being proven by a Boston company. Sea Machines builds autonomous vessel software and systems for the marine industry. Founded in 2015, the company recently raised $15 million in a Series B round, making it total raised $27.5 million since 2017. Founder and CEO Michael G. Johnson, a licensed marine engineer, recently took the time to answer via email some questions AI Trends poses to selected startups. Describe your team, the key people Sea Machines is led by a team of mariners, engineers, coders and autonomy scientists. The company today has a crew of 30 people based in Boston; Hamburg, Germany; and Esbjerg, Denmark. Sea Machines is also hiring for a variety of positions, which can be viewed at sea-machines.com/careers. Michael Johnson, Founder and CEO, Sea Machines What business problem are you trying to solve? The global maritime industry is responsible for billions in economic output and is a major driver of jobs and commerce. Despite the sector’s success and endurance, it faces significant challenges that can negatively impact operator safety, performance and profitability. Sea Machines is solving many of these challenges by developing technologies that are helping the marine industry transition into a new era of task-driven, computer-guided vessel operations. How does your solution address the problem? Autonomous systems solve for these challenges in several ways: Autonomous grid and waypoint following capabilities relieve mariners from manually executing planned routes. Today’s autonomous systems uniquely execute with human-like behavior, intelligently factoring in environmental and sea conditions (including wave height, pitch, heave and roll); change speeds between waypoints; and actively detect obstacles for collision avoidance purposes. Autonomous marine systems also enable optionally manned or autonomous-assist (reduced crew) modes that can reduce mission delays and maximize effort. This is an important feature for anyone performing time-sensitive operations, such as on-water search-and-rescues or other urgent missions. Autonomous marine systems offer obstacle detection and collision avoidance capabilities that keep people and assets safe and out of harm’s way. These advanced technologies are much more reliable and accurate than the human eye, especially in times of low light or in poor sea conditions. Because today’s systems enable remote-helm control and remote payload management, there is a reduced need for mariners (such as marine fire or spill response crews) to physically man a vessel in a dangerous environment. A remote-helm control beltpack also improves visibility by enabling mariners to step outside of the wheelhouse to whatever location provides the best vantage point when performing tight maneuvers, dockings and other precision operations. Autonomous marine systems enable situational awareness with multiple cameras and sensors streaming live over a 4G connection. This real-time data allows shoreside or at-sea operators a full view of an autonomous vessel’s environment, threats and opportunities. Minimally manned vessels can autonomously collaborate to cover more ground with less resources required, creating a force-multiplier effect. A single shoreside operator can command multiple autonomous boats with full situational awareness. These areas of value overlap for all sectors but for the government and military sector, new on-water capabilities and unmanned vessels are a leading driver. By contrast, the commercial sector is looking for increased productivity, efficiency, and predictable operations. Our systems meet all of these needs. Our technology is designed to be installed on new vessels as well as existing vessels. Sea Machines’ ability to upgrade existing fleets greatly reduces the time and cost to leverage the value of our autonomous systems. How are you getting to the market? Is there competition? Sea Machines has an established dealer program to support the company’s global sales across key commercial marine markets. The program includes many strategic partners who are enabled to sell, install and service the company’s line of intelligent command and control systems for workboats. To date, Sea Machines dealers are located across the US and Canada, in Europe, in Singapore and UAE. We have competition for autonomous marine systems, but our products are the only ones that are retrofit ready, not requiring new vessels to be built. Do you have any users or customers? Yes we have achieved significant sales traction since launching our SM series of products in 2018. Just since the summer, Sea Machines has been awarded several significant contracts and partnerships: The first allowed us to begin serving the survey vessel market with the first announced collaboration with DEEP BV in the Netherlands. DEEP’s vessel outfitted with the SM300 entered survey service very recently. Next, we partnered with Castine-based Maine Maritime Academy (MMA) and representatives of the U.S. Maritime Administration (MARAD)’s Maritime Environmental and Technical Assistance (META) Program to bring valuable, hands-on education about autonomous marine systems into the MMA curriculum. Then we recently announced a partnership with shipbuilder Metal Shark Boats, of Jeanerette, Louisiana, to supply the U.S. Coast Guard (USCG)’s Research and Development Center (RDC) with a new Sharktech 29 Defiant vessel for the purposes of testing and evaluating the capabilities of available autonomous vessel technology. USCG demonstrations are happening now (through November 5) off the coast of Hawaii. Finally, just this month, we announced that the U.S. Department of Defense (DOD)’s Defense Innovation Unit (DIU) awarded us with a multi-year Other Transaction (OT) agreement. The primary purpose of the agreement is to initiate a prototype that will enable commercial ocean-service barges as autonomous Forward Arming and Refueling Point (FARP) units for an Amphibious Maritime Projection Platform (AMPP). Specifically, Sea Machines will engineer, build and demonstrate ready-to-deploy system kits that enable autonomous, self-propelled operation of opportunistically available barges to land and replenish military aircraft. In the second half of 2020 we are also commencing onboard collaborations with some crew-transfer vessel (CTV) operators serving the wind farm industry. How is the company funded? The company recently completed a successful Series B round, which provided $15M in funds, with a total amount raised of $27.5M since 2017. The most recent funds we were able to raise are going to significantly impact Sea Machines, and therefore the maritime and marine industries as a whole. The funds will be put to use to further strengthen our technical development team as well as build out our next level of systems manufacturing and scale our operations group to support customer deployments. We will also be investing in some supporting technologies to speed our course to full dock-to-dock, over-the-horizon autonomy. The purpose of our technology is to optimize vessel operations with increased performance, productivity, predictability and ultimately safety. In closing, we’d like to add that the marine industries are a critically significant component of the global economy and it’s up to us to keep it strong and relevant. Along with people, processes and capital, pressing the bounds of technology is a key driver. The world is being revolutionized by intelligent and autonomous self-piloting technology and today we find ourselves just beyond the starting line of a busy road to broad adoption through all marine sectors. If Sea Machines continues to chart the course with forward-looking pertinence, then you will see us rise up to become one of the most significant companies and brands serving the industry in the 21st century. Any anecdotes/stories? This month we released software version 1.7 on our SM300. That’s seven significant updates in just over 18 months, each one providing increased technical hardening and new features for specific workboat sectors. Another interesting story is about our Series B funding, which, due to the pandemic, we raised virtually. Because of where we are as a company, we have been proving our ability to retool the marine industry with our technology, and therefore we are delivering confidence to investors. We were forced to conduct the entire process by video conference, which may have increased overall efficiency of the raise as these rounds traditionally require thousands if not tens of thousands of miles of travel for face-to-face meetings, diligence, and handshakes. Remote pitches also proved to be an advantage because it allowed us to showcase our technology in a more direct way. We did online demos where we had our team remotely connected to our vessels off Boston Harbor. We were able to get the investors into the captain’s chair, as if they were remotely commanding a vessel in real-world operations. Finally, in January, we announced the receipt of ABS and USCG approval for our SM200 wireless helm and control systems on a major class of U.S.-flag articulated tug-barges (ATBs), the first unit has been installed and is in operation, and we look forward to announcing details around it. We will be taking the SM200 forward into the type-approval process. Learn more at Sea Machines.

0
0
30479

article-image-youtube-promises-to-reduce-recommendations-of-conspiracy-theory-ex-googler-explains-why-this-is-a-historic-victory

Sugandha Lahoti

12 Feb 2019

4 min read

Youtube promises to reduce recommendations of ‘conspiracy theory’. Ex-googler explains why this is a 'historic victory'

Sugandha Lahoti

12 Feb 2019

4 min read

Talks of AI algorithms causing harms including addiction, radicalization. political abuse and conspiracies, disgusting kids videos and the danger of AI propaganda are all around. Last month, YouTube announced an update regarding YouTube recommendations aiming to reduce the recommendations of videos that promote misinformation ( eg: conspiracy videos, false claims about historical events, flat earth videos, etc). In a historical move, Youtube changed its Artificial Intelligence algorithm instead of favoring another solution, which may have cost them fewer resources, time, and money. Last Friday, an ex-googler who helped build the YouTube algorithm, Guillaume Chaslot, appreciated this change in AI, calling it “a great victory” which will help thousands of viewers from falling down the rabbit hole of misinformation and false conspiracy theories. In a twitter thread, he presented his views as someone who has had experience working on Youtube’s AI. Recently, there has been a trend in Youtube promoting conspiracy videos such as ‘Flat Earth theories’. In a blog post, Guillaume Chaslot explains, “Flat Earth is not a ’small bug’. It reveals that there is a structural problem in Google’s AIs and they exploit weaknesses of the most vulnerable people, to make them believe the darnedest things.” Youtube realized this problem and has made amends to its algorithm. “It’s just another step in an ongoing process, but it reflects our commitment and sense of responsibility to improve the recommendations experience on YouTube. To be clear, this will only affect recommendations of what videos to watch, not whether a video is available on YouTube. As always, people can still access all videos that comply with our Community Guidelines”, states the YouTube team in a blog post. Chaslot appreciated this fact in his twitter thread saying that although Youtube had the option to ‘make people spend more time on round earth videos’, they chose the hard way by tweaking their AI algorithm. AI algorithms also often get biased by tiny groups of hyperactive users. As Chaslot notes, people who spend their lives on YouTube affect recommendations more. The content they watch gets more views, which leads to Youtubers noticing and creating more of it, making people spend even more time on that content. This is because YouTube optimizes for things you might watch, not things you might like. As a hacker news user observed, “The problem was that pathological/excessive users were overly skewing the recommendations algorithms. These users tend to watch things that might be unhealthy in various ways, which then tend to get over-promoted and lead to the creation of more content in that vein. Not a good cycle to encourage.” The new change in Youtube’s AI makes use of machine learning along with human evaluators and experts from all over the United States to train these machine learning systems responsible for generating recommendations. Evaluators are trained using public guidelines and offer their input on the quality of a video. Currently, the change is applied only to a small set of videos in the US as the machine learning systems are not very accurate currently. The new update will roll out in different countries once the systems become more efficient. However, there is another problem lurking around which is probably even bigger than conspiracy videos. This is the addiction to spending more and more time online. AI engines used in major social platforms, including but not limited to YouTube, Netflix, Facebook all want people to spend as much time as possible. A hacker news user commented, “This is just addiction peddling. Nothing more. I think we have no idea how much damage this is doing to us. It’s as if someone invented cocaine for the first time and we have no social norms or legal framework to confront it.” Nevertheless, Youtube updating it’s AI engine was taken generally positively by Netizens. As Chaslot, concluded on his Twitter thread, “YouTube's announcement is a great victory which will save thousands. It's only the beginning of a more humane technology. Technology that empowers all of us, instead of deceiving the most vulnerable.” Now it is on Youtube’s part how they will strike a balance between maintaining a platform for free speech and living up to their responsibility to users. Is the YouTube algorithm’s promoting of #AlternativeFacts like Flat Earth having a real-world impact? YouTube to reduce recommendations of ‘conspiracy theory’ videos that misinform users in the US. YouTube bans dangerous pranks and challenges Is YouTube’s AI Algorithm evil?

0
0
30303

Sugandha Lahoti

21 Mar 2018

5 min read

What is Meta Learning?

Sugandha Lahoti

21 Mar 2018

5 min read

Meta Learning, an original concept of cognitive psychology, is now applied to machine learning techniques. If we go by the social psychology definition, meta learning is the state of being aware of and taking control of one's own learning. Similar concepts, when applied to the machine learning theory states that a meta learning algorithm uses prior experience to change certain aspects of an algorithm, such that the modified algorithm is better than the original algorithm. To explain in simple terms, meta-learning is how the algorithm learns how to learn. Meta Learning: Making a versatile AI agent Current AI Systems excel at mastering a single skill, playing Go, holding human-like conversations, predicting a disaster, etc. However, now that AI and machine learning is possibly being integrated in everyday tasks, we need a single AI system to solve a variety of problems. Currently, a Go Player, will not be able to navigate the roads or find new places. Or an AI navigation controller won’t be able to hold a perfect human-like conversation. What machine learning algorithms need to do is develop versatility – the capability of doing many different things. Versatility is achieved by intelligent amalgamation of Meta Learning along with related techniques such as reinforcement learning (finding suitable actions to maximize a reward), transfer learning (re-purposing a trained model for a specific task on a second related task), and active learning (learning algorithm chooses the data it wants to learn from). Such different learning techniques provides an AI agent with the brains to do multiple tasks without the need to learn every new task from scratch. Thereby making it capable of adapting intelligently to a wide variety of new, unseen situations. Apart from creating versatile agents, recent researches also focus on using meta learning for hyperparameter and neural network optimization, fast reinforcement learning, finding good network architectures and for specific cases such as few-shot image recognition. Using Meta Learning, AI agents learn how to learn new tasks by reusing prior experience, rather than examining each new task in isolation. Various approaches to Meta Learning algorithms A wide variety of approaches come under the umbrella of Meta-Learning. Let's have a quick glance at these algorithms and techniques: Algorithm Learning (selection) Algorithm selection or learning, selects learning algorithms on the basis of characteristics of the instance. For example, you have a set of ML algos (Random Forest, SVM, DNN), data sets as the instances and the error rate as the cost metric. Now, the goal of Algorithm Selection is to predict which machine learning algorithm will have a small error on each data set. Hyper-parameter Optimization Many machine learning algorithms have numerous hyper-parameters that can be optimized. The choice of selecting these hyper-parameters for learning algorithms determines how well the algorithm learns. A recent paper, "Evolving Deep Neural Networks", provides a meta learning algorithm for optimizing deep learning architectures through evolution. Ensemble Methods Ensemble methods usually combine several models or approaches to achieve better predictive performance. There are 3 basic types – Bagging, Boosting, and Stacked Generalization. In Bagging, each model runs independently and then aggregates the outputs at the end without preference to any model. Boosting refers to a group of algorithms that utilize weighted averages to make weak learners into stronger learners. Boosting is all about “teamwork”. Stacked generalization, has a layered architecture. Each set of base-classifiers is trained on a dataset. Successive layers receive as input the predictions of the immediately preceding layer and the output is passed on to the next layer. A single classifier at the topmost level produces the final prediction. Dynamic bias selection In Dynamic Bias selection, we adjust the bias of the learning algorithm dynamically to suit the new problem instance. The performance of a base learner can trigger the need to explore additional hypothesis spaces, normally through small variations of the current hypothesis space. The bias selection can either be a form of data variation or a time-dependent feature. Inductive Transfer Inductive transfer describes learning using previous knowledge from related tasks. This is done by transferring meta-knowledge across domains or tasks; a process known as inductive transfer. The goal here is to incorporate the meta-knowledge into the new learning task rather than matching meta-features with a meta-knowledge base. Adding Enhancements to Meta Learning algorithms Supervised meta-learning: When the meta-learner is trained with supervised learning. In supervised learning we have both input and output variables and the algorithm learns the mapping function from the input to the output. RL meta-learning: This algorithm talks about using standard deep RL techniques to train a recurrent neural network in such a way that the recurrent network can then implement its own Reinforcement learning procedure. Model-agnostic meta-learning: MAML trains over a wide range of tasks, for a representation that can be quickly adapted to a new task, via a few gradient steps. The meta-learner seeks an initialization that is not only useful for adapting to various problems, but also can be adapted quickly. The ultimate goal of any meta learning algorithm and its variations is to be fully self-referential. This means it can automatically inspect and improve every part of its own code. A regenerative meta learning algorithm, on the lines of how a lizard regenerates its limbs, would not only blur the distinction between the different variations as described above but will also lead to better future performance and versatility of machine learning algorithms.

0
0
28374

article-image-reinforcement-learning-works

Pravin Dhandre

14 Nov 2017

5 min read

How Reinforcement Learning works

Pravin Dhandre

14 Nov 2017

5 min read

[box type="note" align="" class="" width=""]This article is an excerpt from a book by Rodolfo Bonnin titled Machine Learning for Developers.[/box] Reinforcement learning is a field that has resurfaced recently, and it has become more popular in the fields of control, finding the solutions to games and situational problems, where a number of steps have to be implemented to solve a problem. A formal definition of reinforcement learning is as follows: "Reinforcement learning is the problem faced by an agent that must learn behavior through trial-and-error interactions with a dynamic environment.” (Kaelbling et al. 1996). In order to have a reference frame for the type of problem we want to solve, we will start by going back to a mathematical concept developed in the 1950s, called the Markov decision process. Markov decision process Before explaining reinforcement learning techniques, we will explain the type of problem we will attack with them. When talking about reinforcement learning, we want to optimize the problem of a Markov decision process. It consists of a mathematical model that aids decision making in situations where the outcomes are in part random, and in part under the control of an agent. The main elements of this model are an Agent, an Environment, and a State, as shown in the following diagram: Simpliﬁed scheme of a reinforcement learning process The agent can perform certain actions (such as moving the paddle left or right). These actions can sometimes result in a reward rt, which can be positive or negative (such as an increase or decrease in the score). Actions change the environment and can lead to a new state st+1, where the agent can perform another action at+1. The set of states, actions, and rewards, together with the rules for transitioning from one state to another, make up a Markov decision process. Decision elements To understand the problem, let's situate ourselves in the problem solving environment and look at the main elements: The set of states The action to take is to go from one place to another The reward function is the value represented by the edge The policy is the way to complete the task A discount factor, which determines the importance of future rewards The main difference with traditional forms of supervised and unsupervised learning is the time taken to calculate the reward, which in reinforcement learning is not instantaneous; it comes after a set of steps. Thus, the next state depends on the current state and the decision maker's action, and the state is not dependent on all the previous states (it doesn't have memory), thus it complies with the Markov property. Since this is a Markov decision process, the probability of state st+1 depends only on the current state st and action at: Unrolled reinforcement mechanism The goal of the whole process is to generate a policy P, that maximizes rewards. The training samples are tuples, <s, a, r>. Optimizing the Markov process Reinforcement learning is an iterative interaction between an agent and the environment. The following occurs at each timestep: The process is in a state and the decision-maker may choose any action that is available in that state The process responds at the next timestep by randomly moving into a new state and giving the decision-maker a corresponding reward The probability that the process moves into its new state is influenced by the chosen action in the form of a state transition function Basic RL techniques: Q-learning One of the most well-known reinforcement learning techniques, and the one we will be implementing in our example, is Q-learning. Q-learning can be used to find an optimal action for any given state in a finite Markov decision process. Q-learning tries to maximize the value of the Q-function that represents the maximum discounted future reward when we perform action a in state s. Once we know the Q-function, the optimal action a in state s is the one with the highest Q- value. We can then define a policy π(s), that gives us the optimal action in any state, expressed as follows: We can define the Q-function for a transition point (st, at, rt, st+1) in terms of the Q-function at the next point (st+1, at+1, rt+1, st+2), similar to what we did with the total discounted future reward. This equation is known as the Bellman equation for Q-learning: In practice, we can think of the Q-function as a lookup table (called a Q-table) where the states (denoted by s) are rows and the actions (denoted by a) are columns, and the elements (denoted by Q(s, a)) are the rewards that you get if you are in the state given by the row and take the action given by the column. The best action to take at any state is the one with the highest reward: initialize Q-table Q observe initial state s while (! game_finished): select and perform action a get reward r advance to state s' Q(s, a) = Q(s, a) + α(r + γ max_a' Q(s', a') - Q(s, a)) s = s' You will realize that the algorithm is basically doing stochastic gradient descent on the Bellman equation, backpropagating the reward through the state space (or episode) and averaging over many trials (or epochs). Here, α is the learning rate that determines how much of the difference between the previous Q-value and the discounted new maximum Q- value should be incorporated. We can represent this process with the following flowchart: We have successfully reviewed Q-Learning, one of the most important and innovative architecture of reinforcement learning that have appeared in recent. Every day, such reinforcement models are applied in innovative ways, whether to generate feasible new elements from a selection of previously known classes or even to win against professional players in strategy games. If you enjoyed this excerpt from the book Machine learning for developers, check out the book below.

0
0
27404

article-image-india-engages-in-a-national-initiative-to-support-its-ai-industry-from-ai-trends

Matthew Emerick

08 Oct 2020

5 min read

India Engages in a National Initiative to Support Its AI Industry from AI Trends

Matthew Emerick

08 Oct 2020

5 min read

By AI Trends Staff The government of India is engaged in an initiative on AI that aims to promote the industry, which a recent IDC report maintains is growing at over a 30% annual clip. India’s Artificial Intelligence spending will grow from $300.7 million in 2019 to $880.5 million in 2023 at a compound annual growth rate (CAGR) of 30.8 per cent, states IDC’s Worldwide Artificial Intelligence Spending Guide. Rishu Sharma, Principal Analyst, Cloud and AI at IDC in India Enterprises are relying on AI to maintain business continuity, transform how businesses operate, and gain competitive advantage. “COVID-19 is pushing the boundaries of organizations’ AI lens. Businesses are considering investments in intelligent solutions to tackle issues associated with business continuity, labor shortages, and workspace monitoring. Organizations are now realizing that their business plans must be closely aligned with their AI strategies,” stated Rishu Sharma, Principal Analyst, Cloud and AI at IDC in India, in an IDC press release. In other report highlights: Enterprises rely on AI to maintain business continuity, transform how businesses operate and gain competitive advantage. Almost 20% of enterprises are still devising AI strategies to explore new businesses and ventures; Half of India enterprises plan to increase their AI spending in 2020; Data trustworthiness and difficulty in selecting the right algorithm, are among the top challenges that hold organizations back from implementing AI technology. “The variety of industry-specific tech solutions supported by emerging technologies like IoT and Robotics are getting powered by complex AI algorithms,” stated Ashutosh Bisht, Senior Research Manager for IDC’s Customer Insights and Analysis group. “With the fast adoption of cloud technologies in India, more than 60% of AI Applications will be migrated to the cloud by 2024.” As per IDC’s 2020 COVID-19 Impact Survey, half of Indian enterprises plan to increase their AI spending this year. However, data trustworthiness and difficulty in selecting the right algorithm, are among top challenges that hold organizations back from implementing AI technology, according to IDC. Prime Minister Speaking at RAISE 2020 Global Summit Indian Prime Minister Nrendra Modi was to address a virtual summit on AI this week (October 5) in India. Called RAISE 2020, for Responsible AI for Social Empowerment, the summit is planned as a global meeting to exchange ideas and chart a course for using AI for social transformation, inclusion and empowerment in areas like healthcare, agriculture, education and smart mobility, according to an account from the South Asian news agency ANI. Indian AI startups will be showcasing their offerings as part of the AI Solution Challenge, a government effort to support tech entrepreneurs and startups by providing exposure, recognition and guidance. India’s strengths that position it well to become an AI lead include its healthy startup ecosystem, home to elite science and technology institutions, a robust digital infrastructure and millions of STEM graduates each year, the release indicated. Prime Minister Modi was to articulate an “AI for All” strategy, intent on building a model for the world on how to responsibly direct AI for social empowerment, the release stated. Government Has Launched AI Portal The Indian government earlier this year launched the National AI Portal, as a collaboration of the National Association of Software and Service Companies (Nasscom) and the National e-Governance Division of the Ministry of Electronics and Information Technology (MeitY). The portal’s objective is to function as a platform for AI-related advancements in India, with sharing of resources in articles, investment funding news for AI startups, and AI education resources in India. The portal will also distribute documents, case studies and research reports, and describe new job roles related to AI. Named IndiaAI, the site’s education focus aims to help professionals and students learn about and find work in the field of AI. Free and paid AI courses are available on subjects of Machine Learning, Data Visualization, and Cybersecurity, provided by educational institutions including IIT Bombay, third party content providers including SkillUp and edX, or private companies like IBM. The AI education program is open to students in classes 8-12 across thousands of schools in India. Some Skeptical of India’s Ability to Unlock AI’s Potential Skepticism about India’s ability to capitalize on its opportunities in AI is being voiced in some quarters. “The country is still miles away from unlocking the true value of AI in both the government and the private sector,” stated an account from CXOToday.com. India lags behind the top five geographies for private sector investment in AI, the account stated. The US is far ahead, with investments worth $18 billion, followed by Europe ($2.6 billion) and Israel ($1.8 billion). Only a few large companies are investing in AI R&D, being “risk averse.” Startups are having difficulty finding capital. Most vital is the need for the government and the private sectors to work hand-in-hand, particularly on investment in AI R&D. Sanjay Gupta, Country Head & VP, Google India, has stated that close collaboration between the private and public sector, and a focus on collective expertise and energies on the most pressing problems of today, will go a long way towards achieving the vision of a socially empowered, inclusive, and digitally transformed India, where AI has a big role to play. Read the source articles in an IDC press release, from the South Asian news agency ANI and CXOToday.com.

0
0
27027

article-image-data-scientist-sexiest-role-21st-century

Aarthi Kumaraswamy

08 Nov 2017

6 min read

Data Scientist: The sexiest role of the 21st century

Aarthi Kumaraswamy

08 Nov 2017

6 min read

"Information is the oil of the 21st century, and analytics is the combustion engine." -Peter Sondergaard, Gartner Research By 2018, it is estimated that companies will spend $114 billion on big data-related projects, an increase of roughly 300%, compared to 2013 (https://www.capgemini-consulting.com/resource-file-access/resource/pdf/big_dat a_pov_03-02-15.pdf). Much of this increase in expenditure is due to how much data is being created and how we are better able to store such data by leveraging distributed filesystems such as Hadoop. However, collecting the data is only half the battle; the other half involves data extraction, transformation, and loading into a computation system, which leverages the power of modern computers to apply various mathematical methods in order to learn more about data and patterns and extract useful information to make relevant decisions. The entire data workflow has been boosted in the last few years by not only increasing the computation power and providing easily accessible and scalable cloud services (for example, Amazon AWS, Microsoft Azure, and Heroku) but also by a number of tools and libraries that help to easily manage, control, and scale infrastructure and build applications. Such a growth in the computation power also helps to process larger amounts of data and to apply algorithms that were impossible to apply earlier. Finally, various computation- expensive statistical or machine learning algorithms have started to help extract nuggets of information from data. Finding a uniform definition of data science is akin to tasting wine and comparing flavor profiles among friends—everyone has their own definition and no one description is more accurate than the other. At its core, however, data science is the art of asking intelligent questions about data and receiving intelligent answers that matter to key stakeholders. Unfortunately, the opposite also holds true—ask lousy questions of the data and get lousy answers! Therefore, careful formulation of the question is the key for extracting valuable insights from your data. For this reason, companies are now hiring data scientists to help formulate and ask these questions. At first, it's easy to paint a stereotypical picture of what a typical data scientist looks like: t- shirt, sweatpants, thick-rimmed glasses, and debugging a chunk of code in IntelliJ... you get the idea. Aesthetics aside, what are some of the traits of a data scientist? One of our favorite posters describing this role is shown here in the following diagram: Math, statistics, and general knowledge of computer science is given, but one pitfall that we see among practitioners has to do with understanding the business problem, which goes back to asking intelligent questions of the data. It cannot be emphasized enough: asking more intelligent questions of the data is a function of the data scientist's understanding of the business problem and the limitations of the data; without this fundamental understanding, even the most intelligent algorithm would be unable to come to solid conclusions based on a wobbly foundation. A day in the life of a data scientist This will probably come as a shock to some of you—being a data scientist is more than reading academic papers, researching new tools, and model building until the wee hours of the morning, fueled on espresso; in fact, this is only a small percentage of the time that a data scientist gets to truly play (the espresso part however is 100% true for everyone)! Most part of the day, however, is spent in meetings, gaining a better understanding of the business problem(s), crunching the data to learn its limitations (take heart, this book will expose you to a ton of different feature engineering or feature extractions tasks), and how best to present the findings to non data-sciencey people. This is where the true sausage making process takes place, and the best data scientists are the ones who relish in this process because they are gaining more understanding of the requirements and benchmarks for success. In fact, we could literally write a whole new book describing this process from top-to-tail! So, what (and who) is involved in asking questions about data? Sometimes, it is process of saving data into a relational database and running SQL queries to find insights into data: "for the millions of users that bought this particular product, what are the top 3 OTHER products also bought?" Other times, the question is more complex, such as, "Given the review of a movie, is this a positive or negative review?" This book is mainly focused on complex questions, like the latter. Answering these types of questions is where businesses really get the most impact from their big data projects and is also where we see a proliferation of emerging technologies that look to make this Q and A system easier, with more functionality. Some of the most popular, open source frameworks that look to help answer data questions include R, Python, Julia, and Octave, all of which perform reasonably well with small (X < 100 GB) datasets. At this point, it's worth stopping and pointing out a clear distinction between big versus small data. Our general rule of thumb in the office goes as follows: If you can open your dataset using Excel, you are working with small data. Working with big data What happens when the dataset in question is so vast that it cannot fit into the memory of a single computer and must be distributed across a number of nodes in a large computing cluster? Can't we just rewrite some R code, for example, and extend it to account for more than a single-node computation? If only things were that simple! There are many reasons why the scaling of algorithms to more machines is difficult. Imagine a simple example of a file containing a list of names: B D X A D A We would like to compute the number of occurrences of individual words in the file. If the file fits into a single machine, you can easily compute the number of occurrences by using a combination of the Unix tools, sort and uniq: bash> sort file | uniq -c The output is as shown ahead: 2 A 1 B 1 D 1 X However, if the file is huge and distributed over multiple machines, it is necessary to adopt a slightly different computation strategy. For example, compute the number of occurrences of individual words for every part of the file that fits into the memory and merge the results together. Hence, even simple tasks, such as counting the occurrences of names, in a distributed environment can become more complicated. The above is an excerpt from the book Mastering Machine Learning with Spark 2.x by Alex Tellez, Max Pumperla and Michal Malohlava. If you would like to learn how to solve the above problem and other cool machine learning tasks a data scientist carries out such as the following, check out the book. Use Spark streams to cluster tweets online Run the PageRank algorithm to compute user influence Perform complex manipulation of DataFrames using Spark Define Spark pipelines to compose individual data transformations Utilize generated models for off-line/on-line prediction

0
0
26776

article-image-baidu-releases-ezdl-a-platform-that-lets-you-build-ai-and-machine-learning-models-without-any-coding-knowledge

Melisha Dsouza

03 Sep 2018

3 min read

Baidu releases EZDL - a platform that lets you build AI and machine learning models without any coding knowledge

Melisha Dsouza

03 Sep 2018

3 min read

Chinese internet giant Baidu released ‘EZDL’ on September 1. EZDL allows businesses to create and deploy AI and machine learning models without any prior coding skills. With a simple drag-and-drop interface, it takes only four steps to train a deep learning model that’s built specifically for a business’ needs. This is particularly good news for small and medium sized businesses for whom leveraging artificial intelligence might ordinarily prove challenging. Youping Yu, general manager of Baidu’s AI ecosystem division, claims that EZDL will allow everyone to access AI “in the most convenient and equitable way”. How does EZDL work? EZDL focuses on three important aspects of machine learning: image classification, sound classification, and object detection. One of the most notable features about EZDL is the small size of the training data sets required to create artificial intelligence models. For image classification and object recognition, it requires just 20 to 100 images per label. For sound classification, it needs only 50 audio files at the most. The training can be completed in just 15 minutes in some cases, or a maximum of one hour for more complex models. After a model has been trained, the algorithm can be downloaded as a SDK or uploaded into a public or private cloud platform. The algorithms created support a range of operating systems, including Android and iOS. Baidu also claims an accuracy of more than 90 percent in two-thirds of the models it creates. How EZDL is already being used by businesses Baidu has demonstrated many use cases for EZDL. For example: A home decorating website called ‘Idcool’ uses EZDL to train systems that automatically identify the design and style of a room with 90 percent accuracy. An unnamed medical institution is using EZDL to develop a detection model for blood testing. A security monitoring firm used it to make a sound-detecting algorithm that can recognize “abnormal” audio patterns that might signal a break-in. Baidu is clearly making its mark in the AI race. This latest release follows the launch of its Baidu Brain platform for enterprises two years ago. Baidu Brain is already used by more than 600,000 developers. Another AI service launched by the company is its conversational DuerOS digital assistant, which is installed on more than 100 million devices. As if all that weren't enough, Baidu has also been developing hardware for artificial intelligence systems in the form of its Kunlun chip, designed for edge computing and data center processing - it’s slated for launch later this year. Baidu will demo EZDL at TechCrunch Disrupt SF, September 5th to 7th at Moscone West, 800 Howard St., San Francisco. For more on EZDL visit the Baidu's website for the project. Read next Baidu Apollo autonomous driving vehicles gets machine learning based auto-calibration system Baidu announces ClariNet, a neural network for text-to-speech synthesis

0
0
25906

article-image-dr-brandon-explains-decision-trees-jon

Aarthi Kumaraswamy

08 Nov 2017

3 min read

Dr.Brandon explains Decision Trees to Jon

Aarthi Kumaraswamy

08 Nov 2017

3 min read

[box type="shadow" align="" class="" width=""]Dr. Brandon: Hello and welcome to the third episode of 'Date with Data Science'. Today we talk about decision trees in machine learning. Jon: Decisions are hard enough to make. Now you want me to grow a decision tree. Next, you'll say there are decision jungles too! Dr. Brandon: It might come as a surprise to you, Jon, but decision trees can help you make decisions easier. Imagine you are in a restaurant and you are given a menu card. A decision tree can help you decide if you want to have a burger, pizza, fries or a pie, for instance. And yes, there are decision jungles, but they are called random forests. We will talk about them another time. Jon: You know Bran, I have never been very good at making decisions. But with food, it is easy. It's ALWAYS all you can have. Dr. Brandon: Well, my mistake. Let's take another example. You go to the doctor's after your binge eating at the restaurant with stomach complaints. A decision tree can help your doctor decide if you have a problem and then to choose a treatment option based on what your symptoms are. Jon: Really!? Tell me more. Dr. Brandon: Alright. The following excerpt introduces decision trees from the book Apache Spark 2.x Machine Learning Cookbook by Siamak Amirghodsi, Meenakshi Rajendran, Broderick Hall, and Shuen Mei. To know how to implement them in Spark read this article. [/box] Decision trees are one of the oldest and more widely used methods of machine learning in commerce. What makes them popular is not only their ability to deal with more complex partitioning and segmentation (they are more flexible than linear models) but also their ability to explain how we arrived at a solution and as to "why" the outcome is predicated or classified as a class/label. A quick way to think about the decision tree algorithm is as a smart partitioning algorithm that tries to minimize a loss function (for example, L2 or least square) as it partitions the ranges to come up with a segmented space which are best-fitted decision boundaries to the data. The algorithm gets more sophisticated through the application of sampling the data and trying a combination of features to assemble a more complex ensemble model in which each learner (partial sample or feature combination) gets to vote toward the final outcome. The following figure depicts a simplified version in which a simple binary tree (stumping) is trained to classify the data into segments belonging to two different colors (for example, healthy patient/sick patient). The figure depicts a simple algorithm that just breaks the x/y feature space to one-half every time it establishes a decision boundary (hence classifying) while minimizing the number of errors (for example, a L2 least square measure): The following figure provides a corresponding tree so we can visualize the algorithm (in this case, a simple divide and conquer) against the proposed segmentation space. What makes decision tree algorithms popular is their ability to show their classification result in a language that can easily be communicated to a business user without much math: If you liked the above excerpt, please be sure to check out Apache Spark 2.0 Machine Learning Cookbook it is originally from to learn how to implement deep learning using Spark and many more useful techniques on implementing machine learning solutions with the MLlib library in Apache Spark 2.0.

0
0
25692

Savia Lobo

05 Feb 2018

6 min read

AutoML : Developments and where is it heading to

Savia Lobo

05 Feb 2018

6 min read

With the growing demand in ML applications, there is also a demand for machine learning tasks such as data preprocessing, optimizing model hyperparameters and so on to be easily handled by non-experts. This is because, these tasks were repetitive and due to the complexity were considered to be handled only by ML experts. To support this cause and to maintain off-the-shelf quality of machine learning methods without expert knowledge, Google came out with a project named AutoML, an approach that automates designing of ML models. You could also refer to our article on Automated Machine Learning (AutoML) for a clear understanding on how AutoML functions. Trying AutoML on smaller datasets AutoML brought in altogether new dimensions within machine learning workflows where repetitive tasks performed by human experts could be taken over by machines. When Google started off with AutoML, they applied the AutoML approach onto two smaller datasets in DL namely, CIFAR-10 and Penn Treebank to test them on image recognition and language modeling tasks respectively. The result was, AutoML approach could design models that were at par with the ones designed by the ML experts. Also, on comparing the designs drafted by humans and AutoML, it was seen that the machine-suggested architecture included new elements. These elements were later known to alleviate gradient vanishing/exploding issues, which concludes that the machines provided a new architecture which could be more useful for multiple tasks. Also, the machine designed architecture has many channels so that the gradients could flow backwards. This could help explain why LSTM RNNs work better than standard RNNs. Trying AutoML on larger datasets After a success in small scale datasets, Google tested AutoML on large scale datasets such as ImageNet and COCO object detection dataset. Testing AutoML on these was a challenge because of their higher orders of magnitude, and also because simply applying AutoML directly to ImageNet would require many months of training the AutoML method. In order to apply AutoML to large scale datasets, some alterations were made within the AutoML approach for it to be more tractable to large scale datasets. The changes include: Redesigning the search space so that AutoML could find the best layer which can then be stacked many times in a flexible manner to create a final network. Carry out architecture search on CIFAR-10 dataset and transfer the best learned architecture to ImageNet image classification and COCO object detection datasets. Thus, AutoML could find out two best layers i.e normal cell and reduction cell, which when combined resulted into a novel architecture called as “NASNet”. These two work well with CIFAR-10, and also ImageNet and COCO object detection. NASNet was seen to have a prediction accuracy of 82.7% on the validation, as stated by Google. Such an accuracy surpassed all previous inception models built by Google. Further, the learned features from the ImageNet classification were transferred to carry out object detection tasks using the COCO dataset. The learned features combined with a faster R-CNN resulted into a state-of-the-art predictive performance on the COCO object detection task in both the largest as well as mobile-optimized models. Google suspected that these image features learned by ImageNet and COCO can be reused for various other computer vision applications. Hence, Google open-sourced NASNet for inference on image classification and for object detection in the Slim and Object Detection TensorFlow repositories. Towards Cloud AutoML: Automated Machine learning platform for everyone Cloud AutoML has been Google’s latest buzz for its customers as it makes AI available for everyone. Using Google’s advanced techniques such as learning2learn and transfer learning, Cloud AutoML helps businesses having limited ML expertise, to start building their own high-quality custom models. Thus, Cloud AutoML benefits AI experts by improving their productivity and explore new fields in AI. The experts can also aid less-skilled engineers to build powerful systems. Companies such as Disney and Urban Outfitters are using AutoML for making search and shopping on their websites more relevant. With AutoML going on cloud, Google released its first Cloud AutoML product, Cloud AutoML Vision, an Image Recognition tool that enables fast and easy to build custom ML models. This tool has a drag-and-drop interface that allows one to easily upload images, train and manage the models, and then deploy those trained models directly on Google Cloud. When used to classify popular public datasets like ImageNet and CIFAR, Cloud AutoML Vision has shown state-of-the-art results. These results included fewer misclassifications than the generic ML APIs results. Here are some highlights on Cloud AutoML vision: It is built on Google’s leading image recognition approaches, along with transfer learning and neural architecture search technologies. Hence, one can expect an accurate model even if the business has a limited expertise in ML. One can build a simple model in minutes or a full, production-ready model in a day in order to pilot AI-enabled application. AutoML Vision has a simple graphical UI using which one can easily specify data. It later turns the data into a high quality model customized for one’s specific needs. Starting off with Images, Google plans to roll out Cloud AutoML tools and services for text and audio too. However, Google isn’t the only one in the race; other competitors including AWS and Microsoft are also bringing in tools such as Amazon’s SageMaker and Microsoft’s service for customizing Image recognition model, to aid developers with automating machine learning. Some other automated tools include: Auto-sklearn: An automated project that aids scikit-learn project--package of common machine learning functions--to choose the right estimator function. The Auto-sklearn includes a generic estimator function that conducts analysis to determine the best algorithm and set of hyperparameters for a given Scikit-learn job. Auto-WEKA : An inspiration from the Auto-sklearn is for machine learners using Java programming language and the Weka ML package. Auto-WEKA uses a fully automated approach to select a learning algorithm and sets its hyperparameters, unlike previous methods which used to address this in isolation. H2o Driverless AI : This uses a web-based UI and is specifically designed for business users who want to gain insights from data but do not want to get into the intricacies of machine learning algorithms. This tool allows users to choose one or multiple target variables in the dataset that needs a solution, and the system provides the answer. The results are in the form of interactive charts, explained with annotations in plain English. Currently, Google’s AutoML is leading them. It would be exciting to see how Google scales an automated ML environment exactly the same as traditional ML. Not only Google, but also other businesses are contributing to the movement towards adopting an automated machine learning ecosystem. We saw some tools joining the automation league and can expect more tools to join them. Also, these tools could go on cloud in future for an extended availability for non-experts, similar to the AutoML cloud by Google. With machine learning going automated, we can expect more and more systems to move a step closer to widening the scope for AI.

0
0
25232

article-image-tensorflow-1-10-rc0-released

Amey Varangaonkar

24 Jul 2018

2 min read

Tensorflow 1.10 RC0 released

Amey Varangaonkar

24 Jul 2018

2 min read

Continuing the recent trend of rapid updates introducing significant fixes and new features, Google have released the first release candidate for Tensorflow 1.10. TensorFlow 1.10 RC0 brings some improvements in model training and evaluation, and also how Tensorflow runs in a local environment. This is Tensorflow’s fifth update release in just over a month, which includes two major version updates, the previous one being Tensorflow 1.9 What’s new in Tensorflow 1.10 RC0? The tf.contrib.distributions module will be deprecated in this version. This module is primarily used to work with statistical distributions Upgrade to NCCL 2.2 will be mandatory in order to perform GPU computing with this version of Tensorflow, for added performance and efficiency. Model training speed can now be optimized by improving the communication between the model and the Tensorflow resources. For this, the RunConfig function has been updated in this version. The Tensorflow development team also announced support for Bazel - a popular build and testing automation software - and deprecated support for cmake starting with Tensorflow 1.11. This version also incorporated some bug fixes and performance improvements to the tf.data, tf.estimator and other related modules. To get full details on the features list of this release candidate, you can check out Tensorflow’s official release page on Github. No news on Tensorflow 2.0 yet Many developers were expecting the next major release of Tensorflow, Tensorflow 2.0, to be released in late July or August. However, the announcement of this release candidate and the mention of the next version update (1.11) means they will have to wait for some more time before they get to know more about the next breakthrough release. Read more Why Twitter (finally!) migrated to Tensorflow Python, Tensorflow, Excel and more – Data professionals reveal their top tools Can a production ready Pytorch 1.0 give TensorFlow a tough time?

0
0
25162

article-image-openai-lp-a-new-capped-profit-company-to-accelerate-agi-research-and-attract-top-ai-talent

Fatema Patrawala

12 Mar 2019

3 min read

OpenAI LP, a new “capped-profit” company to accelerate AGI research and attract top AI talent

Fatema Patrawala

12 Mar 2019

3 min read

A move that has surprised many, OpenAI yesterday announced the creation of a new for-profit company to balance its huge expenditures into compute and AI talents. Sam Altman, the former president of Y Combinator who stepped down last week, has been named CEO of the new “capped-profit” company, OpenAI LP. But some worry that this move may result in making the innovative company no different from the other AI startups out there. With the OpenAI LP their mission is to ensure that artificial general intelligence (AGI) benefits all of humanity, primarily by attempting to build safe AGI and share the benefits with the world. OpenAI mentions on their blog that “returns for our first round of investors are capped at 100x their investment (commensurate with the risks in front of us), and we expect this multiple to be lower for future rounds as we make further progress.” Any returns beyond the cap amount will revert to OpenAI. OpenAI LP’s primary obligation is to advance the aims of the OpenAI Charter. All investors and employees sign agreements that OpenAI LP’s obligation to the Charter always comes first, even at the expense of some or all of their financial stake. But the major reason behind the new for-profit subsidiary can be explicitly put up as OpenAI in need of more money. The company anticipates to spend billions of dollars in building large-scale cloud compute, attracting and retaining talented people, and developing AI supercomputers in the coming years. The cash burn rate of a top AI research company is staggering. Consider OpenAI’s recent OpenAI Five project — a set of coordinated AI bots trained to compete against human professionals in the video game Dota 2. OpenAI rented 128,000 CPU cores and 256 GPUs at approximately US$2500 per hour for the time-consuming process of training and fine-tuning its OpenAI Five models. Additionally consider the skyrocketing cost of retaining top AI talents. A New York Times story revealed that OpenAI paid its Chief Scientist Ilya Sutskever more than US$1.9 million in 2016. The company currently employs some 100 pricey talents for developing its AI capabilities, safety, and policies. OpenAI LP will be governed by the original OpenAI Board. Only a few on the Board of Directors are allowed to hold financial stakes, and those who do not will be able to vote on decisions if the financial interests are seen to conflict with OpenAI’s mission. People have linked the new for-profit company with OpenAI’s recent controversial decision to withhold the code and training dataset for their language model GPT-2, ostensibly due concerns they might be used for malicious purposes such as generating fake news. A tweet from a software engineer suggested an ulterior motive: “I now see why you didn’t release the fully trained model of #gpt2”. OpenAI Chairman and CTO Greg Brockman shot back: “Nope. We aren’t going to commercialize GPT-2.” OpenAI aims to forge a sustainable path towards long-term AI development. And it also plans to strike a balance between benefiting humanity and turning a profit. A big part of OpenAI’s appeal to top AI talents is it's not-for-profit character — will OpenAI LP mar that? And can OpenAI really strike a balance between benefiting humanity and turning a profit? Whether the for-profit shift will accelerate OpenAI’s mission or prove a detrimental detour remains to be seen, but the journey ahead is bound to be challenging. OpenAI’s new versatile AI model, GPT-2 can efficiently write convincing fake news from just a few words

0
0
25018

article-image-gender-bias-in-the-driving-systems-of-ai-autonomous-cars-from-ai-trends

Matthew Emerick

08 Oct 2020

17 min read

Gender Bias In the Driving Systems of AI Autonomous Cars from AI Trends

Matthew Emerick

08 Oct 2020

17 min read

0
0
23560

article-image-data-governance-in-operations-needed-to-ensure-clean-data-for-ai-projects-from-ai-trends

Matthew Emerick

15 Oct 2020

5 min read

Data Governance in Operations Needed to Ensure Clean Data for AI Projects from AI Trends

Matthew Emerick

15 Oct 2020

5 min read

By AI Trends Staff Data governance in data-driven organizations is a set of practices and guidelines that define where responsibility for data quality lives. The guidelines support the operation’s business model, especially if AI and machine learning applications are at work. Data governance is an operations issue, existing between strategy and the daily management of operations, suggests a recent account in the MIT Sloan Management Review. “Data governance should be a bridge that translates a strategic vision acknowledging the importance of data for the organization and codifying it into practices and guidelines that support operations, ensuring that products and services are delivered to customers,” stated author Gregory Vial is an assistant professor of IT at HEC Montréal. To prevent data governance from being limited to a plan that nobody reads, “governing” data needs to be a verb and not a noun phrase as in “data governance.” Vial writes, “The difference is subtle but ties back to placing governance between strategy and operations — because these activities bridge and evolve in step with both.” Gregory Vial, assistant professor of IT at HEC Montréal An overall framework for data governance was proposed by Vijay Khatri and Carol V. Brown in a piece in Communications of the ACM published in 2010. The two suggested the strategy is based on five dimensions that represent a combination of structural, operational and relational mechanisms. The five dimensions are: Principles at the foundation of the framework that relate to the role of data as an asset for the organization; Quality to define the requirements for data to be usable and the mechanisms in place to assess that those requirements are met; Metadata to define the semantics crucial for interpreting and using data — for example, those found in a data catalog that data scientists use to work with large data sets hosted on a data lake. Accessibility to establish the requirements related to gaining access to data, including security requirements and risk mitigation procedures; Life cycle to support the production, retention, and disposal of data on the basis of organization and/or legal requirements. “Governing data is not easy, but it is well worth the effort,” stated Vial. “Not only does it help an organization keep up with the changing legal and ethical landscape of data production and use; it also helps safeguard a precious strategic asset while supporting digital innovation.” Master Data Management Seen as a Path to Clean Data Governance Once the organization commits to data quality, what’s the best way to get there? Naturally entrepreneurs are in position to step forward with suggestions. Some of them are around master data management (MDM), a discipline where business and IT work together to ensure the accuracy and consistency of the enterprise’s master data assets. Organizations starting down the path with AI and machine learning may be tempted to clean the data that feeds a specific application project, a costly approach in the long run suggests one expert. “A better, more sustainable way is to continuously cure the data quality issues by using a capable data management technology. This will result in your training data sets becoming rationalized production data with the same master data foundation,” suggests Bill O’Kane, author of a recent account from tdwi.org on master data management. Formerly an analyst with Gartner, O’Kane is now the VP and MDM strategist at Profisee, a firm offering an MDM solution. If the data feeding into the AI system is not unique, accurate, consistent and time, the models will not produce reliable results and are likely to lead to unwanted business outcomes. These could include different decisions being made on two customer records thought to represent different people, but in fact describe the same person. Or, recommending a product to a customer that was previously returned or generated a complaint. Perceptilabs Tries to Get in the Head of the Machine Learning Scientist Getting inside the head of a machine learning scientist might be helpful in understanding how a highly trained expert builds and trains complex mathematical models. “This is a complex time-consuming process, involving thousands of lines of code,” writes Martin Isaksson, co-founder and CEO of Perceptilabs, in a recent account in VentureBeat. Perceptilabs offers a product to help automation the building of machine learning models, what it calls a “GUI for TensorFlow.”. Martin Isaksson, co-founder and CEO, Perceptilabs “As AI and ML took hold and the experience levels of AI practitioners diversified, efforts to democratize ML materialized into a rich set of open source frameworks like TensorFlow and datasets. Advanced knowledge is still required for many of these offerings, and experts are still relied upon to code end-to-end ML solutions,” Isaksson wrote.. AutoML tools have emerged to help adjust parameters and train machine learning models so that they are deployable. Perceptilabs is adding a visual modeler to the mix. The company designed its tool as a visual API on top of TensorFlow, which it acknowledges as the most popular ML framework. The approach gives developers access to the low-level TensorFlow API and the ability to pull in other Python modules. It also gives users transparency into how the model is architected and a view into how it performs. Read the source articles in the MIT Sloan Management Review, Communications of the ACM, tdwi.org and VentureBeat.

0
0
22819

Tech News - Artificial Intelligence

Sherin Thomas explains how to build a pipeline in PyTorch for deep learning workflows

Paper in Two minutes: Attention Is All You Need

Startup Focus: Sea Machines Winning Contracts for Autonomous Marine Systems from AI Trends

Youtube promises to reduce recommendations of ‘conspiracy theory’. Ex-googler explains why this is a 'historic victory'

What is Meta Learning?

How Reinforcement Learning works

India Engages in a National Initiative to Support Its AI Industry from AI Trends

Data Scientist: The sexiest role of the 21st century

Baidu releases EZDL - a platform that lets you build AI and machine learning models without any coding knowledge

Dr.Brandon explains Decision Trees to Jon

Trending Topics

AutoML : Developments and where is it heading to

Tensorflow 1.10 RC0 released

OpenAI LP, a new “capped-profit” company to accelerate AGI research and attract top AI talent

Gender Bias In the Driving Systems of AI Autonomous Cars from AI Trends

Data Governance in Operations Needed to Ensure Clean Data for AI Projects from AI Trends

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access