We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7
14.
2 Machine Learning and Deep LearningLearning ObjectiveDifferentiate among supervised,
semi-supervised, unsupervised, reinforcement, and deep learning.Machine learning (ML) is an application of artificial intelligence that provides systems with the ability to automatically learn and improve from experience without being explicitly programmed. Machine learning focuses on the development of computer programs that can access data and use those data to learn from themselves.First we discuss how machine learning differs from traditional computer programming and expert systems. We then discuss problems inherent to developing ML systems. We close this section with a discussion of several types of machine learning: supervised, semi-supervised, unsupervised, and reinforcement.Traditional Programming versus Machine LearningFundamentally, traditional programming is a structured combination of data and a computer algorithm (computer program) that produces answers. In supervised machine learning, developers train the system with labelled input data and the expected output results. After the system is trained, developers feed it with unlabelled input data and examine the accuracy of the output data. Let’s look at an example of the difference between traditional programming and supervised machine learning.Traditional ProgrammingLet’s say that we want to know the product of two numbers. The first column is a and the second column is b. With traditional programming, we create an algorithm (computer code), or c = a x b. The results are 24, 15, and 18.abc = a × b639424515218Supervised Machine LearningLet’s use the same numbers as our example above as labelled input data to train a supervised machine learning system. We feed the system with these relationships:6 and 4 are related to 243 and 5 are related to 15We want to know how these numbers relate, so we let the system evaluate the relationships of known values and check its accuracy.6 ? 4 = 243 ? 5 = 15The system determines that the relationship between each pair is “multiply.” If we say that the question mark is multiply and check our results, we find that they are correct.So, if we then feed 9 and 2 into the system, it will tell us that the relationship is 9 x 2 = 18.When the machine learning algorithm is trained on large amounts of labelled training data, it produces predictions for additional examples.Check Your UnderstandingExpert Systems versus Machine LearningExpert systems (ESs) are computer systems that attempt to mimic human experts by applying expertise in a specific domain. Essentially, an ES transfers expertise from a human domain expert (or other source) to the system. This knowledge is then stored in the system, typically in the form of IF-THEN rules. The more complex ESs are composed of thousands of these rules.ESs can make inferences and arrive at conclusions. Then, like a human expert, they offer advice or recommendations. Also like human experts, they can explain the logic behind the advice.Expert systems do present some problems. For instance, transferring domain expertise from human experts to the expert system can be difficult because humans cannot always explain how they know what they know. In addition, even if the domain experts can explain their entire reasoning process, it might not be possible to automate that process. The process might be either too complex or too vague, or it might require too many rules. Essentially, it is very difficult to program all the possible decision paths into an expert system.There are significant differences between expert systems and machine learning systems. First, ESs require human experts to provide the knowledge for the system. In contrast, ML systems do not require human experts. Further, much like traditional programming, expert systems must be formally structured in the form of rules. By contrast, machine learning algorithms learn from ingesting vast amounts of data and adjusting hyperparameters and parameters (discussed below).Check Your UnderstandingMachine Learning Bias (Also Called Algorithm Bias)Designers must consider the many types of bias when developing machine learning systems. The sources of bias include underspecification, how developers approach a problem, and the data used to train the system.UnderspecificationThe training process for a machine learning system can produce multiple models, all of which pass the testing phase. However, these models will differ in small, arbitrary ways, depending on things such as the random values given to the nodes in a neural network before training starts, the number of training runs, and more (see Section 14.3). Developers typically overlook these differences if they do not impact how an ML model performs on its test. Unfortunately, these differences can lead to huge variations in how the model performs in the real world. Essentially, even if a training process can produce a good model, it could still ultimately produce a poor model. The process will not know the difference, and neither will the developers, until the model is employed in the real world.How Developers Approach a ProblemLet’s look at a simple example that illustrates how you might frame a problem. Consider the following numbers: 4, 9, 3, 6, 11, and 5. What is the next number in this series? Your answer to this problem is a product of how you intuitively see the problem and how you frame it. If you think in arithmetic terms, you may try some combination of addition and subtraction based on a pattern that you think exists. If you are a statistician, you may try to perform regression on the numbers to determine what the next one would be. There are many possible approaches.The “answer” here is that there is no right answer because we picked these numbers at random. The critical point is how you chose to approach the problem. Your approach reveals your bias as to how you would try to solve the problem.This simple example illustrates an essential issue in the field of AI. How developers approach or frame a problem determines how they set up the process of building the AI system and, ultimately, how the algorithm learns and produces answers.How Data Can Bias an ML SystemThe third type of bias comes from the data that are used to train the system. One type of data bias, known as data shift, comes from a mismatch between the data used to train and test the system and the data the system actually encounters in the real world. For example, an ML system trained only with current customers might not be able to predict the behaviours of new customers who are not represented in the training data. Another source of data bias is hidden relationships in the data. For example, it is common for people to live in neighbourhoods where mostly people of their own racial background live. Therefore, when ML systems use postal codes/areas/neighbourhoods (for marketing, hiring, etc., purposes), it can lead to outcomes that are biased against people of certain races.In addition, an ML system trained on biased data will likely pick up the same biases that already exist in society. For instance, ML systems used for criminal risk assessment have been found to be biased against people of colour.As a result, machine learning raises many ethical questions. ML systems trained on data sets collected from biased samples can exhibit these same biases when they are used, a problem called algorithmic bias. For example, using job hiring data from a firm with biased hiring policies could cause an ML system to duplicate this bias by scoring job applicants accordingly. Clearly, collecting the data and documenting the algorithmic rules used by an ML system in a responsible manner is a critical component of developing ML systems.Check Your UnderstandingFalse PositivesAnother challenging problem when building AI systems or evaluating outputs is seeing conditions where none actually exist, which is called a false positive. A false positive is a result that indicates that a given condition exists when it in fact does not. An example of a false positive is convicting an innocent person, identifying an email as spam when it is not, flagging a legitimate transaction as fraudulent, and many others.Analyzing complex data sets can be difficult. However, by being aware of false positives, AI practitioners can assess data objectively and not be misled by apparent, but erroneous, conditions.We now turn our attention to the various types of machine learning: supervised, semi-supervised, unsupervised, reinforcement, and deep. Figure 14.1 provides a look at how these types differ.FIGURE 14.1The types of machine learning.A flowchart describes the types of machine learning. The flow starts with a process, Is the system looking for patterns in massive amounts of data, question mark, at the first level. If this process is Yes, it leads to the process, Great, exclamation mark, It is machine learning, at the second level. If No, it leads to another process, Game over. You are looking at the wrong flowchart, at the second level. The process, Great, exclamation mark, It is machine learning, further leads to the process, Is the system being told what to look for, question mark, at the third level. If this process is Yes, it leads to another process, Ok, telltale sign of supervised learning, at the fourth level. If No, it leads to the process, Is the system trying to teach an objective through trial and error, question mark, at the fourth level. The process, Ok, telltale sign of supervised learning leads to another process, Is the system using deep neural networks, question mark, at the sixth level. If the process, Is the system trying to teach an objective through trial and error, question mark, at the fourth level is Yes, it leads to another process, Definitely reinforcement learning, at the fifth level. If No, it leads to the process, Then it must be unsupervised learning, at the fifth level. Both the processes at the fifth level lead to the process, Is the system using deep neural networks, question mark, at the sixth level. If the process at the sixth level is Yes, it leads to the process, Welcome to the land of deep learning, hyphen, add, open double quotes, deep, close double quotes, to each technique’s name, hyphen, for example, deep supervised learning. If No, it leads to the process, How boring. Keep each technique’s name as is.Check Your UnderstandingSupervised LearningAs we discussed in the previous section, supervised learning is a type of machine learning in which the system is given labelled input data and the expected output results. Developers input massive amounts of data during the training phase and stipulate what output should be obtained from each specific input value. Developers then input unlabelled, never-been-seen data values to verify that the model is accurate.Classification and regression analysis are important techniques for supervised learning. Classification algorithms are used when the outputs are restricted to a limited set of values; regression algorithms are used when the outputs can have any numerical value within a certain range.Classification refers to a predictive modelling problem in which the system generates a class label for a given set of input data. There are four types of classification.Binary classification refers to classification problems that have only two class labels. Examples are email spam detection (spam or not), churn prediction (churn or not), and conversion prediction (buy or not).Multi-class classification refers to classification problems with more than two class labels. Examples are news article categories, plant species classification, and optical character recognition.Multi-label classification refers to classification problems that have two or more class labels, where one or more class labels can be predicted for each example. Consider the example of photo classification, where a given photo may have multiple objects in the scene. The classification model may predict the presence of multiple known objects in the photo, such as an automobile, a person, a stop sign, and so on.The main difference between multi-class classification and multi-label classification is that, in multi-class classification, the classes are mutually exclusive (e.g., an email is either spam or not). However, in multi-label classification, each label represents a different classification task. Imbalanced classification refers to classification problems in which the number of classes in each class is unequally distributed. Typically, imbalanced classification problems are binary classification problems in which the majority of data points in the training data belong to one class and a minority to another class. Examples are fraud detection, outlier detection, and medical diagnostic tests.Linear regression is a supervised machine learning algorithm in which the predicted output is continuous and has a constant slope. This algorithm is used to predict continuous variables such as sales or price, rather than classifying them into categories with a classification algorithm. There are two main types of linear regression: simple and multiple.In simple linear regression, a single independent variable is used to predict the value of a dependent variable. For example, the Italian clothing company Benetton is examining the relationship between its annual sales and the amount the firm spends on advertising. Benetton uses simple linear regression, using advertising as the independent (predictor) variable to predict the dependent variable, sales.In multiple linear regression, two or more independent variables are used to predict the value of a dependent variable. Suppose that Benetton wants to analyze the impact of product price, product advertising expense, store location, and season of the year on product sales. The firm would conduct a multiple linear regression, with price, advertising expense, store location, and season as the independent variables predicting the dependent variable, product sales.Check Your UnderstandingSemi-Supervised LearningSemi-supervised learning is a type of machine learning that combines a small amount of labelled data with a large amount of unlabelled data during training. For example, semi-supervised learning is excellent for text document classification because it is very difficult to find a large amount of labelled text documents. The reason is that it is not efficient to have a human read through entire text documents to classify and label them. In this case, the algorithm learns from a small amount of labelled text documents while still being able to classify large amounts of unlabelled text documents in the training data.Check Your UnderstandingUnsupervised LearningUnsupervised learning is a type of machine learning that searches for previously undetected patterns in a data set with no pre-existing labels and with minimal human supervision. The best time to use unsupervised learning is when an organization does not have data on desired outcomes. An example is when the firm wants to determine a target market for an entirely new product that it has never sold before.Cluster analysis is one of the primary techniques in unsupervised learning. Cluster analysis groups, or segments, data points to identify common characteristics. It then reacts based on whether each new piece of data exhibits these characteristics.Example: Finding customer segments. Clustering is an unsupervised ML technique in which the goal is to find groups or clusters in input data. Developers use clustering to determine customer segments in marketing data using variables such as gender, location, age, education, income bracket, and many others.Example: Feature selection. Assume that developers want to predict which customers of a bank are likely to be interested in a new credit card promotion. Since the bank cannot offer the promotion to all its customers, the goal is to find customers who are most likely to be interested in the new credit card promotion. The bank needs to analyze large amounts of data about each customer to make these predictions, including the customer’s average monthly income, average monthly debt payments, credit history, age, interest in similar prior promotions, and many other variables.Because banks typically collect more data than they use for such decisions, not all of the variables are relevant for predicting a customer’s likelihood of interest in a new promotion. For instance, does an applicant’s age make any difference in deciding whether they would be interested in the promotion? Is the customer’s gender important? (Note that, in certain countries, it might be illegal to include gender or age considerations in such decisions.) For this reason, eliminating unnecessary variables is an essential part of training an ML system. In feature selection, developers try to eliminate a subset of the original set of features (variables).Check Your UnderstandingReinforcement LearningReinforcement learning is a type of machine learning in which the system learns to achieve a goal in an uncertain, potentially complex environment. In reinforcement learning, the system faces a game-like situation where it employs trial and error to find a solution to a problem. The developer awards penalties or rewards to the system for the actions it performs so that it will do what the developer wants. The system’s goal is to maximize the total reward.Although the designers set the reward policy—that is, the rules of the game—they give the model no hints or suggestions for how to solve the problem. The system must determine how to perform the task to maximize the reward, beginning with totally random trials and finishing with sophisticated tactics.There are numerous examples of reinforcement learning applications. Some of these are:Recommendation systemsAutomated ad bidding and buyingDynamic resource allocation in wind farms, HVAC (heating and air conditioning) systems, and computer clusters in data centresAutomated calibration of engines and other machinesRobotic controlAutonomous vehicles such as self-driving carsSupply chain optimizationCheck Your UnderstandingDeep LearningDeep learning is a subset of machine learning in which artificial neural networks learn from large amounts of data. When supervised, semi-supervised, unsupervised, and reinforcement learning systems use neural networks, we add the term deep to each one, resulting in deep supervised learning, deep semi-supervised learning, and so on.Deep learning systems can solve complex problems even when they utilize a data set that is very diverse and unstructured. These systems can discover new patterns without being exposed to labelled historical or training data. Widely used examples of deep learning are automatic speech recognition, image recognition, natural language processing, customer relationship management, recommendation systems, and drug discovery.IT’s About Business 14.1 POM HRM AI in the Global Shipping IndustryThe Shipping IndustryThe shipping industry is changing with the aim of enhancing efficiency and safety at ports and at sea. From small boats to huge container ships, vessels are integral components of the global economy. According to the United Nations, ships transport approximately 90 percent of all worldwide commerce.Maritime companies are leveraging machine learning and other technologies to design smart ports. In addition, they are working to create more autonomous ships. At times, transforming a port involves updating an existing technical infrastructure made up of less sophisticated components. For example, historically some ports have relied on low-tech, manual solutions. In these ports, workers physically visit a vessel with ropes, which they use to measure the length and width of the ship. Then they decide which dock the ship should enter.Smart ports. In contrast, smart ports employ advanced, innovative technologies to monitor and improve their operations. These technologies frequently take the form of digital twins. A digital twin is a digital representation of a physical system that maps that entity into a three-dimensional virtual system. Digital twins integrate and analyze multiple data sources, including sensors on port equipment and vessels, inbound and outbound ship traffic, harbour size, live weather conditions, and many other variables. Digital twins also enable port authorities to improve mooring and casting off, and to remotely control cranes and other equipment.For example, in shallow water, the tides play a major role in scheduling loading and unloading operations, especially for larger vessels. Buoys equipped with sensors monitor tidal changes, water temperature, and other variables. These metrics provide a clearer picture of real-time conditions. Instead of the port sending a human team to check the buoys, the buoys transmit their information to the port in real time.When ships arrive or leave the dock, they must be loaded and unloaded efficiently and safely. To do so, ports use industrial cranes to transfer containers to and from each ship and around the port. Over time, the transfer of tonnes of cargo as winds blow through a port can compromise the cranes’ structural integrity. At times, metal fatigue can even cause the catastrophic collapse of a crane itself. By monitoring the crane’s structure and the meteorological environment at the port, operators can adjust docking and crane operations to increase safety. Cranes are equipped with cameras, an anemometer—an instrument that measures wind velocity—and other sensors to monitor the stress on the crane’s structure while it is operating. Machine learning can then analyze these data to monitor trends and predict failures before they occur.When a ship arrives and is unloading, port authorities must have the necessary number and types of trucks and trains waiting to receive the containers. The authorities monitor GPS sensors on the trucks and trains to precisely determine their locations, thus increasing the efficiency of transferring the containers.Once the various sensors, ships, and port systems are integrated, machine learning systems can then optimize maritime scheduling. The system can answer such questions as: Which dock should we send the ship to? Which train or trucks should be waiting as the ship offloads its cargo? Where should the trains or trucks be positioned in the port when a ship is ready to load? These insights help to reduce bottlenecks and prevent accidents.The largest port in Europe, the Port of Rotterdam (Netherlands), has embraced digitization. Through its Smart Infrastructure program, it aims to have ships autonomously enter and leave the port by 2030. Further, the port operates an unstaffed container terminal that utilizes autonomous cranes.The second-largest port in Europe is the Port of Antwerp (Belgium), which has also digitized. The port utilizes a digital three-dimensional map that contains actionable real-time information as its digital twin. Smart cameras and computer vision are integrated into the map to produce an “intelligent wharf” that ensures that ships are safely and properly moored while also reducing wait times. The cameras have automatic image-recognition capabilities that increase security around the port and enable authorities to analyze and optimize equipment movements.Ports are also working to prevent human operators from getting too tired. In the past, crane operators had to work under difficult conditions with narrow margins for error. They would sit for hours in a cockpit 10 to 12 stories high and watch several monitors while they controlled a giant crane. They also had to constantly take wind pressure into account. This job was stressful and caused fatigue.In contrast, crane operators can now utilize sensor data to control their cranes remotely from buildings on the ground. This process reduces both operator fatigue and the risk of error during loading and unloading operations.Smart ships. The shipping industry is reimagining the way cargo moves at sea. Smart ships use a number of sensors such as GPS, cameras, radar, and LIDAR for operations. (LIDAR is a system that measures distance to a target by illuminating the target with laser light and measuring the reflected light with a sensor.) Sensors enable these ships to operate with reduced crew sizes. Because the system reduces the number of crew members, the ship can be constructed without the life-support systems used to accommodate human crew members such as galleys, housing compartments, food storage, and restrooms. In turn, reducing ship size can minimize construction costs and fuel consumption and leave more room for storage. Put simply, a smaller autonomous ship can carry roughly the same amount of cargo as a much larger crewed vessel.Smart ships must provide for control of the vessel and have the capacity to monitor the condition of each component of the vessel. Therefore, these ships typically contain redundant systems that prevent the system from failing in the event of a malfunction. Machine learning systems can analyze trends in data to predict operational failures in advance and notify ship owners of necessary maintenance. QuestionsWhich type of machine learning applies to the following applications in this case? Support your answer for each application.Port operations;Crane operations;Predicting preventive maintenance on port equipment;Optimizing the loading and unloading of ships.What are the advantages to ship owners of implementing ML applications? Support your answer.What are the disadvantages to ship owners of implementing ML applications? Support your answer. What are the advantages to crew members of implementing ML applications on ships? Support your answer.Sources: Compiled from Fine Art Shippers, “What Is Smart Shipping, and How Can It Change Art Logistics?,” Fine Art Shippers, August 12, 2022; C. Cole, “Creating a Digital Twin: The Key to Building a Smart Vessel,” Siemens.com blog, January 27, 2022; Sobel Network Shipping, “Smart Shipping for the Supply Chain,” Sobel Network Shipping Company, Inc., December 21, 2021; N. Joshi, “Why AI Adoption Is Lagging in International Shipping,” BBN Times, December 7, 2021; M. Ball, “New AI-Powered Smart Shipping Solutions,” Unmanned Systems Technology, November 5, 2021; A. Inam, “5 Ways AI Can Help Mitigate the Global Shipping Crisis,” TechCrunch, August 9, 2021; J. Donnelly, “How Can Digital Twins Help Ports?” Port Technology, July 20, 2021; J. Donnelly, “Digital Twin Shortens Playing Field for Expansive Belfast Harbour,” Port Technology, May 21, 2021; J. Donnelly, “Measure, Optimise Terminal Machinery to Truly Greenify Ports,” Port Technology, April 1, 2021; J. Donnelly, “AIDrivers Emphasises Digital Twin in PSA Singapore Project,” Port Technology, March 19, 2021; N. Joshi, “Charting the Role of Artificial Intelligence in Shipping,” BBN Times, October 7, 2020; A. Oriel, “Decoding the Future of Global Shipping with Artificial Intelligence,” Industry Wired, August 28, 2020; J. Jackson, “What Are Smart Ports and How Will They Change the Shipping Industry?” Searates, August 21, 2020; R. Adams, “AI on the High Seas: Digital Transformation Is Revolutionizing Global Shipping,” TechRepublic, August 19, 2020; F. Martin, “How AI & Automation Has Overhauled the Shipping Industry,” Analytics India Magazine, January 31, 2019; G. Spencer, “AI and Cargo Shipping: Full Speed Ahead for Global Maritime Trade,” Microsoft.com, April 23, 2018.Check Your UnderstandingBefore you go on… What is the difference between traditional computer programming and machine learning systems?What is the difference between expert systems and machine learning systems?Describe three types of bias that can negatively impact the development of machine learning systems.Differentiate between supervised learning, semi-supervised learning, unsupervised learning, reinforcement learning, and deep learning.