Recent Advances in Artificial Intelligence
Recent Advances in Artificial Intelligence
ABSTRACT
In recent times, human capabilities have been replaced or enhanced to a greater extent by intelligent machines in
many areas. The term artificial intelligence refers to the intelligence exhibited by machines or software. Artificial
intelligence has greatly improved the performance of the manufacturing and service systems. Study within various
areas of AI has given rise to the rapidly growing technology referred to as an expert system. Application areas of AI
are having an enormous impact on various fields of life as expert system is widely used nowadays to resolve
complex problems in various areas as science, engineering, business, medicine, forecasting. AI is currently making
significant progress in image recognition, computer vision, language manipulation, or prediction, with huge
possible impacts for healthcare, transportation, media, or the military. The objective of the paper is to give an
overview of this technology and the application areas of this technology. This paper will also explore the current
use of Artificial Intelligence technologies in various fields as well as the recent developments made in this domain.
1. INTRODUCTION
Artificial intelligence is playing an increasing role in the research on management science and operational
research areas. Intelligence is commonly considered as the ability to collect knowledge and reason about
knowledge to solve complex problems. In the near future, intelligent machines will replace human
capabilities in many areas. Artificial intelligence is the study and development of intelligent machines and
software that can reason, learn, gather knowledge, communicate, manipulate, and perceive the objects. It
is the study of the computation that makes it possible to perceive reason and act. Artificial intelligence is
different from psychology because it emphasis on computation and is different from computer science
because of its emphasis on perception, reasoning, and action. It makes machines smarter and more useful.
It works with the help of artificial neurons - artificial neural network) and scientific theorems - if then
statements and logics. Artificial intelligence has the advantages over the natural intelligence as it is more
permanent, consistent, less expensive, has the ease of duplication and dissemination, can be documented
and can perform certain tasks much faster and better than the human. Major Artificial Intelligence areas
are Expert Systems, Natural Language Processing, Speech Understanding, Robotics and Sensory
Systems, Computer Vision and Scene Recognition, Intelligent ComputerAided Instruction, Neural
Computing. The various techniques applied in artificial intelligence are Neural Network, Fuzzy Logic,
Evolutionary Computing, and Hybrid Artificial Intelligence.
2. LITERATURE SURVEY
1
Artificial Intelligence
A new automated AI system with improved computational efficiency and a much smaller carbon footprint
has been developed recently by MIT. This system trains one large neural network comprising many pre-
trained subnetworks of different sizes that can be tailored to diverse hardware platforms without
retraining. The system cuts the energy required for training and running neural networks.
2.1.2 Motive
Artificial intelligence has become a focus of certain ethical concerns, but it also has some major
sustainability issues. Recently, a report estimated that the amount of power required for training and
searching for a certain neural network architecture involves the emissions of roughly 626,000 pounds of
carbon dioxide. That’s equivalent to nearly five times the lifetime emissions of the average car, including
its manufacturing. This issue gets even more severe in the model deployment phase, where deep neural
networks need to be deployed on diverse hardware platforms, each with different properties and
computational resources.
To overcome such issues a new automated AI system has been developed for training and running certain
neural networks. Results indicate that, by improving the computational efficiency of the system in some
key ways, the system can cut down the pounds of carbon emissions involved - in some cases, down to low
triple digits. The system, which is called ‘a once-for-all network’, trains one large neural network
comprising many pre-trained subnetworks of different sizes that can be tailored to diverse hardware
platforms without retraining. This dramatically reduces the energy usually required to train each
specialized neural network for new platforms - which can include billions of internet of things (IoT)
devices. Using the system to train a computer-vision model, it is estimated that the process required
roughly 1/1,300 the carbon emissions compared to today’s state-of-the-art neural architecture search
approaches while reducing the inference time by 1.5-2.6 times.
The system was built on a recent AI advance called AutoML (for automatic machine learning), which
eliminates manual network design. Neural networks automatically search massive design spaces for
network architectures tailored, for instance, to specific hardware platforms. But there’s still a training
efficiency issue: Each model has to be selected then trained from scratch for its platform architecture.
An AutoML system that trains only a single, large “once-for-all” (OFA) network that serves as a
“mother” network, nesting an extremely high number of subnetworks that are sparsely activated from the
mother network was used. OFA shares all its learned weights with all subnetworks — meaning they come
essentially pre-trained. Thus, each subnetwork can operate independently at inference time without
retraining.
2.1.4 Applications
An OFA convolutional neural network (CNN) was trained - commonly used for image-processing tasks
with versatile architectural configurations, including different numbers of layers and “neurons,” diverse
filter sizes, and diverse input image resolutions. Given a specific platform, the system uses the OFA as
the search space to find the best subnetwork based on the accuracy and latency tradeoffs that correlate to
the platform’s power and speed limits.
2
Artificial Intelligence
For an IoT device, for instance, the system will find a smaller subnetwork. For smartphones, it will select
larger subnetworks, but with different structures depending on individual battery lifetimes and
computation resources. OFA decouples model training and architecture search, and spreads the one-time
training cost across many inference hardware platforms and resource constraints.
This relies on a “progressive shrinking” algorithm that efficiently trains the OFA network to support all of
the subnetworks simultaneously. It starts with training the full network with the maximum size, then
progressively shrinks the sizes of the network to include smaller subnetworks. Smaller subnetworks are
trained with the help of large subnetworks to grow together. In the end, all of the subnetworks with
different sizes are supported, allowing fast specialization based on the platform’s power and speed limits.
It supports many hardware devices with zero training costs when adding new devices.
2.2.1 Definition
A Generative Model is a powerful way of learning any kind of data distribution using unsupervised
learning and it has achieved tremendous success in just a few years. All types of generative models aim at
learning the true data distribution of the training set so as to generate new data points with some
variations. But it is not always possible to learn the exact distribution of our data either implicitly or
explicitly and so we try to model a distribution that is as similar as possible to the true data distribution.
For this, we can leverage the power of neural networks to learn a function that can approximate the model
distribution to the true distribution.
Generative models the distribution of individual classes. A Generative model is the one that can generate
data. It models both the features and the class i.e. the complete data. If we model P(x,y): one can use this
probability distribution to generate data points - and hence all algorithms modeling P(x,y) are generative.
2.2.2 Examples
1. Naive Bayes models P(c) and P(d|c) - where c is the class and d is the feature vector.
2. Also, P(c,d) = P(c) * P(d|c)
3. Hence, Naive Bayes in some form models, P(c,d)
4. Bayes Net
2.2.3 Approaches
3
Artificial Intelligence
Variational Autoencoders (VAE) and Generative Adversarial Networks (GAN). VAE aims at maximizing
the lower bound of the data log-likelihood and GAN aims at achieving an equilibrium between Generator
and Discriminator.
2.2.4 Application
Generative models have many short-term applications. But in the long run, they hold the potential to
automatically learn the natural features of a dataset, whether categories or dimensions or something else
entirely.
4. CONCLUSIONS
The main objective of this paper is to analyze the various advances made in the field of Artificial
Intelligence. An important development is of a system that reduces the number of carbon footprints
emitted by AI neural network models. In total, one OFA can comprise more than 10 quintillion — that’s a
1 followed by 19 zeroes — architectural settings, covering probably all platforms ever needed. But
training the OFA and searching it ends up being far more efficient than spending hours training each
neural network per platform. Moreover, OFA does not compromise accuracy or inference efficiency.
Instead, it provides state-of-the-art ImageNet accuracy on mobile devices. And, compared with the state-
of-the-art industry-leading CNN model, the researchers say OFA provides 1.5-2.6 times speedup, with
superior accuracy. Another important algorithm that has proved to be very helpful is the Generative
models. Generative models are a more complete extension of standard statistical learning framework, that
supposed to learn more general knowledge about the underlying data.
REFERENCES
[1] Gregory D Hager, Ann Drobnis, Fei Fang, Rayid Ghani, Amy Greenwald, Terah Lyons, David C Parkes, Jason
Schultz, Suchi Saria, Stephen F Smith. “Artificial intelligence for social good” Preprint arXiv:1901.05406, 2019
[2] Bettina Berendt. “AI for the common good. Pitfalls, challenges, and ethics pen-testing.” Paladyn, Journal of
Behavioral Robotics, 10(1):44–65, 2019.
[3] Jorg L, Kristian Kersting, and Katharina Morik. “Computational Sustainability” volume 645. Springer, 2016.
[4] Carla P Gomes. “Computational Sustainability: Computational methods for a sustainable environment, economy,
and society.” The Bridge, 39(4):5–13, 2009.
[5] Thomas G Dietterich. “Machine learning in ecosystem informatics and sustainability.” In Twenty-First
International Joint Conference on Artificial Intelligence, 2009.
4
Artificial Intelligence
[6] Maria De-Arteaga, William Herlands, Daniel B Neill, and Artur Dubrawski. “Machine learning for the
developing world.” ACM Transactions on Management Information Systems (TMIS), 9(2):9, 2018.
[7] James H Faghmous and Vipin Kumar. “A big data guide to understanding climate change: The case for theory-
guided data science.” Big data, 2(3):155–163, 2014.
[8] David G. Victor. “How artificial intelligence will affect the future of energy and climate.” https:
//www.brookings.edu/research/how-artificial-intelligence-will-affect-thefuture-of-energy-and-climate/, 2019.
[9] Thomas G Dietterich. “Machine learning in ecosystem informatics and sustainability.” In Twenty-First
International Joint Conference on Artificial Intelligence, 2009.
[10] Sarvapali Ramchurn, Perukrishnen Vytelingum, Alex Rogers, and Nicholas R Jennings. “Putting the “smarts”
into the smart grid: A grand challenge for artificial intelligence.” Communications of the ACM, 55(4):86–97, 2012.
[11] Alasdair Bruce and Lyndon Ruff. “Deep learning solar PV and carbon intensity forecasts.”
http://powerswarm.co.uk/wp-content/uploads/2018/10/2018.10.18-Bruce-NationalGrid-ESO-Deep-Learning-Solar-
PV-and-Carbon-Intensity.pdf.
[12] Gregory S Ledva, Laura Balzano, and Johanna L Mathieu. Real-time energy disaggregation of a distribution
feeder’s demand using online learning. IEEE Transactions on Power Systems, 33(5):4730–4740, 2018.
[13] Rob Matheson “Reducing the carbon footprint of artificial intelligence” http://news.mit.edu/2020/artificial-
intelligence-ai-carbon-footprint-0423
[14] Tommi S. Jaakkola, David Haussler. “Exploiting generative models in discriminative classifiers” Conference
on Advances in neural information processing systems II
[15] Zhaoqing Pan, Weijie Yu1, Xiaokai Yi, Asifullah Khan,Feng Yuan, Yuhui Zheng. “Recent Progress on
Generative Adversarial Networks (GANs): A Survey” 10.1109/ACCESS.2019.2905015, IEEE Access
[16] Y. Bengio, A. Courville, and P. Vincent. “Representation learning: Areview and new perspectives,” IEEE
Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 8, pp. 1798–1828, Aug 2013.
[17] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio,
“Generative adversarial nets,” Advances in Neural Information Processing Systems 27, 2014, pp. 2672–
2680.
[18] A. Radford, L. Metz, and S. Chintala. “Unsupervised representation learning with deep convolutional
generative adversarial networks,” International Conference on Learning Representations, 2016.
[19] M. Mirza and S. Osindero. “Conditional Generative Adversarial Nets” arXiv e-prints, p. arXiv:1411.1784, Nov.
2014.
[20] A. Makhzani, J. Shlens, N. Jaitly, and I. Goodfellow. “Adversarial autoencoders,” International Conference on
Learning Representations, 2016.
[21] L. Metz, B. Poole, D. Pfau, and J. Sohl-Dickstein. “Unrolled generative adversarial networks,” International
Conference on Learning Representations, 2017.
[22] T. Che, Y. Li, A. P. Jacob, Y. Bengio, and W. Li. “Mode regularized generative adversarial networks,”
International Conference on Learning Representations, 2017.
[23] J. Zhao, M. Mathieu, and Y. LeCun, “Energy-based generative adversarial network,” International Conference
on Learning Representations, 2017.
[24] M. Arjovsky, S. Chintala, and L. Bottou. “Wasserstein generative adversarial networks,” Proceedings of the
34th International Conference on Machine Learning, vol. 70, Aug 2017, pp. 214–223.
[25] T. Miyato, T. Kataoka, M. Koyama, and Y. Yoshida. “Spectral normalization for generative adversarial
networks” in International Conference on Learning Representations, 2018.
5
Artificial Intelligence
[26] Sangeun Oh, Yongsu Jung, Seongsin Kim, Ikjin Lee, Namwoo Kang, “Deep Generative Design: Integration of
Topology Optimization and Generative Models” ASME 2018 International Design Engineering Technical
Conferences & Computers and Information in Engineering Conference.