Research Paper On iOS
Research Paper On iOS
44-48
ISSN 2320–088X
IMPACT FACTOR: 6.017
1
[email protected], 2 [email protected], 3 [email protected]
Abstract— Recent advances in software development and efforts toward context awareness and
personalization have brought closer the long-standing vision of the ubiquitous intelligent personal assistant.
This has become particularly salient in the context of smart-phones and electronic tablets, where natural
language interaction has the potential to considerably enhance mobile experience. It offers more options in
terms of user interface. This trend may well usher in a genuine paradigm shift in man-machine
communication. This paper presents an overview of SIRI which is voice controlled assistant for Apple,
technology used in it, its pros and cons and features, and finally discusses how the current implementation
might evolve in the near future to best mitigate any downside.
I. INTRODUCTION
With the advancement in the fields of artificial intelligence, Natural language processing, speech recognition,
etc. different commercial products like Apple‟s Siri, Microsoft‟s Cortana, etc. have been developed, which are
able to process the speech inputs and provide desired results. Now-a-days research is going in the areas of health
care, navigation, translation and other areas [1], so as to incorporate the above mentioned technologies for
solving different problems in an efficient and intelligent way.
A personal assistant is that person/agent who is able to provide distinct help at a given time and in a given
activity context. For instance, a secretary employed in any company performs activities such as answering
incoming calls, scheduling meetings and appointments, ordering products, or interacting with clients. Personal
assistants possess an important characteristic of adapting themselves according to the demands of their superiors.
One-to-one relationship exists between this personal assistant and the superior. The features of the digital
assistants are easiness of interaction, flexibility and simplicity. Voice-based input/output interface is the easiest
way to process speech inputs because voice-based interaction is usually simple, flexible and does not require
cognitive efforts, attention and/or memory resources on the side of the user. Besides all the above mentioned
benefits, there are many constraints of using personal digital assistants which include complexity in human
speech and varying contexts. Due to this reason all the commercial products based on personal digital assistance
are only developed for a specific description.
Artificial Intelligence (AI) has been propelled into the mainstream of learning. AI has many areas like
computer science, cognitive and learning sciences, game design, psychology, sociology, philosophy,
mathematics, neuroscience, linguistics, defence industry, medicine and education [2]. AI uses logical series of
steps called algorithms and advanced cognitive computing technologies to use the techniques of search and
pattern matching for providing solutions for the demanded answers. AI is an interdisciplinary field that is used
for diagnosis of illnesses, criminal identification and artificial instructions. To develop communication between
human and computer, AI possesses the ability to reason while processing a natural language and has different
scope of data in terms of the developments in the above mentioned fields.
AI, has some other descriptions as well. AI has the ability to comprehend, learn, solve, interpret and execute
complex mental process. AI is a subfield of computer science. Natural Language Processing (NLP) is provided
for human computer interaction in order to combine human learning and machine reasoning [3 – 5]. NLP is the
analysis of linguistic data, most commonly in the form of textual data such as documents or publications, using
computational methods.
II. SIRI
Some Apple products have built-in intelligent personal assistant-SIRI. SIRI is an offshoot of the DARPA
(funded project) and CALO (Cognitive Assistant that learns and Organizes) [6]. SIRI program can help a user to
schedule reminders or assist in texting, and can also do fun things like telling exactly what planes are flying
above your location and their departure time. With more recent Apple products a user can go entirely hands-free
by saying, "Hey, Siri."
SIRI uses voice queries and a natural language user interface to answer questions, and perform actions by
delegating requests to a set of Internet services. The software adapts to users' individual language usages,
searches, and preferences, with continuing use. Returned results are individualized. Third-party access to Siri,
was opened up by Apple with the release of IOS – 10 in 2016, including third-party messaging applications,
payments, ride-sharing, and Internet calling apps.
A. Technology used in SIRI
Siri uses Machine Learning technologies to function. Using ASR (Automatic speech recognition) to
transcribe human speech (in this case, short utterances of commands, questions, or dictations) into text. Users
speak natural language as voice commands in order to operate the mobile devices (iPhone 4S and later and
newer iPad and iPod Touch devices) and its applications. The idea is to provide high level modelling primitives
as integral part of a data model in order to facilitate the representation of real world situations.
B. Working of SIRI
The working of SIRI [7] can be explained in the following steps:
1. Voice Recognition – Whenever a person commands through his/her natural voice, the assistant must be
able to convert that analog signal to digital one and then „understand‟ what was being said after concatenating
the keywords altogether, and finally fixing/obeying the issue/command. This might sound trivial and easy but it
is the first step towards reaching the next, since without overcoming the hurdle of country-wise accents,
surrounding noises, and specific voices, one cannot successfully establish its working. It also timely learns how
its user sounds while speaking specific words. The speech recognition that Apple‟s Siri uses is 95% accurate
and has really low error rate.
2. Send everything to the apple servers on the cloud – Siri does not work locally on a mobile device and eats
its limited resources, but rather loads everything to the powerful computer servers so as to extend the maximum
efficiency and continuously improvise. There is an algorithm that identifies the keywords and go down towards
the flowchart branches (conceptually Tree data structure) that best match those keywords, so as to reach out to
meaningful conclusions. If it fails, it searches for another branch. If it fails here too, it asks whether the user
wants results from the Web. It hasn‟t reached to the point of conversational App but has numerous conditional
statements in its coding that respond according to the user‟s action.
3. Understand what the statement implies – If you ask Siri “Is there any nearby Indian restaurant?”, it will
check for the same through GPS, but what if you said “I love Margherita”? Only a human is capable to
understand that it is a type of Pizza, but an AI won‟t get it. If a coder writes code of an Artificial Intelligence,
he/she must know how sophisticated a machine ought to be. Also, it must be able to recognize the difference
© 2017, IJCSMC All Rights Reserved 45
Bisma Shakeel et al, International Journal of Computer Science and Mobile Computing, Vol.6 Issue.7, July- 2017, pg. 44-48
between words like – byte and bite, sheep and ship, dear and deer, etc. Siri is able to relate the words with each
other in terms of nouns, adjectives, verbs, and sentence as a whole. For instance, if you say “How fast a deer can
run?”, it will obviously match these words and conclude that it cannot be “dear”; likewise, if you say “I‟ll bite
you”, it will understand that there is no possibility that it is “byte”. There is nothing to get surprised by such
statements. People play with something they find new and there is no wonder that they speak out loud „any‟
random thought. Thus, SIRI is efficiently able to understand the context of a sentence and hence processes them
accordingly.
4. Action based on what was commanded – Here is the most challenging thing. Siri or any other AI assistant
you plan to develop must understand what you say. If it fails it might also drag you to potential dangerous
situation. For instance, if you said to book a flight, it must be capable to understand this and as well interact with
other Apps to perform the given task. Plus it must not interact with those sites that aren‟t your interest,
especially those that involve credit/debit card payment. One might get doomed if the assistant doesn‟t serve
appropriately.
There are other voice assistants as well but they haven‟t reached to the standard of accuracy SIRI currently
has, yet!
With the announcement of IOS – 10 in June 2016, Apple opened up limited third-party developer access to
Siri through a dedicated application programming interface (API) [9].
D. PROS AND CONS OF SIRI
The advantages of Apple‟s Siri are listed as:
Siri's original release on iPhone 4S received praise for its voice recognition and contextual knowledge of
user information, including calendar appointments.
Siri is an easier, faster way to get things done.
Siri is always with you on your iPhone, iPad, Mac, Apple Watch, and Apple TV-ready to help throughout
your day.
The imperfections in Apple‟s Siri are:
Siri's original release on iPhone 4S was criticized for requiring stiff user commands and having a lack of
flexibility.
It was also criticized for lacking information on certain nearby places, and for its inability to understand
certain English accents.
A number of media reports have indicated that Siri is lacking in innovation, particularly against new
competing voice assistants from other technology companies.
ACKNOWLEDGEMENT
We would like to thank our HOD Mrs. Yasmeen for providing us with useful and valuable comments on the
study. This study would not have been possible without their support. All remaining errors, if any, are our
responsibility. The usual disclaimer applies.
REFERENCES
[1] R. Belvin, R. Burns, and C. Hein, “Development of the HRL route navigation dialogue system,” in
Proceedings of ACL-HLT, 2001, pp. 1–5.
[2] V. Zue, S. Seneff, J. R. Glass, J. Polifroni, C. Pao, T.J.Hazen,and L.Hetherington, “JUPITER: A Telephone
Based Conversational Interface for Weather Information,” IEEE Transactions on Speech and Audio
Processing, vol. 8, no. 1, pp. 85–96, 2000.
[3] M. Kolss, D. Bernreuther, M. Paulik, S. St¨ucker, S. Vogel, and A. Waibel, “Open Domain Speech
Recognition & Translation: Lectures and Speeches,” in Proceedings of ICASSP, 2006.
[4] D. R. S. Caon, T. Simonnet, P. Sendorek, J. Boudy, and G. Chollet, “vAssist: The Virtual Interactive
Assistant for Daily Homer-Care,” in Proceedings of pHealth, 2011.
[5] Crevier, D. (1993). AI: The Tumultuous Search for Artificial Intelligence. New York, NY: Basic Books,
ISBN 0-465-02997-3.
[6] Sadun, E., &Sande, S. (2014). Talking to Siri: Mastering the Language of Apple's Intelligent Assistant.
Que Publishing.
[7] https://www.quora.com/How-does-Siri-work-2.
[8] https://www.cs.montana.edu/mwittie/publications/Assefi15Experimental.pdf.
[9] https://link.springer.com/chapter/10.1007/978-1-4614-6018-3_7.