0% found this document useful (0 votes)
77 views14 pages

Case Study

The document is a case study report submitted by Pratiksha Ahire and Mrunali Shelar to the University of Mumbai to fulfill requirements for a Bachelor of Engineering degree in Computer Engineering. It examines software engineering practices for machine learning based on a study of teams at Microsoft developing AI applications. The study identified challenges that software engineering teams face in integrating machine learning into their development processes and products. It also identified best practices adopted by Microsoft teams to address these challenges.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views14 pages

Case Study

The document is a case study report submitted by Pratiksha Ahire and Mrunali Shelar to the University of Mumbai to fulfill requirements for a Bachelor of Engineering degree in Computer Engineering. It examines software engineering practices for machine learning based on a study of teams at Microsoft developing AI applications. The study identified challenges that software engineering teams face in integrating machine learning into their development processes and products. It also identified best practices adopted by Microsoft teams to address these challenges.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

A

Case Study Report

On
Software Engineering for Machine Learning

Submitted in partial fulfilment of the requirement of the University of Mumbai


for the Degree of Bachelor of Engineering

(Computer Engineering)

By
Pratiksha Ahire
Mrunali Shelar

Under the guidance of

Prof. Kalidas Bhavale

Department of Computer Engineering


Suman Educational Trust’s
Dilkap Research Institute of Engineering & Management Studies
Mamdapur, Post:Neral, Taluka:Karjat, Dist:Raigad-410101
University of Mumbai
ACADEMIC YEAR 2022-2023

Dilkap Research Institute of Engineering &


Management Studies

Department of Computer
Engineering
Academic Year 2022-2023

CERTIFICATE

This is to certify that

Ms.Pratiksha Ahire,

Ms.Mrunali Shelar,

Sem.VII, BE Computer, Roll No: 01,59 has satisfactorily completed the requirements of the
Mini Project entitled

"Software Engineering for Machine Learning "


As prescribed by the University of Mumbai Under the guidance of

Prof. Kalidas Bhavale

Guide HOD

(Prof. Kalidas Bhavale) (Prof. Indira Joshi)

Internal Examiner External Examiner


ABSTRACT

Recent advances in machine learning have stimulated widespread interest


within the Information Technology sector on integrating AI capabilities into
software and services. This goal has forced organizations to evolve their
development processes. We report on a study that we conducted on observing
software teams at Microsoft as they develop AI-based applications. We consider
a nine-stage workflow process informed by prior experiences developing AI
applications (e.g., search and NLP) and data science tools (e.g. application
diagnostics and bug reporting). We found that various Microsoft teams have
united this workflow into pre-existing, well-evolved, Agile-like software
engineering processes, providing insights about several essential engineering
challenges that organizations may face in creating large-scale AI solutions for
the marketplace. We collected some best practices from Microsoft teams to
address these challenges. In addition, we have identified three aspects of the AI
domain that make it fundamentally different from prior software application
domains: 1) discovering, managing, and versioning the data needed for machine
learning applications is much more complex and difficult than other types of
software engineering, 2) model customization and model reuse require very
different skills than are typically found in software teams, and 3) AI
components are more difficult to handle as distinct modules than traditional
software components — models may be “entangled” in complex ways and
experience non-monotonic error behaviour. We believe that the lessons learned
by Microsoft teams will be valuable to other organizations.
INTRODUCTION

Personal computing. The Internet. The Web. Mobile computing. Cloud


computing. Nary a decade goes by without a disruptive shift in the dominant
application domain of the software industry. Each shift brings with it new
software engineering goals that spur software organizations to evolve their
development practices in order to address the novel aspects of the domain. The
latest trend to hit the software industry is around integrating artificial
intelligence (AI) capabilities based on advances in machine learning. AI broadly
includes technologies for reasoning, problem solving, planning, and learning,
among others. Machine learning refers to statistical modelling techniques that
have powered recent excitement in the software and services marketplace.
Microsoft product teams have used machine learning to create application suites
such as Bing Search or the Cortana virtual assistant, as well as platforms such as
Microsoft Translator for real-time translation of text, voice, and video,
Cognitive Services for vision, speech, and language understanding for building
interactive, conversational agents, and the Azure AI platform to enable
customers to build their own machine learning applications [1]. To create these
software products, Microsoft has leveraged its pre-existing capabilities in AI
and developed new areas of expertise across the company
II. BACKGROUND
A. Software Engineering Processes
The changing application domain trends in the software industry have
influenced the evolution of the software processes practiced by teams at
Microsoft. For at least a decade and a half, many teams have used
feedback-intense Agile methods to develop their software [2], [3], [4]
because they needed to be responsive at addressing changing customer
needs through faster development cycles. Agile methods have been
helpful at supporting further adaptation, for example, the most recent shift
to re-organize numerous team’s practices around DevOps [5], which
better matched the needs of building and supporting cloud computing
applications and platforms.1 The change to DevOps occurred fairly
quickly because these teams were able to leverage prior capabilities in
continuous integration and diagnostic-gathering, making it simpler to
implement continuous delivery.

B. ML Workflow
One commonly used machine learning workflow at Microsoft has been
depicted in various forms across industry and research [1], [9], [10], [11].
It has commonalities with prior workflows defined in the context of data
science and data mining, such as TDSP [12], KDD [13], and CRISP-DM
[14]. Despite the minor differences, these representations have in
common the data-cantered essence of the process and the multiple
feedback loops among the different stages.

C. Software Engineering for Machine Learning


The need for adjusting software engineering practices in the recent era
has been discussed in the context of hidden technical debt [18] and
troubleshooting integrative AI [19], [20]. This work identifies various
aspects of ML system architecture and requirements which need to be
considered during system design. Some of these aspects include hidden
feedback loops, component entanglement and eroded boundaries,
nonmonotonic error propagation, continuous quality states, and
mismatches between the real world and evaluation sets
D. Process
Maturity Software engineers face a constantly changing set of platforms
and technologies that they must learn to build the newest applications for
the software marketplace. Some engineers learn new methods and
techniques in school, and bring them to the organizations they work for.
Other learn new skills on the job or on the side, as they anticipate their
organization’s need for latent talent [25]. Software teams, composed of
individual engineers with varying amounts of experience in the skills
necessary to professionally build ML components and their support
infrastructure, themselves exhibit varying levels of proficiency in their
abilities depending on their aggregate experience in the domain
III. STUDY

We collected data in two phases: an initial set of interviews to gather the


major topics relevant to our research questions and a wide-scale survey about
the identified topics. Our study design was approved by Microsoft’s Ethics
Advisory Board.

A. Interviews
Because the work practice around building and integrating machine
learning into software and services is still emerging and is not uniform
across all product teams, there is no systematic way to identify the key
stakeholders on the topic of adoption. We therefore used a snowball
sampling strategy, starting with (1) leaders of teams with mature use of
machine learning (ML) (e.g., Bing), (2) leaders of teams where AI is a
major aspect of the user experience (e.g., Cortana), and (3) people
conducting company-wide internal training in AI and ML.

B. Survey
Based on the results of the interviews, we designed an open-ended
questionnaire whose focus was on existing work practice, challenges in
that work practice, and best practices (Figure 2). We asked about
challenges both directly and indirectly by asking informants to imagine
“dream tools” and improvements that would make their work practice
better. We sent the questionnaire to 4195 members of internal mailing
lists on the topics of AI and ML
IV. APPLICATIONS OF AI

Many teams across Microsoft have augmented their applications with


machine learning and inference, some in some surprising domains. We asked
survey respondents for the ways that they used AI on their teams. We card
sorted this data twice, once to capture the application domain in which AI was
being applied, and a second time to look at the (mainly) ML algorithms used to
build that application. We found AI is used in traditional areas such as search,
advertising, machine translation, predicting customer purchases, voice
recognition, and image recognition, but also saw it being used in novel areas,
such as identifying customer leads, providing design advice for presentations
and word processing documents, providing unique drawing features, healthcare,
and improving gameplay. In addition, machine learning is being used heavily in
infrastructure projects to manage incident reporting, identify the most likely
causes for bugs, monitor fraudulent fiscal activity, and to monitor network
streams for security breaches.
V. BEST PRACTICES WITH MACHINE LEARNING
IN SOFTWARE ENGINEERING

In this section, we present our respondents’ viewpoints on some of the


essential challenges associated with building largescale ML applications and
platforms and how they address them in their products. We categorized the
challenges by card sorting interview and survey free response questions, and
then used our own judgment as software engineering and AI researchers to
highlight those that are essential to the practice of AI on software teams.
A. End-to-end pipeline support As machine learning components have become
more mature and integrated into larger software systems, our participants
recognized the importance of integrating ML development support into the
traditional software development infrastructure. They noted that having a
seamless development experience covering (possibly) all the different stages
described in Figure 1 was important to automation. However, achieving this
level of integration can be challenging because of the different characteristics of
ML modules compared with traditional software components. For example,
previous work in this field [18], [19] found that variation in the inherent
uncertainty (and error) of data-driven learning algorithms and complex
component entanglement caused by hidden feedback loops could impose
substantial changes (even in specific stages) which were previously well
understood in software engineering (e.g., specification, testing, debugging, to
name a few). Nevertheless, due to the experimental and even more iterative
nature of ML development, unifying and automating the day to-day workflow
of software engineers reduces overhead and facilitate progress in the field.

B. Data availability, collection, cleaning, and management Since many machine


learning techniques are cantered around learning from large datasets, the
success of ML-centric projects often heavily depends on data availability,
quality and management [28]. Labelling datasets is costly and time consuming,
so it is important to make them available for use within the company (subject to
compliance constraints). Our respondents confirm that it is important to “reuse
the data as much as possible to reduce duplicated effort.” In addition to
availability, our respondents focus most heavily on supporting the following
data attributes: “accessibility, accuracy, authoritativeness, freshness, latency,
structuredness, ontological typing, connectedness, and semantic joinability.”
Automation is a vital cross-cutting concern, enabling teams to more efficiently
aggregate data, extract features, synthesize labelled examples. The increased
efficiency enables teams to “speed up experimentation and work with live data
while they experiment with new models.”

C. Education and Training The integration of machine learning continues to


become more ubiquitous in customer-facing products, for example, machine
learning components are now widely used in productivity software (e.g., email,
word processing) and embedded devices (i.e., edge computing). Thus, engineers
with traditional software engineering backgrounds need to learn how to work
alongside of the ML specialists. A variety of players within Microsoft have
found it incredibly valuable to scaffold their engineers’ education in a number
of ways. First, the company hosts a twice-yearly internal conference on machine
learning and data science, with at least one day devoted to introductions to the
basics of technologies, algorithms, and best practices. In addition, employees
give talks about internal tools and the engineering details behind novel projects
and product features, and researchers present cutting-edge advances they have
seen and contributed to academic conferences. Second, a number of Microsoft
teams host weekly open forums on machine learning and deep learning,
enabling practitioners to get together and learn more about AI. Finally, mailing
lists and online forums with thousands of participants enable anyone to ask and
answer technical and pragmatic questions about AI and machine learning, as
well as frequently share recent results from academic conferences.

D. Model Debugging and Interpretability Debugging activities for components


that learn from data not only focus on programming bugs, but also focus on
inherent issues that arise from model errors and uncertainty. Understanding
when and how models fail to make accurate predictions is an active research
area [29], [30], [31], which is attracting more attention as ML algorithms and
optimization techniques become more complex. Several survey respondents and
the larger Explainable AI community [32], [33] propose to use more
interpretable models, or to develop visualization techniques that make black-
box models more interpretable. For larger, multi-model systems, respondents
apply modularization in a conventional, layered, and tiered software
architecture to simplify error analysis and debuggability
.

VI. LIMITATIONS

Our case study was conducted with teams at Microsoft, a large, world-wide
software company with a diverse portfolio of software products. It is also one of
the largest purveyors of machine learning-based products and platforms. Some
findings are likely to be specific to the Microsoft teams and team members who
participated in our interviews and surveys. Nevertheless, given the high variety
of projects represented by our informants, we expect that many of the lessons
we present in this paper will apply to other companies. Some of our findings
depend on the particular ML workflow used by some software teams at
Microsoft. The reader should be able to identify how our model abstractions fit
into the particulars of the models they use. Finally, interviews and surveys rely
on self-selected informants and self-reported data. Wherever appropriate, we
stated that findings were our informants’ perceptions and opinions. This is
especially true with this implementation of our ML process maturity model,
which triangulated its measures against other equally subjective measures with
no objective baseline. Future implementations of the maturity model should
endeavour to gather objective measures of team process performance and
evolution.
VII. CONCLUSION

Many teams at Microsoft have put significant effort into developing an


extensive portfolio of AI applications and platforms by integrating machine
learning into existing software engineering processes and cultivating and
growing ML talent. In this paper, we described the results of a study to learn
more about the process and practice changes undertaken by a number of
Microsoft teams in recent years. From these findings, we synthesized a set of
best practices to address issues fundamental to the large-scale development and
deployment of ML-based applications. Some reported issues were correlated
with the respondents’ experience with AI, while others were applicable to most
respondents building AI applications. We presented a ML process maturity
metric to help teams self assess how well they work with machine learning and
offer guidance towards improvements. Finally, we identified three aspects of the
AI domain that make it fundamentally different than prior application domains.
Their impact will require significant research efforts to address in the future.
REFERENCES

[1] M. Salaries, D. Dean, and W. H. Tok, “Microsoft AI Platform,” in Deep


Learning with Azure. Springer, 2018, pp. 79–98.
[2] A. Bagel and N. Nagappan, “Usage and perceptions of agile software
development in an industrial context: An exploratory study,” in First
International Symposium on Empirical Software Engineering and Measurement
(ESEM 2007), Sept 2007, pp. 255–264.
[3] A. Bagel and N. Nagappan, “Pair programming: What’s in it for me?” in
Proc. of the Second ACM-IEEE International Symposium on Empirical
Software Engineering and Measurement, 2008, pp. 120–128.
[4] B. Murphy, C. Bird, T. Zimmermann, L. Williams, N. Nagappan, and A.
Bagel, “Have agile techniques been the silver bullet for software development at
Microsoft?” in 2013 ACM/IEEE Intl. Sump. on Empirical Software
Engineering and Measurement, Oct 2013, pp. 75–84.
[5] M. Senapati, J. Buchan, and H. Osman, “DevOps capabilities, practices, and
challenges: Insights from a case study,” in Proc. of the 22nd International
Conference on Evaluation and Assessment in Software Engineering 2018, 2018,
pp. 57–67.
[6] T. D. LaToza, G. Venolia, and R. DeLine, “Maintaining mental models: A
study of developer work habits,” in Proc. of the 28th International Conference
on Software Engineering, 2006, pp. 492–501.
[7] M. Kim, T. Zimmermann, R. DeLine, and A. Begel, “The emerging role of
data scientists on software development teams,” in Proc. of the 38th
International Conference on Software Engineering, 2016, pp. 96–107.
[8] M. Kim, T. Zimmermann, R. DeLine, and A. Begel, “Data scientists in
software teams: State of the art and challenges,” IEEE Transactions on Software
Engineering, vol. 44, no. 11, pp. 1024–1038, 2018.
[9] C. Hill, R. Bellamy, T. Erickson, and M. Burnett, “Trials and tribulations of
developers of intelligent systems: A field study,” in Visual Languages and
Human-Centric Computing (VL/HCC), 2016 IEEE Symposium on. IEEE, 2016,
pp. 162–170.

You might also like