0% found this document useful (0 votes)
34 views3 pages

SoP Draft 5

The document is a statement of purpose from an applicant for a graduate program in data science. It summarizes the applicant's background in data analysis projects at previous jobs, highlights their inclination towards problem solving and mathematics from an early age, and expresses a desire to further their education in data science through advanced coursework and exploration of deep learning principles. The applicant believes the graduate program will provide expertise to pursue opportunities as a data scientist.

Uploaded by

Nishu Garg
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views3 pages

SoP Draft 5

The document is a statement of purpose from an applicant for a graduate program in data science. It summarizes the applicant's background in data analysis projects at previous jobs, highlights their inclination towards problem solving and mathematics from an early age, and expresses a desire to further their education in data science through advanced coursework and exploration of deep learning principles. The applicant believes the graduate program will provide expertise to pursue opportunities as a data scientist.

Uploaded by

Nishu Garg
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Statement of Purpose

According to Forbes, more data has been created in the past two years than in the entire previous history
of human race and that information security is soon to become a $100 billion industry. This is a
testament to the data’s worth. I realised that implicitly in my short stint in the vast and potent field of data
science. Being conversant with basic concepts of data analysis, I look forward to strengthening my core
competencies and equipping myself with advanced data analysis tools with a <<GRADUATE
PROGRAM>> in <<UNIVERSITY>>
Right from school, I had an inclination towards puzzle solving, which nourished my aptitude in
mathematics. I scored a centum in my 10th grade board exams, qualified for the national level of the
International Mathematics Olympiad and was a finalist in NTSE and KVPY fellowships that encourage
academics and research in mathematics and applied sciences. Subsequently I pursued under-graduation in
the Department of Engineering design at the prestigious IIT-Madras. Courses like Functional and
Conceptual Design, Linear Algebra and Optimization, Introduction to Computation and Visualization,
and Numerical Methods introduced me to non-numeric data and techniques to utilize them. The
opportunity to test their effectiveness presented itself during my internship with Institute of Financial
Management and Research, where my team was charged with the job of determining fair meter rates for
the local autorickshaws, which were notorious for fleecing the customers. We conducted a survey and
came up with a simple linear model to fix the rates, which were later presented to the Transport
Commissioner of Chennai and implemented throughout the state. This was my first encounter with data
analysis. I exploited my subsequent academic projects to explore it, such as –
• A project in Information and Communication Technology for Development to identify potential
customers for organic products.
• A power and dimension optimization for the mini excavator that my team built for Product Design
Lab 2 as well as the multifunctional hospital bed that I designed for Sundaram Medical Devices,
during my academic internship.
My enthusiasm for data analysis prompted Cognizant Corporation to offer me a job in the Data Sciences
division, where I was trained in basic statistical concepts and tools used in data analysis. I scaled the steep
learning curve and secured Wiley’s certification for Advanced Analytics, Applied Machine Learning, and
Big Data.
My professional data science career commenced with my first project for a human resource consultant.
The task was to shortlist the right personnel for job interviews from 1.8 million resumes and profiles
created in their website. There were two segments to the problem – one was to cluster similar jobs
together from the job description, and the other was to identify candidates that fit each job cluster. As a
part of the team that worked towards a tangible solution, our first step was to do a hierarchical clustering
using skill-set and trimmed down 250 jobs into 35 clusters. I then employed phrase extraction and entity
recognition to extract educational status, expected pay and previous experience time-period, and co-word
analysis and topic modelling for skills, from the resume. Feature selection was inevitable in narrowing
down the 32 important variables. I, along with my team, developed logistic regression models for each
cluster to evaluate probabilities of candidates getting placed using historical placement records, which
improved the screening efficiency by 12%. Having automated the system end-to-end so as to score
freshly, the new candidates and candidates who updated their information, every 15 minutes, I further
designed the machine learning engine so as to relearn every month, compare itself to the incumbent
model and retain whichever performs better. This engagement helped me learn the ropes of a real-world
project by acquainting me with IT concepts like database management, and data science concepts like
handling unstructured data, data extraction, cleaning, model building and visualization.
My eagerness to traverse the ocean of data science was remunerated with an offbeat project for a
beverage distributor, which was to identify ideal sites for retail stores in Iowa. The data contained GPS
locations of prominent civil structures like hospitals, churches, schools, etc. and demographic distribution
of the region. I divided the state into 100 regions, 4 sub regions each (North, East, West, South centres),
derived a table of sites and proximity of structures around each. I developed a Huff gravity model to
compute average footfall each site would receive based on the above information, married it to the
demographics and sales data of pre-existing stores and built a linear model to come up with sites that
would do maximum sales. While this assignment uncovered the concept of spatial analytics, where I
learned to create spatial polygon maps, I wished I had an in-depth academic knowledge to contrive a
better model and developed a conviction to have myself theoretically well equipped.
My erstwhile exploits took me to a telecommunications giant where I probed further into data analysis. I
was tasked with reducing the sales and service call volume using the online customer interactions on
multiple channels. I applied PrefixSpan to identify most frequent sequences and alert the client once the
support crosses a threshold, which helped reduce the sales and service call volume by 8%. The next one
was to identify potential billing related calls using the historical bill records of 5 million customers. I
merged it with the demographics and their product subscription data, and fostered an ensemble of random
forest and gradient boosted trees to calculate the probabilities of customers making a call and identify the
prominent driving factors. Using those features, I categorized the population into 20 segments ranging
from most unlikely to call, to most likely to call. Here, again, my limited intellect compelled me to use an
ensemble instead of fine tuning the individual models, to present better results.
Satisfied with my hitherto performance, I was assigned a bigger challenge of highlighting the paths taken
by customers on the website that culminated in an order. I designed a 2nd order Markov Chain based
multi-channel attribution model to isolate major channel transitions and wedges confronted in the journey
and use the transition probabilities to choose an appropriate link to re-route the customer to a path with a
higher probability of success. The mathematical complexity that higher order Markov Chains brings in
confined me to realms of 2nd order.
In an attempt to make it even easier to place the order, I was asked to come up with a product
recommendation system. I proposed a hybrid of content-based suggestion using the user’s web search to
get their preferences and collaborative filtering to promote a product bought by other similar users. The
user preferences were captured in the form of latent features using Matrix Factorisation and matched
using cosine similarity.
I was excited with each challenge I faced, the learning opportunities they brought, and the algorithms I
got to experiment with, and juicing out the hidden story that the data can narrate, which is not evident in
plain sight. It also reinforced the fact that I could have achieved better results with comprehensive
understanding of the math behind the algorithms. Over the past 30 months, I have acclimatized to the
norms of the IT and data science industry and developed a strong foothold in basic data analysis routines.
I intend to build on my strong foundation and delve into the frontiers of this domain. Until now, I have
meddled extensively with static datasets and static models. I wish to foray into the paradigm of constantly
evolving dynamic models and dive into deep learning principles. I believe a graduate program in <<
GRADUATE PROGRAM / UNIVERSITY >> provides me the right platform to explore the spectrum
and choose my colour.
<< Paragraph about the program, courses and professors>>
I am a well-rounded individual who devotes due attention to life outside academia/work, being a
published short story writer and a poet. I have captained the school and university hockey teams on
various competitive stages and won laurels. Also, I have contributed greatly to social causes while
working with the National Service Scheme of IIT Madras and Transparent Chennai, an NGO in uplifting
roadside shop owners and vendors and I hope I can add greatly to the diversity of the community outside
academics too.
Such rich exposure to an array of data analysis techniques has only made me realise the unending
possibilities that can be achieved with mastery over them. I believe that a master’s program at
(<<UNIVERSITY NAME>>) will endow me with the technical and functional wherewithal and enhance
my prospects. as I intend to pursue employment opportunities as ((Data Scientist / Business intelligence
professional / something Relevant) in top-drawer firms specializing in << the specialization>> and help
me navigate through the volatile adversaries of the industry and come out with flying colours.

You might also like