0% found this document useful (0 votes)
118 views2 pages

Reading Assignment - 474 Final

John Snow conducted one of the earliest data analysis exercises in 1854 by collecting cholera death counts in London and visualizing the data on a map. This revealed cholera was spread through water, not air as previously believed. Today we have vast amounts of digital data or "Big Data" that requires specialized analysis. Big Data is characterized by its large volume, variety, and velocity. Companies use tools like Hadoop and MongoDB to analyze unstructured Big Data for insights into operations, marketing, and more. Examples include Netflix's recommendations and Amazon's anticipatory shipping based on customer patterns. Key challenges are understanding and managing unstructured data at large scales while ensuring privacy and security.

Uploaded by

John Park
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
118 views2 pages

Reading Assignment - 474 Final

John Snow conducted one of the earliest data analysis exercises in 1854 by collecting cholera death counts in London and visualizing the data on a map. This revealed cholera was spread through water, not air as previously believed. Today we have vast amounts of digital data or "Big Data" that requires specialized analysis. Big Data is characterized by its large volume, variety, and velocity. Companies use tools like Hadoop and MongoDB to analyze unstructured Big Data for insights into operations, marketing, and more. Examples include Netflix's recommendations and Amazon's anticipatory shipping based on customer patterns. Key challenges are understanding and managing unstructured data at large scales while ensuring privacy and security.

Uploaded by

John Park
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Big Data: An Introduction

-By Rahila Athanikar, Fall 2015 ISDS 474, CSUF


Perhaps one of the earliest and most impactful data collection and interpretation exercise was
conducted by the physician John Snow in Soho, London in the year 1854. In the outbreak of Cholera,
when people were fleeing for their lives, Snow visited affected localities, and undertook one of the
riskiest data collection exercises ever. He went door to door, counting Cholera related deaths. By
visualizing this data on the city map, Snow was able to find that Cholera spread by water and not air,
as it was so believed at that time. Without this data visualization by Snow, Cholera have would still
remained a dreadful reality today.

Cut to the modern world, data and its interpretation continues to make possible modern life, as we
know it to be. Today, we have no dearth of data, in fact we face the other extreme - data available in
such extreme magnitudes and of such an ‘untamed’ nature, that we have coined a name for it ‘Big
Data’. In this paper, we describe Big Data, some implementation examples, and some issues &
challenges surrounding Big Data.

What is Big Data?

The term Big Data refers to large and complex data which may or may not be structured and which
cannot be processed using traditional data-processing applications. Most of the sources characterize
big data by 3Vs - volume, variety, and velocity at which the data is processed. Given the large volume
and variety of data, specialized data mining methods are used to obtain insight, to predict or to
classify. In the business world, since big data has the potential to help companies make more
informed decisions, organizations are ready to take the leap and invest in software techniques to
gather and analyze such big data.

Analyzing Big Data and Related Software.

Structured Big Data may be analyzed with software tools commonly used for predictive analytics,
data mining, and text analytics. However, the traditional data analytics tools, relational databases, and
data processing warehouses fail to work with Big Data if it is unstructured or semi-structured.

Such unstructured Big data is collected, analyzed and processed by new technologies and tools mainly
differentiated as open source or proprietary. The most popular open source platform is Enterprise
Hadoop with NoSql Database, Mongo DB and R. Paid tools include SAP, Teradata, Salesforce,
Amazon Web Services EC2, Oracle, and IBM Bigdata platform.

Big data analysis can be used for basic and advanced insights. Data can be simply sliced, diced, and
visualized in basic forms and monitored at the basic level or it can even be used for predictive
modeling and pattern-matching. Analytics are even used to improve operation performance or to
simply drive revenue.

Implementation Examples.

Big Data can be used across industries, by for profit and by non-profits as well. It can impact
marketing, operations, finance, retail, new product development, new market discoveries; the list as
such is endless. With a huge influx of data available at its disposal, most organizations have deployed
or are planning to deploy big data related projects to analyze data and gain actionable insights. One
such example is of Netflix, a giant in the entertainment business that uses data to gain competitive
edge. Netflix’s successful recommendation system is backed by cutting-edge data analytics.

In another recent example, the e-commerce giant Amazon has filed a patent for what it calls
‘anticipatory shipping’ in Jan 2014. With this technology backed by big data analysis, Amazon will
box and ship items that it predicts customers will want based on previous orders, product searches,
wish lists, etc. This will help Amazon cut delivery times and gain an edge on its rivals.

A certain fast food company (name not disclosed) is training its cameras to analyze the drive through
lanes and based on the length of the queue, the digital menu board is customized to showcase items
that the people might want to order based on their wait times in the queue.

There is no end to the list of astounding big data applications. But the common thread through all
these stories is that they are possible only because it has become possible to tinker with Big Data.

Key Issues and Challenges.

Most of the voluminous data available with organizations is unstructured and changes too fast,
challenging the processing capacity of existing database and software techniques. Understanding the
data is one of the key challenges organizations face. Visualization techniques can be useful to
understand data.

Data management is another issue which requires organization to extract, store, and process data to
derive meaningful insights. Other issues include synchronization of data across platforms, dealing
with outliers, addressing data quality, and shortage of talent big data analysts.

Technical challenges, data handling performance, scalability, data security, and workload diversity are
a few challenges that businesses face when dealing with big data. Implementing right technologies
that are flexible to accommodate business needs without compromising with security and
performance is especially important.

A challenge from the societal perspective is the issue of privacy. Intimate individual customer
information may become public causing much panic or embarrassment. Previously concealed
information such as race, gender, and sexual orientation can be deduced by analyzing such Big Data
and be used to discriminate during employment, credit card approvals, etc.

References (In addition to links provided to us on Titanium)

10 Big Data analytics privacy problems. Retrieved 11/11/15 http://bit.ly/1WJmmFR

What is Big Data and why it matters. (n.d.). Retrieved 11/08/15, http://bit.ly/1WMKfHd

PBS series - ‘How we got to Now’, S1: Episode 1 ‘Clean’ on Netflix

Amazon wants to ship your packet before you buy it. Retrieved 11/11/15 http://on.wsj.com/1M2x565

Big data for Dummies. Retrieved 11/08/15 http://bit.ly/1kli2vt

Ten big data case studies in a nut-shell. Retrieved 11/11/15 http://bit.ly/1NZPEZj

You might also like