0% found this document useful (0 votes)
324 views6 pages

The 42 V's of Big Data and Data Science PDF

The document lists 42 "V's" that represent concepts related to big data and data science. It begins with the original 3 V's of volume, velocity and variety that were proposed in 2001 to describe big data trends. Since then, various other lists have expanded on these initial 3 V's, with the document synthesizing these into a comprehensive list of 42 V's. These V's cover a wide range of topics from validity and value to variability, vastness, and more. The author provides this expanded list to help build a simple mental model for understanding big data and data science concepts.

Uploaded by

5oscilantes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
324 views6 pages

The 42 V's of Big Data and Data Science PDF

The document lists 42 "V's" that represent concepts related to big data and data science. It begins with the original 3 V's of volume, velocity and variety that were proposed in 2001 to describe big data trends. Since then, various other lists have expanded on these initial 3 V's, with the document synthesizing these into a comprehensive list of 42 V's. These V's cover a wide range of topics from validity and value to variability, vastness, and more. The author provides this expanded list to help build a simple mental model for understanding big data and data science concepts.

Uploaded by

5oscilantes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

14/7/2019 The 42 V's of Big Data and Data Science

BLOG

The 42 V's of Big Data and Data Science


Tom Shafer
April 1, 2017

Tweet Share Like 9 Share

Understanding and e ectively communicating a concept often requires rst building a simple
mental model. Consider, for example, how we teach the physical laws to students: it helps to walk
with algebra before you can run with calculus. This kind of model trades correctness (shaving o
"unnecessary" detail) for an increased ability to grasp the larger picture.

https://www.elderresearch.com/blog/42-v-of-big-data 1/8
14/7/2019 The 42 V's of Big Data and Data Science

In 2001, Gartner (perhaps) accidentally abetted an avalanche of aliteration with an article that
forecast trends in the industry, gathering them under the headings Data Volume, Data Velocity,
and Data Variety. Of course in ation continues its inexorable march, and about a decade
later we had the 4 V's of Big Data, then 7 V's, and then 10 V's.

But it's 2017 and we now operate in an ever more sophisticated world of analytics. To keep up
with the times, we present our updated 2017 list: The 42 V's of Big Data and Data Science.

1. Vagueness: The meaning of found data is often very unclear, regardless of how much data
is available.

2. Validity: Rigor in analysis (e.g., Target Shu ing) is essential for valid predictions.

3. Valor: In the face of big data, we must gamely tackle the big problems.

4. Value: Data science continues to provide ever-increasing value for users as more data
becomes available and new techniques are developed.

5. Vane: Data science can aid decision making by pointing in the correct direction.

6. Vanilla: Even the simplest models, constructed with rigor, can provide value.

7. Vantage: Big data allows us a privileged view of complex systems.

https://www.elderresearch.com/blog/42-v-of-big-data 2/8
14/7/2019 The 42 V's of Big Data and Data Science

8. Variability: Data science often models variable data sources. Models deployed into
production can encounter especially wild data.

9. Variety: In data science, we work with many data formats ( at les, relational
databases, graph networks) and varying levels of data completeness.

10. Varifocal: Big data and data science together allow us to see both the forest and the trees.

11. Varmint: As big data gets bigger, so can software bugs!

12. Varnish: How end-users interact with our work matters, and polish counts.

13. Vastness: With the advent of the Internet of Things (IoT), the "bigness" of big data is
accelerating.

14. Vaticination: Predictive analytics provides the ability to forecast. (Of course, these
forecasts can be more or less accurate depending on rigor and the complexity of the
problem. The future is pesky and never conforms to our March Madness brackets.)

15. Vault: With many data science applications based on large and often sensitive data sets,
data security is increasingly important.

16. Veer: With the rise of agile data science, we should be able to navigate the customer's
needs and change directions quickly when called upon.

17. Veil: Data science provides the capability to peer behind the curtain and examine the
e ects of latent variables in the data.

18. Velocity: Not only is the volume of data ever increasing, but the rate of data generation
(from the Internet of Things, social media, etc.) is increasing as well.

19. Venue: Data science work takes place in di erent locations and under
di erent arrangements: Locally, on customer workstations, and in the cloud.

20. Veracity: Reproducibility is essential for accurate analysis.

21. Verdict: As an increasing number of people are a ected by models' decisions, Veracity and
Validity become ever more important.

22. Versed: Data scientists often need to know a little about a great many things: mathematics,
statistics, programming, databases, etc.

23. Version Control: You're using it, right?

24. Vet: Data science allows us to vet our assumptions, augmenting intuition with evidence.

25. Vexed: Some of the excitement around data science is based on its potential to shed light
on large, complicated problems.

https://www.elderresearch.com/blog/42-v-of-big-data 3/8
14/7/2019 The 42 V's of Big Data and Data Science

26. Viability: It is di cult to build robust models, and it's harder still to build systems that will
be viable in production.

27. Vibrant: A thriving data science community is vital, and it provides insights, ideas, and
support in all of our endeavors.

28. Victual: Big data — the food that fuels data science.

29. Viral: How does data spread among other users and applications?

30. Virtuosity: If data scientists need to know a little about many things, we should also grow
to know a lot about one thing.

31. Viscosity: Related to Velocity; how di cult is the data to work with?

32. Visibility: Data science provides visibility into complex big data problems.

33. Visualization: Often the only way customers interact with models.

34. Vivify: Data science has the potential to animate all manner of decision making and
business processes, from marketing to fraud detection.

35. Vocabulary: Data science provides a vocabulary for addressing a variety of problems.
Di erent modeling approaches tackle di erent problem domains, and di erent validation
techniques harden these approaches in di erent applications.

36. Vogue: "Machine Learning" becomes "Arti cial Intelligence", which becomes...?

37. Voice: Data science provides the ability to speak with knowledge (though not all knowledge,
of course) on a diverse range of topics.

38. Volatility: Especially in production systems, one has to prepare for data volatility. Data that
should "never" be missing suddenly disappears, numbers suddenly contain characters!

39. Volume: More people use data-collecting devices as more devices become internet-
enabled. The volume of data is increasing at a staggering rate.

40. Voodoo: Data science and big data aren't voodoo, but how can we convince potential
customers of data science's value to deliver results with real-world impact?

41. Voyage: May we always keep learning as we tackle the problems that data science
provides.

42. Vulpine: Nate Silver would like you to be a fox, please.

Besides our own additions, this list of V's incorporates the lists of several articles written over the
last few years:
https://www.elderresearch.com/blog/42-v-of-big-data 4/8
14/7/2019 The 42 V's of Big Data and Data Science

3 V's, 2001 (and again)

4 V's, 2012 (and again)

4 V's, 2013

7 V's, 2013

6 V's, 2013

5 V's, 2013

10 V's, 2014

8 V's, 2014

5 V's, 2014 (and again)

These nine distinct sets encompass fteen di erent "V's," orbiting the original three. We can
safely say we are now well on the way to 100 V's of Big Data and Data Science!

Request a consultation to speak to a data analytics consultant about how Elder Research


can help drive better insight from your big data analytics projects.

https://www.elderresearch.com/blog/42-v-of-big-data 5/8
14/7/2019 The 42 V's of Big Data and Data Science

Related
Download the Ten Levels of Analytics ebook.

Download the Top 10 Data Mining Mistakes ebook.

Download the The Ten Most Common Data Mining Business Mistakes white paper.

About the Author


Data Scientist Tom Shafer joined Elder Research after completing a
Ph.D. in Physics at the University of North Carolina. In his research, Dr.
Shafer computed the decay properties of heavy atomic nuclei to study
an astrophysical process that formed the heaviest elements. Those
computations were made possible by collaboratively developing a new
parallel computing algorithm to calculate nuclear properties. Tom also
earned degrees in Physics and Mathematics from the University of
North Carolina at Wilmington.

Tweet Share Like 9 Share

SUBSCRIBE VIA RSS

Subscribe by Email

First Name*
First Name

Last Name*
Last Name

Email*
Email

https://www.elderresearch.com/blog/42-v-of-big-data 6/8

You might also like