5 Vs of Big Data
5 Vs of Big Data
Volume
Volume, the first of the 5 V's of big data, refers to the amount of data that exists. Volume is like
the base of big data, as it is the initial size and amount of data that is collected. If the volume of
data is large enough, it can be considered big data. What is considered to be big data is relative,
though, and will change depending on the available computing power that's on the market.
Velocity
The next of the 5 V's of big data is velocity. It refers to how quickly data is generated and how
quickly that data moves. This is an important aspect for companies need that need their data to
flow quickly, so it's available at the right times to make the best business decisions possible.
An organization that uses big data will have a large and continuous flow of data that is being
created and sent to its end destination. Data could flow from sources such as machines,
networks, smartphones or social media. This data needs to be digested and analyzed quickly,
and sometimes in near real time.
As an example, in healthcare, there are many medical devices made today to monitor patients
and collect data. From in-hospital medical equipment to wearable devices, collected data needs
to be sent to its destination and analyzed quickly.
In some cases, however, it may be better to have a limited set of collected data than to collect
more data than an organization can handle -- since this can lead to slower data velocities.
Variety
The next V in the five 5 V's of big data is variety. Variety refers to the diversity of data types. An
organization might obtain data from a number of different data sources, which may vary in
value. Data can come from sources in and outside an enterprise as well. The challenge in variety
concerns the standardization and distribution of all data being collected.
1
than unstructured data. Structured data, meanwhile, is data that has been organized into a
formatted repository. This means the data is made more addressable for effective data
processing and analysis.
Veracity
Veracity is the fourth V in the 5 V's of big data. It refers to the quality and accuracy of data.
Gathered data could have missing pieces, may be inaccurate or may not be able to provide real,
valuable insight. Veracity, overall, refers to the level of trust there is in the collected data.
Data can sometimes become messy and difficult to use. A large amount of data can cause more
confusion than insights if it's incomplete. For example, concerning the medical field, if data
about what drugs a patient is taking is incomplete, then the patient's life may be endangered.
Both value and veracity help define the quality and insights gathered from data.
Value
The last V in the 5 V's of big data is value. This refers to the value that big data can provide, and
it relates directly to what organizations can do with that collected data. Being able to pull value
from big data is a requirement, as the value of big data increases significantly depending on the
insights that can be gained from them.
Organizations can use the same big data tools to gather and analyze the data, but how they
derive value from that data should be unique to them.