Big Data and Block Chain

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

BIG DATA

Characteristics of Big Data – The 4 V’s

1. Volume

First up is volume. Unsurprisingly, the main characteristic that makes any dataset “big” is
the sheer size of the thing. When we’re talking about Big Data, we’re referring to the
datasets that stretch from petabytes to exabytes. These huge volumes require powerful
processing technologies – much more powerful than a regular laptop or desktop
processor. As an example of a high-volume dataset, we can think about Facebook. The
world’s most popular social media platform now has roughly 2.89 billion active users,
many of whom spend hours each day posting updates, commenting on images, liking
posts, clicking on ads, playing games, and doing a zillion other things that generate data
that can be analyzed. This is high-volume big data in no uncertain terms.

2. Velocity

The high velocity of data means that there will be more data available on any given day
than the day before – but it also means that the velocity of data analysis needs to be just
as high. Data professionals today don’t gather data over time and then carry out a single
analysis at the end of the week, month, or quarter. Rather, the analysis is live – and the
faster the data can be collected and processed, the more valuable it is in both the long
and short term. Facebook messages, Twitter posts, credit card swipes, and e-commerce
sales transactions are all examples of high-velocity data.

3. Variety

Facebook, of course, is just one source of big data. We can imagine just how much data
can be sourced from a company’s website traffic, from review sites, social media (not
just Facebook, but Twitter, Pinterest, Instagram, and all the rest of the gang as well),
email, CRM systems, mobile data, Google Ads, and so on. All these sources produce
data that can be collected, stored, processed, and analyzed. This data can be broken
down into three distinct types – structured, semi-structured, and unstructured.

Structured data is comprised of clearly-defined data types that are well organized and
can be easily searched – things like airline reservation systems, lists of customer names
and account histories, or just simply spreadsheets are all examples of structured data.

Unstructured data, by comparison, is unorganized. Things like text and multimedia


content – videos, images, social media posts, instant message communications – are all
examples of unstructured data.
Email is an example of semi-structured data – there is often metadata (i.e. data about
data) attached to emails within a database, making it more structured than unstructured
data, but less so than structured.

4. Veracity

Veracity refers to the quality, accuracy, and trustworthiness of data that are collected. As
such, veracity is not necessarily a distinctive characteristic of big data (as even little data
needs to be trustworthy), but due to the high volume, variety, and velocity, high reliability
is of paramount importance if a business is to draw accurate conclusions from it. High
veracity data is the truly valuable stuff that contributes in a meaningful way to overall
results. And it needs to ensure high quality as well. While analyzing Twitter data, for
instance, it’s imperative that the data is extracted directly from the Twitter site itself
rather than from some third-party system that might not be trustworthy. Low veracity or
bad data is estimated to cost US companies over $3.1 trillion a year due to the fact that
bad decisions are made on the basis of it, as well as the amount of money spent
scrubbing, cleansing, and rehabilitating it.

BLOCKCHAIN
Prospects:

1. Bitcoin is just one of the applications for the technology, whose use is being tested
across industries.
2. Healthcare, banking, education, agriculture, electricity distribution, and land records are
sectors that could benefit.
3. Blockchain-powered smart contracts, where every piece of information being recorded
can enhance the ease of doing business.
4. It will augment the credibility, accuracy, and efficiency of a contract while reducing the
risk of fraud, substantially.
5. Blockchain could play a crucial part in health insurance claims management by reducing
the risk of insurance claim fraud.
6. The technology can also be used to prevent the sale of spurious drugs in the country by
tracking every step of the supply chain network.
7. Artificial Intelligence and the Internet of Things (IoT) can gain immensely from
blockchain applications.
8. In an IoT world, thousands of devices would need to rapidly and seamlessly transact
with each other in real-time.
9. The adoption of blockchain by banks can help avert frauds as the technology updates
information across all users simultaneously.
10. It could be used to further strengthen our national institutions, including the judiciary and
the Election Commission.
11. Critical citizen information like land records, census data, birth and death records,
business licenses, criminal records, intellectual property registry, electoral rolls could all
be maintained as blockchain-powered, tamper-proof public ledgers.

Challenges: Blockchain has been considered the technology of the future. However, its large-
scale implementation involves an array of serious challenges that need to be addressed.

1. Costs: Blockchain technology is expensive to initially put in place.

2. Scalability: One of the key challenges in Blockchain implementation is its inability to


scale. Networks like Bitcoin and Ethereum experience a slowing down of transactions
and a consequent increase in the transaction fee as more users join the network.

3. Blockchain operations are slow compared to traditional transaction processing. This is


because of the consensus mechanism, additional layers of security, and encryption.

4. Energy consumption: The massive usage of energy for the functioning of blockchain.

5. Security: Safeguarding the privacy of individuals and companies as blockchains are


usually open ledgers for everyone to see. All the public Blockchains are vulnerable to
51% attacks. All the transactions happening on the Blockchain have to be validated by
the majority of miners present in the network. If any miner or group of miners gains
control of 51% of the nodes on the network, they can prevent other miners from creating
new blocks and hinder the validation of new transactions. They can even reverse the
transactions confirmed by them.

6. Lack of standardization: with so many platforms present in the Blockchain space, there
is currently no standard that allows them to interact with each other. There are
thousands of active Blockchain projects today running on different platforms. Each of
these platforms has its own coding language, set of protocols, and consensus
mechanism. It is difficult for the platforms to interact with each other without a translator.

7. Lack of Knowledge: Knowledge of the benefits of distributed ledger technology is still


limited.
8. If automated risk management, smart contracts, and similar tools are deployed across a
network, cascades of rapid and hard-to-control obligations and liquidity flows could
propagate across a network.
9. This interdependence will likely call for creative organizational thinking to address the
need for governance and strong risk management.

However, there are at least seven challenges distinctive to cryptocurrencies of which risk
managers must be aware.

1. Diversity
2. Valuation difficulties
3. Regulatory and legal dilemmas
4. Data and modeling obstacles
5. Illiquidity and trading costs
6. Custody, clearing, and settlement problems
7. Cryptocurrency derivatives are very risky.

You might also like