Commercial Usage Using Big Data
Commercial Usage Using Big Data
Prof. Pavan Kulkarni, Dept. Of Computer Engineering, Trinity College of Engineering and Research, Pune University,
Pune, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Big data is concern with huge amount of data external factors that are accountable for changing
which includes complexity,multiple data sets as we know relational patterns in network. This paper presents a
technologies are increasing rapidly so the data usage is new data mining technique that analyzed states
expanded day by day on social media every seconds 1 million between entities of dynamic network and identify
or 1 billion posts or data are updated so it has become very maximum non redundant path of stable relational
difficult to manage data by traditional data base than states.[1]
concept of Data Mining arises. In data mining there are
different methodologies which are used to manage such as 2. “Novel Approaches to Crawling Important Pages
clustering, frequent patterns etc. This paper represents
Early”
HACE theorem which is used to manage different type of
data like organizational, educational, industrial, social data Author: M.H. Alam, J.W. Ha, and S.K. Lee
along with security and accuracy.
In data mining web crower is used web application
KeyWords:BigData,Hadoop,Hive,DataMining,Clustering
,Hace.
like web search engine, web archives and directories
which maintain web page designed algorithms
utilized different quality including title of page, and
1.INTRODUCTION topic significance. The trial using openly available
data sets to study the result of every feature on crawl
Data mining ,is the operation of examine data from ordering and estimate the performance of different
various views and summaries it into meaningful data algorithms.[2]
- data that can be used to gain revenue, cuts costs, or
both. Data mining software is one of a number of 3. “Identifying Influential and Susceptible Members of
analytic tools for recognizing data. It allows users to Social Networks”
study data from many different attribute or angles, Author : S. Aral and D. Walker
categories it, and summaries the relation identified.
Data mining is the activity of uncovering correlative Recognize social power in networks is critical to
or patterns among heaps of fields in large relational understanding how behaviors spread. We present a
databases. method that uses in randomized test to recognize
influence and receptiveness in networks while
1.1 LITERATURE SURVEY avoiding the biases inbuilt in traditional estimates of
social contagion. Interference in a envoy sample of
1 “Algorithms for Mining the Evolution of Conserved 1.3 million Facebook users showed that younger
Relational States in Dynamic Networks,” Dec. 2012. users are more tendency to influence than older
Author:. R. Ahmed and G. Karypis users, men are more significant than women, women
significant men more than they influence other
Dynamic networks have just being identify as women, and married individuals are the least
powerful idea to model and denote temporal changes susceptible to influence in the decision to adopt the
and dynamic aspects of core data in difficult system. product offered. Analysis of influence and
To recognize the transitions from one preserved to susceptibility mutually with network structure
the next and it give confirmation to previous of exposed that influential individuals are less disposed
© 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 221
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 03 Issue: 10 | Oct-2016 www.irjet.net p-ISSN: 2395-0072
to influence than non influential single and that they changes in DJIA closing values. Our results shows that
cluster in the network while susceptible the correct of DJIA predictions can be significantly
individuals.[3] increased.[5]
ways clustered-lattice networks than across related involvement has shown because of maximum usage of
to random networks system.[7] social circuit. Given a public circuit structure, the
problem of influence maximization is to find a
8.“Parallel Algorithms for Mining Large-Scale Rich- minimum set of nodes that could maximize the
Media Data” distributed of influences. With a big-scale social
network, the ratio and utility of such algorithms are
Author: E.Y. Chang, H. Bai, and K. Zhu
depreciative. Although many recent studies have
The sum of online photos and videos is now at tens of focused on the problem of influence maximization,
billions. To make, index, and recover these large- these works are time-consuming when a social
scale rich-media data, A system must employ network is big-scale. In this paper, we propose two
ascendible data management and mining algorithms. novel algorithms, CD H-Kcut and Community and
The research communities necessarily to consider power Heuristic on Kcut/SHRINK, to solve the
finding ample measure question instead of finding influence maximization problem based on a graphic
problems with small data sets that do not reflect real model. The community structure, which significantly
life script. This tutorial present key difficulties in reduces the number of candidates of authoritative
large-scale rich-media data mining, and presents nodes, to avoid knowledge intersection. The
parallel algorithms for challenges. We instant our experimental results on both synthetic and real data
parallel implementations of Spectral Clustering sets indicate that our algorithms not only outmatch
(PSC), FP-Growth (PFP), Latent Dirichlet Allocation the state-of-the-art algorithms in efficiency but also
(PLDA), and Support Vector Machines (PSVM).[8] possess graceful scalability.[10]
Author: R. Chen, K. Sivakumar, and H. Kargupta Fig: Big data processing framework
© 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 223
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 03 Issue: 10 | Oct-2016 www.irjet.net p-ISSN: 2395-0072
Heterogeneous: Data is a plan often used in the precious suggestions and guidance. Lastly we would
science and statistics relating to the quality in a like to thank our family& friends for the assist and
substance .A photo that is homogeneous remaining confidence they have given us during the course of
the same in all cases and at all times in characters our work.
shape , size , height , weight, texture , distribution ,
disease , temperature , radioactivity , design , etc. one REFERENCES
that is heterogeneous in a way that is readily
1) “Algorithms for Mining the Evolution of
distinguishable by the senses constant in one of these Conserved Relational States in Dynamic
qualities. Networks,” Dec. 2012.Author:. R. Ahmed and G.
Karypis.
Autonomous: Sources with distributed and
decentralized authority main feature of Big Data. 2) “Novel Approaches to Crawling Important Pages
Early” Author:M.H. Alam, J.W. Ha, and S.K. Lee.
Complex: Unstructured Data which is raw data yet to
be processed. 3) “Identifying Influential and Susceptible Members
of Social Networks” Author :S. Aral and D. Walker.
Evolving: The day to day data is increasing with new 4) “Analyzing Collective Behavior from Blogs Using
type of data. Swarm Intelligence,” Author :S. Banerjee and N.
Agarwal.
CONCLUSION
5) “Twitter Mood Predicts the Stock Market” Author
Because of Increase in the amount of data in the field :J. Bollen, H. Mao, and X. Zeng.
of genomics, meteorology, biology, environmental
research, it gets hard to take care of data, to find 6) “Network Analysis in the Social Sciences” Author
connections, patterns and to analyze the large data :S. Borgatti, A. Mehra, D. Brass, and G. Labianca.
sets. As an organization rolls up much more data at
this scale, validating the process of big data analysis 7) “The Spread of Behavior in an Online Social
will become paramount. The paper describes Network Experiment” Author :D. Centola.
different methods of algorithms used to manage such
8) “Parallel Algorithms for Mining Large-Scale Rich-
large data sets and it gives an overview of
Media Data” Author :E.Y. Chang, H. Bai, and K.
architecture and algorithms used in large data sets.
Zhu.
© 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 224