0% found this document useful (0 votes)
73 views33 pages

Bda - Unit 1

BDA unit Notes

Uploaded by

VISHWA PRIYA I
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
73 views33 pages

Bda - Unit 1

BDA unit Notes

Uploaded by

VISHWA PRIYA I
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 33
Understanding Big Data Syllabus Introduction to big data - convergence of key trends - unstructured data - industry examples of big data - web analytics - big data applications- big data technologies - introduction to Hadoop - open source technologies - cloud and big data - mobile business intelligence - Crowd sourcing analytics - inter and trans firewall analytics. Contents 1.1 Introduction to Big Data 1.2 Convergence of Key Trends 1.3 Unstructured Data 1.4 Industry Examples of Big Data 1.5 Web Analytics 1.6 Big Data Applications 1.7 Big Data Technologies 1.8 Introduction to Hadoop 1.9 Open Source Technologies 1.10 Cloud and Big Data 1.11 Mobile Business Intelligence 1.12 Crowd Sourcing Analytics 1.13. Inter and Trans Firewall Analytics 1.14 Two Marks Questions with Answers (1-1) Understandin Big Data Analytics 1-2 17 Bi Doty ERG introduction to Big Data Big data can be defined as very large volumes of data available at various sources, in varying degrees of complexity, generated at different speed i a velocities and varying degrees of ambiguity, which cannot be processed using. traditionay technologies, processing methods, algorithms or any commercial off-the-she}y solutions. ‘Big data’ is a term used to describe a collection of data that is huge in size ang yet growing exponentially with time. In short, such data is so large and complex that none of the traditional data management tools are able to store it or process jt * efficiently. = with the raw data that isn't aggregated o, organized and is most often impossible to store in the memory of a single computer. * Big data processing Sr. No. Data science 1 It is a field of scientific analysis of data in Big data is storing and pr order to solve analytically complex volume of structured and problems and the significant and data that can’ not. be 4 necessary activity of cleansing, preparing traditional applications. ; of data. - - i 2. It is used in Biotech, energy, gaming and | Used in retail, education, insurance, Hee social media. i Es Goals: Data classification, anomaly Goals. ; “To provide be ustomer detection, prediction, scoring and ranking. service, - identifyi revenue Wr, opportunities, effective|marketing etc. Benefits of Big Data Processing Benefits of big data processing : 1. Improved customer service. 2. Business can utilize outside intelligence while taking decisions. 3. Reducing maintenance costs. TECHNICAL PUBLICATIONS® - on up-thrust for knowledge Big Data Analytics 1-3 Understanding Big Data 4, Re-develop your products : Big data can also help you understand how others perceive your products so that you can adapt them or your marketing, if need be. 5, Early identification of risk to the product / services, if any. 6. Better operational efficiency. EEE) Big Data Challenges © Collecting, storing and processing big data comes with its own set of challenges : 1. Big data is growing exponentially and existing data management solutions have to be constantly updated to cope with the three Vs. 2. Organizations do not have enough skilled data professionals who can lerstand and work with big data and big data tools. ‘onvergence of Key Trends a i.e, it is a process of producing data, Some data are the records related to culture and society and others are the descriptions of phenomena of the universe and life. The large scale of data is rapidly generated and stored in computer systems, which is called © Data is generated automatically by mobile devices and computers, think facebook, search queries, directions and GPS locations and image capture. “e Sensors also generate volumes of data, including medical data and commerce location-based sensors. Experts expect 55 billion IP- enabled sensors by 2021. Even storage of all this data is expensive. Analysis gets more important and more expensive every year. * Fig. 1.2.1 shows the big data explosion by the current data boom and how critical it is for us to be able to extract meaning from all of this data. cen > @ ie EP ‘© The phenomena of exponential multiplication of data that gets stored is termed as “Data Explosion". Continuous inflow of real-time data from various processes, machinery and manual * Sending emails, making phone calls, collecting information for campaigns; each day we create a massive amount of data just by going about our normal business TECHNICAL PUBLICATIONS® - an up-thrust for knowledge | | | Big Data Anelytics © We differentiate big data (characte $4 Understanding Big Dar, and this data explosion does not seem to be slowing down. In fact, 90 % of thy Reason for this data explosion is Innovation. . Business model transformation : Innovation changed the way in which we gy business, provide services. The data world is governed by three fundamenta) trends are business model transformation, globalization and personalization of services. © Organizations have traditionally treated data as a legal or compliance requirement, supporting limited management reporting requirements, Consequently, organizations have treated data as a cost to be minimized, © The businesses are required to produce more data related to product and provide services to cater each sector and channel of clistomer. 2. Globalization : Globalization is an emerging trend in business where organizations start operating on an international scale. From manufacturing to customer service, globalization has changed the commerce of the world. Variety and different formats of data are generated due to globalization. 3, Personalization of services : To enhance customer service, the form of one-to-one marketing in the form of personalization of service is opted by the customer. Customers expect 4. New sources of data : The shift to online advertising supported by the likes of Google, Yahoo and others is a key driver in the data boom. Social media, mobile devices, sensor networks and new media are on the fingertips of customers or users. The data generated through this is used by corporations for decision support systems like business intelligence and analytics. The growth of technology helped to emerge new business models over the last decade or more. Integration of all the data across the enterprise is used to create business decision support platform. e from traditional data by one or more of the five V's : 1. Volume : Volumes of data are | infrastructure can cope with. It consisting of terabytes or petabytes of data. © Fig, 1.2.2 shows big data volume. TECHNICAL PUBLICATIONS®- an up-thrust for knowledge Big Data Analytics 1-5 Understanding Big Data Clickstream logs Emails Machine dat pals vege _ Machine data volume ) <==) eysremalion, 920-spatial data Fig. 1.2.2 Big data volume 2. Velocity : The term ‘velocity’ refers to the speed of generation of data. determines real potential in the data. It is reated in or near Teal le. 3. Variety It refers to heterogeneous sources and the nature of data, both Hig. 123 (a) and Fig. 1.2.3 (b) shows big data velocity and data variety. Mobile [~~] [= | \ > (oaa ) Social = [=] Fig, 1.2.3 (a) Data velocity (Refer Fig. 1.2.3 (b) on next page) 4. Value : It represents the ‘Amazon, facebook, Yahoo, Google (Web based companies) The ultimate objective of any big data project should be to generate some sort of value for the company doing all the analysis, Otherwise, you're just performing some technological task for technology's sake, TECHNICAL PUBLICATIONS® - an up-thrust for knowledge Big Data 5. EEE] Compare Cloud Computing and Big Data | Sr. No. Understanc Analytics ng Bly br, ‘Semistructured Fig, 1.2.3 (b) Data varlety © For ‘real-time spatial big data, decisions can be enhanced through visualization of dynamic change in such spatial phenomena. as climate, traffic, social-media-based attitudes and massive inventory locations. © Exploration of data trends can include spatial proximities ang relationships. Once spatial big data are structured, formal spatial analytics can be applied, such as spatial autocorrelation, overlays, buffering, spatial cluster techniques and location quotients. Veracity : Big data must be fed with relevant and true data, We will not be able to perform useful analytics if much of the incoming data comes from false sources or has errors. Veracity versa. It relates to the assurance of the data's quality, integrity, credibility and accuracy. We must evaluate the data for accuracy before using it for business insights because it is obtained from multiple sources: Cloud computing It provides resources on demand. It provides a way to handle :hu; of data and generate insights. volumes It refers to internet services from SaaS, — It refers to data, which can be structured, semi-structured or unstructured, PaaS to laa. Cloud is used to store data and It is used to describe a huge volume o! information on remote servers. data and information, Cloud computing is economical as it Big data is a highly scalable, robust has low maintenance costs centralized ecosystem and cost-effective, platform no upfront cost and disaster safe implementation. ie TECHNICAL PUBLICATIONS® - an up-thrust for knowlodgo Big Data Analytics 1-7 Understanding Big Data PR maw. Vendors and solution’ providers’ of” big Vendors and solution providers of oud computing are Google, Amazon data are Cloudera, Hortonworks,. Apache | web service, Dell, Microsoft, Apple and MapR. i i and IBM. i The main focus of cloud computing is Main focus of big data is about solving to provide computer resources and problems when. a huge amount. of data services with the help of network generating and processing. connection. 28 Unstructured data is data that does not follow a specified format. Row and columns ze not used for unstructured data. Therefore it is difficult to retrieve . Unstructured data e. For example of unstructured data is e-mails, click streams, textual data, images, log data and videos. In the case of unstructured data, the size is not the only problem, z out of unstructured data i and challenging as compared of structured data. The unstructured data can be in the form of text : (Documents, email messages, customer feedbacks), audio, video, images. Email is an example of unstructured data, Even today in most of the organizations more than 80 % of the data are in unstructured form. This carries lots of information. But extracting information from these various sources is a very big challenge. Tree eae oc icing den 2. Data can be of any type. 3. Unstructured data does not follow any structural rules. 4. There are no predefined formats, restriction or sequence for unstructured data. 5. Since there is no structural binding for unstructured data it is unpredictable in — nature. Examples of machine generated unstructured data : L : This includes weather data or the data that the government captures in its satellite surveillance imagery. 2. Scientific data : This includes atmospheric data and high energy physics. 3. Photographs and video : This include security, surveillance and traffic video. TECHNICAL PUBLICATIONS® - an up-thrust for knowledge Underst 1-8 landing iyo, 1 Du ety apaties i It helps for appli, is arranged i 4 column format. PPlicatioy, Structured data is arranged in rows an‘ 7 retrieve and proedss data easily. Database management system is used for Stor, structured data. ; ge j + Any data that can be stroed in the form of a particular fixed is know, stractured data. For example, data stored in the oe, ~ Be Of tables relational database management systems is a form of struc . EERE Ditference between Structured and Unstructured Data , | ie. stored | Unst | We fs in discrete form. ie. | Uns ‘in ° Big data plays an important role in digital’ marketifig) Each day information shared iF digitally increases significantly. With the help of big data, marketers can analyze } every action of the consumer. It provides better marketing insights and it helps marketers to Reasons why big data is important for digital marketers : a) Real-time customer insights b) Personalized targeting ©) Increasing sales 4) Improves the efficiency of a marketing campaign @) Budget optimization f) TECHNICAL PUBLICATIONS® - on upthrust for knowledge Big Data Analytics 1-9 Understanding Big Date Data constantly informs marketing teams of customer behaviors and industry trends and is used to optimize future efforts, create innovative campaigns and puild lasting relationships with customers. Big data regarding customers provides marketers details about user demographics, See tasysiich can be used to personalize the product experience and i Big data solutions can help menniggcisagendeinpoiny which marketing campaigns, strategies or social channels are getting the most traction. This lets marketers allocate marketii costs for projects that are not : Nowadays, personalization is the key strategy for every marketer. Engaging the customers at the right moment with the right message is the biggest issue for marketers. Big data helps marketers to create targeted and personalized campaigns. Personalized marketing is creating and delivering messages to the individuals or the group of the audience through data analysis with the help of consumer's data such as geolocation, browsing history, clickstream behavior and purchasing history. It is also known as one - to - one marketing. Consumer insights : In this day an age, marketing has become the ability of a company to interpret the data and change its strategies accordingly. Big data allows for real-time consumer insights which is crucial to understanding the habits of your customers. By interacting with your consumers through social media you will know exactly what they want and expect from your product or service, which will be key to distinguishing your campaign from your competitors. Help increase sales : Big data will help with demand predictions for a product or service, Information gathered on user behaviour will allow marketers to answer what types of product their users are buying, how often they conduct purchases or search for a product or service and lastly, what payment methods they prefer usin, data allows marketers to measure their campaign performance. This is the g- Marketers will to measure any negative changes to marketing KPIs. If they have not achieved the desired results it will be a signal that the strategy would need to be changed in order to maximize revenue and make your marketing efforts more scalable in future. TECHNICAL PUBLICATIONS® - an up-thrust for knowledge TN | [Enhanced appreciation of what consumers like or do, * Understanding Big 0, Big Date Analytics: EEA web Analytics: Web analytics is the is on identifying the website data to strategy and The WWW is an evolving services across the Internet. on freely published communica tional and of those goals and to dri vee publishing and accessing, FeSOUFEES an system for PI Its operations are bas., The wed is an open system: tion stan is and documents standards. 2 wp pe nm rs Analyze website conversions Senor to measure and benchmark use web anclytics platforms indicators that drive their busines ss performance and to look at key performance such as purchase conversion rate, © Website analytics provide insights and data that can be used to create a bette user experience for website visitors. Understanding customer behavior is also ke: to optimizing a website for key conversion metrics. For example, web analytics will show us the most popular pages on Your w' b: and the most popular paths to purchase. With website analytics, we can accurately track the effectiveness of your online marketing campaigns to f inform future efforts. «Web analytics can help 2 digital marketer understand their customers better 2. Conversion challenges 3 not like 3. wn TECHIWCAL PUBLICATIONS + 67 upptinust for knowtedge Big Data Analytics 1-1t Understanding Big Data ig pata Applications Big data applications can help companies to make better business decisions by analyzing large volumes of data and discovering hidden patterns. These data sets might be from social media, data captured by sensors, website logs, customer feedbacks, etc. Organizations are spending huge amounts on big data applications to discover hidden patterns, unknown associations, market style, consumer preferences and other valuable business information. Domains where bs data can be applied to health care, media and entertainment, « Relation between IloT and Big Data : Big data production in the industrial Internet of Things (IloT) is evident due to the massive deployment of sensors and Internet of Things (loT) devices. However, big data processing is challenging due to limited computational, networking and storage resources at IoT device-end. Big Data Analytics (BDA) is expected to provide operational and customer-level intelligence in IloT systems. «© The extensive installation of sensors on machines causes a massive increase in the volume of data collected within industrial processes. The data consist of operating data, error lists, history of maintenance activities and alike. * In combination with the related business data, the overall plethora of data provides the raw material for process optimizations and other applications. To set this potential for optimizations free, the raw data needs to be processed systematically, passing through various algorithms. The results are prepared information with specific application objectives. Especially pattern detection is to mention in this context, since this method identifies and quantifies cause and effect correlations and allows predictions of state changes. The significance of the information given out by the analysis depends on the amount of data processed. 1. Healtheare : * Big data analytics for healthcare uses health-related information of an individual or community to understand a patient, organization or community. In the past, managing and analyzing healthcare data was tedious and expensive. More recently, technology has helped the healthcare sector make leaps and bounds to keep up with the flow of big data in healthcare. Diagnostic devices, medical machinery, instrumentation, online services sources such as these are transferring data throughout a healthcare network. This is done with the help of big data tools such 2s Hadoop and Spark. TECHNICAL PUBLICATIONS® - an up-thrust fer knowledge Big Daze Anaiytes * One of the mast current and relevant big data examples in healthcare is jo hes impacted the global coronavirus crisis. Big data analytics for health, supported the rapid development of COVID-19 vaccines. Researchers can

You might also like