M AKING M A R K E T S F U N C T I O N B E T T E R YEARS GLOBAL RESEARCH & ANALYTICS International Youth Centre, Teen Murti Marg, Chanakyapuri, New Delhi - 110 021, India Phone: 91-11-23010199, Fax: 91-11-23015452, Email: [email protected] Website: www.nasscom.in Big Data: The Next Big Thing 2 Copyright 2012 International Youth Centre, Teen Murti Marg, Chanakyapuri New Delhi - 110 021, India Phone: 91-11-23010199, Fax: 91-11-23015452 Email: [email protected] Published by NASSCOM, New Delhi Designed & Produced by CREATIVE INC. Phone: 91-11-41634301 Printed at P.S. Press Services Disclaimer The information contained herein has been obtained from sources believed to be reliable. NASSCOM and CRISIL GR&A disclaim all warranties as to the accuracy, completeness or adequacy of such information. NASSCOM and CRISIL GR&A shall have no liability for errors, omissions or inadequacies in the information contained herein, or for interpretations thereof. Service provider proles are representative of the Indian players. We have tried to cover players across the Big Data spectrum hardware, software, analytics, system integration and IT services. Identication of players is based on reliable industry sources, interviews, and organisation websites. This report is not a recommendation to invest/disinvest in any organisation covered in the report. The material in this publication is copyrighted. No part of this report may be reproduced either on paper or electronic media in part or in full without permission in writing from NASSCOM. Request for permission to reproduce any part of the report may be sent to NASSCOM. Usage of Information Forwarding/copy/using in publications without approval from NASSCOM will be considered as infringement of intellectual property rights. Big Data: The Next Big Thing 3 Every few years, we come across the next big technological idea which radically transforms the way businesses function by opening up new opportunities and ef ciencies. Big Data has now emerged as the next big thing the big idea whose time has come. And like most big ideas in the recent past, Big Data ofers a big opportunity for India. In this study, jointly conducted by NASSCOM & CRISIL Global Research and Analytics (GR&A), we look at the opportunity, which lies in ofering services around Big Data implementation and analytics for global multinationals. By 2015, Big Data is expected to become a USD 25 billion industry, driven by uses across industries such as manufacturing, retail, nancial services, telecom and healthcare. We expect the Indian Big Data industry to grow from USD 200 million in 2012 to USD 1 billion in 2015 at a CAGR in excess of 83 per cent Indian service providers are already leveraging partnerships, M&As and venture funding to capture Big Data outsourcing opportunity. We are condent that India will be at the forefront in ofering Big Data analytics and related IT services. The challenge, however, is in meeting the demand of data scientists and IT engineers which is estimated to reach approximately 15,000-20,000, at a CAGR of 80 per cent by 2015. The signs, though, are encouraging. India follows close on the heels of the US and is well ahead of other outsourcing destinations in terms of Big Data talent availability and service providers initiatives to build such talent for the Big Data opportunity. To further augment this capacity, organisations are leveraging their academic alliance programmes, with universities in India to introduce courses on various areas of Big Data. Their eforts are being complemented by private IT training institutes in the country, which are developing talent through courses specic to Big Data skills. Today, data is omniscient and omnipresent. This data is getting generated at a rapid pace: around 2.5 billion GB of data is generated every day, and more than 90 per cent of the data available today has been created in the past 3-4 years. This has primarily been because of the explosion in our use of click stream, mobile applications and social media. Its estimated that Twitter alone generates 12 Terabytes of data daily. Its a gold mine for businesses which can separate the wheat from the chaf to identify the trends. Organisations across segments are now looking at this pool of data to determine how best it can be mined and gauge their customers likes and dislikes. Storing, analysing and making sense of data of such unwieldy dimension will be a challenge of epic proportions. However, we believe India is on the right path to steal a march over others. In this study, we ofer a big perspective on Big Data and how it can be turned into actionable insights. Foreword Roopa Kudva Managing Director and CEO, CRISIL Som Mittal President, NASSCOM Big Data: The Next Big Thing 4 Acknowledgements 5 Key Takeaways 6 Introduction to Big Data 8 Global Perspective on Big Data 26 Indias Advantage in the Big Data Opportunity 40 The Future of Big Data 71 Annexure 78 Contents Big Data: The Next Big Thing 5 This publication was prepared through a collaborative efort by several institutions and individuals. We would like to acknowledge the support of our Executive Council for providing the essential and gracious counsel and guidance. NASSCOM has published, and continues to work on, various reports on the IT-BPO sector; information from these reports have been used in this study. We gratefully acknowledge the contribution of our members and partners including Genpact, EMC, Sears Holding, HP Analytics, Mu Sigma, AbsolutData, Computer Sciences Corporation, Deloitte, Frost & Sullivan, Marlabs, LatentView, EXL Services, Fidelity Investments, Impetus and JP Morgan Chase in terms of their valuable time and informative case studies. We deeply appreciate the eforts of CRISIL Global Research & Analytics (GR&A) and its team comprising Gaurav Dua, Kumar Rajendran, Priya Khemka, Gunja Rastogi, Mehak Mayor, Praveen Kalani, Hemant Bisht, Ridhima Sudan, Santosh Kandwal and Sonam Gupta who were instrumental in producing this report. We also convey our special acknowledgement to NASSCOMs research team for their efort and contribution towards the production of this report. Acknowledgements Key Takeaways Big Data: The Next Big Thing 7 India showcases competitive advantage in Big Data oferings An Introduction to Big Data Big Data: The Next Big Thing 9 Big Data is dened by volume, variety and velocity Organisations worldwide are turning their attention to Big Data as they scramble to derive insights from the deluge of information generated from various sources. In the past few years, the global marketplace has seen exponential growth in data volumes, created and consumed by a diverse cross-section of stakeholders. The term Big Data signies large datasets in multiple formats, growing at an enormous rate and posing problems for traditional storage and analytical platforms. Big Data is distinct from large existing data stored in various relational databases, as it warrants a more advanced mechanism for both storage and analysis. Technologies such as NoSQL databases and MapReduce/Hadoop frameworks are at the core of the solutions heralding a paradigm shift. So Big Data is characterised by three attributes of data: volume, variety and the velocity at which it is generated. Traditional analytics on transactional or structured data have helped data-driven organisations gain insights from various enterprise data. As data from weblogs, social media posts, sensors, images, emails, audio and video les emerge as sources of insights, it presents a huge competitive opportunity for businesses. The need to derive predictive and actionable insights from this data for improved business operations and better decision making is what drives Big Data analytics. Big Data: The Next Big Thing 10 The data being generated globally is undergoing exponential growth Data volume is the primary characteristic of Big Data. With data becoming an indispensable part of every economy, industry, organisation, business function and individual, it is being actively captured by organisations to better understand their customers, suppliers, partners and operations. Large datasets yield more information and hence, improved analysis compared to limited records of data, leading to better competitive advantage and business operations. This data is being generated at a rapid pace: around 2.5 billion GB of data is generated every day, and more than 90 per cent of the data available today has been created in the past 3-4 years. According to IDC, data generated globally is expected to witness a 41.0 per cent CAGR between 2009 and 2020 to reach 35.0 Zettabytes. Moreover, the technological landscape has changed with innovation in both managing and storing large data. As organisations move away from the traditional data storage systems such as le systems and databases to newer technologies such as cloud-based storage and open source software, data storage and management costs are seeing a downward trend. According to IDC, storage costs have plummeted from USD 18.9/gigabyte in 2005 to USD 1.6/gigabyte in 2011, and are expected to further decline to 0.7/gigabyte by 2015. Apart from storage costs, the evolution of several open source analytical tools and platforms has made data analytics exible, reliable and relatively afordable for Big Data. Volume Variety Velocity Big Data: The Next Big Thing 11 Today 80 per cent of data existing in any enterprise is unstructured data Organisations worldwide are increasingly realising that unstructured data, if analysed, can provide a competitive edge. While structured data is transactional and can be stored in rows and columns with an identiable structure, unstructured data such as audio, video and social media messages is raw or semi-structured. This data is generated in several forms such as web clicks, emails, phone conversations, weather data, audio and video les, location coordinates and pictures. Moreover, unstructured data is highly dynamic and does not have a particular format, i.e., it may be in diferent languages, have several terminologies, and may exist in the form of X-ray sheets, voice mails, digital photographs, or phone conversations. Organisations are overwhelmed by the volume of unstructured data and are looking at ways to manage and analyze them in a systematic manner. As a result, one of the key focus areas for organisations wanting to leverage Big Data is to handle unstructured data and adopt new technologies to deal with them. It is imperative to develop technologies that can enable storage of such huge data as well as maintain transactional consistency between structured and unstructured data. Newer technologies such as NoSQL databases to store unstructured data and processing methods such as Hadoop and massively parallel processing are gaining prominence in the area of Big Data and Big Data analytics. Volume Variety Velocity Big Data: The Next Big Thing 12 Increased data velocity enables real-time use of Big Data The proliferation of the internet and the mobile era has increased the rate at which data is created and stored; hence, there is a need for tools and technologies to analyse data at an equal speed. The shelf-life of data has dropped from months to hours and seconds. The ubiquitous nature of the internet, coupled with massive computing power and accessibility, has transformed data processing from an auxiliary function into an essential mechanism that enables organisations to transform their businesses. Big Data service providers are increasingly leveraging technologies such as streaming processing and in-memory computing that mitigates the shortcomings of batch processing and enable faster storage and data processing. Earlier, these technologies were popular in verticals considered more critical, such as the nancial and government sectors. However, as the criticality of analysing data in real-time emerges, several other industries are also adopting solutions based on these technologies. Volume Variety Velocity Big Data: The Next Big Thing 13 Social media analytics, sentiment analysis and behavioural analysis are the upcoming Big Data analytics services Big Data analytics is the process of applying advanced analytical techniques to large datasets to uncover hidden patterns, unknown correlations and other useful information. Big Data analytics helps businesses: Take better business decisions: The most important objective of Big Data analytics is to help organisations make better business decisions, taking into account all the available information. This is achieved by analysing large volumes of structured and unstructured data from sources that are left unutilised by conventional business intelligence solutions Predict and identify change: Big Data analytics helps organisations closely monitor their ecosystem, discover what has changed, and decide how they should react. It also enables them to predict change, which is crucial given the current competitive business environment Identify new opportunities: Advanced Big Data analytics is an efective way to discover new opportunities such as new business segments, best suppliers, associate products of af nity and sales seasonality The evolution of advanced analytical techniques such as machine learning, predictive analytics, data mining, statistical analysis, articial intelligence and natural language processing have enabled Big Data: The Next Big Thing 14 organisations to generate insights across all aspects of their businesses. Organisations are now able to analyse complete datasets, including unstructured data, instead of smaller samples, resulting in better outcomes. New visualisation tools and techniques are helping data scientists, and business users are able to understand Big Data and make decisions based on it. Visual tools for generating insights have also evolved from simple graphs, PowerPoint presentations and dashboards to heat maps, cluster analysis and real-time advanced dashboards. Some of the widely used Big Data visualisation tools are: Tag cloud: A weighted visual list where words that appear most frequently are larger and words that appear less frequently are smaller Clustergram: Used to visualise how clusters are formed and how cluster members are assigned to clusters as the number of clusters increases Heat map: A graphical representation of data where the individual values contained in a matrix are represented as colours Dashboard: A real-time graphical presentation of data analysis History ow: Charts the evolution of a document as it is edited by multiple contributing authors Big Data: The Next Big Thing 15 Big Data analytics is the application of advanced techniques on Big Datasets; answer questions previously considered beyond reach Big Data analytics is an evolving and multifaceted area for analytics players. The key diferentiating factors between traditional analytics, advanced analytics and Big Data analytics are: Big Data analytics difers from advanced analytics in terms of diferent data formats and structures, and new application requirements for Big Data While traditional analytics performs rear-view analysis on structured data, advanced analytics and Big Data analytics provide a progressive view, enabling organisations to anticipate and deal with future opportunities i.e. Big Data analytics has a denitive predictive end-result in its use Big Data analytics has enabled cross-channel analytics and real-time insights at greater speed, access and collaboration. For example, detection of consumer emotions on a call on mentioning a competitor or conversion of a service call into an opportunity by leveraging Big Data analytics are more relevant in real-time rather than after the interaction ends. Big Data: The Next Big Thing 16 Big Data management, analytics, IT services and applications are the key constituents of Big Data ecosystem The Big Data ecosystem includes multiple elements from the data that is analysed using the IT infrastructure that supports it and the applications that enable its analysis and usage. Elements of Big Data include: Data management refers to systems where the data resides. It comprises the legacy systems as well as Hadoop-based systems and NoSQL databases. Legacy systems include databases that store and manage structured data, i.e., RDBMS to store and analyse structured data, and MPP systems to scale up for large structured datasets. Hadoop is an open source software framework to support applications that enable analysis of petabyte and xetabyte-sized data. Given Hadoops popularity and wide adoption, several other open-source projects have become associated with it, adding new functionality and enterprise-ready features to make it a compelling enterprise solution. These sub-projects include Hadoop Distributed File System (HDFS), Hbase, Hive, Mahout, Pig, ZooKeeper, Avro, Cassandra, and Chukwa. Once Big Data is collected and processed, it becomes operational data, i.e., it represents Big Data outcomes or serves as an input data for analytics. Big Data analytics includes the technologies and tools to analyse the operational data and generate insight from it. After the data is analysed, it becomes available for business users through various visualisation techniques. Big Data: The Next Big Thing 17 Data consumption involves enabling the Big Data insights to work in Business Intelligence (BI) and end-user applications IT services enable integration of Big Data framework with the traditional business intelligence infrastructure Big Data: The Next Big Thing 18 Traditional storage architectures limit the potential of Big Data, thereby, compelling businesses to move to new data foundation The traditional analytics technology stack has evolved into the Big Data analytics technology stack. The inability of traditional BI applications to process unstructured datasets makes them less relevant in the Big Data analytics space. Big Data management, infrastructure and storage systems: Growth in Big Data has led to signicant infrastructure requirements to support the distributed processing of unstructured data analytics. Unlike traditional relational databases, which are structured, normalised, and densely populated, Big Data technology stack mainly comprises Hadoop architecture that has a distributed le system, analytics and data storage platforms, and an application layer that manages distributed processing, parallel computation, workow and conguration management for unstructured data. Other than Hadoop, there are non-relational databases such as NoSQL databases and MPP systems that are scalable, network-oriented, semi-structured, and sparsely populated. This layer also comprises servers, networks, and storage used for scale-out deployment of Big Data technology. With the emergence of Big Data, traditional RDBMS, MPP and DW are transitioning into a new role of supporting Big Data management by processing structured datasets as outputs of Hadoop or MapReduce technologies and then input for BI software and analytical applications. Big Data: The Next Big Thing 19 Big Data analytics: While traditional analytics primarily catered to structured or row/column-based data, Big Data analytics enables analytical processing of multi-structured data for text analytics, predictive modelling, and social media analytics, using techniques such as MapReduce and in database analytical functions. Moreover, traditional analytics leveraged basic visualisation techniques such as charts and graphs to communicate analysis to business users, while Big Data analytics uses new visualisation tools such as real-time dashboards, heat maps and tag clouds. Big Data: The Next Big Thing 20 Key players across the traditional and Big Data technology stack As Big Data technologies become mainstream, the vendor landscape is evolving rapidly. Data management includes vendors of Hadoop-based solutions, other MapReduce technology suppliers as well as cloud and datacentre providers. The increased demand for Big Data analytics has changed the competitive landscape for the Big Data analytics service providers. In addition to the incumbent IT/BPO/Knowledge service players, there are now more pure-play analytics players, some of whom provide sector-specic analytics solutions. Some of the larger organisations have set up captives, which provide data analytics solutions to the other divisions and subsidiaries of those organisations. Even the breadth of the services provided by analytics organisations has substantially increased from data storage and management to delivering real-time insights and end-to-end data analytics services. Big Data management and storage: Many new organisations have emerged as providers of Apache open source Hadoop distributions, with various levels of proprietary customisation for data management. Cloudera and Hortonworks are the major players for Hadoop distributions. While Cloudera contributes signicantly to Apache HBase, the Hadoop-based non-relational database that enables low-latency, Hortonworks mainly ofers next-generation MapReduce architecture. Other pure players include MapR, Hadapt and Zettaset. Moreover, mega IT vendors have also entered the Big Data market through acquisitions. The Big Data warehouse market is mainly led by four players IBM Netezza, Big Data: The Next Big Thing 21 EMC Greenplum, HP Vertica and Teradata Aster Data. Non-Hadoop vendors are also signicantly contributing to the Big Data market opportunity Splunk, HPCC Systems and Datastax are some of the key players. Big Data analytics: With the deluge of data, it has become pertinent to have applications and platforms that leverage the underlying Hadoop infrastructure for data analytics. Some of the key players in this segment are: Karmasphere, which ofers an analytical development platform to perform ad-hoc queries on Hadoop-based data via an SQL interface; Datameer, which provides a Hadoop-based business intelligence platform that leverages a spreadsheet-like interface to analyse data; and service providers such as QlikView, Revolution Analytics, Informatica, 1010data, and ClickFox which ofer cloud-based Big Data applications and services. Big Data use: Big Data analytics engage with large datasets which may be dif cult to understand for business users. A number of organisations such as Amazon Web Services, Google, and Intellicus are launching new user applications which facilitate the usage of Big Data analytics. Additionally, the landscape for Big Data IT services is growing exponentially, with established service providers such as Oracle, IBM and CSC building their Big Data service portfolio. Moreover, Indian IT/ BPO players such as TCS, Infosys and Wipro are also bolstering their capabilities in Big Data-specic software development and implementation. Big Data: The Next Big Thing 22 Big Data enables better customer segmentation, improved productivity and fraud detection across all industry sectors As organisations adjust to the rapidly changing digital lifestyle of consumers worldwide, they are beginning to discover the importance of understanding and envisaging the impact of information generated from non-traditional sources such as blogs, Facebook posts, tweets, emails, smartphone applications, electronic sensors, images and YouTube videos. Big Data not only helps organisations gain a multi-dimensional view of their ecosystem, but also generates powerful insights that can help them better execute their operations and take well-informed decisions. Big Data is increasingly being leveraged through advanced data analytics tools and techniques to provide organisations with a better understanding of their customers, competitors, operations, suppliers and partners. High performance analytics, which previously took days or weeks to perform, can now be undertaken in seconds, minutes or hours through Big Data technologies. The public and private sectors are adopting Big Data analytics on a large scale to generate strategic insights and improve their product/service strategy, operational efficiency and gain a deeper understanding of their customers, competitors and suppliers. Big Data analytics is enabling them to predict the trends in near real-time, make more accurate forecasts and adjust their operations quickly to changing demand or new business opportunities. Big Data: The Next Big Thing 23 Public sector: Big Data can be of immense use in the public/development sectors. It enables government departments and developmental organisations to analyse large amount of data across populations and to provide better governance and service. Big Data analytics can help them to improve transparency, enhance decision making, and adopt innovative practices in healthcare, public administration, defence, disaster management, transportation and energy. For example, Big Data has emerged as a new focal point for the US Government, which has announced a USD 200 million Big Data Research and Development Initiative in March 2012. Financial services: Big Data analytics can enable nancial institutions make better trading and risk decisions, protect themselves from frauds and security threats, and improve their products by better customer identication and marketing campaigns. Further, Big Data analytics is transitioning investment banks from relying on overnight batch data to make trading decisions. It has improved the risk decisions by leveraging real-time analysis of current data rather than the risk management models based on historical data. For example, CITIC Bank Credit Card Center used Big Data technology to identify customers unlikely to activate their credit card services, and direct marketing incentives to those most likely to activate, thereby improving the efectiveness of the marketing campaign by 65 per cent, while Westpac New Zealand used Big Data technology to analyse social media data to gain real-time insights into the banks brand health and its product performance across diferent geographies by correlating specic branch performance to customers social data. Healthcare: The surge in volumes of clinical data on medication, allergies, and procedures owing to the implementation of electronic health records have led healthcare organisations to seek opportunities to predict and react more rapidly to critical clinical events, resulting in better care for patients and more efective cost management. For example, several of the United States largest integrated delivery networks such as Cleveland Clinic, MedStar, University Hospitals, St. Joseph Health System, Catholic Health Partners and Summa Health System use the Big Data platform for real-time exploration, performance and predictive analytics of clinical data. Manufacturing: Organisations are increasingly leveraging Big Data and nding new opportunities to predict maintenance problems, enhance manufacturing quality and reduce costs using Big Data. For example, Volvo leverages Big Data to analyse information received from its vehicles, customer relationship management systems, product development and design systems, to identify, in advance, potential issues such as manufacturing and mechanical problems and proactively resolve the problems by adjusting its manufacturing process. Telecommunications: Organisations in the telecom industry are increasingly relying on real-time analysis of data generated by mobile devices including phone calls, text messages, applications, and web browsing for better customer service and to build on retention and loyalty. For instance, while Nokia collects a huge amount of unstructured data from phones in use, services, log les and other sources and uses it to gain insights and understand the collective behaviour of consumers to improve the quality of its phones and their features, Cablecom deploys Big Data analytics to identify when a Big Data: The Next Big Thing 24 customer was most likely to make a decision to leave its network and ofers special deals and incentives to retain the customer at the right time. Retail: With large amounts of data being generated from the point-of-sale at stores, online transactions, and social media posts, Big Data ofers numerous opportunities to retailers to improve marketing, merchandising, operations, supply chain and develop new business models. Retailers are deploying Big Data analytics to improve the accuracy of forecasts, anticipate changes in demand and react accordingly. For example, the use of Big Data analytics led to signicant growth in the number of active members of Sears loyalty programme (membership crossed 80 million customers). Other industries: Big Data can also be used in other industries. Data-intensive verticals such as utilities, oil & gas, and transportation, where data is generated through smart meters, GPS systems, and satellites are gradually using Big Data analytics to make real-time predictions of their operations. Big Data: The Next Big Thing 25 Social gaming, mobile applications, internet search portals are key end-user applications, leveraging Big Data analytics As adoption of Big Data analytics by enterprises is gaining traction, players are also gearing up towards mainstream adoption, i.e., B2C applications. Many Big Data players are solving dif cult problems for consumers by providing Big Data applications on PCs, smartphones, tablets and other web-enabled devices. Consumers are using Big Data analytics for everyday chores such as locating vacant parking spaces more efectively, and for real-time comparison of prices. With new applications coming into play everyday, the B2C market for Big Data is likely to replicate the success of current mobile applications in the coming years. While innovation is taking place in Big Data technologies, success would be determined by mass adoption and a large number of businesses getting valuable insights through the new and compelling end-user applications that allow regular business users or customers to quickly derive practical and actionable insights. Global Perspective on Big Data Big Data: The Next Big Thing 27 North America drives the Big Data opportunity with over 55 per cent of the worlds data North America and Europe, the two major data hubs of the world, account for a substantial portion of the global demand potential for Big Data analytics. Big Data service providers and leading IT players have signicantly ramped up their capabilities in these developed regions that embraced the concept of Big Data, particularly in data-intensive industries such as digital media, manufacturing, healthcare, retail and nancial services. While North America and Europe are poised to drive the growth of Big Data for the next 2-3 years, developing economies such as India and China are expected to catch up soon riding high on the rapid expansion of multimedia content, increasing popularity of social media and proliferation of mobile devices. Further, while developed economies are likely to continue to be the major Big Data contributors in terms of revenue opportunity, emerging economies, particularly India, are all set to emerge as the preferred Big Data analytics and associated IT service providers. Big Data: The Next Big Thing 28 Global Big Data market is estimated at ~USD 8.0 billion in 2012 Though still in an embryonic stage, with large rms piloting Big Data implementation, the industry is witnessing exponential growth and market penetration. Statistics suggest that the industry is poised to grow by more than 50 per cent in 2012 to approximately USD 8.0 billion from USD 5.0 billion in 2011. Tremendous opportunities have mushroomed for players across the technology spectrum hardware and software applications providers; systems integrators; technology consultants and analytics service providers with a large number of organisations implementing Big Data technologies. The IT-BPO industry is expected to account for about 36-38 per cent of the market opportunity, followed by applications software at approximately 26-28 per cent. The market is further expected to experience high penetration rate with investments expanding beyond the leaders of the Silicon Valley such as eBay, Amazon, Yahoo and Google organisations that initiated the Big Data revolution, to industry verticals such as manufacturing, nancial services, healthcare and retail. Big Data: The Next Big Thing 29 Emergence of niche start-ups and technological developments fostering growth in the Big Data industry Big Data: The Next Big Thing 30 New database architectures and innovative analytics tools and techniques to facilitate Big Data implementations The key stimulus for Big Data implementation is the innovation in database architectures and analytical tools. Technologies are emerging in the areas of: Data storage and management (architectures): A number of database architectures and systems such as Hadoop, NoSQL database systems, and MPP systems have emerged, enabling easy storage and analysis of high volume unstructured data, thus improving scalability and fault tolerance. These systems perform data management functions much faster through distributed processing and rapid parallel computations on large clusters of computer nodes. Data storage, advanced analytics, and data processing: The need for faster data access, storage and analysis has led to the development of in-memory databases such as SAP HANA and Terracottas BigMemory, which store data in a computers memory, as opposed to disk-based database systems, thereby enabling faster data processing, low-latency and real-time analytical queries. In-memory databases particularly help in algorithmic trading, e-Commerce and social media analytics, where datasets are large and real-time analysis is required. Moreover, analytics tools such as Kognitio, SAP HANA, and SAS analytics server enable rapid computing and real-time analysis by reducing the response time, exible and agile analytical environment through massively parallel processing of queries. Big Data: The Next Big Thing 31 Advanced visualisation: Tools and techniques such as tag clouds, real-time dashboards, and heat maps enable representation of multi-dimensional data in enhancing the quality of analysis and insight by facilitating rapid and accurate observations. Unlike traditional visualisation tools, these new techniques facilitate integrated display of performance metrics updated in real-time, enabling users to quickly visualise complex data and get faster insights. Big Data: The Next Big Thing 32 Emergence of niche Big Data start-ups to boost technological innovation Tools and technologies required to manage and analyse Big Data present a growth opportunity for start- ups to innovate and come up with new products. New organisations across the Big Data technology stack have been thriving on the back of some robust investments anticipated in the Big Data space. The centrepiece of Big Data technology innovation, the Hadoop distribution, has been put to commercial use by many start-ups such as Cloudera, HortonWorks, Zettaset, and MaPR, with some customisation of the open source software. Furthermore, the business environment is witnessing a slew of start-ups in the non-Hadoop systems such as NoSQL, Next Generation (MPP) Data Warehousing like CouchBase, Splunk and VoltDB. The industry also has many start-ups emerging in the analytics platforms and cloud-based applications as well as in the advanced data visualisation space. While the past 2-3 years have mainly seen new organisations coming up in the data management space, analytics applications is the impetus for growth in the next few years. Some of the start-ups in this eld include Karmasphere, Kognitio, 1010Data, Revolution Analytics and QlikView. The Big Data technology space is witnessing a lot of venture capital activity, with funding in Big Data start-ups reaching ~USD 2.5 billion in 2011, compared with ~USD 1.5 billion in 2010. These start-ups are innovation hubs that are gaining importance across industry verticals. Most of theseorganisations are witnessing high double-digit revenue growth driven by the huge demand for their solutions. Moreover, Big Data: The Next Big Thing 33 many start-ups are being acquired by larger IT players given the growth opportunities and the need to build Big Data capabilities. For instance, IBM has acquired Tealeaf Technologies, Vivisimo and Varicent; Teradata acquired eCircle, and EMC acquired Greenplum. Big Data: The Next Big Thing 34 Large IT players leveraging M&As to add Big Data capabilities to their service portfolios The Big Data space is witnessing a string of M&A driven by the need to quickly ramp up capabilities and also to have a complete set of capabilities to service clients who are keen to have Big Data implementation. Leading technology players such as Oracle, IBM, SAP, and EMC are aggressively acquiring smaller Independent Software Vendors (ISVs) and data analytics rms to strengthen their Big Data portfolio. IBM is in the forefront of this phenomenon through multiple acquisitions over 2010-12 in the Big Data space. It acquired Vivisimo and TeaLeaf Technology in 2012, i2 Limited in 2011 and Coremetrics and Netezza Corporation in 2010, for bolstering its Big Data capabilities. Further, HP acquired Autonomy for more than USD 10 billion, making it the largest deal in the Big Data industry. HP aims to cater to the Big Data market by leveraging Autonomys pattern matching technology that recognises and processes Big Data. Big Data: The Next Big Thing 35 Emergence of cloud-based development and deployment for Big Data solutions As data is increasingly becoming unstructured, complex and varied, it has become imperative to process and analyse it in real-time. New data-centric solutions such as Database Platform-as-a- Service (PaaS), on-demand database service, analytics Software-as-a-Service (SaaS), as well as on-demand data preparation, storage or enrichment through Data-as-a-Service (DaaS) are now commercially available. These Big Data cloud solutions enable traditional enterprises to scale up their data management and storage at lower costs and provide them real-time insights about the data that could not be stored before. While the existing SaaS application service providers are working towards product/service diferentiation to ensure that customers derive more value from their applications, new pure-play service providers are launching Big Data-specic cloud applications and services. For example, Google, Amazon Web Services and Microsoft have enhanced their cloud oferings to ofer PaaS and analytics SaaS for Big Data. Leading technology players are launching Big Data cloud solutions in June 2012, CSC launched its DaaS ClimateEdge, a suite of reports that leverages data from NASA, the National Oceanic and Atmospheric Administration (NOAA) and other government sources and uses on-demand advanced analytics to manage climate-related risk and exposure. New players such as 1010Data, and Kognitio Big Data: The Next Big Thing 36 are also ofering their cloud-based Big Data solutions to their customers, enabling them to analyse Big Data on-demand. However, the adoption of Big Data through cloud applications may witness a few roadblocks in terms of data privacy and security concerns. For example, regulations such as Health Insurance Portability and Accountability Act (HIPAA) Privacy Rules that ensure patient privacy of shared data may inhibit the adoption of Big Data analytics on-demand. Big Data: The Next Big Thing 37 Potential shortfall of 1.5 million data-savvy managers and ~150,000 data scientists in the US in 2018 The Big Data phenomenon has led to an increasing demand for data scientists professionals conversant with both the business context and data analytics who play a crucial role in extracting insights from large datasets, analysing these and then presenting the value-added information to business users or non-data experts. Big Data needs a new breed of professionals with a deep expertise in statistics and machine learning, as well as managers and analysts who can leverage insights for Big Data. The shortage of such talent is a signicant challenge that organisations need to address for successful Big Data implementation. According to McKinsey, the US alone faces a shortage of 140,000-190,000 analysts and 1.5 million managers who can analyse Big Data. To address the shortage, organisations have embarked on initiatives to train their existing employees and develop new talent. Organisations such as EMC, Oracle and IBM are partnering with universities to ofer courses on various elements of Big Data. Internally, enterprises are creating organisational cultures that are favourable for data-driven decisions by hiring employees from academic elds such as statistics, and mathematics, as well as through on-the-job training on emerging technologies in the Big Data space. Big Data: The Next Big Thing 38 Slow enterprise adoption due to lack of awareness about benets of the Big Data While there is a lot of attention on Big Data and organisations worldwide have started investing in it, adoption by traditional enterprises has been slower than expected. This is partly due to dif culties in understanding the Big Data paradigm and how to integrate it with legacy systems and extract business value. Industry studies show that majority of respondents, mainly senior executives from diverse industry verticals world over, acknowledge that Big Data holds signicant business opportunities; however, there is a lack of understanding about how data can be used to drive businesses forward. Further, ensuring that investing in Big Data implementation would achieve a high RoI is also a major concern. Given the gap in understanding the benets and opportunities of Big Data, many enterprises are less inclined to give it high priority for immediate investments. However, the market appears receptive as most of the leading organisations across industry verticals are willing to integrate Big Data into their existing systems, and are engaging in pilot projects to examine their success. The value ofered by Big Data is not currently out of doubt as there are skeptics who are still questioning if it is worth all the investments being poured into it. This is in part due to the lack of abundant and well-publicised business cases on successful implementation and the benets accrued. Therefore, as executives lack an understanding, and in some cases the sponsorship of Big Data, IT organisations may witness additional complexities in terms of budget and bandwidth constraints in the process of implementing Big Data. Big Data: The Next Big Thing 39 Data related regulations like Dodd-Frank and Basel III to impact Big Data implementations An increasing number of regulations are driving organisations to source, analyse and report large amount of data. Regulations such as Dodd-Frank, Basel III and HITECH mandate more transparency and real-time reporting for data collected from multiple systems/sources, their aggregation, analysis and storage. Consequently, organisations in various industry verticals are leveraging Big Data analytics to comply and provide more transparency. This has prompted data management, storage and analysis to be more comprehensive and real-time. While regulations in industry verticals are driving Big Data adoption, regulations such as the EU Data Protection Directive may impact adoption of Big Data analytics, particularly in cloud-based delivery models. Further, with businesses collecting and storing large amount of customer data, privacy-related concerns have also increased. Some countries have already enacted legislations to protect the privacy of individuals and many are in diferent stages of formulating them. Therefore, businesses will also have to consider certain regulatory aspects as they move towards leveraging Big Data analytics using stored customer data. Indias Advantage in the Big Data Opportunity Big Data: The Next Big Thing 41 Indias Big Data market opportunity estimated at ~USD 200 million in 2012 India is rising to play an important role as a key outsourcing destination in the overall global Big Data landscape for services relating to Big Data technology implementation and analytics, capitalising on its already well-established IT/BPO and knowledge service outsourcing industry, which ofers signicant cost and intellectual arbitrage to global multinationals. Indias domestic demand for Big Data analytics is at a nascent stage since most Indian organisations still consider Big Data as a mere hype. The opportunity for Indian service providers arises from ofering Big Data technology implementation and analytics outsourcing services, which is growing robustly. In 2011, Indias Big Data outsourcing opportunity was estimated by CRISIL GR&A to be around USD 90 million and is projected to grow by ~110-115 per cent in 2012 to USD 200-205 million. The IT services segment, which primarily comprises the Big Data technology implementation, including data collection, integration, and designing of Big Data architecture and data analytical tools, is expected to account for 82-84 per cent of this growth projection, while the Big Data analytics services is likely to account for 16-18 per cent. Although immense amount of data is being generated across all industry verticals including nancial services, manufacturing, retail, healthcare, telecom, logistics, and others, nancial services and telecom are early adopters of the Big Data technologies. Big Data: The Next Big Thing 42 Key factors that are pushing organisations to adopt Big Data analytics include large volumes of data being generated across global organisations as a result of the increasing use of Internet, mobile, social media marketing, as well as Machine-to-Machine (M2M) conversations that need to utilise this data to derive meaningful insights to help organisations make well-informed decisions. Big Data: The Next Big Thing 43 Global In-house centres, pure-play analytics rms and IT/BPO players expected to benet from the Big Data opportunity The Big Data outsourcing market, though still at an embryonic stage, is being tapped aggressively by the global in-house centres (captive centres of multinationals) as well as the Indian service providers comprising IT/BPO players, pure-play analytics rms and knowledge service providers. Global In-house Centres: Global multinationals have set up these centres across India to ofer support on various back-end processes such as accounting, HR, and payroll as well as to ofer an ofshore base for knowledge services such as business research, nancial research, data management and analytics and legal services. With growing interest in Big Data, organisations are leveraging their already established in-house centres for Big Data technology implementation as well as to handle large volumes of unstructured data to provide business intelligence and data analytics solutions. Global in-house centres have been successfully leveraged to unleash the power of Big Data as they enable seamless sharing of data given that they are a business unit/division of the parent organisation. This is because there are no data security/privacy issues and there is a high level of data integration with the parent. Further, the management enjoys tighter control over the data and applies analytics closely related to business needs given that these centres have built-in domain knowledge. Some of the key players who have set up in-house centres to deliver Big Data analytics Big Data: The Next Big Thing 44 to their parent organisation are: - Retailers such as Sears Holdings and Walmart - IT/technology service providers such as Google, Yahoo, HP, SAP, Oracle, IBM and Dell - Financial service organisations such as JPMorgan Chase, Merrill Lynch, HSBC, American Express, Goldman Sachs, Barclays, Bank of America, Citigroup and Wells Fargo Pure-play Analytics Players: These primarily comprise Indian as well as global pure-play analytics rms as well as major knowledge service outsourcing providers who ofer analytics and are now establishing their presence in the Big Data analytics eld. Key pure-play analytics rms operating in the industry are: Bridge i2i, Nuevora, MuSigma, Cognilytics, Fractal and AbsolutData. Key knowledge services outsourcing players such as CRISIL GR&A, Ugam Solutions, and SmartCube are increasingly taking interest in expanding their analytics capabilities to harness the potential of Big Data. These service providers enjoy strong subject matter expertise, leverage the best practices in the industry to ofer analytics services, and ofer optimum priced services, given the economies of scale coming from serving various clients with Big Data analytics. These players face key challenges such as low levels of data integration with the clients, intellectual property and data security. Integrated IT/BPO Providers: Several integrated IT/BPO players engaged in application development & management, and infrastructure management as well as BPO players providing outsourcing services for back-end functions have also entered the Big Data market and are moving from simpler business process services to providing Big Data implementation, tools, and technologies. To strengthen their presence in Big Data, these players leverage their global presence and existing multinational client base looking at Big Data implementation as well as utilise their strong technology orientation to provide Big Data tools and technologies. This business model mainly comprises two categories of players: - IT-BPO providers such as Infosys, TCS, Wipro, and HCL. TCS and Infosys are helping their global multinational clients in designing and implementing Big Data technology - Key BPO vendors such as Genpact, EXL, and WNS Big Data: The Next Big Thing 45 Pure-play providers and integrated IT service providers are active in providing services in the Big Data environment Big Data: The Next Big Thing 46 Global in-house centres to be the front-runners in Big Data servicing; but IT/Analytics players follow closely Big Data analytics came into play globally in late-2011. In 2011, many multinationals were skeptical about Big Data implementation and trying to quantify the Return on Investment (RoI) to build a case for Big Data implementation. The early adopters of Big Data analytics have tried to leverage their in-house global centres in India, given the talent shortage in the developed world, to generate meaningful insights from Big Data. The ease of seamlessly sharing data and information also prompted multinationals to leverage their analytics and knowledge centres in India to conduct Big Data analytics. Global multinationals across verticals such as nancial services, retail, technology, and healthcare have started leveraging their Indian centres for Big Data implementation and analytics. In 2012-13, the success of global in-house centres in the Big Data market is expected to catapult the emergence of a hybrid service model in which the in-house centres of global organisations would ofer analytical services to external clients in addition to their internal business units. Further, pure-play analytics rms present in India are increasingly deploying advanced analytical tools and techniques on Big Data sets to gain signicant business traction as more and more Big Data business opportunities move to India. Integrated IT/BPO service providers are building Big Data architecture and ofering analytics services to their clients. Big Data: The Next Big Thing 47 Some of the key initiatives taken by Indian service providers and global multinationals are: In 2012, Sears Holdings, the fourth largest retailer in the US, created a wholly-owned subsidiary, MetaScale, to target and sell its managed Hadoop services (or Big Data services) to customers with revenue of between USD 1.0 million and USD 10.0 million across healthcare and entertainment verticals Walmart expanded its e-Commerce operations to India by opening a @Walmartlabs facility in Bengaluru, India, in April 2012, to develop social media analytics and Big Data infrastructure In July 2012, Yahoo also set up a Grid Computing Lab at the IIT-Madras campus in partnership with the institute to enable researchers to access web-scale data and conduct research on Big Data issues such as search, personalisation and digital advertising Infosys aggressively focuses on ofering major enablers for Big Data analytics adoption including solutions, services, and expertise across key industry verticals such as financial services, manufacturing, healthcare, and telecom In 2012, TCS won Big Data contracts to deliver next-gen insights using Big Data frameworks for a global airline, a US-based bank and a global market research rm as well as to set up a leading- edge distributed data warehouse for a hi-tech rm using Big Data BPO service providers such as Genpact and IBM Daksh are also being seen as strong contenders in the analytics domain and are well poised to capitalise on the Big Data trend The Big Leap in Big Data is expected to come by 2014 when the stage of testing waters would have been successfully crossed and Big Data pilot projects would have delivered protable results or expected ROI for clients. Once the multinational organisations realise the potential opportunity ofered by Big Data analytics, more and more organisations are expected to undertake Big Data implementation in a big way to strengthen their business and enhance protability. All the players are expected to expand their operations to tap the growth in the market. Hence, the industry is expected to witness: The emergence of several new Big Data analytics rms to cash in on the growing Big Data opportunity. Further, these analytics rms and knowledge service players are expected to play a dominant role in the Big Data analytics space Integrated IT services providers who are likely to ofer services across the Big Data value chain from implementation, consulting to analytical services Global in-house centres are likely to continue to grow, and more and more multinational organisations are expected to leverage this business model and set up/expand their in-house centres for Big Data implementation Big Data: The Next Big Thing 48 Service providers are leveraging partnerships, M&As and venture funding to capture the Big Data outsourcing opportunity in India Major services providers across the country are undertaking several strategic initiatives to capitalise on the Big Data outsourcing opportunity. The industry is witnessing an increasing thrust on leveraging venture capital funding; collaboration for developing Big Data technologies and joint go-to-market; mergers and acquisitions to enhance capability across Big Data software and services as well as expanding overseas presence to capture the market. Venture Capital (VC) funding: In the recent months, venture and growth capital rms have invested huge amounts in Big Data organisations, primarily to enable these rms to strengthen their operations Partnerships with foreign players: Big Data service providers are entering into technology partnerships and collaborations to expand their capabilities to serve new markets and industry verticals - In August 2012, Intel built partnerships with India-based Independent Software Vendors (ISVs) across various business segments such as nancial services, manufacturing, education, retail, telecom, and healthcare to foster its presence in the Big Data ecosystem in India Big Data: The Next Big Thing 49 - In July 2012, BPO players such as Infosys BPO announced plans to look for partners in the Big Data analytics eld to strengthen its capabilities Strategic M&A to gain Big Data capabilities: The hype in the industry has led to the mushrooming of various smaller players ofering Big Data services such as application development, system integration, consulting, storage and architecture design. Established integrated IT/BPO service providers and pure-play analytics rms are aggressively acquiring niche players to broaden their capabilities - In June 2012, Wipro acquired Australia-based Promax Applications Group, a specialised trade promotion management rm, for USD 36.6 million to reinforce its presence in the Australian market and strengthen its capabilities in Big Data analytics solutions Geographic expansion: Indian organisations are also looking to expand their overseas presence to market their Big Data capabilities and capture the market opportunity Strengthening workforce: Various organisations are planning to collaborate with the academia to train and certify data scientists to counter the impending shortage of data scientists, analysts, and managers that is likely to challenge the Big Data market growth - In August 2012, Intel announced plans to collaborate with educational institutions to bring innovation in data analytics and research, and has tied up with ~300 colleges and universities in India including the IITs and other educational institutes such as Pune University to foster research and innovation in Big Data analytics Big Data: The Next Big Thing 50 India has an early mover advantage vis--vis other geographies in creating a strong base of Big Data workforce India is expected to be a forerunner in Big Data talent supply, not as a cheaper alternative but as a go-to-destination for the quality of talent in the country. India churns out more than 2.5 million university graduates and about 750,000 post graduates every year, of which ~700,000 students are graduates in Mathematics and Science and ~300,000 are post graduates in these elds. With its repower of intellectual pool in Mathematics and Science, India is all geared up for the Big Data revolution. Further, with the ever-increasing number of students having domain expertise in decision sciences, India is well-positioned to address the global demand for Big Data solutions. With India already catering to the business analytics needs of global multinationals at the best possible performance-to-cost ratio, the country has a huge potential to supply data scientists for the Big Data industry. Tier I cities such as NCR (Delhi, Gurgaon, and NOIDA), Bengaluru, and Mumbai have emerged as good breeding grounds in India for global organisations to set up their analytics centres of excellence and they account for more than two-thirds of the analytics professionals in India. Further, more than 60 per cent of the analytical workforce in India has a work experience of 3-10 years, which is a boon to Big Data analytics. These professionals have the ability to apply advanced analytics and can be trained internally by organisations to work on Big Data. Big Data: The Next Big Thing 51 Indian academia is also aggressively developing capabilities to match the ever-growing demand for and dearth in supply of data scientists with analytical training through solemn intervention at the education level and imparting training on analytical and statistical tools. Premier colleges/universities in India already have courses in place to impart training in analytics. Key analytics courses in India include: Business Analytics and Intelligence (BAI) IIM Bengaluru: An executive course, BAI requires at least ve years of work experience and is suitable for professionals who are already working in analytics to enhance their knowledge as well as for those with an analytical aptitude Executive Programme in Business Analytics IIM Calcutta: This is a one-year distance programme ofered in association with Hughes Education, and covers topics such as data mining, soft computing, design of experiments, survey sampling, statistical inference, investment management, nancial modelling, and advanced marketing research Advanced Certicate Programme in Business Analytics IIT Bombay: Designed in partnership with HughesNet Global Education, it is a part-time course for analysts to develop the skills and competencies of key analytics techniques such as behaviour and data modelling Business Analytic & Data Mining Indian Statistical Institute ISI Pune: Designed to guide business analytics professionals in analysing large quantities of data to study unknown interesting patterns through cluster analysis, dependencies (association rule mining), classication of data, and predictive analytics Post Graduate Certificate in Research and Analytics MICA Ahmedabad: This is a one-year programme based on practical and non-technical approach through various data analysis software Indian universities continue to introduce courses in statistics and data analytics to produce graduates to meet the manpower shortage in the global Big Data market. Recent academia initiatives for developing the talent pool for Big Data analytics include: In August 2012, Academy of Decision Science and Analytics started ofering an e-learning Post Graduate Programme (PGP) course in data analytics in association with Ivory Education In July 2012, The Institute of Management Technology (IMT), Ghaziabad, signed an MoU with Genpact to develop and implement analytics elective for the two-year post graduate diploma in management programme to provide both theoretical and practical work experience in analytics as applied in diferent industries - Pankaj Kulshreshtha, Senior Vice President and business leader Smart Decision Services Analytics and Research, Genpact, stated, The emergence of big data, regulatory changes and social media are causing a big shift in the way businesses operate and students of IMT will learn how to combine process, analytics and technology to make organisations smarter in this dynamic new world. It is also a great example of two organisations, both leaders in their respective elds working together to build talent in an area which is expected to more than double in the next 2-3 years in India. Big Data: The Next Big Thing 52 In May 2012, IIMLucknow partnered with the US-based Kelley School of Business to provide two certicate programmes in business analytics and global strategy - Dan Smith, Dean of the Kelley School, said, Our collaborative goal is to fundamentally advance the quality of decision making by business leaders by improving their ability to draw meaningful insights from the massive amounts of data available to them today. In November 2011, Indian School of Business (ISB) Hyderabad launched Asia Analytics Lab for its students, which is a focal point for data analytics initiatives, education, research and business applications in the Asian context In 2011, the Indian Institute of Science (IISC) Bengaluru launched Master of Management, a two-year course to focus on training students in Technology Management and Business Analytics Indian service providers are also making large investments and innovation in creating and grooming a new breed of talent. For example, IBM has partnered with 500 universities in India to help more than 30,000 students develop skills in predictive analytics. India is at an advantage vis--vis other geographies, as apart from the ample number of graduates it produces each year, organisations in India are also making huge investments in breeding and grooming such talent. Further, India retains advantages due to demographic factors, and the fact that the education system is producing a huge pool of analytical talent. Big Data: The Next Big Thing 53 Indian service providers ofering Big Data solutions across verticals Big Data: The Next Big Thing 54 1. Manufacturing: Indian service providers enable manufacturing organisations to analyse large datasets for efective decision making The manufacturing sector generates large volumes of text, image and numerical data in its production processes, R&D and engineering functions. The sector generates data from a multitude of sources, including instrumented production machinery (process control), supply chain management systems, and performance monitoring systems. Large volumes of datasets thus aggregated are then subjected to diferent Big Data analytical tools and techniques to generate useful insights across the value chain. Hence, Big Data nds application across R&D, product design, supply chain management, production, marketing and sales, and after-sales service. R&D and product design: The use of Big Data in the R&D processes ofers opportunities to accelerate product development, help designers focus on product features based on concrete customer inputs as well as use designs that minimise production costs - Aggregate customer data and make them available to improve service and enable design-to-value - Source and share data through virtual collaboration sites (idea marketplaces to enable crowd sourcing) Big Data: The Next Big Thing 55 - Build consistent interoperable, cross-functional R&D and product design databases to enable rapid experimentation, simulation, and co-creation Procurement: Manufacturing rms use Big Data analytics during procurement process to drive ef ciency in their supply chain and improve demand forecasting processes. Manufacturers deploy Big Data analytics to - Gather sales, customer feedback, and demand patterns from distributors/retailers to rectify any deviation in real-time, thereby improving the supply chain responsiveness - Conduct a path analysis to design ways to move a product more efectively from the factory to the customer - Automate stock optimisation and replenishment decisions based on the analysis of inventory-related data trends Production: The deployment of the Internet of Things or actuators and sensors also allows manufacturers to leverage real-time data from sensors to track parts, monitor machinery, and guide actual operations. At the production stage, Big Data analytics is used in - Digital factory simulations: Manufacturers take inputs from product development and historical production data and apply advanced computational methods to create a digital model of the production process and thus design optimal production layouts and digital shop oor control and improved fault detection - Sensor-based operations: Firms leverage Big Data analytics on the volumes of real-time, highly granular data gathered from the sensors deployed across production lines to forecast operational costs, schedule predictive systems maintenance, monitor labour and equipment performance, and improved fault detection by identifying patterns that lead to potential equipment failure Sales & Distribution: Manufacturing organisations track customer-related transaction data to generate actionable insights on the customer buying patterns and behaviour, strengthen their marketing and sales strategies and make informed product decisions. Analytics can be applied on this data to - Ensure improved customer segmentation and better customer relationship management - Improve product inventory tracking - Enhance the efectiveness of the sales force and marketing campaigns After-Sales Service: Warranty analytics as well as real-time analysis of sales and feedback data are the key applications being leveraged by manufacturing rms, which are based on Big Data analytics. These applications primarily involve analysing large volumes of warranty claims to improve product development with the aim of improving product quality and reducing warranty costs. Further, after-sales and feedback data can help enhance after-sales service as well as detect and rectify manufacturing and design errors to enhance customer satisfaction
Big Data: The Next Big Thing 56 Some of the key benets delivered by Big Data analytics for the manufacturing sector include: Product demand forecasting and supply planning: Using real-time data from sales and demand patterns or from customer feedback and purchasing behaviours, manufacturers can rectify any deviation in real-time, engage in efective demand forecasting, adjust production levels and increase the frequency of planning supply cycles to match with the production cycles Improved collaborative engineering through crowdsourcing: Leverages crowdsourcing to collect product-/market-related data to enable collaborative engineering that results in innovative design from customers. For example, auto manufacturing organisations encourage ideas from consumers to make improvements to new car models. Big Data analytics enables these organisations to gather and analyse data from tweets, blogs and other social media platforms efectively to ofer innovative features in newer versions of the vehicles Mass customisation: By enabling design-to-value, Big Data analytics allows manufacturers to leverage quantitative customer insights mined from sources such as PoS, customer feedback from retail surveys, and social media platform, and improve their output quantities as well as facilitate mass customisation Ef cient planning and operations: Big Data aids in designing, simulating and testing product or factory plans in a virtual manner, before the actual production or construction. Further, it is used to predict equipment failures and system replacements to better anticipate any roadblocks in the manufacturing processes. To capitalise on this huge opportunity, various Indian Big Data service providers such as Infosys, Intel, Fractal, and Wipro have built capabilities to win new clients as well as to better serve the existing ones in the manufacturing sector. In 2012, Infosys was selected as the sole sourced partner for cloud strategy and Big Data infrastructure for a North American manufacturer, to devise a Big Data strategy and roadmap In August 2012, Intel announced the signing of partnerships with India-based ISVs across various business segments including manufacturing, and others to build Big Data analytics capabilities across India Big Data: The Next Big Thing 57 Case examples: Indian service providers serving global manufacturers on custom designed Big Data implementations and analytics Big Data: The Next Big Thing 58 2. Retail: Indian service providers help retailers understand customer buying patterns and maintain optimal stock levels Retailers generate Big Data through various sources such as social media, Point of sale (PoS) and web/ online sales platform (credit cards and rewards cards, purchases), consumer surveys, loyalty programme proles, in-store tools and footfalls. This customer-focused data can be used to gain signicant and meaningful insights into consumer behaviour, their buying patterns, and changing preferences. Big Data analytics helps both online as well as brick and mortar retailers to improve their decision making, manage the supply chain, inventory levels, merchandising and pricing, enhance focus on customer segmentation and hence introduce targeted products/services as well as marketing/promotional campaigns. Further, Big Data allows retailers to enhance their margins and productivity by enabling them to perform real-time analysis of customer response to pricing/product changes/productivity and rene their strategies based on such analysis. Some of the important areas within the retail industry where Big Data analytics is being used are: Supply chain and procurement: Retailers use Big Data analytics to help them better manage their and their suppliers inventory levels, relationships with suppliers, and make informed decisions on stock levels. For example, Barnes & Noble deployed Big Data analytics solution from IBM to enable suppliers to monitor its inventory and take appropriate replenishment decisions. Big Data enables retailers to Big Data: The Next Big Thing 59 - Improve inventory management, stocking decisions and stock forecasting by combining multiple datasets such as sales history, weather predictions and seasonal sales cycles - Optimise transportation and vehicle routing by using GPS-enabled Big Data telematics to improve eet and distribution management, enhance productivity by rationalizing fuel ef ciency, preventive maintenance, driver behaviour, and vehicle routing - Base their supplier negotiations for price discounts, and change in raw material preferences by analysing customer preferences and buying behaviour data Merchandising: Big Data implementation and analytics on the POS and RFID data can help retailers to easily strengthen their merchandising-oriented decisions such as - Assortment optimisation: Retailers make product assortment decisions in stores based on the demographic and purchasing pattern data - Price optimisation: Retail rms can leverage advanced demand-elasticity models on the pricing and sales data available for deciding the optimum pricing of products and services - Placement and design optimisation: Brick and mortar retailers optimise the placement of goods and visual designs of their store layout by mining sales data at the SKU level and even foot-traf c data and online retailers adjust website placements based on data on page interaction such as website traf c, scrolling, clicks, and mouse-overs Operations: To create operational value and efficiency, retail firms are deploying Big Data implementation to - Ensure performance transparency by analysing store sales, SKU sales, and sales per employee data - Reduce costs while maintaining service levels by leveraging the labour input, time and attendance data, and tracking labour scheduling information Sales and marketing: It is the most common business function for which retail rms use Big Data analytics. Key sales and marketing functions where Big Data implementation nds use are: - Use customers demographics, purchase history, preferences, and real-time location data for cross-selling and up-selling of goods - Undertake location-based marketing for ofering promotional discounts, and special ofers, primarily leveraging the personal data generated by smartphones - Enable customer micro-segmentation to deliver personalisation of products/services to customers based on traditional market research data as well as data available from behavioural tracking Big Data: The Next Big Thing 60 - Use sentiment analysis that leverages consumer data generated by social media platforms to make informed business decisions such as assessing the real-time response to marketing campaigns - Study in-store consumer behaviour to improve store layout, product mix, and shelf positioning by tracking shopping patterns, real-time location data from smartphone applications, and shopping cart transponders Customer services: By applying Big Data analytics on customer behaviour, which can be tracked through service centres (IVR and call centres), social media platforms; retailers can improve their interaction with customers for better service delivery Big Data analytics has found signicant acceptance in the retail sector, especially among the leading players. Walmart acquired social media rm Kosmix to create WalmartLabs and is using this specialist R&D unit to redesign its business by merging social, mobile and retail data, to understand consumers buying habits. Further, in April 2012, Walmart expanded its e-Commerce operations to India and opened the @Walmartlabs facility in Bengaluru, India, to develop social media analytics and Big Data infrastructure. Other retailers such as Sears utilise their in-house IT/technology centres in India to provide Big Data analytics to set product prices in real-time and move inventories. It also has a subsidiary, Metascale, which helps other organisations in industries such as energy and healthcare, implementing Hadoop. Big Data-driven analytics hold much potential for retailers in the realm of customer intelligence. These include: The ability to prole and segment customers based on socioeconomic characteristics can allow rms to market to diferent segments based on their discrete preferences and hence generate better customer retention rates Online social network analysis enables businesses to monitor consumer sentiments towards their brands, react to trends as they develop, and identify inuential individuals within networks for direct marketing Using Big Data to construct predictive models for customer behaviour and purchase patterns facilitates the accurate appraisal of each Customers Lifetime Value (CLV) to a rm, allowing resource allocation towards acquiring and retaining profitable clients, thereby raising the overall protability Big Data: The Next Big Thing 61 Sears is leveraging Big Data analytics to turn itself around, and is also keen on ofering analytics services to external clients Big Data: The Next Big Thing 62 3. Financial services: Witnessing increased adoption of Big Data analytics, to reduce risk and uncover new market opportunities Financial services is considered to be a very data-intensive sector, with more data per million of revenue/ operating expenditure or per employee, than almost all other sectors. Within the sector, structured and unstructured data is available from a variety of sources such as customer and transaction data from various channels such as branch, kiosks, mobile and web; social media; emails; credit cards data; insurance claims data; stock market data; statistical data, PDF & excel les, news, videos, and government lings. With the industry facing a multitude of challenges such as higher customer expectations, uncertain operating environment, strict regulations, stif competition, and slowing economic growth, Big Data analytics can help banks, capital markets and insurance organisations by providing tools to reduce costs and improve productivity. Increasing regulatory compliances and the need for collecting every piece of data and standardising them is driving the growth of Big Data analytics. Several areas within the nancial services sector are expected to gain from Big Data technologies. They include: Banking Credit reward programme analysis: Banks are increasingly using unstructured data to understand customer prole and introduce successful credit cards with innovative rewards programme - For example A national bank used a Big Data solution to analyse data from sources such as call centres, customer service emails, and social media conversations to create a credit card ofering Big Data: The Next Big Thing 63 with a rewards programme to attract a young, professional demographic. This helped in providing information to the marketing department to create a targeted promotion campaign, including strategically placed social messaging and monitoring Capital Markets Trading surveillance: The nancial sector leverages Big Data to monitor trading activities and identify abnormal trading patterns. In surveillance, Big Data analytics allow online access to trade-by-trade history for investigation, trending, and discovery to be combined with real-time data to provide a real-time and historical context to behaviour - For example Organisations combine data about the parties that participate in a trade with the complex data that describes relationships among those parties and how they interact with one another. The combination allows the bank to recognise unusual trading activity and to ag it for review Insurance Insurance organisations are increasingly using unstructured data to predict client longevity, along with examining the prospective clients medical status by analysing their general comments, visits to particular websites, and enquiry about some specic products. Using weather and calamity information for managing claims exposures and losses based on unstructured data from weather measurements, and soil observations. - E.g. An insurance organisation sells Total Weather Insurance, which pays local farmers when they are impacted by weather events that afect their prots. The organisation uses a cloud-driven Big Data analytics service to predict the possibility of extreme weather, along with the potential impact. It prices its insurance policies accordingly, based on 2.5 million daily weather measurements, 150 billion soil observations, and 10 trillion scenario data points to build and price their products Big Data is being extensively used across all domains of the nancial services for risk management, fraud detection, compliance and customer relationship management: Risk management: Predictive modeling of customer behaviour and scoring techniques enable nancial sector organisations to access and minimise default risks at an individual level and make customised oferings, in line with the customers risk prole - E.g. A large bank wanted to use 12 years of monthly account-level credit card data, credit bureau information and bank account information to better assess the risk before granting loans or raising credit limits. Ideally, it wanted this information in real time. To speed the computing, it used an in-database Big Data approach, which helped the bank to calculate risk 70 times faster Big Data: The Next Big Thing 64 Fraud detection: Big Data technologies give nancial services organisations the ability to run exploratory modelling and discovery on data, thereby increasing the accuracy of fraud detection models. The faster processing capability enables organisations to quickly build or refresh fraud detection models, and also helps in detecting fraud in real-time by analysing and streaming transaction data Compliance and regulatory reporting: Increased oversight and scrutiny of the organisations operations, funding and investment portfolio has led nancial services organisations to adopt sophisticated Big Data technologies to store and process vast amount of data to simplify and streamline their regulatory and compliance reporting - For example Reserve Bank of India (RBI) has directed all Indian banks to standardise their regulatory reporting by following an Automated Data Flow (ADF) approach to ensure 100 per cent accuracy and zero human intervention in every stage of reporting: right from data extraction from source systems to the actual submission of returns. Firms that could not utilise complete information and rms that believed reporting did not really require management attention are increasingly focusing on Big Data analytics Customer relation management: Big Data analytics also helps nancial service organisations in acquiring new customers and cross-selling their oferings to existing customers by using Big Data to identify the most protable customers and run efective marketing campaigns. The large volume of unstructured data from social media is combined with the CRM systems to study customer behaviour and optimise customer experience. Apart from customer acquisition, organisations can improve customer retention by using predictive analytics to detect early signs of disengagement Financial services organisations are gaining business advantage by mining and analysing Big Data to stay ahead of the competition, improve customer service, detect fraud, accurately calculate risks and maximise operational ef ciencies, along with adhering to stringent regulations and compliances. Indian service providers are enabling Big Data analytics in the area of fraud detection, client behaviour analysis, trading pattern analysis, risk calculation on large portfolio of loans, and improved and targeted marketing campaigns. Further, Indian nancial sector organisations are increasingly favouring Big Data analytics to tackle terabytes of unstructured data: - YES Bank is nding out solutions to handle the increasing pile of unstructured data from mobile devices and social media networks, customer transaction starting from withdrawal of money from bank, and ATM. The bank feels the regulatory requirement of storing internally generated data is driving banks to adopt Big Data Big Data: The Next Big Thing 65 Case examples: Financial services rms are using Big Data to prevent fraud and better understand customer prole Big Data: The Next Big Thing 66 4. Telecom services: Telcos are using Big Data to boost marketing, reduce attrition rate and enhance network productivity The telecommunication service industry is characterised by extremely high levels of competition. This has resulted in the telecom organisations shift in focus from simply reducing costs and increasing protability to delivering value and managing customers experience over their networks. Further, commoditisation of traditional voice-based services has led to reducing Average Revenue per User (ARPU) and margins. So it has become important for the telecom service providers to diferentiate themselves by providing innovative and high quality services, while avoiding network overload and cost overruns. The telecom industry generates large volumes of real-time data, including customer call logs, billing and usage data as well as data from networks and routers, access points, mobile devices, and social media platforms. This presents a huge opportunity for telecom players to leverage Big Data analytics to derive meaningful insights, help gain better control of services and make efective operating and investment decisions. Following are some areas where Big Data analytics can play a signicant part: Network planning and optimisation: Big Data implementation can help operators to ef ciently plan and predict network growth based on past capacity utilisation, marketing demand forecasts and service consumption trends and implement network changes just before the demand curve. It can also help them to analyse the data available on various web metrics to assess the bandwidth utilisation and better plan on how to use the unused resources Big Data: The Next Big Thing 67 Service quality management: Big Data analytics can give telecom service providers the ability to analyse real-time streaming data from network elements and consumer devices to predict network failures and take preemptive steps Price & product customisation: Using insights generated by combining customer usage and subscription data with network, cost and revenue data, telecom organisations can provide a wide range of services to their customers Strategy assessment and decision making: Leveraging the data generated from customer records across various platforms, telecom organisations can design their marketing campaigns and promotional schemes/discounts/ofers to better target customer groups and ofer more personalised/targeted services Customer attrition management: By conducting predictive churn management analytics on its customer data, telecom service providers can identify high risk customers who are likely to leave the network and ofer them timely and attractive deals to retain them As the telecom industry has been one of the early adopters of Big Data tools and technologies, it is reaping the several benets from the usage of Big Data analytics including planning ef cient utilisation of network bandwidth, improving service levels by proactively detecting network and router failure, and better customer retention through targeted marketing and promotion campaigns. Various Indian telecom service providers have adopted Big Data technologies and realised its benets Reliance Communications plans to use Big Data implementation on the data generated by its telecom business for analytical planning and strategic decision making. Reliance has adopted Multi Parallel Processing DB to store CDR and unstructured data and perform analytics on it Bharti Airtel creates more than 5,000 targeted campaigns a day using Big Data generated from its customer usage, billings and sales details Big Data: The Next Big Thing 68 Case example: Impetus provided a pragmatic approach using NoSQL Apache Cassandra Big Data: The Next Big Thing 69 5. Healthcare: Is getting transformed with the adoption of Big Data analytics, substantially improving patient care In the healthcare industry, data is being generated at a faster pace owing to rapid digitisation of patient healthcare records, monitoring of in-patient and out-patient through sensors, generation of epidemic data, genomics research, medical imaging (MRI, CT-Scan) and implementation of Hospital Information System (HIS), Picture Archiving and Communication System (PACS) as well as gathering of patient behaviour and sentiment data from social media platform. Currently, only a few physician of ces and hospitals, majorly in the US and UK, have Electronic Health Records (HER) systems in place, but that number is likely to increase as the Health and Human Services departments and private hospitals are likely to support EHR adoption rapidly in the coming years. This has prompted healthcare providers, payers, pharmaceutical and medical products organisations to adopt Big Data and explore measures to manage costs, develop products and provide better healthcare to patients. There are several areas where Big Data technologies play a critical role: R&D, Life Sciences/Biomedicine: In this area, Big Data technologies are useful in drug discovery analysis, data annotations and validity analysis of genomic, proteomic, and metabolic data and studying gene expression for next-generation sequencing and read mapping Big Data: The Next Big Thing 70 - E.g. Clinical Genomics uses algorithms and analytics to nd treatments for conditions based on a patients genetic prole. Doctors can use Clinical Analytics to analyse patients similarities, predict outcomes, evaluate risk benets and view treatment options Patient Care: Big Data analytics is increasingly being used in the areas of patient care such as patient monitoring and assessment, patient care personalisation, providing efective and value-added services, preventative care, identifying potential causes for infections, readmission, and diseases. Some of the instances of how Big Data improves patient care are: - Application of Big Data analytics to patient proles (e.g., segmentation and predictive modelling) which help to identify individuals who would benet from proactive care or lifestyle changes. For instance, these approaches can help identify patients who are at high risk of developing a specic disease (e.g., diabetes) and would benet from a preventive care programme - Using visual analytics, doctors can look more deeply into care processes to identify the most efective ones and how they can be ne-tuned - Improve patient care by analysing data coming from myriads of remote patient monitoring devices such as wearable devices, home sensing devices, and video monitoring Healthcare operations: Healthcare operations include activities such as understanding and inuencing consumer behaviour, optimising physician interactions, clinical decision support system, monitoring and educating patients, and Comparative Efectiveness Research (CER). - E.g. automated diagnoses of early-stage breast cancer by using Big Data analytics technology. By automatic analysis of large sets of mammographic images, using unique Image Classication approach, healthcare organisations can classify large collection images based on a small set of training images, which helps the radiologists to speed up their time-to-diagnose Epidemiology: Big Data technologies are helpful for pattern analysis and trends in health issues across a geography, tracking of the spread of disease based on streaming data, and visualisation of global outbreaks, enabling the determination of source of infection Healthcare security: The healthcare sector loses huge sums of money due to medical fraud. Big Data technologies enable government and insurance organisations to detect fraud in real-time and prevent nancial losses arising from fraudulent claims Therefore, Big Data helps organisations within the healthcare, life sciences and pharmaceutical space to improve the quality of patient care or proactive care, lower the cost of healthcare services and patient care, enhance fraud detection and make hospital operations more ef cient, and accelerate research and development. Organisations such as Quintiles and Accenture are leaders in providing Big Data analytics for the healthcare and pharmaceutical space. The Future of Big Data Big Data: The Next Big Thing 72 Global Big Data market to reach ~USD 25 billion by 2015 As enterprises undertake pilots for Big Data implementation and large IT organisations and start-ups compete for market share, the global Big Data market is expected to grow by about 46 per cent to more than USD 25 billion by 2015. The IT & IT-enabled services, including analytics, are expected to grow the fastest, at a rate of more than 60 per cent), with their share in the total Big Data market expected to increase to ~45 per cent in 2015 from ~31 per cent in 2011. Big Data analytics is likely to be driven by the near-ubiquitous nature of the data and proliferation of technologies and applications such as mobile sensors, smartphones and social networking, along with the growing realisation of the benets of Big Data by enterprises. While Big Data could add momentous value in the coming years, it might have to overcome certain roadblocks. Though early movers are formulating Big Data strategies, mass adoption may be hindered by the lack of best practices and the signicant cultural change organisations require for sharing data. However, as organisations leverage large datasets from within and outside, Big Data is likely to continue to grow as an area which can deliver substantial benets. Finally, the aggressive eforts of service providers both large IT organisations and niche start-ups to demonstrate their domain expertise and ability to derive valuable insights from Big Data would be an enabler to this opportunity. Big Data: The Next Big Thing 73 India Big Data outsourcing opportunity to increase over 2012-15 to lie between USD 1.1-1.2 billion Indias Big Data outsourcing opportunity is likely to grow by about 83 per cent annually to ~USD 1.0 billion during 2011-15. India is expected to be the preferred destination for analytics and IT services for Big Data due to its pre-eminence in IT/BPO services, knowledge services outsourcing and analytics as well as for its intellectual pool of talent. The share of analytics in the overall Big Data opportunity is expected to rise from ~16 per cent in 2011 to 25per cent in 2015. The key drivers for India include the eforts of service providers to develop talent and increase their domain expertise and breadth of services. Moreover, a number of Indian service providers are leveraging partnerships with Big Data technology players to facilitate delivery of Big Data solutions. Finally, while the current demand for Big Data analytics is generated from global clients, domestic demand in India is also gaining traction. For example, Asian Paints and Star India have leveraged Big Data analytics to track and analyse large datasets. Big Data: The Next Big Thing 74 Global Big Data market to evolve, India to emerge as a preferred destination for analytics and IT services The Big Data industry is likely to continue to strengthen its foundation over the coming years by investing more on the Big Data technologies and tools. While the emphasis was on technology innovation in database storage and management in 2011-12, the focus is expected to shift to the delivery of Big Data analytics with newer applications coming in for analytics and visualisation. Further, with the growing use cases of Big Data implementation, best practices could help in the wider adoption of Big Data analytics. The US is likely to remain the major market, while demand for Big Data solutions from APAC and Europe is expected to gain traction in the next 2-3 years. Service providers are expected to continue expanding their oferings and educating clients about the benets of Big Data. Large integrated IT-BPO players are likely to leverage technological partnerships with large Big Data players such as Cloudera, EMC (Greenplum), and HortonWorks, as well as evaluate M&As as a tool to build a robust portfolio and provide a global delivery network. As things stand, India is ideally poised to capitalise on Big Data, but there is still work to be done if it were to fully realise its potential in terms of a rened talent pool, having a mature service-provider landscape and innovative service delivery. Continued excellence, along with Indias key value proposition, will ensure the countrys position as a hub for Big Data analytics. Big Data: The Next Big Thing 75 Indian service providers expected to hold a lions share in analytics and IT services for Big Data With more businesses embracing a data-driven decision making culture, Indian IT/BPO service providers and analytics players are providing clients the necessary tools and solutions required to harness Big Data. They ofer analytics-led solutions for better customer insights, unique market diferentiation and managing risks and nancial metrics more efectively. Indian players are likely to build capabilities across the entire spectrum of the Big Data ecosystem. While IT/BPO players would build capabilities in the development of infrastructure, implementation, and delivery, analytics and knowledge service providers are expected to scale up their capabilities in providing Big Data analytics. As the Indian IT/BPO players already have a leading position in industry-specic software development and implementation, they have a huge growth opportunity to build Big Data end-user applications, and develop Big Data management and storage service portfolio. On similar lines, analytics players such as Genpact, MuSigma, CRISIL GR&A, AbsolutData,, and LatentView are gearing up to build the robust advanced analytics required to manage the insights engine for Big Data. Big Data: The Next Big Thing 76 Concerted eforts by the service providers and academia to improve talent employability The rising demand for Big Data analytics is expected to witness a global shortfall of IT and analytics professionals with the necessary skills to implement technologies to leverage Big Data and manage project mandates to derive business value based on these datasets, and data scientists who can run complex techniques to unravel the insights from these datasets. If this issue is not addressed, it can result in a situation where businesses might not be able to gain from the potentially valuable insights from Big Data. However, Big Data analytics as an integrated discipline has just emerged in the academic curriculum and it would take some time before academic institutions start producing Big Data professionals. As a fresh pedigree of data scientists would be limited, the industry is aggressively implementing efective recruitment practices and training modules to develop the existing pool of BI analysts and IT professionals for Big Data analytics. While major eforts are being undertaken globally to develop the Big Data talent, India is at the forefront and has an early-mover advantage than other outsourcing destinations in terms of initiatives by academia and corporates in building fresh talent for Big Data. Indian enterprises and academia have also started addressing the Big Data skills shortage. Technology rms such as EMC, Oracle, and IBM are planning to work with universities in India and overseas to introduce full-length electives or crash courses on various elements of Big Data. Training organisations such as NIIT and Aptech are Big Data: The Next Big Thing 77 also exploring designing curricula for developing specialised skilled talent for the Big Data industry. Further, enterprises are creating organisational cultures that are conducive to data-driven decision making by: Efective recruitment: While recruiting new talent, the focus is shifting from business-oriented degrees to other academic elds such as hard sciences, statistics, and mathematics. Further, candidates are being tested for intellectual curiosity and technical depth to address Big Data challenges On-the-job training: Organisations, both global and Indian, are investing in on-the-job training in emerging technologies of Big Data to eliminate skill gaps in their existing workforce Annexure Big Data: The Next Big Thing 79 AbsolutData Research & Analytics Big Data: The Next Big Thing 80 Accenture Big Data: The Next Big Thing 81 CRISIL Global Research & Analytics Big Data: The Next Big Thing 82 CSC Big Data: The Next Big Thing 83 EMC Corporation Big Data: The Next Big Thing 84 Fractal Analytics Big Data: The Next Big Thing 85 Genpact Big Data: The Next Big Thing 86 IBM Big Data: The Next Big Thing 87 Impetus Technologies Big Data: The Next Big Thing 88 Infosys Limited Big Data: The Next Big Thing 89 LatentView Analytics Big Data: The Next Big Thing 90 Marlabs Big Data: The Next Big Thing 91 MetaScale LLC Big Data: The Next Big Thing 92 Mu Sigma Big Data: The Next Big Thing 93 Nuevora Big Data: The Next Big Thing 94 Glossary Apache Cassandra Is an open source distributed database management system. It is a NoSQL solution, designed to handle very large amounts of data spread out across many commodity servers while providing a highly available service with no single point of failure. Apache Thrift Is an interface denition language that is used to dene and create services for numerous languages. Apache Avro Is a remote procedure call and serialisation framework developed within Apaches Hadoop project. It denes data types and protocols, and serialises data in a compact binary format. Apache Pig Refers to the data ow language and execution framework for parallel computation, built on HDFS. BASEL III BASEL III is a global regulatory standard on bank capital adequacy, stress testing and market liquidity risk agreed upon by the members of the BASEL Committee on Banking Supervision in 2010-11. BFSI Services Refer to banking, nancial services and insurance services, and includes players like banks, asset managers, mutual funds, insurers, brokers, traders, etc. Big Data Big Data relates to rapidly growing, structured and unstructured datasets with sizes beyond the ability of conventional database tools to store, manage, and analyse them. In addition to its size and complexity, it refers to its ability to help in evidence-based decision making, having a high impact on business operations. Business Intelligence Use of analysis tools to query data repositories and generate analyst reports, enabling managers in business decision making by identifying trends and patterns in the industry. Convectional File System Is the log-structured le system, designed for high write throughput. All updates to data and metadata are written sequentially to a continuous stream known as log. Chukwa Is a Hadoop subproject for large-scale log collection and analysis. Clustergram Is used to visualise how clusters are formed and how cluster members are assigned to clusters as the number of clusters increases. Big Data: The Next Big Thing 95 Data Acquisition Refers to primary data collection through phone, eld surveys and interviews, as well as secondary data collection through web searches, printed sources and databases. Data Analytics Refers to the extensive use of data, statistical and quantitative analysis, explanatory and predictive models, and fact-based management to drive business decisions and actions. Data Collection Refers to collection of data, survey programming and hosting, data search integration and programming. Data Integration Involves combining data residing in diferent sources and providing users with a unied view of these data. Data Management Refers to the development and execution of architectures, policies, practices and procedures that properly manage the full data lifecycle needs of an enterprise. Data Scientist Refers to professionals conversant with both the business context and data analytics. Their role encompasses extracting insights from large datasets, analysing these and then presenting the value-added information to business users or non-data experts. Data Warehousing Refers to the systems where the data resides or stored. Delivery Centre Refers to a regional office at an onshore or offshore location, established to deliver services to clients. Distributed Hardware Consists of multiple autonomous computers that communicate through a computer network. Dodd-Frank The DoddFrank Wall Street Reform and Consumer Protection Actplaces regulation of the nancial industry in the hands of the government. Aims to prevent another signicant nancial crisis by creating new nancial regulatory processes that enforce transparency and accountability while implementing rules for consumer protection. Big Data: The Next Big Thing 96 Equity Research Study and analysis of fundamental parameters of Companies and industries, to ofer insights on investment opportunities. ETL & Data Integration ETL is a process in database usage and in data warehousing that involves extracting data from multiple sources, transforming it to t operational needs, and loading it into a database, operational data store, data mart or data warehouse. EU Data Protection Directive Enables protection of individuals with regard to the processing of personal data and on the free movement of such data in the European Union. Financial Research Includes equity research, xed income research, credit research, knowledge support for investment and wealth management, commodities and foreign exchange, derivatives, as well as emerging services like risk management analytics and insurance actuarial services. FTE Refers to Full-Time Equivalent (FTE) employees, who work an equivalent number of hours as a full- time employee. Gigabytes Refers to research in instruments with xed interest payment obligation, such as bonds and swaps. Hadoop Distributed File System (HDFS) Is the primary storage system used by Hadoop applications. HDFS creates multiple replicas of data blocks and distributes them on compute nodes throughout a cluster to enable reliable, extremely rapid computations. Hadoop Sqoop Enables import and export of data from structured data stores such as relational databases, enterprise data warehouses, and NoSQL systems. Hadoop A Big Data technology framework, it is an open source software framework that supports data-intensive distributed applications licensed under the Apache v2 license. HBase Refers to a Bigtable-like structured storage system for Hadoop HDFS. Big Data: The Next Big Thing 97 Heat Maps Is a graphical representation of data where the individual values contained in a matrix are represented as colours. HIPAA Refers to the Health Insurance Portability and Accountability Act (HIPAA) that protects health insurance coverage for workers and their families when they change or lose their jobs, and requires the establishment of national standards for electronic healthcare transactions and national identiers for providers, health insurance plans, and employers. History Flow Charts the evolution of a document as it is edited by multiple contributing authors. HITECH Act The HITECH Act set meaningful use of interoperable EHR adoption in the healthcare system as a critical national goal and incentivised EHR adoption. Hive Is a data warehouse infrastructure which allows SQL-like ad-hoc querying of data (in any format) stored in Hadoop. Global In-house Centres Are the ofshore centres of global corporations for services that are to be kept in-house and involve intellectual property and sensitive data. In-memory Analytics Is a BI methodology used to solve complex and time-sensitive business scenarios. It works by increasing the speed, performance and reliability when querying data in the servers random access memory. In-memory Databases Refers to the database management system that primarily relies on main memory for computer data storage. Machine Learning Is a scientic discipline that deals with the design and development of algorithms that take as input empirical data, such as that from sensors or databases, and yield patterns or predictions thought to be features of the underlying mechanism that generated the data. Mahout Refers to a scalable Machine Learning algorithms using Hadoop. Big Data: The Next Big Thing 98 MapReduce Is a programming model for processing large datasets, and the name of an implementation of the model by Google. MapReduce is typically used to do distributed computing on clusters of computers. Massively Parallel Processing Refers to the coordinated processing of a programme by multiple processors that work on diferent parts of the programme, with each processor using its own operating system and memory. NoSQL Databases Refers to the next-generation databases that are non-relational, distributed, open-source and horizontally scalable. NoSQL databases are not primarily built on tables and can manage large unstructured datasets. Online Analytical Processing Provides solutions to multi-dimensional analytical queries. OLAP is a category of business intelligence, which also encompasses relational reporting and data mining. Key applications of OLAP include business reporting for sales, marketing, management reporting, business process management, budgeting and forecasting, nancial reporting, etc. Pure-play Service Providers Refer to independent service providers who are specialists in offering a broad range of Big Data services. RDBMS A Relational Database Management System (RDBMS) is a traditional system of storing data in which data is stored in tables and the relationships among the data are also stored in tables. Real-time Dashboards Is a real-time graphical presentation of data analysis. Software-as-a-Service Software-as-a-Service refers to software that is accessed via a web browser and is paid on a subscription basis, and for which the user does not have to pay for ownership, maintenance and installation. Spatial Information Flow Describes the physical location of objects and the metric relationships between objects such as aerial and satellite remote sensing imagery, the Global Positioning System (GPS), and Computerised Geographic Information Systems (GIS). Big Data: The Next Big Thing 99 Stochastic Optimisation Are optimisation methods that generate and use random variables. Stochastic optimisation methods also include methods with random iterates. Structured Datasets Data that resides in xed elds within a record or le. Relational databases and spreadsheets are examples of structured data. Tag Cloud Refers to the weighted visual list where words that appear most frequently are larger and words that appear less frequently are smaller. Unstructured Datasets Refers to information that either does not have a pre-dened data model and/or does not t well into relational tables. Unstructured information is typically text-heavy, but may contain data such as dates, numbers, and facts as well. For example, tweets, emails, RSS, XML, etc. Zookeeper Is a high performance coordination service for distributed applications. Big Data: The Next Big Thing 100 List of Abbreviations APAC Asia Pacic B2C Business-to-Consumer BFSI Banking, Financial Services and Insurance BI Business Intelligence BPO Business Process Outsourcing CA Chartered Accountant CFA Certied Financial Account CAGR Compounded Annual Growth Rate CARC Cumulative Aggregate Rate of Change CoE Centre of Excellence CLV Customer Lifetime Value CPG Consumer Packaged Goods CRISIL Credit Rating Information Services of India GR&A Ltd. Global Research & Analytics CRM Customer Relationship Management CEO Chief Executive Of cer COO Chief Operating Of cer CIO Chief Information Of cer CTO Chief Technology Of cer Delhi NCR Delhi National Capital Region DW Data Warehouse ERP Enterprise Resource Planning ETL Extract, Transform and Load EU European Union FTE Full-Time Equivalent FTP File Transfer Protocol GB Gigabyte HITECH Health Information Technology for Economic and Clinical Health HR Human Resources IP Intellectual Property IPO Initial Public Ofering IT Information Technology ITeS Information Technology-enabled Services IVR Interactive Voice Response MBA Masters in Business Administration MIS Management Information Systems M&A Mergers and Acquisitions MNC Multinational Corporation MPP Massively Parallel Processing M.S. Master of Science NoSQL Not Only Structured Query Language OLAP Online Analytical Processing PACS Picture Archiving and Communication System PB Petabyte Ph.D Doctor of Philosophy PoS Point of Sale P&L Prot & Loss RDBMS Relational Database Management System RFID Radio-Frequency Identication R&D Research and Development RoI Return on Investment RoW Rest of the World RSS Rich Site Summary/Really Simple Syndication SAS Statistical Analysis System SaaS Software-as-a-Service SKU Stock Keeping Unit SME Small and Medium Enterprises SPSS Statistical Package for the Social Sciences SQL Structured Query Language Big Data: The Next Big Thing 101 SVP Senior Vice President TB Terabyte UK United Kingdom US United States (of America) USD United States Dollar VC Venture Capital VP Vice President XML Extensible Markup Language International Youth Centre Teen Murti Marg, Chanakyapuri New Delhi 110 021, India T 91 11 2301 0199 F 91 11 2301 5452 [email protected] www.nasscom.in