Ibm 0453 Analytics
Ibm 0453 Analytics
Ibm 0453 Analytics
The Analytics Revolution: Optimizing Reporting and Analytics to Make Actionable Intelligence Pervasive
Prepared for IBM by: David Loshin Knowledge Integrity, Inc. January, 2010
1 (301) 754-6350
2 (301) 754-6350
Delivering trustworthy actionable intelligence to the right people when they need it short-circuits analysis paralysis and encourages rational and confident decisions. At one end of the continuum, overall company performance is reviewed and alternatives are considered to help adjust corporate strategy for long-term value generation. At the other end, operational activities are improved with specific pieces of intelligence that can adjust and optimize activities in real time. Of course, actionable intelligence informs both strategic and operational processes, and its pervasive delivery to staff members up and down the org chart can facilitate a transition from reacting to what has happened in the past to better predict the optimal choices for the future.
Industry Examples
From one standpoint, opportunities for improvement manifest themselves differently depending on the industry, while from another standpoint there are common dimensions of operations that can be improved no matter which industry. Certainly, a company within a particular industry can benefit from reporting and analytics associated with the specific aspects of that industry, such as these vertical examples: Health Care Monitoring business process performance permeates all aspects of quality of care. For example, understanding why some practitioners are more successful at treating certain conditions can lead to improved quality of care. Analytics can help to discover the factors that contribute to success of one approach over others, and see whether those successes are dependent on variables within the control of the practitioner or factors outside their control. Improved diagnostic approaches can reduce the demand for high-cost diagnostic resources such as imaging machinery, and better treatments can reduce the duration of patient stays, freeing up beds, improving throughput, and enabling more efficient bed utilization. Logistics/Supply Chain Integrated analysis for transportation and logistics management sheds insight into evaluation of many aspects of an efficient supply chain. For example, business intelligence is used to analyze usage patterns for particular products based on a series of geographic, demographic, and psychographic dimensions. Predictability becomes the magic word knowing what types of individuals in which types of areas account for purchases of the range of products over particular time periods can help in more accurately predicting (and therefore meeting) demand. As a result, the manufacturer can route the right amounts of products to reduce or eliminate out-of-stocks. At the same time, understanding demand by region over different time periods leads to more accurate planning of delivery packaging, methods, and scheduling. One can map the sales of products in relation to distance from the origination point; if sales are lower in some locations than others, it may indicate a failure in the supply chain that can be reviewed and potentially remediated in real time. Telecommunications In an industry continually battling customer attrition, increasing a customers business commitment contributes to maintaining a long customer lifetime. For example, examining customer cell phone usage can help to identify each individuals core network. If a customer calls a small number of residential land lines or personal mobile phones, that customer may be better served by a friends and family service plan that lowers the cost for the most frequently called numbers. Identifying household relationships within the core network may enable service bundling, either by consolidating mobile accounts, or by crossselling additional services such as landline service, internet, and other entertainment services. On the other hand, if the calls from the customers individual mobile phone are largely to business telephone numbers and have durations between a half hour to an hour, that customer may be better served with a business telephony relationship that bundles calling with additional mobile connectivity services.
4 (301) 754-6350
Retail The large volume of point of sales data makes it a ripe resource for analysis, and retail establishments are always looking for ways to optimize their product placement to increase sales while reducing overhead to increase their margins, especially when market baskets can be directly tied to individuals via affinity cards. Understanding the relationship between a brickand-mortar store location and the types of people that live within the surrounding area helps the store managers with their selection of products for store assortment. Strategic product placement (such as middle shelf or end-cap) can be reserved for those items that drive profitability, and this can be based on a combination of product sales by customer segment coupled with maps of customer travel patterns through the store. Product placement is not limited to physical locations; massive web logs can be analyzed for customer behavior to help dynamically rearrange offer placement on a web site, as well as encourage product upselling based on abandoned cart analysis, through collaborative filtering, or based on the customers own preferences. Financial Services/Insurance In both insurance and banking, identifying risks and managing exposure are critical to improved profitability. Banks providing a collection of financial services develop precise models associated with customer activities and profiles that identify additional risk variables. For example, analyzing large populations of credit card purchases in relation to mortgage failures may show increased default risk for individuals shopping at particular shopping malls or eating at certain types of fast food restaurants. In turn, recognizing behaviors that are indicative of default risk may help the bank anticipate default events and reach out to those individuals with alternate products that keep them in their homes, reduce the risk of default, and improve predictability of the loans cash flow over long periods of time. Manufacturing -Plant performance analysis is critical to maintaining predictable and reliable productivity; tracking production line performance, machinery downtime, production quality, work in progress, safety incidents, and delivering measurements of operational performance indicators along the management escalation chain so that adverse events can be addressed within the proper context within a reasonable timeframe. Hospitality Hotel chains assess customer profiles and related travel patterns ,and know that certain customers may be dividing their annual night allocation among the competitors. By analyzing customer travel preferences and preferred locations, the company may present incentive offers through the loyalty program to capture more of that customers night allocation.
The examples for these industries are similar in that the analysis ranges from straightforward reporting of key business performance indicators to exploring opportunities for optimizing the way the organization is run or improving interactions with customers and other business partners. Other industries benefit from reporting and analytics.
5 (301) 754-6350
(and diverse) data sets to support both management and decision making at operational, tactical, and strategic levels. Through data collection, aggregation, analysis, and presentation, actionable intelligence can be delivered to best serve a wide range of target users. Organizations that have matured their data warehousing programs allow those users to extract actionable knowledge from the corporate information asset and rapidly realize business value. But while traditional data warehouse infrastructures support business analyst querying and canned reporting or senior management dashboards, a comprehensive program for information insight and intelligence can enhance decision-making process for all types of staff members in numerous strategic, tactical, and operational roles. Even better, integrating the relevant information within the immediate operational context becomes the differentiating factor. Offline customer analysis providing general sales strategies is one thing, but real-time actionable intelligence can provide specific alternatives to the sales person talking to a specific customer based on that customers interaction history in ways that best serve the customer while simultaneously optimizing corporate profitability as well as the salespersons commission. Maximizing overall benefit to all of the parties involved ultimately improves sales, increases customer and employee satisfaction, and improves response rate while reducing the cost of goods sold a true win-win-win for everyone. The wide range of analytical capabilities all help suggest answers to a series of increasingly valuable questions: What? Predefined reports will provide the answer to the operational managers, detailing what has happened within the organization and various ways of slicing and dicing the results of those queries to understand basic characteristics of business activity (e.g., counts, sums, frequencies, locations, etc.).Traditional BI reporting provides 20/20 hindsight it tells you what has happened, it may provide aggregate data about what has happened, and it may even direct individuals with specific actions in reaction to what has happened. Why? More comprehensive ad hoc querying coupled with review of measurements and metrics within a time series enables more focused review. Drilling down through reported dimensions lets the business client get answers to more pointed questions, such as the sources of any reported issues, or comparing specific performance across relevant dimensions. What if? More advanced statistical analysis, data mining models, and forecasting models allow business analysts to consider how different actions and decisions might have impacted the results, enabling new ideas for improving the business. What next? By evaluating the different options within forecasting, planning, and predictive models, senior strategists can weigh the possibilities and make strategic decisions. How? By considering approaches to organizational performance optimization, the C-level managers can adapt business strategies that change the way the organization does business.
7 (301) 754-6350
Information analysis makes it possible to answer these questions. Improved decision-making processes depend on supporting business intelligence and analytic capabilities that increase in complexity and value across a broad spectrum for delivering actionable knowledge (as is shown in ). As the analytical functionality increases in sophistication, the business client can gain more insight into the mechanics of optimization. Statistical analysis will help in isolating the root causes of any reported issues as well as provide some forecasting capabilities should existing patterns and trends continue without adjustment. Predictive models that capture past patterns help in projecting what-if scenarios that guide tactics and strategy towards organizational high performance.
Intelligent analytics and business intelligence are maturing into tools that can help optimize the business. That is true whether those tools are used to help C-level executives review options to meet strategic objectives, Senior managers seeking to streamline their lines of business, or Operational decision-making in ways never thought possible.
These analytics incorporate data warehousing, data mining, multi dimensional analysis, streams, and mash-ups to provide a penetrating vision that can enable immediate reactions to emerging opportunities while simultaneously allowing one to evaluate the environment over time to see ways to change the business.
mechanisms for actionable knowledge. Today, delivering the right information to the right people at the right time is the culmination of best practices in data management and organization combined with a technical infrastructure designed to direct many steady channels of data into a high performance analysis platform that can deliver trustworthy results within real time constraints. This depends on three areas of technology: Continuous synchronization of data from multiple sources to provide a coherent and consistent view; Cohesive information integration allowing for massive amounts of high quality data to be harmonized and aligned to enable effective analysis; and a Comprehensive set of analytics services that reduce or even eliminate the need to be a power user to derive benefit from actionable intelligence.
In this section we look at some of the key technical considerations necessary for a modern analytics environment, and how the results can be fully integrated into hundreds, if not thousands, of daily business processes. Given an understanding of the technology components, we will begin to see some challenges emerge in environments with heterogeneous technologies cobbled together to support the analytics program.
9 (301) 754-6350
Maintaining a reasonable degree of synchrony and coherence among the multitude of data sources available within (as well as from outside) the organization requires technical strategies for continuous data availability that do not impose a strain on the environment. Some of those strategies include: Change data capture (CDC) for data replication, involving a managed, synchronized copying of data from a source to one or more target data systems. CDC is an event-driven mechanism for capturing changes from source datasets and propagating those changes through various channels, either directly to target databases or through a message queue for subsequent processing. Synchronizing data modifications via CDC enables coherence between operational systems and analytical systems, enabling the discovery of actionable opportunities in real-time while maintaining consistency across reporting systems. Data Federation, which enables transparent access to heterogeneous (and generally physically distributed) data types, platforms, and sources, and in numerous formats, without requiring a staging area or centralized repository (see ). Federation is an effective way to capture subsets of very large or very distributed data sets, and is frequently used, when data is offsite, is in an older format, or is infrequently used. For example, a data federation framework will allow an application to access databases, XML data, flat files, or even data services or data streams using a uniform access mechanism. In turn the federation server dynamically accesses the data sources and returns the synchronized results. Federation simplifies consolidating data from multiple sources, enabling cross-pollination of information to better discover opportunities, and is very effective at joining dissimilar data before it arrives to the target. Information Stream processing, which provides applications with continuous access to streaming data sources in real time. Connecting streaming data with persistent data sources supports complex event processing and real-time discovery of opportunities based on emerging knowledge activities, such as weather-based commodity trading or immediate deliveries to prevent retail out-of-stocks.
10 (301) 754-6350
11 (301) 754-6350
Data Profiling Like any asset, it is valuable to inventory the asset determine what it is, who it belongs to, where it is used, and its serviceability. Data profiling does this it supports a combination of artifact review and empirical analysis of source data sets to understand the characteristics of data element metadata that are critical for analysis. Data profiling can provide the ground-truth associated with the actual data values as well as provide evidence of consistency with metadata, and will provide insight into the suitability of candidate sources to satisfy the target analytical needs. The result of a data quality assessment using data profiling in conjunction with a review of the consuming application expectations can uncover data quality rules that are managed within the metadata repository. Data Cleansing Data profiling will expose potential anomalies and errors in the data, and erred data used as input to analytical processing will impact the believability of its results. When the source data system owners are able to address design or process flaws that introduce data issues, then the root causes can be eliminated and so can the errors. However, when the source data is outside of the organizations administrative control, processes and technical infrastructure must be in place to parse, standardize, enhance/enrich, and cleanse the data to satisfy the downstream analytic needs. Data cleansing is generally required when the data is being repurposed, especially when the new purposes have stricter data quality expectations. Data cleansing enhances the value of the data and high quality, consistent data makes the decision-making process trustworthy. Data Validation Alternatively, there are often opportunities for introducing new errors into the data, and identifying and eliminating potential data flaws early in the information production flow will reduce the variance and inconsistency downstream and improve overall operational efficiency. But even more importantly, validating supplied data with defined data quality rules contributes to the delivery of trustworthy data. Identity Resolution Due to the organic growth of the myriad operational systems deployed across the enterprise, it is not unusual that multiple data instances in different systems represent the same real-world entities in a variety of ways. Alternatively, there are often situations in which two real-world entities share the same identifying attributes, making it difficult to distinguish between them. Both of these types of issues reflect the same core challenge: the ability to evaluate the similarity between pairs of records and determine whether they represent the same thing. Identity resolution addresses these issues by calculating and scoring the degree of similarity between any two records. When the score is above a specific threshold, the two records are presumed to match; below another threshold, they are deemed to not match. Identity resolution is used to match records in the presence of variations or incomplete attributes, or to determine that two records truly represent distinct entities. Identity resolution is a key component of master data management, and increasing precision in entity matching reduces data duplication and supports high quality reporting and analysis. Pervasive Delivery Mechanisms The spreadsheet is no longer the sole means for delivering analytical results. Self-service configuration of reports and query results from within business 12 (301) 754-6350
applications eliminates the bottleneck caused by relying on IT staff for support, and web-based delivery simplifies the availability of results using a variety of intuitive, interactive visual presentation objects (such as graphs, heat maps). By automating event-driven notifications that can be tailored to an assortment of interfaces, ranging from desktops to hand-held PDAs, actionable intelligence can be provided directly where it is needed.
Analytics Services
The range of analytics services supports the variety of data consumers across the organization: Reporting and Ad Hoc Querying Standard, static reports derived from user specifications provide a consistent view of particular aspects of the business, generated in batch and typically delivered on a scheduled basis through a standard (web) interface. The static nature of reports drives the need for alternative methods for additional insight. One approach is to extract the reported data into spreadsheets for additional data manipulation, while also allowing ad hoc queries to gather additional data for analysis. Standard reports can provide knowledge to a broad spectrum of consumers, even if those consumers must have contextual knowledge to identify the key indicators and take action. However, given the growth of data into the petabytes, standard reporting is rapidly yielding to exception reporting.
Scorecards and Dashboards If a trained eye is required to scan key performance metrics from canned reports, simplifying the presentation of key performance metrics may better enable the knowledge worker to transition from seeing what has already happened to understanding the changes necessary to improve the business process. Scorecards and dashboards customize an up-to-date presentation of summarized performance metrics, allowing continuous monitoring throughout the day. Pervasive delivery mechanisms can push dashboards to a large variety of channels, ranging from the traditional browser-based format to hand held mobile devices. Through the interactive nature of the dashboard, the knowledge worker can drill down through the key indicators regarding any emerging opportunities, as well as take action through integrated process-flow and communication engines. Mash-ups The mash-up takes the dashboard to the next level, allowing the knowledge consumers themselves the ability to identify their own combination of analytics and reports with external data streams, news feeds, social networks, and other web 2.0 resources in a visualization framework that specifically suits their own business needs and objectives. The mash-up framework provides the glue for integrating data streams and business intelligence with interactive business applications. Multidimensional Analysis and Online Analytical Processing (OLAP) The multidimensional analysis provided by OLAP tools helps analysts slice and dice relationships between different variables (within their own hierarchies), such as what are corporate revenues by time? or 13 (301) 754-6350
What is the availability of products by supplier by location? The use of the word by suggests a pivot around which the data can be viewed, allowing one to look at sales grouped by time period, then by region, or the other way around, grouped by region then by time period. OLAP lets the analyst drill up and down along the hierarchies in the different dimensions to uncover dependent relationships that are hidden within the hierarchies.
14 (301) 754-6350
and their related attributes, characteristics, profiles, and transaction histories and can be used in real-time embedded predictive models to enhance operational decision-making.
Challenges
A mature business intelligence and analytics program will have a full complement of these technology components to support knowledge consumers across the full analysis spectrum. In many environments the analytics program has grown organically, with a variety of acquired tools, internally-developed solutions, integrated with different choices of hardware, network, and software (such as database management systems), leading to a workable, if not the most efficient, solution. And while the horde of vendors have endeavored to support interoperability (so that they all seem to play well together), the BI systems have been engineered over time with heterogeneous components provided by different vendors, with little thought of the complexity of component integration, let alone performance or optimization. In fact, as the need for speed grows, we are recognizing that there are some latent challenges that exist for organizations as they try to home brew their comprehensive analytics infrastructure, including these factors: Organic Development and Heterogeneity The organic nature of the development means that the analytic applications have been incorporated on an on-demand basis, with neither a comprehensive program plan nor an assessment of business needs across the enterprise. This leads to technical dependencies based on development decisions that are not specifically related to addressing business needs, and in time, those dependencies may impede the maturation of a flexible end-to-end analytics solution. Flexibility and Extensibility Despite the attempts by many vendors to enable interoperability, each is limited in its ability to work well with those (usually released) product versions for which there are published specifications. In reality, this imposes stringent integration constraints which may prevent the customer from enabling all available product capabilities. For example, if the selected data cleansing tool only works with version 5.7 of the selected ETL product, the customer must refrain from upgrading to version 6 of the ETL product until the data cleansing vendor enhances its product to support the ETL product upgrade. Also, as business needs, requirements, analytics expectations, or numbers of consumers change, the underlying analytics infrastructure will have to adapt to those changes. This suggests a need for an ability to easily add capabilities and functionality to the business intelligence infrastructure. Data quality Even with an increased concentration on data governance and quality management, intermittent data validation, different tools for parsing, standardization, and cleansing, and conflicting rule sets still contribute to data flaws and inconsistency. Time to value Installing, testing, and validating a variety of components and ensuring that they operate well together requires a significant time and resource investment in planning, 15 (301) 754-6350
design, implementation, and deployment. The increased complexity of implementation and deployment increased the time until the systems can be used productively. Performance and scalability Many BI systems become a victim of their own success; as the number of users increases, the query load grows, or as the amount of data to be analyzed grows, the systems ability to scale appropriately leads to performance degradation. This scalability challenge only increases when interoperability constraints artificially throttle back the performance potential of any of the integrated components.
16 (301) 754-6350
Performance and scalability When an end-to-end solution is designed to run on top of specific hardware, the developers are able to take advantage of a number of optimizations integrated directly into both the hardware and software platforms, such as workload management, task and process scheduling, load balancing, parallel I/O channels, or high availability. Optimized analytical database management services allow for high performance analytical data warehousing, supported by parallelized data integration plus high speed federation services. Increasing numbers of queries can be offloaded to alternate processing units or routed to inmemory databases, decreasing DBMS loads while increasing response rates and throughput.
By transitioning from an organic evolution of corporate business intelligence and analytics environment built on top of a myriad of technology components to a strategically-architected end-to-end solution, your organization can gain a rapid time to value through real-time, integrated analytics, resulting in advantageous intelligence delivered to the appropriate decision-makers at the right time.
17 (301) 754-6350
18 (301) 754-6350