Big Data Implementation
Big Data Implementation
Project Management:
Methodologies, Caveats
and Considerations
Tiffani Crawford, PhD, builds global Big Data Analytics systems. She has 20 years of high
technology experience with Fortune 500 companies, including Cisco Systems, Cognizant,
Bank of America, VISA/Inovant, BAE Systems, Applied Competitive Technologies, Ditech
Networks/Nuance, Big 4 financial firms, defense contractors and startups. She has worked in
seminal technology development in Big Data, analytics, cloud, networking,
telecommunications, software development, distributed multi-tier applications,
multimedia/digital, geographic information systems, intelligent transport systems, finance,
security, policy systems and structural equation modeling. She is a credited software
developer and published author. She earned her PhD from the University of Southern
California in 2005. She has also earned her Master's, Bachelor's and various technology
certifications. She is a member of PMI with various philanthropic contributions.
Hands-On Exercise
• Create a repeatable model for a client’s Big Data Analytics in a focus area
that inspires you and your team
• Continue this exercise and discussion via social media and online
applications
Large quantities of
many data types
• Structured
• Semi-Structured
• Unstructured
• Human
• Machine
Getting to Value
• Quantitative
• Qualitative
• Correlation
• Longitudinal
• Social
• Search
• Operational
• Inferential
• Ethnographic
• Interview-Based
• Casual
Getting to Value
x/2
• Quantitative
• Qualitative
4xy
• Correlation
• Longitudinal
• Social
• Search
*5z • Operational
• Inferential
*3t if over • Ethnographic
18 years old
• Interview-Based
Data Points • Casual
Business Context
Execution
Components Information Delivery
Analytics & Insights
Processing
Name Data
Data
Node Data Data Data Data Data
Node Node
Node Node Node Node Node
Job Task
Tracker Tracker Task Task Task Task Task
Tracker Tracker Tracker Tracker Tracker
Secondary
(Passive) Node
Data Data Data Data Data
Name Data
Data Node Node Node Node Node
Node
Node Node
Task Task Task Task Task
Job Task Tracker Tracker Tracker Tracker Tracker
Tracker Tracker
HBase Data storage for distributed large tables for random, real-time read/write access to Big Data
Sqoop Open source data integration tool to integrate data into HDFS from sources outside of Hadoop
Pig High-Level Query Language Platform for analyzing huge data sets, which involves complex rules
Cassandra Highly scalable NoSQL database which brings fully distributed design & BigTable’s data model
Bigtable
…
Enterprise
Analytics
Data
Appliance
Warehouse
Data
Integration
Data
Sources
BigQuery
Start Date
End Date
Competitors
Competitor1
Competitor2
Competitor3
Competitor4
Competitor5
CHANGE MANAGEMENT
Approach • Organizational discipline to embed
Incremental steps reduce gaps
analytically-driven process into unified
• Subset of channels, programs and framework for offer management
members ANALYTICAL
• Process design + learning system
• Lack of channel preference segmentation,
• Framework for scaling
propensity scores by channel and program
• Absence of learning system for refinement
DATA INTEGRATION
• Data resides in silos
• Lack of 360-degree view of customer
CURRENT STATE
ORGANIZATIONAL
Data
No Action
Submitted • Programs “owned” by different groups and/or
third parties
• Different priorities and varying approaches
Introductory Workshop
Project Management
Provide a Rollout baseline and plan Detailed review of current state reporting Align roadmap with relevant initiatives
metrics/KPIs
Develop high level roadmap (draft) Identify key projects and prioritize by
Detailed gap analysis and mitigation plan ROI
Develop high level business plan
Define future state roadmap and tactical Develop baseline rollout plan
Gap analysis of current and desired
plan
analytics and business value Develop and present results
Develop draft rollout plan
Objective: Identify next steps for additional data
Develop one next generation Obtain sample data from each source types, BUs and entities
Deliverables
analytics and visualization view Project expected results for each query type
Hadoop – http://www.hadoopilluminated.com/
NoSQL – http://nosql-database.org/
Agile – http://agilemanifesto.org/
Scrum – http://www.scrumalliance.org/
Tiffani Crawford, PhD, builds global Big Data Analytics systems. She has 20 years of high
technology experience with Fortune 500 companies, including Cisco Systems, Cognizant,
Bank of America, VISA/Inovant, BAE Systems, Applied Competitive Technologies, Ditech
Networks/Nuance, Big 4 financial firms, defense contractors and startups. She has worked in
seminal technology development in Big Data, analytics, cloud, networking,
telecommunications, software development, distributed multi-tier applications,
multimedia/digital, geographic information systems, intelligent transport systems, finance,
security, policy systems and structural equation modeling. She is a credited software
developer and published author. She earned her PhD from the University of Southern
California in 2005. She has also earned her Master's, Bachelor's and various technology
certifications. She is a member of PMI with various philanthropic contributions.