Dawn Issac Sam
Dawn Issac Sam
SUMMARY
Senior Data Engineer with 12+ years of IT experience in Big Data, Cloud Migration, Data warehousing,
ETL, Analytics, Business Intelligence, Digital Marketing and Data Migration.
Worked as Lead Data Engineer, ETL architect, Data Architect for multiple Fortune 500 companies
across USA, Australia and India designing and developing ETL, BI and marketing solutions from
ground up.
Strong hands-on knowledge in Python, SQL, Spark, GIT,CI/CD
Knowledgeable in data integration on AWS using S3, Appflow, lambda, API Gateway, Secrets Manager,
KMS, Kinesis etc. using CDK for Infrastructure deployment.
Knowledgeable in Requirement gathering, analysis, ELT Environment Design and Build, Data Modeling,
platform building in on-prem servers or Cloud and Data Migration.
Strong believer in winning together and owning problems collectively.
Keen eye for optimization, enhancements and process improvements.
People person with a keen edge in integrating talents and bringing people together around ideas.
Extensive knowledge in Telecom,Media, Retail, Manufacturing, hospitality and Healthcare domain.
Proven leader delivering successful Data products / platforms as well as end to end solutions to
businesses.
Extensive experience in building highly specialized data platforms to enable Customer Engagement,
Email Personalization, PEP/SMS Messaging, Customer scoring based on data models, Marketing funnels,
BI Dashboards.
Excellent written and spoken communication skills.
Technical Skills
Languages: Python, SQL, Shell Scripting, C
ETL & BI Tools: Apache Spark, NiFi, Informatica, IBM Datastage, SAS, Tableau
Databases: Snowflake, MongoDB, Teradata, Exadata, Oracle, SQL Server
Platforms/Tools: AWS, TOAD, Erwin, Putty, Maestro, UC4, Airflow, Github
Domain: Telecom, Retail, E commerce, Manufacturing, Hospitality, Healthcare
1
PROFESSIONAL EXPRIENCE
Disney Employs Multiple Membership platforms across Different Regions of operations. In order to obtain an
enterprise-wide membership platform, the business rules need to be re-written, real-time data need to be integrated
and a complex recognition system need to be employed. This ensures that data is sourced directly from each line of
business or platform and integrated
This role requires extensive understanding of Disney’s core business of various subject areas based on knowledge
of business as well as inputs from SMEs to build data layer and consumption layers internal customers.
Responsibilities:
Participated in overall design of new Guest 360 platform
Create ingestion patterns in AWS using CDK (Infrastructure as Code)
Create APIs for consuming Events from salesforce
Provision highly available data via AWS Kinesis for internal customers
Set up Snowflake environment
Lead the migration from Exadata –SAS on premise to Snowflake-Python resulting in an annual cost savings
of 1million + in savings and an increase in performance of over 200% ·
Designed and implemented real-time data pipelines to ingest guest events from various APIs using API
Gateway, Lambda, S3 and Kinesis
Create complex SQL scripts, CTEs, Stored Procs to load and manipulate data per business logic, create
semantic layer using Views
Create Data ingestion pipelines using Python and manipulate data in Dataframes before loading to Snowflake
The Walt Disney Company- Ad tech Platform March 2019 – June 2021
Build a marketing data platform that integrates known and unknown profile data to be matched up with identifiers
that match at person level, campaign level, lead level, channel type etc. The platform needs to be extremely aligned
at multiple cross grains to empower effective marketing and communication with leads and guest without
redundancy.
Responsibilities:
Involve in tech stack decisions for Ad tech platform
Design logic for creating unique identifiers. Code and integrating identifiers at different grains in Snowflake
and Python
Develop PoC using new technologies to dry run architectural proposals
Design architecture and layout for data side of platform in Snowflake.
Create Python scripts for data ingestion from multiple databases as well as APIs.
Data integration using Lamba services in AWS
Optimized Database tables using cluster keys and automated warehouse sizing.
Automated processes, integrate with pipelines for CI/CD
Automated monitoring for resource usage within Snowflake for cost optimization.
2
Technology stack: Python, SAS, AWS, Hadoop, GitHub, Exadata, Snowflake
The Walt Disney company – Customer Engagement Feb 2017 – Feb 2019
Engineered and built Common Campaign Profile (CCP) as a unified data platform enabling drill down at transaction
level or guest levels across multiple guest engagement platforms at Disney. The platform enabled a holistic view of
guest behavior, campaign personalization, optimally allocate assets in campaigns, segregated customer
profile for greater engagement and allowed cross platform marketing to increase customer engagement and
revenue.
This role utilized extensive understanding of Disney’s core business of various subject areas based on knowledge
of business as well as inputs from SMEs to build Atomic data layer for consumption by data experts.
Responsibilities:
Designed new Data platform for Common Customer Profile
Implemented Physical Data Model on Snowflake and create Python jobs to pull data from JSON, XML,
AWS S3, flat file from various APIs to set up Customer Data Platform.
Data integration using Glue/Lamba services in AWS
Optimized Database tables using cluster keys and automated warehouse sizing.
Build decision engine to switch load processes between Spark, NiFi & Python.
Automated processes, integrate with pipelines for CI/CD
Created a health metric system to monitor multiple environments and ensure data integrity and congruency
Spun up EC2 Instance and schedule ad-hoc Python jobs
Automated ETL processes across billions of rows of data, which increased data availability from 24 hours to 3
hours
Technology stack: Python, SAS, NiFi AWS, GitHub, Kafka, Spark, YAM, Hive, Presto, Snowflake, Teradata
The Walt Disney Company – Customer Engagement Sep 2013 – Jan 2017
ETL Developer
Developed end to end Marketing Campaign data pipelines based on business requirements. Built the data pull
using Python /Teradata and created customized guest specific emails, App and PEP messages using Informatica
and SAS.
Responsibilities:
Created Technical design per on user requirements
Developed and debugged programs to create personalized emails for Email Marketing. Used Python to
produce RTF and HTML reports.
Developed an automated system for updating Canadian compliancy indicators for marketing
Developed and maintained in production Python batch jobs.
Maintained and enhanced existing SAS reporting programs for campaigns. Involved in the code review to
make sure the output is as we expected and the efficiency is met.
Designed standardized and automated ad-hoc, for new product development, on-line marketing, direct mail,
reward-based marketing, etc.
Designed and developed Informatica ETL mappings to extract master and transactional data from
heterogeneous data feeds and load
The Walt Disney company – Information Technology Division Sept 2011 – Aug 2013
ETL Architect
Guest Analytics and Business Intelligence Environment (GABE) was a migration project to forklift Guest Data Mart
(GDM) from Oracle to Teradata environment. The project involved conversion of Oracle Triggers, PL/SQL,
Datastage mapping and SAS scripts into Teradata environment and integrate a Data Lake to provide PII data.
3
Responsibilities:
Installed and set up of Teradata Appliance in test servers.
Converted logical Data Model into Physical Data Model
Designed Guest data layer for Datastage to effectively load to new data mart( GDM)
Monitored Datastage conversion and recommended best practices to leverage maximum performance and
efficiency from Teradata
Trained CMR (Customer Marketing and Relationships) team in Teradata to bring them up to speed on the
Teradata technology.
Converted SQLs in SAS scripts.
Reviewed and approved Datastage jobs converted by Teradata Professional Services.
Built parallel test environment in Oracle and maintained data for validation executions.
Conducted performance tests and optimized performance in new environment.
Created test SQLs and automated with Shell scripts for validating migrated data and DDLs.
Built load processes in Python to load XML Data from Data Lake to integrate Web traffic data into the Data
Mart
Project Logan is an initiative in the Storage division of LSI Corporation to build a brand-new Teradata Data
Warehouse to integrate data from the operational system in SAP, Incident Reporting in Agile and Test /Warranty
repairs done by Contract Manufacturers in Siebel.
The project was designed to develop a one-stop-shop for all reporting needs and build scorecards for AFR, ARR,
IRR, IQR etc. This EDW implementation is a window for business and business groups into combined environment.
The end goal is to have a holistic environment for monitoring and analyzing supply chain efficiencies, optimize cost,
strategize product release driven by analytics down to lowest grain of business .
Responsibilities:
Spearheaded the design team to completely stand up a Data Warehouse from ground up
Engaged with business leaders and get roadmap approval.
Guided lead engineers to prepare detailed Technical specifications.
Designed the system to have a database environment hosted on a Unix platform, control environment to control
batch-processing, data pipelines for data ingestion and ETL environment for business transformation
Co-ordinated work assignments with off-site team.
Customized Manufacturing Logical Data Model (M-LDM).
Created Teradata Parallel Transporter TPT (Bulk Load) script templates for team to work on.
Set up environment to be UTF-8 compliant and set up data base and scripts accordingly.
Set up processes to Load XML data using Python.
Built Scripts for mail alerting (using mailx), sequential file management, file filtering, error handling etc.
Set Up SAS environment and create pilot scripts to provide data analysts for Business Analysts
4
Telstra, Australia Aug 2005 – Feb 2008
ETL Developer / Production Support
Environment : Teradata V2R6, Control M, Informatica, UNIX, IBM Mainframe