CV ALBERIC KOUADIO
CV ALBERIC KOUADIO
Data Engineer
Statistician, Economist, and Military Veteran with over 5+ years of administrative support, database management,
statistical management, and experienced in Machine Learning, Data mining with large Structured and Unstructured large
data sets, performing Data Acquisition, Data Validation, Predictive modeling, and Data Visualization. Financial service
and industry experience in text mining, transposing words, and phrases in unstructured data sets into numerical values.
Innovative and scientifically rigorous recent graduate with significant data science internship experience to bring to the
table. With a team-oriented attitude, I am eager to contribute my abilities in quantitative modeling and experimentation to
enhance the experience of Pinterest users around the world.
SKILLS
Analytical methods: Panel Data, Dynamics Panel Data, Advanced Data Modelling, Forecasting, time series Models,
Regression Analysis, Predictive Analytics, Statistical Analysis (ANOVA, ANCOVA, AR, MA, and ARIMA), inferential
statistics, Bayesian Statistics, descriptive Statistics
Programming: SAS (base SAS and Macros), SQL, R, Python, Spark and SparkSQL
Strong SQL skills including complex query building and query performance tuning.
ETL experience using either Informatica or Data Stage or an equivalent technology.
Proficient in SQL, Python, and data modeling tools such as Power BI or Tableau
Experience communicating and delivering results with data scientists, machine learning, and modeling valuable to
implementing use cases.
Python (or similar language) for data collection, analysis, and communication including familiarity with pandas package,
object oriented programming, and built-in functions. Experience with R
Familiarity with cloud-based data platforms such as AWS, Azure, or GCP
Databricks, AWS Glue, S3, EC2
2 years of experience in data engineering including ETL development, data modeling, and schema design
Supervised Learning: linear and logistic regressions, decision trees, support vector machines (SVM)
Unsupervised Learning: K-means clustering, principal component analysis (PCA)
Python (NumPy, Pandas, Scikit-learn, Keras, Tensorflow), SQL (MySQL, Postgres, Microsoft SQL server) Time Series
Forecasting, Productionizing Models, Recommendation Engines Customer Segmentation
Data Visualization: Excel, Google Sheets, Tableau, PowerBi, Excel Power Query, MicroStrategy
WORK EXPERIENCE
⬧ developed data visualizations and dashboards to communicate key insights using PowerBI and Excel Power
Query.
⬧ Created and managed analytics impactful dashboards using PowerBI or Excel PowerQuery for data reporting.
⬧ Data analysis and reporting using Advanced EXCEL 365(Vlook up, PivotChart Powerpivot, Power Query, Pivot
table, Formulas, Graph)
⬧ Model governance and review, including model deficiency identification and remediation, independent testing,
ongoing performance monitoring, and compensating controls in all aspects of the model lifecycle.
⬧ Statistical analysis using R and Python.
⬧ Design and deploy graphic visualizations using PowerBI.
⬧ Create calculated fields, combine fields, bins, sets, and hierarchies using PowerBI.
⬧ cleaned unstructured datasets.
⬧ data wrangling via notebooks like Jupiter notebook
⬧ querying, validating, manipulating, and transferring data sets across multiple formats using SQL.
⬧ Worked on source to target Data mappings in excel as well as Informatica Data Quality (IDQ) tool to deliver the
application level specific rules to the developers and performed Data Profiling.
⬧ Authored various Use Cases and Activity diagrams, Sequence diagrams using Rational Requisite Pro and used
UML methodology to define the Data Flow Diagrams (DFD).
⬧ Performed Forward engineering, Reverse engineering, and applied naming standards in Erwin.
⬧ Worked closely with ETL, Hadoop and SSRS developers to deliver the business rules for them to produce report,
flat file, or PDF format.
⬧ Design and deploy graphic visualizations using PowerBI.
⬧ Create calculated fields, combine fields, bins, sets, and hierarchies using PowerBI.
⬧ cleaned unstructured datasets to structure datasets in PowerBi.
⬧ Extract, Transformed unstructured datasets in Power Query and Load (ETL) in PowerBi for visualization.
⬧ data wrangling via notebooks like Jupiter notebook
⬧ querying, validating, manipulating, and transferring data sets across multiple formats using SQL.
⬧ translate complex, technical findings into an easily understood narrative in graphical, verbal, or written forms.
⬧ Identify, design, and implement engineering process improvements through automating manual processes,
optimizing data pipelines, re-designing services for greater scalability.
⬧ data analysis and reporting, including dashboard development, data cleansing, metrics analysis, and research.
⬧ Implemented lessons that involved multiple instructional methods and were cross curricular to promote academic
growth in a 9th to 12th grade classroom for twelve weeks.
⬧ Strengthened the classroom management plan to increase students focus and motivation.
⬧ Worked with teachers to develop standardized assessments.
⬧ Worked with 4 interns to conduct attitude study, which led current buyers to purchase products 13% more
frequently.
⬧ Built data visualizations using SQL and Tableau for business KPIs that reduced manual reporting by 9 hours weekly.
⬧ Received, cleaned, and prepped data from client using Python, SQL, and Excel to help data scientists build
marketing mix models that lifted ROI by 4 basis points.
⬧ Created calculator with Excel and SQL for a client to help prioritize a project roadmap by changing inputs.
⬧ Collaborated with and garnered feedback from product managers and analysts, and documented user data
⬧ Determined strategic marketing opportunity for client through analysis, delineating savings of $12K in annual
campaign budget.
⬧ Contributed to reports on product development and design.
⬧ Create Taxonomies, Glossaries and ontologies using TopBraid EDG for the Naval Meteorology and
Oceanography Command
⬧ Made recommendation for the Naval oceanography enterprise Controlled vocabulary Governance (NAVY)
⬧ Manage Metadata for the Naval Meteorology and Oceanography Command
⬧ Created an XML document Transformed XML document into XLS, XLST
⬧ Handle Unstructured and structure Data to derive some information from which helps in development of the
company.
⬧ Write SPARSQL queries to perform advanced-level data extraction, transformation and data management tasks
providing on the go responses to some management questions by performing complex joins and queries.
⬧ Analyze benchmarking data, reports, processes, and measurements for Naval meteorology and oceanography
indicators.
⬧ Used Python 3.0 (NumPy, SciPy, Pandas, SciKit-Learn, Seaborn, NLTK) to develop variety of models and
algorithms for analytic purposes.
⬧ Leads and performs activities to develop and execute tests to validate ATG system functionality against specifications.
⬧ Conduct statistical inventory reconciliation on more than 600 underground storage tanks.
⬧ Analyze inventories and dispensing data collected to determine whether tank system is leaking.
⬧ Gather more than 200 metrics on uninterrupted basis for tanks and producing reports to communicate performance
against goals, areas for improvement, and progress toward the desired state!
⬧ Manage People.Net database to setup, delete, add drivers on implementing the Electronic Logging Devices
⬧ Collaborated with other analysts and key stakeholders to identify underlying trends, both internally and externally,
impacting current and future enrolment and financial considerations, and incorporate trends into forecast models.
⬧ Dealing with a large set of data coming from five different sources
⬧ Used analysis packages (e1071, catools, sklearn, dploy, ggplot) in programming languages (R, Python) in query,
extraction, and manipulation.
⬧ Monitor daily 30 workflow using Salesforce software and integrated data system.
⬧ Utilization of Excel including pivot tables, graphing and mechanization, fast load
⬧ Enhanced already existing statistical models (linear models) for predicting the best prices for commercialization using
Machine Learning Linear regression models, by applying a mix of linear regression and random forest regression.
⬧ Created impactful dashboards using Tableau or Excel for data reporting.
⬧ Data analysis and reporting using Advanced EXCEL (Vlook up, PivotChart Powerpivot, Formulas, Graph)
⬧ Model governance and review, including model deficiency identification and remediation, independent testing,
ongoing performance monitoring, and compensating controls in all aspects of the model lifecycle?
⬧ Design and deploy graphic visualizations using Tableau.
⬧ Create calculated fields, combine fields, bins, sets, and hierarchies using Tableau.
⬧ Implemented Data blending to blend related data from different data sources in Tableau.
PROJECTS
Fantasy Football Models
⬧ Aggregated and prepped 3 years of fantasy football projection data from 3 independent sources into a MySQL
database. ·
⬧ Created a random forest model in R combining disparate sources into one projection that outperformed the mean
absolute error of the next best projection by 15%.
Prima India Project
⬧ The goal of this project is to model a classifier for predicting whether a patient is suffering from diabetes or not
using the support vector machine using four kernel functions.
⬧ Our application of SVM found models that achieved fair classification performance, with RBF function accuracy
of around 82.61%.
Text Data Analysis (YouTube Case-Study)
⬧ Perform sentiment analysis using pandas, seaborn, matplotlib, and Numpy
⬧ Perform Emoji’s Analysis
Marketing Department (New York Bank)
⬧ Understand the power of data science to perform market segmentation using K-Mean algorithms and Principal
Component Analysis
⬧ Perform exploratory data analysis and visualize data set and understand the theory behind K means clustering
using the elbow method.
Entertainment Engine
⬧ Built enhanced entertainment recommendation using k nearest-neighbors in scikit-learn after aggregating data
from Rotten Tomatoes.
⬧ Built visualizations in Tableau to show how ratings changed over time and how the model was performing.
⬧ Saved 15+ minutes on entertainment selections relative to previous methodology.
Operation Department (Toronto Hospital)
⬧ To develop a model that could detect and classify chest disease and reduction cost and time of detection.
⬧ Learn how to leverage the power of deep learning to improve the process.
⬧ Understand the theory and intuition behind the Convolutional neural network, and learn of deep learning, perform
image classification.