Rohit Data Analyst Professional Summary
Rohit Data Analyst Professional Summary
Rohit Data Analyst Professional Summary
Data Analyst
PROFESSIONAL SUMMARY:
· 8+ Years in Data Analysis, Data Engineering and Data Analysis, Business Requirement Gathering,
Data Modeling, Technical Documentation with around 8 years of experience in IT makes me a
full stack data scientist.
· Good exposure to Machine learning models, techniques, statistics behind it and work arounds
on different types of data like structured, unstructured, numerical, text, image.
· Experienced and comfortable working on various databases like SQL Server AWS, Redshift,
Oracle, No SQL MongoDB, Postgresql, Redis.
· Developed analytical measures, delivering the insights to stakeholders and created tableau
reports at store, city and pan India for performance assessment.
· Well conversant with test case design, test case execution, test data preparation, SDLC
concepts, Defect Life cycle.
· Expert in using design and management tools like Erwin and Toad, SQL Developer.
· Thorough knowledge in creating DDL, DML and Transaction queries in SQL for Oracle database.
· Expertise in software management tools such as Clear Quest, Quality center, HP ALM, JIRA for
defect tracking and reporting.
· Proficient in Data Governance, Data Lifecycle, Data Quality Improvement, Master Data
Management, and Metadata Management.
· Worked with Data governance team to evaluate test results for fulfillment of all data
requirements
· Expertise in test data validation with Multi - dimensional cubes such as SSAS and BO.
· Cloud Technologies AWS (EMR, EC2, S3, Athena) and Azure (Blob Storage, Azure HD Insights,
MS-SQL Server
· Proficient in Oracle 11g, Teradata, SQL Server, PL/SQL on UNIX and Windows platforms.
Extensive experience in Database activities like Data Modeling, Design, development,
maintenance, performance monitoring and tuning, troubleshooting, data migration etc.
· Well capable in preparing Financial Reports, On-Demand Reports using Oracle SQL, Teradata.
Developed Adhoc queries and Reports using Oracle, SQL, PL/SQL, and UNIX to fulfill business
analysts, Operations Analysts and Financial analysts’ data requests.
· Proficient in preparing Presentations, Graphs, Pivot Tables using Microsoft Excel and PowerPoint
TECHNICAL SKILLS:
Databases: AWS, GCP, Redshift, MongoDB, MySQL, PostgreSQL, Redis, DB2, Oracle, SQL PL,
SQL Server, Teradata, Microsoft PDW, SCMVC VSS, Team foundation sever, Clear Quest,
Clear Case, IBM CMVC
ETL Tools: Airflow, Dockers, Cron, CI/CD, Informatica, SSIS, Ab Initio, Data Stage
Testing Tools: HP Quality Center, Mercury Quality Center /Test Director, Clear Case Other tools
Languages: Python, R, Java, SQL, Nodejs, C, C++, Matlab, PyTorch, Spark, Numpy, Tensor Flow,
RDBMS(SQL), Redshift, MySQL, PostgreSQL, MongoDB (No-SQL), HBase, Redis, Flink, Kafka,
Hadoop, AWS, Tableau, Elastic Search, Kibana, CI/CD, Simulation & Optimization, Data
Visualization, Machine Learning, Deep Learning, Data Mining, Business Analytics, Data
PROFESSIONAL EXPERIENCE:
Responsibilities:
· Calibration of devices with the reference(government) device values. Monitors air pollutants
such as CO, NO2, O3, PM 2.5, 10 levels. Pre-processing of sensor data. Applied machine learning
models like Non-linear Regression, ARIMA, ARIMAX. GAM produced desired results & got
government approvals.
· Enhanced the supply chain and logistics by building an end to end Vehicle Routing Logistics(VRL)
pipeline using Open Source Routing Machine(OSRM). Iteratively reduced bias for trip
generation.
· Supply chain optimization through root cause analysis on funnel events, processes causing
delays like partner availability, idle time, handshakes, item location updates.
· Geographic segmentation and visualization on kepler using Open Street Maps (OSM) data lying
in Postgresql database.
· Worked with data standardization, Merge and data pattern matching for MDM Setup
· Manipulate and prepare data, extract data from database for business analyst using SAS.
· Analyzed the Business Requirements Specification Documents and Source to Target Mapping
Documents and identified the test requirements.
· Co-ordinated with business users to understand with functional requirements. This included
creating ETL Specification Document, participating in review meetings.
· Worked with AWS Athena and Snowflake database for doing the data analysis.
· Selected and generated data into csv files and stored them into AWS S3 by using AWS EC2 and
then structured and stored in AWS Redshift.
· Perform the UAT and get sign off from application owner before making live application in AWS
· Tested the source data for data completeness and data correctness.
· Tested the PL/SQL package that loaded data into staging from the source database.
· Created the ETL process that loaded the data into target database after performing all the
transformations according to the business requirements.
· Validated the test data in DB2 tables on Mainframes and on Tera data using SQL queries.
· Writing complex SQL queries for checking the counts and for validating the data at field level.
· Tested the format of the reports according to the specifications provided and also compared the
data in the reports with the backend Datamart through SQL and also using excel for data
comparison.
· Created xml schema definitions (XSDs) with XMLSPY tool and converted into Informatica
metadata.
· Participate in biweekly technical huddle meeting with development and DBA team. Participate
in weekly data analyst meting and submit weekly data governance status
· Created Excel Templates using macros and extensively used VB scripting to create various
reports according to end user requirement.
· Performed segmentation to extract data and create lists to support direct marketing mailings
and marketing mailing campaigns.
· Worked with systems engineering team to deploy and test new Hadoop environments and
expand existing Hadoop clusters
· Built Fast Load and Fast Export scripts to load data into Teradata and extract data from
Teradata.
· Involved in end to end testing of the entire process flow starting from the source database to
the target Datamart to the reports by considering all possible scenarios.
· Was responsible for maintaining all the test cases and defects in HP Quality Center 10 for all the
team members to review.
· Worked with data compliance teams, Data governance team to maintain data models,
Metadata, Data Dictionaries; define source fields and its definitions.
· Prepared data quality criteria and governance for Data Warehousing Application.
· Used VBA for excel to automate the data entry forms to help standardize data.
· Used SAS for pre-processing data, SQL queries, and data analysis, generating reports, graphics
and statistical analysis.
· Performed integration testing of Hadoop packages for ingestion, transformation, and loading of
massive structured and unstructured data in to benchmark cube.
· Carried out the testing strategy/validations against MDM subject area by implementing key test
cases.
· Exploited power of Teradata to solve complex business problems by data analysis on a large set
of data.
· Worked with VB Script and UNIX Shell scripting for File Validations.
· Involved in Writing Detailed Level Test Documentation for reports and Universe testing. Involved
in developing detailed Test strategy, Test plan, Test cases and Test procedures using Quality
Center for Functional and Regression Testing.
Responsibilities:
· Segmentation of customers using Recency, Frequency & Monetary(RFM model). Analysis on
various KPIs for building segment specific A/B test experiments and strategies for increasing the
user engagement which increased the high performance segment customers by 8.59%.
· Optical Character Recognition(OCR), text extraction of pdf files and analysis of geographic data
and ArcGIS API’s.
· Modeling over the insurance data forecasting the likelihood of a car breakdown reveals
profit/loss.
· Worked with Data Modeling team to create Logical/Physical models for Enterprise Data
Warehouse.
· Designed and Developed Complex Active reports and Dashboards with different data
visualizations using Tableau desktop on customer data.
· Experience in monitoring and managing the Hadoop cluster using Cloudera Manager.
· Created various PL/SQL stored procedures for dropping and recreating indexes on target tables.
· Worked in Mainframe environment and used SQL to query various reporting databases
· Familiar with using Set, Multiset, Derived, Volatile and Global Temporary tables in Teradata for
larger Adhoc SQL requests.
· Used Teradata advanced techniques like OLAP functions CSUM, MAVG, MSUM MDIFF etc.
· Developed Reports using the Teradata advanced techniques like Rank, Row number and etc.
· Efficient in process modeling using Erwin in both forward and reverse engineering cases.
· Pulling data using SQL from various servers including DB2 and SQL Server.
· Developed Data Migration and Cleansing rules for the Integration Architecture (OLTP, ODS, DW).
· Developed data mapping documents between Legacy, Production, and User Interface Systems.
· Documented data content, data relationships and structure, and processes the data using
Informatica Power Center Metadata Exchange.
· Analyze the client data and business terms from a data quality and integrity perspective.
· Perform root cause analysis on smaller self-contained data analysis tasks that are related to
assigned data processes.
· Worked to ensure high levels of data consistency between diverse source systems including flat
files, XML and SQL Database.
· Extracted data from databases like Oracle, SQL server and DB2 using Informatica to load it
into a single repository for data analysis.
· Develop and run ad hoc data queries from multiple database types to identify system of records,
data inconsistencies, and data quality issues.
· Involved in translating the business requirements into data requirements across different
systems.
· Involved in understanding the customer needs with regards to data, documenting requirements,
developing complex SQL statements to extract the data and packaging/encrypting data for
delivery to customers.
· Designed and maintained Tableau reports used to graphically analyze business data.
· Wrote SQL Stored Procedures and Views, and coordinate and perform in-depth testing of new
and existing systems.
· Analyzed data using SAS for automation and determined business data trends.
· Utilized Excel Macros and VBA for automating the process of variance and trend analysis
· Provided support to Data Architect and Data Modeler in Designing and Implementing Databases
for MDM using ERWIN Data Modeler Tool and MS Access.
· Analyzed test cases for Big Data analytics platform for Claim and provider’s data using Hadoop
and Hive.
· Responsible for understanding the business rules for Master Data management (MDM)
· Worked with Data Architect in Designing the CIM Model for Master Data Management.
Environment: Informatica, SAS/BASE, SAS/Access, SAS/Connect, XML, MS Office, DB2, VBA Excel 2013,
Mainframes, VSAM, JCL, PL/SQL, Tableau, Access, SQL Server, Erwin, InfoSphere Data Architect,
Teradata, Windows, Oracle.
· Creation and maintenance of MATRIX – R Shiny live application for internal purpose having near
real time KPI observation, performance, functioning and visualizations helping stakeholders,
managers and associates to take decisions and strategize accordingly.
· Engineered ETL pipeline using Java, Flink, Kafka, Google Dataproc Clusters, Airflow for Single
View of Customer consisting of aggregated analytical data for each customer like age, trips, life
time value, offers used and referrals.
· Responsible for generating daily reports, automation, extracting and analyzing data from various
data sources like MongoDB, Cassandra, Presto using Flask API, SQL queries, Metabase,
automating the processes.
· Predicted the price of vehicles for the next 5 years. Price depends on make, model, trim, age,
distance travelled, scheduled maintenance, repairs, location and demand based on data
preparation and analysis.
· Predictive maintenance of vehicle mobile objects, analysis of the dynamics, estimating life,
finding out defects and anomalies in their performance. Calculated speed of mobile object from
sensor data, estimated load of the vehicle, monitoring the pressure, temperature and health of
the object.
· Assisted in creating fact and dimension table implementation in Star Schema model based on
requirements.
· Analyzed and rectified data in source systems and Financial Data Warehouse databases.
· Generated and reviewed reports to analyze data using different excel formats
· Troubleshooting, resolving and escalating data related issues and validating data to improve
data quality.
· Optimizing/Tuning several complex SQL queries for better performance and efficiency.
· Designed and developed UNIX shell scripts as part of the ETL process, automate the process of
loading, pulling the data.
· Validated cube and query data from the reporting system back to the source system.
Client: Amigos Software Solutions, Hyd, India January 2016 to May 2017
Responsibilities:
· Studied ample number of research papers for understanding the text classification techniques.
Used SentiWordNet 2.0 for training the data of customer reviews and Multinomial Naïve Bayes
Classifier from NLTK python library to classify the customer reviews into positive or negative.
· Designed developed and implemented 2 professionally finished systems for tracking IT requests,
and providing a data repository about reports. Documented all system functionality.
· Participated in testing of procedures and data, utilizing PL/SQL, to ensure integrity and quality of
data in data warehouse.
· Metrics reporting, data mining and trends in helpdesk environment using Access
· Gather data from Help Desk Ticketing System and write adhoc reports and, charts and graphs for
analysis.
· Identify and report on various computer problems within the company to upper management
· Report on trends that come up as to identify changes or trouble within the systems using Access
and Crystal Reports.
· Guide, train and support teammates in testing processes, procedures, analysis and quality
control of data, utilizing past experience and training in Oracle, SQL, Unix and relational
databases.
· Maintained Excel workbooks, such as development of pivot tables, exporting data from external
SQL databases, producing reports and updating spreadsheet information.
· Ran workflows created in Informatica by developers then compared before and after
transformation of data generated to ensure that transformation was successful.
· Modified user profiles, which included changing users cost center location, changed users
authority to grant monetary amounts to certain departments - monetary amounts were part of
the overall budget amount granted per department
· Deleted users from cost centers, deleted users authority to grant certain monetary amounts to
certain departments, deleted certain cost centers and profit centers from database
· Created Excel pivot tables, which showed a table of users that, have not performed scanning of
journal voucher documents. Users were able to find documents by double-clicking on his/her
name within the pivot table
· Merge the duplicate records and ensure that the information is associated with company
records.
· Standardize company names, addresses, and ensure that necessary data fields are populated.
· Review the database proactively to identify inconsistencies in the data, conduct research using
internal and external sources to determine information is accurate.
· Coordinate activities and workflow with other Data Stewards in the firm to ensure data changes
are done effectively and efficiently
· Review the database to identify and recommend adjustments and enhancements, including
external systems and types of data that could add value to the system.
· Extract the data from database and provide data analysis using SQL to the business user based
on the requirements. Create pivots and charts in excel sheet to report data in the format
requested
· Assist other Data Stewards with Data Change Management (DCM) Inbox in resolving various
tickets created by the User Change Request in Interaction Database.
· Developed and Created Logical and Physical Database Architecture using ERWIN.
· Designed STAR Schemas for the detailed Data Marts and plan Data Marts involving Shared
Dimensions.
· Conduct Design reviews with the business analysts and content developers to create a proof of
concept for the reports.
· Conducted the required GAP analysis between their AS-IS submission process and TO-BE
Encounter Data Submission Process.
Environment: MS Outlook, MS Project, MS Word, MS Excel, MS Visio, MS Access, Power MHS, Citrix,
Clarity, MS SharePoint
Client: Couth Infotech Pvt. Ltd, Hyderabad, India July 2013 to December 2015
Responsibilities:
· Extensively used SQL programming in backend and front-end functions, procedures, packages to
implement business rules and security
· Worked with SSIS system variable, passing control and audit variables between packages.
· Created adhoc reports for testing and supporting UAT and presented in as Excel spreadsheets
for data verification
· Assisted UAT by testing data in different types of reports, like Master/Detail, Cross Tab and
Charts (for trend analysis).
· Developed scripts, utilities, simulators, data sets and other programmatic test tools as required
executing test plans.
· Creating test cases for ETL mappings and design documents for production support.
· Assisted in test strategy and data creation for testing the mapping document
· Assisted in creating fact and dimension table implementation in Star Schema model based on
requirements.
· Written complex SQL queries for querying data against different data bases for data verification
process.
· Created data requirements and test data to test type II slowly changing dimension tables.
· Worked with ETL group for understating mappings for dimensions and facts.
· Analyzed data from various sources like Oracle, flat files and SQL Server.
· Extensively tested several Cognos reports for data quality, data values, functionality,
calculations, fonts, headers & cosmetic.
Environment: Informatica Power Center (Power Center Designer, workflow manager, workflow
monitor), Mercury Test Director, QTP SQL *Loader, UNIX, Oracle8i, SQL Server, Erwin, Windows, TOAD