Krishna Data Scientist +1 (713) - 478-5282
Krishna Data Scientist +1 (713) - 478-5282
Krishna Data Scientist +1 (713) - 478-5282
com
DATA SCIENTIST +1(713)-478-5282
INTRODUCTION
Around 7+ years of professional experience in Requirement Gathering, Analysis, Developing, Testing, and
implementing life cycle utilizing approaches like Agile, Waterfall and Test-Driven Development (TDD).
PROFESSIONAL SUMMARY
profound experience as a Data Scientist/Machine Learning, Data Engineer and Data Analyst Developer with
excellent Statistical Analysis, Data Mining and Machine Learning Skills.
Worked in the domains of Financial & Insurance Services, Healthcare and Retail.
Expertise in managing full life cycle of Data Science project includes transforming business requirements
into Data
Collection, Data Cleaning, Data Preparation, Data Validation, Data Mining, and Data Visualization from
structured and unstructured Data sources.
Hands on experience in writing queries in SQL and R to extract, transform and load (ETL) data from large
datasets using Data Staging.
Hands on experience of statistical modeling techniques such as: linear regression, Lasso regression, logistic
regression, ANOVA, clustering analysis, principal component analysis.
Hands on Experience in writing User Defined Functions (UDFs) to extend functionality with Scala, and python for
Data preprocessing
Professional working experience in Machine Learning algorithms such as LDA, Linear Regression, Logistic
Regression, SVM, Random Forest, Decision Trees, Clustering, Neural Networks and Principal Component
Analysis.
Working knowledge on Anomaly detection, Recommender Systems and Feature Creation, Validation using
ROC plot and K-fold cross validation.
Professional working experience of using programming languages and tools such as Python, R, Hive, Spark,
and PL/SQL.
Hands on experience in ELK (Elasticsearch, Logstash, and Kibana) stack and AWS, Azure, and GCP.
Hands on experience of Data Science libraries in Python such as Pandas, Numpy, SciPy, scikit-learn, Matplotlib,
Seaborn, Beautiful Soup, NLTK.
Working experience in RDBMS such as SQL Server 2012/2008 and Oracle 11g.
Extensive experience of Hadoop, Hive and NOSQL data bases such as MongoDb, Cassandra, Snowflake and
HBase.
Experience in data visualizations using Python, R, Power BI, and Tableau 9.4/9.2.
Highly experienced in MS SQL Server with Business Intelligence in SQL Server Integration Services (SSIS), SQL
Server Analysis Services (SSAS), SAS, and SQL Server Reporting Services (SSRS).
Familiar with conducting GAP analysis, User Acceptance Testing (UAT), SWOT analysis, cost benefit analysis
and ROI analysis.
Deep understanding of software Development Life Cycle (SDLC) as well as Agile/Scrum Methodology to
accelerate Software Development iteration
Experience with version control tool- Git, and SVN.
Extensive experience in handling multiple tasks to meet deadlines and creating deliverables in fast-paced
environments and interacting with business and end users.
Proficient knowledge of statistics, mathematics, machine learning, recommendation algorithms and analytics
with an excellent understanding of business operations and analytics tools for effective analysis of data.
TECHNICAL SKILLS
Programming Languages Python, Java, R, C, C++, SAS Enterprise Miner, and SQL (Oracle & SQL Server)
ETL Tools Informatica Power center, Talend Open studio, and DataStage
Operating Systems Linux, Unix, and Windows
Methodologies Agile, Waterfall, OOAD, SCRUM
Version Control SVN, CVS, GIT and Clear Case
PROFESSIONAL EXPERIENCE
Client: Allied Solutions May 2019-Present
Role: Data Scientist/ Machine Learning Engineer
Responsibilities:
Experienced machine learning and statistical modeling techniques to develop and evaluate algorithms to
improve performance, quality, data management and accuracy.
Experienced in implementing LDA, Naive Bayes and skilled in Random Forests, Decision Trees, Linear and
Logistic Regression, SVM, Clustering, Neural Networks, Principal Component Analysis, and good knowledge on
Recommender Systems.
Gathered, analyzed, documented, and translated application requirements into data models, supported
standardization of documentation and the adoption of standards and practices related to data and applications.
Responsible for identifying the data sets required to come up with predictive models for providing solutions for
both internal and external business problems.
Data preparation Includes Data Mapping of unlined data from various formats, Identifying The missing data,
Finding the correlations, scaling, and removing the Junk data to further process the data for building a predictive
model into Apache Spark.
Responsible for supervising the Data cleansing, Validation, data classifications and data modelling activities.
To develop algorithms in python like K – Means, Random Forest linear regression, XG Boost and SVM.as part of
data analysis.
Proficient with a deep learning framework such as TensorFlow or Keras and libraries like Scikit-learn
Based on over all Statistics, Model Performance and Run Time decided Final Model and achieved accuracy,
precision, recall in the range of 75-80 % on average for the validated models
Calculated statistical thresholds for A/B Tests and routinely collected data for multiple tests at a time.
Experienced in testing of ETL and message queue workflows and workloads.
Implementation experiences in Machine Learning and deep learning, including Regression, Classification,
Neural network, object tracking, Natural Language Processing (NLP) using packages like Tensor Flow, Keras,
NLTK, Spacy, and BERT.
Development of Web Services using REST API’s for sending and getting data from the external interface in the
JSON format.
Understanding of data structures, data modeling and software architecture
Constructed event pipeline around AWS to simulate and display status of trades in real time
Worked on Azure databases, the database server is hosted on Azure and use Microsoft credentials to login to
the DB rather than the Windows authentication that is typically used.
Built streaming pipeline with confluent AWS Kubernetes with python to support CI/CD
Utilized Spark SQL to extract and process data by parsing using Datasets or RDDs in Hive Context, with
transformations and actions (map, flat Map, filter, reduce, reduce By Key).
Environment: Python, Numpy, Pandas, TensorFlow, Seaborn, SciPy, NLP, Matplotlib, GitHub, Spark, Sqoop, Kafka,
Hive, Java script and AWS, Azure, S3, Jupyter Notebook, Mango DB, Tableau, and Postman.
Environment: Python (Scikit-Learn/SciPy/Numpy/Pandas), Linux, Tableau, Hadoop, Map Reduce, Hive, Oracle,
Windows 10/XP, JIRA.