Data Engineer Rithick Bisher

Rithick Bisher is a Data Engineer with over 9 years of experience in Big Data technologies, proficient in tools such as Hadoop, Spark, and various programming languages including Scala, Java, and Python. He has extensive experience in developing and deploying applications in the Hadoop ecosystem, ETL processes, and machine learning algorithms, with a strong background in data architecture and analytics. His recent roles include providing architectural leadership and developing predictive models at Centene Corporation and Chewy, along with optimizing data processing systems at UBS.

Uploaded by

benchsalesrecruitersaikiran

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views5 pages

Data Engineer Rithick Bisher

Uploaded by

benchsalesrecruitersaikiran

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

RITHICK BISHER

Email: [email protected] PH: 901-492-1051

Data Engineer
PROFESSIONAL SUMMARY:
 9+ years of IT experience in a variety of industries working on Big Data technology using technologies such as
Cloudera and Hortonworks distributions. Hadoop working environment includes Hadoop, Spark, MapReduce,
Kafka, Hive, Ambari, Sqoop, HBase, and Impala.
 Fluent programming experience with Scala, Java, Python, SQL, T - SQL, R.
 Hands-on experience in developing and deploying enterprise-based applications using major Hadoop ecosystem
components like MapReduce, YARN, Hive, HBase, Flume, Sqoop, Spark MLlib, Spark GraphX, Spark SQL,
Kafka.
 Adept at configuring and installing Hadoop/Spark Ecosystem Components.
 Proficient with Spark Core, Spark SQL, Spark MLlib, Spark GraphX and Spark Streaming for processing and
transforming complex data using in-memory computing capabilities written in Scala. Worked with Spark to
improve efficiency of existing algorithms using Spark Context, Spark SQL, Spark MLlib, Data Frame, Pair
RDD’s and Spark YARN.
 Experience in application of various data sources like Oracle SE2 , SQL Server , Flat Files and Unstructured files
into a data warehouse.
 Able to use Sqoop to migrate data between RDBMS, NoSQL databases and HDFS.
 Experience in Extraction, Transformation and Loading (ETL) data from various sources into Data Warehouses, as
well as data processing like collecting, aggregating and moving data from various sources using Apache Flume,
Kafka, PowerBI and Microsoft SSIS.
 Hands-on experience with Hadoop architecture and various components such as Hadoop File System HDFS, Job
Tracker, Task Tracker, Name Node, Data Node and Hadoop MapReduce programming.
 Comprehensive experience in developing simple to complex Map reduce and Streaming jobs using Scala and Java
for data cleansing, filtering and data aggregation. Also possess detailed knowledge of MapReduce framework.
 Used IDEs like Eclipse, IntelliJ IDE, PyCharm IDE, Notepad ++, and Visual Studio for development.
 Seasoned practice in Machine Learning algorithms and Predictive Modeling such as Linear Regression, Logistic
Regression, Bayes, Decision Tree, Random Forest, KNN, Neural Networks, and K-means Clustering.
 Ample knowledge of data architecture including data ingestion pipeline design, Hadoop/Spark architecture, data
modeling, data mining, machine learning and advanced data processing.
 Experience working with NoSQL databases like Cassandra and HBase and developed real-time read/write access
to very large datasets via HBase.
 Developed Spark Applications that can handle data from various RDBMS (MySQL, Oracle Database) and
Streaming sources.
 Proficient SQL experience in querying, data extraction/transformations and developing queries for a wide range
of applications.
 Capable of processing large sets (Gigabytes) of structured, semi-structured or unstructured data.
 Experience in analyzing data using HiveQL, Pig, HBase and custom MapReduce programs in Java 8.
 Experience working with GitHub/Git 2.12 source and version control systems.
 Strong in core Java concepts including Object-Oriented Design (OOD) and Java components like Collections
Framework, Exception handling, I/O system.

TECHNICAL SKILLS:
Languages: Cluster Management & Monitoring Python 3.7.0+, Java 1.8, Scala 2.11.8+, SQL, Cloudera Manager 6.0.0+,
Hortonworks Ambari TSQL, R 3.5.0+, C++, C, MATLAB. 2.6.0+, CloudxLab.
Hadoop Ecosystem: Database Hadoop 2.8.4+, Spark 2.0.0+, MapReduce\ MySQL 5.X, SQL Server Oracle 11g, HBase
HDFS, Kafka 0.11.0.1+, Hive 2.1.0+, HBase \ 1.2.3+, Cassandra 3.11. 1.4.4 +, Sqoop 1.99.7+, Pig 0.17, Flume 1.6.0+,
Keras 2.2.4.
Visualization: Virtualization PowerBI, Oracle BI, Tableau 10.0+. \ VM ware workstation, AWS.\
Operating Systems: Markup Languages Linux, Windows, Ubuntu. \ HTML5, CSS3, JavaScript. \
Other Tools: IDE Jupyter Notebook, KNIME, MS SSMS, Putty, \ Eclipse, GitHub, PyCharm, Maven, IntelliJ, WinSCP,
MS Office 365, Sage Math, SEED \ RStudio, Visual Studio.\ Ubuntu, TensorFlow, NumPy.

PROFESSIONAL EXPERIENCE:
Centene Corporation St Louis, Missouri February 2024 to Present
Sr. Data Engineer
Responsibilities:
 Provided the architectural leadership in shaping strategic, business technology projects, with an emphasis on
application architecture.
 Utilized domain knowledge and application portfolio knowledge to play a key role in defining the future state of
large, business technology programs.
 Participated in all phases of data mining, data collection, data cleaning, developing models, validation, and
visualization and performed Gap analysis.
 Developed Map Reduce/Spark Python modules for machine learning & predictive analytics in Hadoop on AWS.
Implemented a Python-based distributed random forest via Python streaming.
 Migrated the Application on to AWS Cloud.
 Created ecosystem models (e.g. conceptual, logical, physical, canonical) that are required for supporting services
within the enterprise data architecture (conceptual data model for defining the major subject areas used,
ecosystem logical model for defining standard business meaning for entities and fields, and an ecosystem
canonical model for defining the standard messages and formats to be used in data integration services throughout
the ecosystem).
 Used Pandas, NumPy, Seaborn, SciPy, Matplotlib, Scikit-learn, NLTK in Python for developing various machine
learning algorithms and utilized machine learning algorithms such as linear regression, multivariate regression,
naive Bayes, Random Forests, K-means, &KNN for data analysis.
 Conducted studies, rapid plots and using advanced data mining and statistical modeling techniques to build a
solution that optimizes the quality and performance of data.
 Demonstrated experience in the design and implementation of Statistical models, Predictive models, enterprise
data model, metadata solution and data lifecycle management in both RDBMS, Big Data environments.
 Designed multiple Python packages that were used within a large ETL process used to load 2TB of data from an
existing Oracle database into a new PostgreSQL cluster.
 Analyzed large data sets apply machine learning techniques and develop predictive models, statistical models and
developing and enhancing statistical models by leveraging best-in-class modelling techniques.
 Worked on database design, relational integrity constraints, OLAP, OLTP, Cubes and Normalization (3NF) and
De-normalization of the database.
 Developed Map Reduce/Spark Python modules for machine learning & predictive analytics in Hadoop on AWS.
 Leveraged ETL methods for ETL solutions and data warehouse tools for reporting and analysis.
 Used CSV Excel Storage to parse with different delimiters in PIG.
 Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of
Spark using Scala.
 Developed multiple MapReduce jobs in java to clean datasets.
 Developed code to write canonical model JSON records from numerous input sources to Kafka queues.
 Worked on customer segmentation using an unsupervised learning technique - clustering.
 Worked with various Teradata15 tools and utilities like Teradata Viewpoint, Multi-Load, ARC, Teradata
Administrator, BTEQ and other Teradata Utilities.
 Utilized Spark, Scala, Hadoop, HBase, Kafka, Spark Streaming, MLLib, Python, a broad variety of machine
learning methods including classifications, regressions, dimensionally reduction etc.
 Developed Linux Shell scripts by using NZSQL/NZLOAD utilities to load data from flat files to the Netezza
database.
 Designed and implemented system architecture for AmazonEC2 based cloud-hosted solution for the client.
 Tested Complex ETL Mappings and Sessions based on business user requirements and business rules to load data
from source flat files and RDBMS tables to Confidential tables.
Environment: Erwin r9.6, Python, SQL, Oracle 12c, Netezza, SQL Server, Informatica, Java, SSRS, PL/SQL, T-SQL,
Tableau, MLLib, regression, Cluster analysis, Scala NLP, Spark, Kafka, MongoDB, logistic regression, Hadoop, Hive,
Teradata0, random forest, OLAP, Azure, MariaDB, SAP CRM, HDFS, ODS, NLTK, SVM, JSON, Tableau, XML,
Cassandra, MapReduce, AWS.

Chewy Dania Beach, FL April 2021 to January 2024

Sr. Data Engineer
RESPONSIBILITIES:
 Supported MapReduce Programs running on the cluster.
 Evaluated business requirements and prepared detailed specifications that follow project guidelines required to
develop written programs.
 Configured Hadoop cluster with Name node and slaves and formatted HDFS.
 Used Oozie workflow engine to run multiple Hive and Pig jobs.
 Performed Map Reduce Programs those are running on the cluster.
 Developed multiple MapReduce jobs in java for data cleaning and pre-processing.
 Analyzed the partitioned and bucketed data and compute various metrics for reporting.
 Involved in loading data from RDBMS and web logs into HDFS using Sqoop and Flume.
 Worked on loading the data from MySQL to HBase where necessary using Sqoop.
 Developed Hive queries for Analysis across different banners.
 Extracted data from Twitter using Java and Twitter API. Parsed JSON formatted twitter data and uploaded to
database.
 Launching AmazonEC2 Cloud Instances using Amazon Images (Linux/ Ubuntu) and Configuring launched
instances with respect to specific applications.
 Exported the result set from Hive to MySQL using Sqoop after processing the data.
 Analyzed the data by performing Hive queries and running Pig scripts to study customer behavior.
 Have hands on experience working on Sequence files, AVRO, HAR file formats and compression.
 Used Hive to partition and bucket data.
 Fetched live stream data from DB2 to Hbase table using Spark Streaming and Apache Kafka.
 Implemented Apache PIG scripts to load data from and to store data into Hive.
 Experience in writing MapReduce programs with JavaAPI to cleanse Structured and unstructured data.
 Wrote Pig Scripts to perform ETL procedures on the data in HDFS.
 Created HBase tables to store various data formats of data coming from different portfolios.
 Worked on improving performance of existing Pig and Hive Queries.
Environment: SQL/Server, Oracle 9i, MS-Office, Apache, Teradata, Informatica, ER Studio, XML, Business Objects.

UBS weehawken, NJ January 2019 to March 2021

Data Engineer
Responsibilities:
 Member of the Business intelligence team, responsible for designing and optimizing systems.
 Optimized system that processes the ~500GB of logs generated by the Nexmo API platform every day and
loading it into the data warehouse.
 Designed and implemented data loading and aggregation frameworks and jobs that will be able to handle
hundreds of GBs of json files, using Spark, Airflow and Snowflake.
 Built tools using Tableau to allow internal and external teams to visualize and extract insights from big data
platforms.
 Responsible for expanding and optimizing data and data pipeline architecture, as well as optimizing data flow and
collection for cross functional teams.
 Build best practice ETLs with Apache Spark to load and transform raw data into easy to use dimensional data for
self-service reporting.
 Improved the deployment and testing infrastructure within AWS, using tools like Jenkins, Puppet and Docker.
 Work closely with the Product, Infrastructure and Core teams, to make sure data needs are considered during
product development and to guide decisions related to data.
Environment: Scala 2.13, Spark 2.4, Spark SQL, Kafka 2.3.0, Apache Airflow 1.10.4, Snowflake, AWS (Redshift,
Jenkins, Docker), Tableau 2019.2

Data Engineer
Careator Technologies Pvt Ltd Hyderabad, India September 2016 to October 2018
Responsibilities:
 Extensively involved in installation and configuration of Cloudera Distribution Hadoop platform.
 Extract, transform, and load (ETL) data from multiple federated data sources (JSON, relational database, etc.)
with Data Frames in Spark.
 Utilized SparkSQL to extract and process data by parsing using Datasets or RDDs in Hive Context, with
transformations and actions (map, flat Map, filter, reduce, reduce By Key).
 Extended the capabilities of Data Frames using User Defined Functions in and Scala.
 Resolved missing fields in Data Frame rows using filtering and imputation.
 Integrated visualizations into a Spark application using Databricks and popular visualization libraries (ggplot,
MatPlotLib).
 Trained analytical models with Spark ML estimators including linear regression, decision trees, logistic
regression, and k-means.
 Performed pre-processing on a dataset prior to training, including standardization, normalization.
 Created pipelines to create a processing pipeline including transformations, estimations, evaluation of analytical
models.
 Evaluated model accuracy by dividing data into training and test datasets and computing metrics using evaluators.
 Tuned training hyper-parameters by integrating cross-validation into pipelines.
 Computed using Spark MLlib functionality that wasn’t present in SparkML by converting DataFrames to RDDs
and applying RDD transformations and actions.
 Troubleshot and tuned machine learning algorithms in Spark.
Environment: Spark 2.0.0, Spark MLlib, Spark ML, Hive 2.1.0, Sqoop 1.99.7, Flume 1.6.0, HBase 1.2.3, MySQL 5.1.73,
Scala 2.11.8, Shell Scripting, Tableau 10.0, Agile

Spark Developer
Brio Technologies Private Limited Hyd India December 2014 to August 2016
Responsibilities:
 Imported required modules such as Keras and NumPy on Spark session, also created directories for data and
output.
 Read train and test data into the data directory as well as into Spark variables for easy access and proceeded to
train the data based on a sample submission.
 The images upon being displayed are represented as NumPy arrays, for easier data manipulation all the images
are stored as NumPy arrays.
 Created a validation set using Keras2DML in order to test whether the trained model was working as intended or
not.
 Defined multiple helper functions that are used while running the neural network in session. Also defined
placeholders and number of neurons in each layer.
 Created neural networks computational graph after defining weights and biases.
 Created a TensorFlow session which is used to run the neural network as well as validate the accuracy of the
model on the validation set.
 After executing the program and achieving an acceptable validation accuracy a submission was created that is
stored in the submission directory.
 Executed multiple SparkSQL queries after forming the Database to gather specific data corresponding to an
image.
Environment: Scala 2.12.8, Python 3.7.2, PySpark, Spark 2.4, Spark ML Lib, Spark SQL, TensorFlow 1.9, NumPy
1.15.2, Keras 2.2.4, PowerBI

Teja
No ratings yet
Teja
5 pages
Deepak (Sr. Data Engineer)
No ratings yet
Deepak (Sr. Data Engineer)
10 pages
Intuitive Visualization Basics
86% (7)
Intuitive Visualization Basics
2 pages
Bach Volume 4 Issue 3 1973 (Doi 10.2307/41639901) Artur Hirsch - Johann Sebastian Bach's Cantatas in Chronological Order
No ratings yet
Bach Volume 4 Issue 3 1973 (Doi 10.2307/41639901) Artur Hirsch - Johann Sebastian Bach's Cantatas in Chronological Order
19 pages
Azure Data Engineer - Samatha Gudala
100% (1)
Azure Data Engineer - Samatha Gudala
8 pages
Anusha K Phone No: (929) 456-3121 Senior Data Engineer: Summary
No ratings yet
Anusha K Phone No: (929) 456-3121 Senior Data Engineer: Summary
7 pages
Computer Elements 1
No ratings yet
Computer Elements 1
19 pages
Mini ProjectA17
0% (1)
Mini ProjectA17
25 pages
Big Data Analytics: By: Syed Nawaz Pasha at SR Univeristy Professional Elective-5 B.Tech Iv-Ii Sem
100% (1)
Big Data Analytics: By: Syed Nawaz Pasha at SR Univeristy Professional Elective-5 B.Tech Iv-Ii Sem
31 pages
The Ringling Art Library Special Outreach Measures
No ratings yet
The Ringling Art Library Special Outreach Measures
11 pages
Akash Resume
No ratings yet
Akash Resume
7 pages
Adithya Jatangi: Professional Summary
No ratings yet
Adithya Jatangi: Professional Summary
7 pages
Msbi - Resume - Msbi
No ratings yet
Msbi - Resume - Msbi
5 pages
Yasaswi-Sr Data Engineer-Resume
100% (1)
Yasaswi-Sr Data Engineer-Resume
4 pages
Harsh - Data Engineer
No ratings yet
Harsh - Data Engineer
8 pages
Maneesh Azure
No ratings yet
Maneesh Azure
6 pages
Aravind - Senior Azure Data Engineer
No ratings yet
Aravind - Senior Azure Data Engineer
5 pages
Bharath DE
No ratings yet
Bharath DE
7 pages
Resume Data Engineer
No ratings yet
Resume Data Engineer
8 pages
Dice Resume CV Sailaja Reddy
No ratings yet
Dice Resume CV Sailaja Reddy
6 pages
Nikhil Kumar Mutyala - Senior Big Data Engineer
No ratings yet
Nikhil Kumar Mutyala - Senior Big Data Engineer
7 pages
Arnab Paul
No ratings yet
Arnab Paul
8 pages
John Pual
No ratings yet
John Pual
10 pages
Avinash - Data Engineer (AutoRecovered)
No ratings yet
Avinash - Data Engineer (AutoRecovered)
10 pages
Abhinav Puskuru - GCP Data Engineer
No ratings yet
Abhinav Puskuru - GCP Data Engineer
5 pages
DE Sample Resume
No ratings yet
DE Sample Resume
6 pages
Sai Krishna Sr. Big Data Engineer
No ratings yet
Sai Krishna Sr. Big Data Engineer
8 pages
Cloud Bigdata Amand AWS
No ratings yet
Cloud Bigdata Amand AWS
6 pages
Introduction To Database System
No ratings yet
Introduction To Database System
12 pages
Sai Kruthik Reddy Data Engineer
No ratings yet
Sai Kruthik Reddy Data Engineer
9 pages
PYTHON FIINAL REPORT Sreekanth
No ratings yet
PYTHON FIINAL REPORT Sreekanth
19 pages
Gagan
No ratings yet
Gagan
8 pages
Jyostna DataEngineer GCEAD
No ratings yet
Jyostna DataEngineer GCEAD
5 pages
Vishal DataEngineer
No ratings yet
Vishal DataEngineer
3 pages
Naresh DE
No ratings yet
Naresh DE
5 pages
Sahithi Devi
No ratings yet
Sahithi Devi
10 pages
SSREDDY
No ratings yet
SSREDDY
8 pages
J OHN
No ratings yet
J OHN
8 pages
Dice Resume CV Karthik S
No ratings yet
Dice Resume CV Karthik S
4 pages
Mahesh - Big Data Engineer
No ratings yet
Mahesh - Big Data Engineer
5 pages
SumanaV Bigdata
No ratings yet
SumanaV Bigdata
6 pages
TA030 Preliminary Conceptual Architecture
No ratings yet
TA030 Preliminary Conceptual Architecture
16 pages
Aslam Big Data Engineer
No ratings yet
Aslam Big Data Engineer
6 pages
Hpe Storage Upgraed
No ratings yet
Hpe Storage Upgraed
28 pages
Syed Ahmed
No ratings yet
Syed Ahmed
4 pages
Anisha ETL DataEngineer
No ratings yet
Anisha ETL DataEngineer
7 pages
Mir Shezan Data Analyst Resume
No ratings yet
Mir Shezan Data Analyst Resume
3 pages
Abdul Kareem Syed
No ratings yet
Abdul Kareem Syed
5 pages
1
No ratings yet
1
6 pages
Ravi Teja AWS Data Engineer
No ratings yet
Ravi Teja AWS Data Engineer
8 pages
Resume 2
No ratings yet
Resume 2
4 pages
Abdul Hameed Mohamed
No ratings yet
Abdul Hameed Mohamed
7 pages
Reduction of An E - R Schema To Tables
No ratings yet
Reduction of An E - R Schema To Tables
23 pages
Rohith DE
No ratings yet
Rohith DE
7 pages
Resume 3
No ratings yet
Resume 3
7 pages
Resume 1
No ratings yet
Resume 1
7 pages
Ajay Resume
No ratings yet
Ajay Resume
3 pages
Sai Vodnala DE
No ratings yet
Sai Vodnala DE
5 pages
Future Role of AI in UX
No ratings yet
Future Role of AI in UX
13 pages
K Means Algorithm
No ratings yet
K Means Algorithm
6 pages
Srikant Data Engineer
No ratings yet
Srikant Data Engineer
6 pages
Ravi Shankar Chittela DataEngg
No ratings yet
Ravi Shankar Chittela DataEngg
10 pages
Abdul Hameed Sr. Data Engineer +1 (475) 302-9845 Summary:: Hadoop/Spark Ecosystem
No ratings yet
Abdul Hameed Sr. Data Engineer +1 (475) 302-9845 Summary:: Hadoop/Spark Ecosystem
6 pages
Mathisha Jeeva
No ratings yet
Mathisha Jeeva
6 pages
CV DV ETL Dev
No ratings yet
CV DV ETL Dev
2 pages
Report Statspack 2day
No ratings yet
Report Statspack 2day
4 pages
Sequence: Prof. Jaimini N.Patel (SDJ International College)
No ratings yet
Sequence: Prof. Jaimini N.Patel (SDJ International College)
14 pages
Ms Access Notes
No ratings yet
Ms Access Notes
14 pages
ACM Journal Format
No ratings yet
ACM Journal Format
6 pages
Information Systems in Health Care
No ratings yet
Information Systems in Health Care
7 pages
Aditya Paruchuri
No ratings yet
Aditya Paruchuri
7 pages
Anvesh - Sr. Data Engineer
No ratings yet
Anvesh - Sr. Data Engineer
6 pages
DB Lab 4
No ratings yet
DB Lab 4
2 pages
Dbs Bcsf20 Lab07
No ratings yet
Dbs Bcsf20 Lab07
4 pages
Unit 9
No ratings yet
Unit 9
6 pages
Hruthik Reddy - Senior Data Engineer
No ratings yet
Hruthik Reddy - Senior Data Engineer
4 pages
Big Data Analytics Life Cycle
No ratings yet
Big Data Analytics Life Cycle
3 pages
Tftuf
No ratings yet
Tftuf
3 pages
Wamba e Mishra (2017)
No ratings yet
Wamba e Mishra (2017)
16 pages
Mongodb Schema Design Part 1
No ratings yet
Mongodb Schema Design Part 1
1 page
DataEngineer Shreya Hadoop
No ratings yet
DataEngineer Shreya Hadoop
9 pages
Manideep Resume IXL
No ratings yet
Manideep Resume IXL
9 pages
Bharath Sai K DataEngineer
No ratings yet
Bharath Sai K DataEngineer
6 pages
DM Domain
No ratings yet
DM Domain
31 pages
Resume Profile
No ratings yet
Resume Profile
2 pages
Enabling High Reliability and Low Maintenance For Querying Costs
No ratings yet
Enabling High Reliability and Low Maintenance For Querying Costs
6 pages
PL-300 Exam - Free Microsoft Questions and Answers - ITExams.com
No ratings yet
PL-300 Exam - Free Microsoft Questions and Answers - ITExams.com
1 page
Hadoop Blueprints
From Everand
Hadoop Blueprints
Anurag Shrivastava
No ratings yet
Data Engineering with Scala and Spark: Build streaming and batch pipelines that process massive amounts of data using Scala
From Everand
Data Engineering with Scala and Spark: Build streaming and batch pipelines that process massive amounts of data using Scala
Eric Tome
No ratings yet
Ontotext GraphDB in Practice: The Complete Guide for Developers and Engineers
From Everand
Ontotext GraphDB in Practice: The Complete Guide for Developers and Engineers
William Smith
No ratings yet

Data Engineer Rithick Bisher

Uploaded by

Data Engineer Rithick Bisher

Uploaded by

RITHICK BISHER

Email: [email protected] PH: 901-492-1051

Chewy Dania Beach, FL April 2021 to January 2024

UBS weehawken, NJ January 2019 to March 2021

You might also like