Harshil GS
Harshil GS
Boston, MA |E-mail|LinkedIn|Portfolio
PROFESSIONAL SUMMARY
esults-driven Software Engineer and Data Engineer with 2+ years of experience in
R
designing, implementing, and optimizing scalable data solutions. Expertise in distributed
systems, data modeling, cloud computing, and full-stack development. Proficient in
Python, Java, SQL, Spark, AWS, and Kubernetes, with a strong foundation in data
processing, machine learning, and infrastructure automation. Passionate about building
high-performance data platforms to drive business insights and innovation.
TECHNICAL SKILLS
P
● rogramming Languages: Python, Java, C++, SQL
● Data Engineering & Analytics: Apache Spark, Hadoop, Airflow, Pandas, NumPy,
SQL
● Cloud & Infrastructure: AWS (Lambda, S3, EMR, SageMaker), Terraform,
Kubernetes, Docker
● Databases & Storage: PostgreSQL, MongoDB, SQL Server, NoSQL, Columnar
Databases
● Machine Learning & AI: TensorFlow, PyTorch, Scikit-Learn, XGBoost
● Development & DevOps: CI/CD (Jenkins, GitHub Actions), REST API, FastAPI,
Flask
● Data Governance & Security: Data Modeling, Access Control, Traceability
PROFESSIONAL EXPERIENCE
Ed Tech Solutions – Boston, MA
● D esigned and optimized scalable data pipelines, reducing processing latency by
40%.
● Developed and deployed high-performance ETL workflows using Apache Spark
and Airflow.
● Implemented data models and schema designs, ensuring compliance with data
governance policies.
● Leveraged AWS (S3, EMR, Lambda) to build distributed data solutions, enhancing
scalability and efficiency.
● Developed and maintained RESTful APIs using FastAPI and Flask for data access
and visualization.
● Integrated Prometheus and Grafana for real-time monitoring of data workflows
and system performance.
● E ngineered predictive analytics models for demand forecasting and customer
segmentation, improving forecasting accuracy by 25%.
● Designed and built Spark-based data aggregation pipelines, improving query
performance by 30%.
● Developed and optimized SQL-based data transformations, enhancing data
retrieval speed and accuracy.
● Containerized and deployed machine learning models using Docker and
Kubernetes.
● Collaborated with multi-functional teams to define data strategies and best
practices for cloud-based data architectures.
● D eveloped data pipelines and implemented data governance strategies to ensure
data integrity and security.
● Applied domain-driven design to create scalable business applications, reducing
latency and increasing throughput.
● Optimized model performance using hyperparameter tuning and implemented
efficient feature engineering techniques.
● Designed and automated CI/CD pipelines for seamless AI model deployment
across cloud environments.
EDUCATION
Boston University, Boston, MA
● R
elevant Coursework: Data Engineering, Cloud Computing, Machine Learning,
Distributed Systems
● B
achelor of Information and Communication Technology | July 2018 – July 2022 |
GPA: 3.75