0% found this document useful (0 votes)
32 views3 pages

Site Reliability Engineer

Uploaded by

srmjviansrmjv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views3 pages

Site Reliability Engineer

Uploaded by

srmjviansrmjv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

12/1, Second Floor, Raghava Building,

Bashyam Basheer Ahmed Rd, Alwarpet,


Chennai, Tamil Nadu 600018

Site Reliability Engineer


Full-time | Mid Senior to Senior level | Chennai, Tamil Nadu, India | Hybrid Work Culture

About Rheo
Rheo is an intelligent industrial AI platform that utilizes sensors and machine learning to
optimize operational processes.

Rheo fosters the right harmony between people and technology through data-led focus
and transparency, thereby supercharging manufacturing/operations teams into a cohesive unit. At
Rheo, we apply the same principles we advocate to our customers by creating effective lean
solutions.

Job Summary
We are seeking a highly skilled and motivated Site Reliability Engineer (SRE) with at least
2 years of experience in maintaining AWS Cloud infrastructure and working with Kubernetes. The
primary objective of this role is to ensure near-zero downtime for our services and applications by
proactively identifying and addressing issues. The ideal candidate will have a strong background
in troubleshooting, bug tracking, reporting, and resolution, as well as a willingness to participate
in an on-call schedule.

Key Responsibilities:
1. AWS Cloud Maintenance: Maintain and optimize AWS Cloud infrastructure to ensure
scalability, reliability, and performance. Monitor AWS resources and services to identify
and rectify potential issues before they impact the system.
2. Kubernetes Management: Manage and maintain Kubernetes clusters, ensuring high
availability and performance.Implement best practices for container orchestration and
scaling.
3. Incident Response: Participate in an on-call rotation to provide 24/7 support and respond
to critical incidents promptly. Collaborate with cross-functional teams to troubleshoot and
resolve system issues efficiently.
4. Bug Tracking and Resolution: Identify and document software and infrastructure bugs,
working closely with development teams to prioritize and resolve them. Continuously
improve monitoring and alerting systems to proactively detect issues.
12/1, Second Floor, Raghava Building,
Bashyam Basheer Ahmed Rd, Alwarpet,
Chennai, Tamil Nadu 600018

5. Performance Optimization: Analyze system performance and implement optimizations to


enhance reliability and reduce downtime.
6. Automation: Develop and maintain automation scripts and tools for provisioning,
deployment, and monitoring.
7. Documentation: Create and update documentation for systems, processes, and incident
response procedures.
8. Security and Compliance: Ensure security best practices are followed and participate in
security audits and compliance initiatives.

Requirements:
- Bachelor's degree in Computer Science, Information Technology, or related field. (or
equivalent work experience)
- Proven experience as a Devops Engineer or Site Reliability Engineer or similar role, with
at least 2 years.
- Strong hands-on experience with infrastructure-as-code tools like Terraform, configuration
management tools like Ansible, and version control systems like Git.
- Proficiency in scripting languages such as Python, Bash, or Ruby for automation tasks.
- In-depth knowledge of CI/CD concepts and experience with CI/CD tools like Jenkins,
GitLab CI/CD, CircleCI or GitHub Actions.
- Extensive experience working with cloud platforms like AWS, Azure, or GCP.
- Solid understanding of containerization technologies such as Docker and container
orchestration tools like Kubernetes.
- Familiarity with monitoring and logging solutions like Prometheus, Grafana, ELK stack, etc.
- Excellent problem-solving skills and the ability to troubleshoot complex issues across
different technology stacks.
- Strong communication and interpersonal skills to effectively collaborate with
cross-functional teams.

Preferred qualifications:
- Relevant certifications in cloud platforms (AWS Certified DevOps Engineer, Azure DevOps
Engineer, etc.).
- Experience with infrastructure as code (e.g., Terraform, CloudFormation).
- Experience with serverless architectures and services.
12/1, Second Floor, Raghava Building,
Bashyam Basheer Ahmed Rd, Alwarpet,
Chennai, Tamil Nadu 600018

- Experience in working with Agile and DevOps methodologies.


- Knowledge of compliance frameworks (e.g., GDPR, HIPAA) and security best practices.

Work Environment:
We value a culture of continuous improvement, collaboration, and innovation. Our team is
dedicated to maintaining high availability for our services and ensuring a seamless experience for
our users. As an SRE, you will play a crucial role in achieving these goals and driving the
company's success. If you are a proactive, detail-oriented, and experienced SRE who is
passionate about minimizing downtime and improving system reliability, we encourage you to
apply and join our dynamic team. Together, we will ensure the highest level of service for our
customers.

You might also like