Job Openings » Site Reliability Engineer (SRE) - Remote

Site Reliability Engineer (SRE) - Remote

by DigiSource
Status
Pending

Job Description

About the Role
We are looking for a skilled Site Reliability Engineer (SRE) to join our team and play a critical role in ensuring the availability, scalability, and performance of our Fintech and Insurance platforms. As an SRE, you will bridge the gap between development and operations, applying software engineering principles to system administration, monitoring, and incident management.
Key Responsibilities
System Reliability:
  • Design and maintain highly available, fault-tolerant systems to ensure uninterrupted service delivery.
  • Develop and implement monitoring, alerting, and automation tools to proactively identify and address system issues.
Incident Management:
  • Investigate and resolve incidents related to application performance, reliability, and availability.
  • Create post-incident analysis reports and implement changes to prevent future occurrences.
Performance Optimization:
  • Optimize system performance by analyzing and addressing bottlenecks and inefficiencies.
  • Collaborate with developers to ensure code is production-ready and adheres to reliability standards.
Infrastructure as Code (IaC):
  • Build and maintain infrastructure automation using tools like Terraform, Ansible, or similar.
  • Manage containerized environments using Kubernetes.
Collaboration:
  • Work closely with cross-functional teams to define and implement SRE best practices.
  • Advocate for a culture of operational excellence across the organization.

Job Requirement

Experience:
  • Have 05+ years of experience in site reliability, DevOps, or system administration roles.
  • Proven expertise in managing Kubernetes clusters in production environments.
  • Proficiency with monitoring tools such as Prometheus, Grafana, or similar solutions.
  • Strong scripting skills in languages such as Python, Bash, or Go.
  • Experience with CI/CD pipelines and version control systems like Git.
Soft Skills:
  • Fluent in English, with excellent communication and collaboration skills to work in a global team.
  • Strong problem-solving skills and attention to detail.
  • Ability to work independently and proactively in a fast-paced environment.
Preferred Qualifications: 
  • Experience in Fintech or Insurance domains.
  • Familiarity with cloud platforms like AWS, GCP, or Azure.
  • Knowledge of security best practices for cloud-native applications.
  • Certification in SRE, DevOps, or related fields is a plus.

About Company & Benefit

What We Offer:
  • Competitive monthly salary: 40,000,000 ~ 45,000,000 VND net/month (negotiable for exceptional candidates).
  • Flexible working hours, remote work available.
  • Contract duration: 3-6 months with the potential for extension based on performance.
Location and Working Hour: 
  • Location: Remote Fulltime
  • Working hours: 08 hours/day (8:30 - 17:30) on Monday to Friday