- 経験
- 5年以上
- 給料
- —
- 求人情報
- 1
- 投稿済み
- 2週間
- 作業モード
- 在任中
- 再開する
- 応募必須
勤務地
仕事内容
Role overview
This position focuses on keeping production environments highly available, fast, and dependable through cloud operations, automation, monitoring, and disciplined incident handling.
What you'll do
- Design, build, and maintain scalable AWS-based infrastructure using Terraform or CloudFormation.
- Set up and operate observability and monitoring platforms such as Prometheus, Grafana, Splunk, or Datadog.
- Respond to incidents, perform root cause analysis, participate in on-call rotations, and work with SLIs, SLOs, and error budgets.
- Automate recurring operational work to increase reliability, efficiency, and recovery speed.
- Support Kubernetes, Docker, CI/CD pipelines, runbooks, and ITIL-based operational processes.
Skills and experience needed
- Hands-on background in SRE, DevOps, production support, or cloud operations with AWS exposure.
- Working knowledge of Kubernetes, Docker, Linux, and core networking concepts.
- Ability to script in Python, Bash, or Go.
- Experience with monitoring platforms and incident resolution / RCA workflows.
- Familiarity with infrastructure as code, CI/CD tools, and enterprise support systems is preferred.
Experience
A minimum of 5 years of experience in SRE, DevOps, cloud, or production support roles is required.