Someone beat you to it!

Unfortunately, another Gumtree user is in the process of purchasing this item. Check back later in case they've changed their mind, or click the button below to browse more Pay Online Securely listings.

Junior Site Reliability Engineer

1 day ago3 views
Ad Saved to My List
View and manage your saved ads in your account.
Report Ad
General Details
Advertised By:Agency
Company Name:Job Placements
Job Type:Full-Time
Description
About the role
The position is responsible for contributing to the reliability, scalability, and performance of the companys cloud-native infrastructure and production services.

Responsibilities:
  • System Monitoring & Observability
  • Configure and maintain monitoring tools (e.g., Prometheus, Datadog) to track key system metrics (latency, traffic, errors, saturation).
  • Create and refine dashboards and alerts to ensure rapid detection of anomalies and potential outages.
  • Assist in the implementation of distributed tracing and structured logging to improve debugging and performance analysis.
Incident Response & Management
  • Participate in a 24/7 on-call rotation as a secondary responder, escalating issues as needed to senior team members.
  • Follow incident response playbooks to diagnose and mitigate production incidents, aiming to restore service within defined SLOs.
  • Contribute to blameless post-incident reviews by documenting timelines, root causes, and action items to prevent recurrence.
Automation & Infrastructure as Code
  • Develop and maintain automation scripts (Python, Go, or Bash) to streamline repetitive operational tasks such as certificate rotation, user access management, and log rotation.
  • Assist in managing cloud infrastructure using IaC tools (Terraform, CloudFormation) to ensure consistent, version-controlled, and repeatable deployments.
  • Support CI/CD pipeline improvements (GitLab CI, GitHub Actions, Jenkins) to enable safe and efficient application deployments.
Capacity Planning & Performance Tuning
  • Collect and analyse resource usage trends (CPU, memory, storage, network) to help forecast capacity needs and recommend scaling actions.
  • Work with development teams to conduct load testing and identify performance bottlenecks.
Collaboration & Knowledge Sharing
  • Partner with software engineers to implement service level indicators (SLIs) and define realistic service level objectives (SLOs).
  • Document system architecture, operational runbooks, and common troubleshooting steps to empower the wider team.
  • Actively participate in team agile ceremonies, providing input on reliability risks for upcoming features.
Beneficial Skills (Desired Skills):
  • Container Orchestration: Hands-on experience with Kubernetes (cluster administration, Helm charts, pod autoscaling) or Docker Swarm.
  • Programming & Scripting: Proficiency in at least one high-level language (Python, Go) for automation and tooling; comfort with shell scripting.
  • CI/CD Pipelines: Familiarity with building and maintaining deployment pipelines, including canary deployments, feature flags, and rollback strategies.
  • Observability
Id Subtitle 1353179938
View More
Apply now:
Job Placements
Selling for 1 year
Total Ads4.95K
Active Ads4.95K
Professional Seller
Seller stats
4.95KTotal Ads
21.73MTotal Views
Contact Job Placements
Message
(4014)
Name
(Optional)
Email Address
(Optional)
Phone
(Required)
Upload CV(Optional)
DOC or PDF only max 2 MB file size
Send Message
By clicking "Send" you accept the Terms & Conditions and Privacy Notice and agree to receive newsletters and promo offers from us.