Site Reliability Engineer

Provide and constantly improve site reliability and incident response

type of job

full-time

updated at

23 months ago

No longer available...

new

Personal Telegram feed

All Web3 opportunities right in your pocket

job details

Role Overview

Want to build Web 3 with us? The next few years in crypto, NFTs and Web3 belong to builders and believers—not short-term speculators. At Rarible, we believe Web 3 will only proliferate when teams build excellent infrastructure, gaps and solutions that serve communities and create a better internet for everyone. If that sounds like music to your ears, we’re looking for you.

Here’s why: We are looking for a Site Reliability Engineer.

You have experience and are culturally aligned with fast-moving small teams. You have experience at globally distributed startups. You are self-driven, are comfortable wearing many hats, and can deliver swiftly when needed. You can identify company priorities, own them, and iterate quickly to ship the best solution.

Responsibilities

Work closely with engineering teams to ensure Rarible well operated and monitored systems, which are designed and implemented for failure.
Provide incident response and support for our production systems.
Continuously work with engineering teams to improve MTTR (Mean Time to Recovery).
Automate our operational processes as needed, with accuracy and in compliance with our security requirements.
Improve tools and advocate operational excellence for continuous monitoring, self-healing systems and alert transparency.
Work on tooling, documentation, playbooks and education needed to ensure that engineering teams could deliver and maintain reliable, observable and scalable systems in self-managed format.
Make sure that reliability related metrics are calculated, communicated and continuously improved.

Requirements

You have 5+ years of relevant experience in ensuring reliability and scalability of production systems.
You are proactive and good at communication.
Monitoring and observability of the systems is one of your main skills, including usage of tracing, RUM and advanced alerts.
Good in programming languages such as TypeScript/JavaScript and Java/Kotlin/Scala.
Worked closely with Software Engineers on a day-to-day basis in ensuring together reliability of production systems and having incident response for both infra and software levels.
Experience with CI/CD so you can improve deployment process and reduce risks.
Deeply understand and worked with Kubernetes and LXC (Linux Containers).
Managed: MongoDB, Postgresql, Elasticsearch, Kafka, JVM.

Benefits

Working for a rapidly expanding global organization
Mentorship, training and career progression plans with leadership focused on developing the teams
Team that cares about products and working conditions
Flexible hours
Full-time, paid vacations
Remote first with relocation packages available

About us

Rarible is a top multichain, community-centric NFT marketplace. It is underpinned by Rarible Protocol, community-governed NFT trading API that simplifies building community marketplaces and other ground-breaking NFT projects and integrations.

With over $300 million in trading volume to date, Rarible is one of the leading NFT brands, constantly innovating on the decentralized solutions for the web3 space.

We are growing and evolving non-stop, and are looking for a Partner Manager to join our dynamic and passionate web3 team.

hide details