Transform healthcare access!
Doctolib is looking for a Site Reliability Engineer to take care of our applicative ecosystem. You will be the doctor of the Doctolib application: curative and preventive care.
Join a team of hard working entrepreneurs, who are passionate about transforming healthcare and play an active role in shaping a fast-growing company from day one!
As a SRE, you will deal with the Doctolib production platform: one of the biggest Postgres database in Europe, 12k queries per second, more than 3k cores. These numbers are multiplied by 1.5 every 6 months.
Your curative role: Fix the application as quick as possible
- Be part of the On Call rotation to keep the platform reliable and performant
- Take care of our operational production environment (Monitoring/Alerting)
- Works hand in hand with developers to investigate run issues, especially security and performance ones
- Supervise applicative rollout to ensure they are smooth and regular
Your preventive role: Maintain applications by measuring and monitoring availability, latency, and overall system health.
- Deploy and maintain a wide range of monitoring probes: technical probes, business rules shared with the products teams and SLO used by the top management
- Regularly bench the platform to be sure we will be able to handle the traffic in 6 months
- Proactive monitoring of the performance of the platform, on the full stack (CDN, application, database)
- Regularly add features in the main codebase to improve the scalability and the reliability of the doctolib monitoring
- Maintain and improve all the tooling related to the rollout in production: deployment script, canary release, monitoring …
- Works hand in hand with developers team to write scalable and reliable features through workshops and pair programming
You can learn more about our values here and here
We are a team of 5 SRE located in Paris
You could be our next teammate if...
- You have a hands on experience on a production platform, if possible at scale
- You like troubleshooting performance issue in complex environments
- You are always looking for the perfect dashboard to monitor your application
- You want to write code in an application used by millions of users every day
- You are fluent or want to be fluent with various tools: kubernetes, rails, terraform, postgres …
Now, it would be fantastic if you could also...
- Speak English
- Have web development experience in a hyper-growth environment