hero

Play a role in the future of software

Discover opportunities across our portfolio companies

Senior Site Reliability Engineer

tinybird

tinybird

Software Engineering
Remote
Posted on Thursday, December 29, 2022

What are we looking for?

We are looking for someone to help us scale and to keep our software and infrastructure reliable and elastic as we scale. Someone who knows how to make hardware and software play together.

We run our stack in Linux. We try to keep things simple. Technologies we use:

  • Nginx: SSL termination and load balancing.
  • Varnish: load balancing and, sometimes, caching.
  • Redis: metadata store.
  • Python: most of our backend uses Python except some small bits that rely on C++ for hot paths.
  • ClickHouse: our main data store.
  • Zookeeper: for ClickHouse replicas coordination.
  • We use Grafana, Loki and Mimir for monitoring and alerting.
  • Terraform: Cloud provisioning (virtual machines, networks, K8s clusters)
  • Ansible: Deploys and software and config provisioning

Our number of machines is still manageable, but the number keeps growing as we keep adding customers.

This is not about managing infrastructure but about making sure that our software uses the hardware resources wisely and flexibly. This means you will not only have to worry about automating machines, but about helping the product team to design and develop the architecture of the system as a whole. That will require you to work with our backend code and to understand how ClickHouse works.

Some challenges and things we want to improve:

  • Observability: from specific resource usage to a bird's eye view of the whole platform. This requires good knowledge of storage, networking, and computing.
  • High-availability and elasticity: as we keep adding customers, we need to architect our system to be more efficient and flexible.
  • Disaster recovery: improving our tooling to manage and discover problems, but also improving our on-call procedures.

As a specific challenge: when our customers grow and we need to upgrade their accounts. Now, we do it manually—not in the traditional sense of manual because we have tools that automate much of the process, but we need to take care of that one customer at a time: deciding what machines we need to spin-up, how much storage we will provision, etc. Ideally, our architecture should allow our customers to upgrade themselves and assign more resources to them dynamically and seamlessly in the most dynamic, safe and transparent way possible.

What will we value?

  • Experience designing, building and running distributed Cloud architectures and large scale web based applications. That is, in so many words, what you will be responsible for at Tinybird.  
  • Programming skills and willingness to dive into our codebase, Clickhouse’s or other in order to figure out how things work. In Tinybird we work mostly with Python and C++.
  • Accountable and enthusiastic to take on the responsibility of designing and managing the platform, and an urge to take on things that may be broken. Unafraid to break stuff because you own it and can fix it if need be.
  • Bias for action, iteration and delivery. Conscious that often decisions can be reversed quickly and that speed is of the essence in business and technology.
  • That you think in terms of systems and you are attuned to edge cases, failure modes, behaviors, specific implementations.
  • Comfortable collaborating and communicating asynchronously.
  • Keen documenter of everything you learn and build, to figure out things once and to make it easy for everybody else.
  • Experience with Nginx, Varnish, Redis, Terraform or Ansible would be great for you to get up and running quickly, but we don’t bring you here to tell you what the right technologies are: rather we expect you to recommend the right one for each challenge.
  • Experience with ClickHouse and/or rolling out database systems at scale would be a huge plus.

Some bits about the way we work

  • We are a fully remote company, and not just because of COVID19, we have worked like that for many years. All of our previous companies were remote friendly companies.
  • We will provide you with up to €2400 to get the right setup at home if you need it.
  • We are just starting up so your work will impact everything we do. We also believe in full transparency and you will always know what is going on.

Here you have our company principles.

A bit more about the hiring process

  • Selected candidates will be invited to schedule a screening call with our tech team.
  • Next, you will be invited to schedule a second interview.
  • Following successful interviews, you will be invited to schedule a final meeting with the rest of the founding team.
  • Successful candidates will subsequently be made an offer via phone or video call.

Compensation

  • A competitive package, including Stock Options.
  • Up to 120K depending on experience.
  • 22 days of holiday a year (plus your birthday and public holidays), but who is counting.
  • Freedom to work from wherever suits you best. This time, we are looking for people based in the time zone range: UTC -2 to UTC +3.

How to apply

Apply telling us a bit about yourself and ask us whatever you need to know about the problem we are trying to solve, the company, your role, etc.

In case you want to know more about us

Tinybird - How We Processed 12 Trillion Rows During Black Friday - Percona Live 2021

Build Fast APIs Faster Over Data at Scale 

Killing the ProcessPoolExecutor