Cache can help reduce the load to the database and improve the service performance, however, designing and using the cache in the optimal way is one of the most tricky problems. In Doordash’s monolithic world, there were many anti-patterns cache usage, which makes cache itself a scalability issue. In this chapter, I am going to revisit the bad practices using cache, and discuss the right principles as well as some interesting cache problems.
Database is oftentimes the bottleneck of service scalability and reliability. Between 2018 and 2020, Doordash experienced multiple site wide outages during our traffic peak hours due to database issues. Before we are able to break down the monolithic service, we need to put off the database fires to ensure the team has enough room to focus on the service extraction. In this chapter, I am going to introduce how we scale the SQL databases.
ORM is widely used in Doordash’s Django monolithic service. It enabled the developer to focus on the business logic. However, it became a major scalability bottleneck as the service grew. In this chapter, I am going to talk about the bad side of ORM, and why we decided to get rid of ORM in our new microservice and use SQL queries instead.
In the monolithic world, doordash service only has two layers: a python Django monolithic service as the backend, and the client layer, including the web and mobile clients. The simple architecture worked well when the business and the team were small, as it enabled the product team to move fast. However it doesn’t scale as it increases the overhead maintaining the backend.
The biggest issues Doordash faced in the past few years are reliability and scalability challenges. They are connected: since we can’t scale the server as the business grows, we crash very often. In the worst outage, Doordash lost millions of dollars since it had to pay for the prepared food to the merchant and send extra apology credit to the customer besides the refund.
This is the third article discussing the challenges of implementing high concurrent web applications. In the previous two articles, we mainly focused on the challenge for handling concurrent requests and the two design patterns of handling concurrent requests. In this article, we are going to discuss the challenges in designing highly concurrent business logic.
Compiler is the translator between human readable high level language and the computer readable low level languages, it translate the a program from a source language into a target language. Why do we need compiler? Because for human beings, programming in a machine language, such as assembly is highly inefficient and time consuming.
As discussed in the previous blog, the bottleneck of thread based architecture is that creating a new thread for each event is going to come with a memory footprint which will exhaust all memories or the number of threads the system can support.
A scalable service is a service that can maintain constant response time as the load increases as more nodes are added to the cluster and new server instances are running. Why it is so difficult to build a scalable service?
Software rarely crashes today. Each of the antipatterns will create, accelerate, or multiply cracks in the system. Learning the antipatterns helps us avoid apply them in our system and software design.