Database schema: Read and write separation to CQRS

Little scum · Posted on 5/4/2020 9:58:50 AM

Read-write separation

When a company's business continues to expand and the number of users increases significantly, the original database is likely to not be able to sustain itself. Then yes

Scale-in, which expands the performance of the hardware, but it is likely that the number of users will continue to grow, and the increased performance will soon be eaten up.
Read-write separation: The database can't hold on, it's nothing more than too much read and write, especially if there are some complex queries such as the most popular products in the last 24 hours. It requires very complex SQL statements, and of course it is slow to run.

However, in order to separate reads and writes, the database needs to be split into master and slave libraries.

The main relational databases on the market support data replication, so you can split a database into two roles: Master and Slave, write operations on the master, and synchronize the master server to other slave servers.

Offline operations such as read operations and data analysis are performed on the Slave server.

We know that many applications on the Internet are read, so that multiple slaves can share the load and ensure the availability and correctness of the data.

However, the corresponding original application code also needs to be modified, and it must be changed to use the master library to write data, and use the slave library when reading data, which is equivalent to rewriting.

Complex queries

However, even after rewriting the code, I found that the performance was still not significantly improved because too many complex queries were used, and we have said in the database component that joins are very performance-intensive.

So can we use a separate table to store the popular products of the past 24 hours, so that we only need to use simple SQL to do it.

In other words, a single set of database tables is inappropriate for different behaviors such as reports, searches, transactions, etc.

The current table is designed to add and modify data, and is not suitable for complex queries.

But we also need to consider how this query base is updated, or whether we can tolerate this delay, whether it may not be updated in real time.

CQRS

Whether the delay can be tolerated needs to be viewed from a business perspective, such as the popular best products in the past 24 hours, a little outdated information does not have much impact, only the final consistency is required.

We can use CQRS (Command Query Responsibility Segregation), that is, the separation of commands for adding or modifying commands from query responsibilities.

In CQRS, the emphasis is on the separation of read (Query) and write (Command), because the data read by users is usually outdated, so why need to read from the database, you can directly establish a read data source. It can be Cache, it can be XML, JSON, etc.

How to solve the problem of how to update mentioned before? You can use Event, that is, an event, for example, when a product is sold, you can publish an event to modify the original Read Model.

In this way, synchronization becomes asynchronous through the event mechanism.

Finally, this method is best used only for complex queries, and the original simple queries are still fetched in the relational database. Why? Because the introduction of a new technology requires a price, such as synchronous mutation steps and event mechanisms, we cannot only see the advantages of new technologies and not the disadvantages.

[Source] Database schema: Read and write separation to CQRS

Related Posts

Sections viewed