Load balancing, session hold, session synchronization

The knife is flying · Posted on 5/14/2015 12:16:44 AM

1. What is load balancing
A new website should not be load balanced, because the traffic volume is not large, so there is no need to engage in these things. However, with the rapid growth of website traffic and traffic, a single server is limited by its own hardware conditions, and it is difficult to withstand such a large number of visits. In this case, there are two options to choose from:
1. Update the hardware of a single server, from dual-core to quad-core, increase memory, etc.
2. Increase the number of servers to share the burden of servers. To achieve the purpose of increasing network bandwidth and increasing the processing power of the server.
The first method can be understood as vertical development, which is always limited. The second method is the right choice to solve the problem
The methods of load balancing can be divided into two directions, one is to use software to achieve load balancing, and the other is to implement hardware load balancing (including combining hardware and software)
Use software to achieve load balancing, and the process of achieving load balancing also consumes some system resources and increases response time. For example, LVS, nginx, haproxy, apache, etc., these application-based load balancing software are suitable for websites that do not have a particularly large number of visits. If you have a website with a large number of visits like sina and 163, using hardware to implement load balancing is the most obvious choice.
There are many load balancing algorithms, including load balancing based on the number of requests, root IP addresses, and traffic-based algorithms. There are two algorithms that I often use.
One is based on the number of requests
a, it can realize that each server can share the customer's request evenly, and if one of the servers goes down, it will not cause a bad impact.
b. The state between servers must be synchronized, such as session, and other means are needed to synchronize these states.
One is according to IP
A, ip_hash algorithm can map an IP to a server, which can solve the problem of session synchronization
b. The bad thing about ip_hash is that if one of the servers goes down, the users mapped to this server will be depressed.
c, ip_hash can easily lead to unbalanced load, now the river crab government filters Google's search keywords, you will often find that Google can't open, but it will be fine after a while. This made those Google enthusiasts depressed, and many users went abroad to find agents. If this happens, these proxies will be assigned to the same server, causing unbalanced load and even failure.

Second, what is session holding and what is its function
Session hold refers to a mechanism on the load balancer that ensures that access requests associated with the same user are distributed to the same server while doing load balancing.
What does session hold do, give an example
If a user access request is assigned to server A, and logs in to server A, and in a short period of time, this user sends another request, if there is no session hold function, this user's request is likely to be assigned to server B, at this time there is no login on server B, so you have to log in again, but the user does not know where his request is assigned, the user's feeling is that he is logged in, why do he have to log in again, the user experience is very bad.
And if you buy something on Taobao, from login = "Shoot something=" add address = "to pay", this is a series of processes, which can also be understood as an operation process, all this series of operation processes should be completed by one server, and cannot be assigned to different servers by the load balancer.
Session hold has a time limit (except for servers that are mapped to a fixed one, such as ip_hash), and various load balancing tools will provide this session hold time setting, LVS, apache, etc. Even the PHP language provides a session.gc_maxlifetime to set the session hold time
The session holding time should be set more than the session survival time, which can reduce the need to synchronize sessions, but it cannot be eliminated. So synchronizing sessions still needs to be done.

Third, session synchronization
Why session synchronization, it has been mentioned when talking about session keeping. For more information, see Three Methods of Session Synchronization in a Web Cluster

There are three methods of session synchronization in a web cluster

After doing a web cluster, you will definitely consider session synchronization first, because after load balancing, the same IP access to the same page will be assigned to different servers. So this article gives three different ways to solve this problem according to this situation:
First, use the database to synchronize the session
I didn't use this method when doing multi-server session synchronization, but if I had to use this method, I thought of two methods:
1. Use a low-end computer to build a database to store the session of the web server, or build this special database on the file server, when the user accesses the web server, he will go to this special database to check the session situation to achieve the purpose of session synchronization.
2. This method is to put the table where the session is stored together with other database tables, if mysql is also clustered, each mysql node must have this table, and the data table of this session table must be synchronized in real time.
Explanation: Using the database to synchronize sessions will increase the burden on the database, which is inherently prone to bottlenecks. The first of the above two methods is better, which separates the table where the session is placed independently, reducing the burden on the real database
2. Use cookies to synchronize sessions
session is the file situation stored on the server side, and cookie is the file situation on the client, how to achieve synchronization? The method is very simple, that is, to put the session generated by the user's visit page into the cookie, that is, to use the cookie as a relay station. You visit web server A, generate a session and put it in the cookie, your access is assigned to web server B, at this time, web server B first judges whether the server has this session, if not, go to see if there is this session in the client's cookie, if not, it means that the session is really not saved, if there is one in the cookie, Synchronize the sessoin in the cookie to the web server B, so that the session can be synchronized.
Note: This method is simple and convenient to implement, and will not increase the burden on the database, but if the client disables cookies, then the session cannot be synchronized, which will bring losses to the website; Cookies are not highly secure, and although they have been encrypted, they can still be forged.

3. Use memcache to synchronize sessions
Memcache can be distributed, and without this function, it cannot be used for session synchronization. He can combine the memory in the web server to become a "mempool", no matter which server generates sessoin, it can be put into this "mempool", and everything else can be used.
Advantages: synchronizing sessions in this way does not increase the burden on the database, and the security is greatly improved compared to using cookies, and putting sessions in memory is much faster than reading from files.
Disadvantages: memcache divides memory into many specifications of storage blocks, there are blocks and sizes, this way also determines, memcache can not fully utilize memory, will produce memory fragmentation, if the storage block is insufficient, it will also produce memory overflow.

Fourth, summary
All three of the above methods are feasible
The first method, the one that affects the speed of the system the most, is not recommended;
The second method has good results, but the safety hazards are the same;
The third method, I personally think the third method is the best, I recommend everyone to use it;

Load balancing, session hold, session synchronization

Sections viewed