preface
The HTTP protocol is one of the most important protocols in the Internet, although it seems simple, but in practice it often encounters problems, and we have encountered it several times. There are long connections and packet parsing. You can't know anything about the HTTP protocol, you must understand it thoroughly. So I wrote this series to share the problems and experiences of the HTTP protocol.
The HTTP protocol has a header and a body for both the request and reply packets, and the body is the resource you want to get, such as an html page, a jpeg image, and the header is used to make certain conventions. For example, the client and the server agree on some transmission formats, and the client first gets the header, knows some format information, and then starts reading the body.
Client: Accept-Encoding:gzip (compress it for me, I am using traffic, download it first and then slowly unzip it)
Server 1: Content-Encoding: null (No Content-Encoding header.) I don't give compression, the CPU is not free, do you want it)
Server 2: Content-Encoding:gzip (save traffic for you, compress it) Client: Connection: keep-alive (Big brother, we finally built a TCP connection, we will use it next time)
Server 1: Connection: keep-alive (not easy, continue to use)
Server 2: Connection: close (Whoever continues to use it with you, our TCP is one-time, and we will have to reconnect next time we find it) The HTTP protocol does not have three handshakes, and when a client requests resources from the server, the server side shall prevail. There are also some headers that do not have a negotiation process, but the server directly tells the client what to do. For example, the Content-Length above is what the server tells the client how big the body is. But! The server may not be able to tell you exactly how big the body is in advance. The server needs to write the header first, and then the body, if you want to write the body case in the header, you have to know the body size in advance. If the body is dynamically generated, the server will finish and then start writing the header, which requires a lot of additional overhead, so there may not be a content-length in the header.
So how does the client know the size of the body? The server tells you in three ways.
1. The server already knows the resource size and tells you through the content-length header.
Content-Length:1076(body的大小是1076B,你读取1076B就可以完成任务了)
Transfer-Encoding: null
2. The server cannot know the size of the resource in advance, or is unwilling to spend resources to calculate the size of the resource in advance, so it will add a header to the http reply message called Transfer-Encoding:chunked, which means block transfer. Each block uses a fixed format, with the size of the block in front, the data behind it, and then the last block with a size of 0. In this way, when the client parses, it needs to pay attention to removing some useless fields.
Content-Length:null
Transfer-Encoding:chunked (接下来的body我要一块一块的传,每一块开始是这一块的大小,等我传到大小为0的块时,就没了)
3. The server does not know the size of the resource, and does not support the chunked transmission mode, so there is neither the content-length header nor the transfer-encoding header. At this time, the header returned by the server must be close.
Content-Length:null
Transfer-Encoding:null
Connection:close(我不知道大小,我也用不了chunked,啥时候我关了tcp连接,就说明传输结束了)
|