"What language is C itself written in?"
From another perspective, it is: before the C language runs, it must be compiled, so where does the C language compiler come from? In what language is it written? If it is written in C itself, is there an egg or a chicken first?
1
Let's assume that there are no compilers in the world, let's start with machine language and see how.
Machine language can be executed directly by the CPU without the need for a compiler.
Then there is assembly language, although assembly language is only a mnemonic for machine language, but it also needs to be compiled into machine language to execute, so there is no choice but to use machine language to write this first compiler (not used in the future).
The problem of assembly language is solved, and it is a big step forward, at this time it is possible to use assembly language to write the C language compiler, which we say is the ancestor of the C compiler.
With this ancestor, you can compile any C language program, so can you write a compiler in C language itself? Just compile it with the ancestors.
OK, after such a layer, I finally got a compiler written in C, which is really troublesome.
At this point, the C compiler written by the previous package can be abandoned.
Of course, if there were other high-level languages before C, such as Pascal, then Pascal could be used to write a C compiler.
The compiler of the first Pascal is said to have been written in Fortran. As the first high-level language, Fortran's compiler should be written in assembly language.
2
Here's an interesting legend about the compiler:
Legend has it that Ken Thompson, one of the inventors of Unix, swaggered to any Unix machine at Bell Labs, entered his username and password, and could log in in the root way!
Bell Labs is full of talent, and some other big bulls vowed to find this vulnerability, they read through the C source code of Unix, and finally found the login backdoor, and after cleaning the backdoor, they compiled Unix and ran it, but Thompson was still able to log in.
Some people think that there may be a problem with the compiler, and a backdoor was implanted when compiling Unix, so they rewrote a compiler in C and compiled Unix again with a new compiler.
But it still doesn't work, Thompson can still log in with root, which is really devastating!
Later, Thompson himself unlocked the secret, it was the first C compiler to have a problem, this compiler will of course be implanted in the backdoor when compiling Unix source code, this is not enough, what's even better, if you write a new compiler in C language, you definitely need to compile it into binary code, what to compile, only use the first compiler written by Thompson to compile, okay, the compiler you wrote will be polluted, your compiler will compile Unix again, Will also implant a backdoor :-)
Speaking of which, I am reminded of the XcodeGhost incident a few years ago, which simply means that a Trojan horse was implanted in Xcode (downloaded from unofficial channels), so that the ios apps compiled by XCode were contaminated, and these apps could be used by hackers to do illegal things.
Although this XCodeGhost is far from Thompson's, it reminds us that when downloading software, you should use formal channels, download from the official website, look for the website's HTTPS standard, and even verify the checksum.
3
Some people may ask: I use Hui to write a Hello World paragraph, but someone can use it to write a complex compiler? Is this possible?
Of course, when the first generation of Unix was developed, there was no C language, and Ken Thompson and Dennis Ritchie typed out Unix with assembly lines. The first version of WPS was written by Qiu Bojun in Hui, and the compiler of Turbo Pascal was also written by Anders in Hui, and the abilities of the gods are not imaginable to ordinary people.
For compilers, it is also possible to develop in a "snowball" way:
Still taking C language as an example, the first version can choose a subset of C language, such as only supporting basic data types, process control statements, and function calls...... We call this subset C0.
Then write a compiler in assembly language, and only get a subset of this language C0, so that it is much easier to write.
The C0 language works, and then we extend this subset by adding structs, pointers, ......, and calling the new language C1.
Who writes the compiler for the C1 language? Naturally, it is C0.
When C1 is working, expand the language features again, write the compiler with C1, and get C2.
Then there is C3, C4...... Finally, you get the full C language.
This process is called bootstrapping, and in Chinese it is called bootstrapping.
|