in Compiler ~ read.

Some thoughts about XcodeGhost, RoTT, and Quine

Story background

Recently in China, there is a trojan called XcodeGhost that infected many popular softwares in China. The trojan's spreading method is pretty unique: By a faulty Xcode compiler.

The whole story is, due to the slow connection speed to the apple's mirror server for Xcode, there is always someone creating their own mirror website in Chinese server so that download will be much easier. For this time, however, the hacker distributed a Xcode with a trojan built inside the iOS app compiler, such that every app being compiled from the Xcode will contain backdoors, and it turns out that this backdoor has collected millions of user information in only half a month.

This remind me of the famous paper "Reflections on Trusting Trust" by Ken Thompson. I'll call it RoTT in my following article. In this paper, it has been pointed out that it is actually possible to implant a piece of code in the compiler's code such that the compiled code will behave according to the previous piece of code. The logic is shown below:

The logic of the backdoor implant:

Basically speaking, for a given piece of code, we want to add some backdoor code to the original piece of code, as is shown in the pseudocode below.

    function compile(){
        if(match("place where you want to implant backdoor")){
              append_the_code_with_backdoor();
        }
    }   

Reproduction of a compiler containing backdoor.

In the compilation process of GCC, gcc will try to compile itself to a binary file A, and then use A to compile the gcc source code again and check if the result are identical. Such process is known to be bootstrap. Therefore, in order to pass such test, the compiler have to reproduce the backdoor itself.

The logic is shown in this pseudocode.

    function compile(){
        if(match("place where you want to implant backdoor")){
              append_the_code_with_backdoor();
        }
        else if(is_compiler()){
              append_backdoor_to_compiler_source_code();
        }
    }

However, this seems a bit difficult because the function compile itself is inside of the compiler code, so the result of the replacement should be identical to its original code. And there comes the famous problem of Quine, which is a program that could print itself, or works as the fix point for the compiler. I'll try to explain that in this article of "Quine"

Conclusion

With the following experiment, it turns out that we can always implant a piece of arbitrary code in a compiler,such that the logic of every code being compiled from that compiler might be changed according to the author of the compiler. This fact leads to us a frustrating result: you can never be sure whether a piece of code is malicious only if you write all the codes on your own, and even if your logic is flawless, the binary code you compiled can contain backdoor unless you implement your own compiler from scratch, which is almost impossible nowadays.

Another interesting thing related is the "reasonable person principle" of School of Computer Science, the first two principle of which is

  • Everyone will be reasonable.
  • Everyone expects everyone else to be reasonable

In the cyber space, the only we can do is to be reasonable, and expect others are reasonable. Or, according to the RoTT, we should at least try to trust the trust itself, although there is now way for us to validate our trustfulness to others sadly.

comments powered by Disqus