in Blog, Computer Security, TECH

Reflections on Trusting Trust by Ken Thompson

To what extent should one trust a statement that a program is free of Trojan horses [2]? Perhaps it is more important to trust the people who wrote the software -Ken Thompson[1]

In this lecture [3], Ken Thompson describes the essence of a computer program and to what extent it can be trusted. He argues that a code which misuses its very fundamental compilation deliberately is like someone breaking into a neighbor’s house, and it doesn’t matter whether it’s locked or unlocked. He further maintains that there are various tricks by means of which one can exploit unavoidable loopholes of compilations in almost any programming languages. Those intentional codes can dodge compilation process to break into something undesirable. He explains this process with the help of a simple self-reproducing program in C [4].

Self-reproducing programs are such program which basically doesn’t have any input but produces an output exactly same as a program itself.  He asserts this as a deep concept of learning. It can be written once and can be used (self-referencing definition) many times. To understand it fundamentally with a more sophisticated example, we can imagine compilers [5]. Compilers are programs which are used to create programs.

Self-reproducing programs can be written in any languages. Following are some of the examples of a self-reproducing program written in python [6] and in C.

Python:
s = 's = %r\nprint(s%%s)'
print(s%s)
C:
main() {
   char * a = "main()
   {char*a=c%s%c%;printf(a+42,34,a,34);};
   ) 43, a, 43, 24 + a(ftnirp; % c % s % c = a * rahc {)(niam ";printf(a+42,34,a,34);}

Author: smr

Notes: This program won the “Worst Abuse of the Rules” award in the 1994 Obfuscated C Code Contest. It claims to be the world’s shortest self-producing C program. Hey, it’s also a palindrome! Some compilers give fatal errors on this one.

You can find lots of self-reproducing programs here [7] written in C. In computer science, it’s called Quine [9] – named to honor Willard Van Orman Quine [8].

Thompson takes an example of C compiler, which was written in C itself. He calls this a “chicken and egg” situation. Taking an example of escape character “\n”, he explains that, such character tells a compiler to perform some particular operation. The “\n” represents a new line character in C.

“Hello world\n”

An amazing piece of code, he says – Compiler knows what specific character to compile for a new line in any character set. Having said that, he reveals a completely astonishing behavior of the compiler. What if somehow you alter a compiler to behave as per your wish for some other piece of a character or code? Say you teach compiler to compile “\v” for a vertical tab. Obviously, at the very beginning, it won’t recognize what “\v” is. But once you know the ASCII for a vertical tab, you can write a new version of a compiler with a code included for a vertical character. And compiling this compiler source you can create a new version of a portable compiler for C. Thompson calls these purposeful pieces of code not a bug but a “Trojan Horse”.

c = next();
if (c != '\\') /* not a backslash... */ return (c);
c = next();
if (c == '\\') /* literal backslash...*/ return('\\');
if (c == 'n') /* newline...*/ return ('\n');
if (c == 'v') /* vertical tab...*/ return ('\v');

Install this compiler as a new version, and there it is. Now your compiler will know what “\v” means.

Knowing that a compiler can be tricked. Thompson introduces “the cutest program he ever wrote”. He says, he can plant a bug in a UNIX login command. The replacement code he designs will miscompile the login command in such a way that it will accept any other encryption or known password/s. Thus, if the binaries are used to compile the login command, he can install binary codes to perform this activity. He reveals a powerful obscure functionality here. i.e. there would be no clue of bug even by inspecting the source code of login command itself. However, a simple examination of the compiler can show the fault.

Now, combining the above two ideas of a self-reproducing program and Trojan horse – injected in a compiler can create a never ending bugged program. How can this be achieved? Imagine you created a new compiler with an injected self-reproducing Trojan horse. Then we installed this new buggy compiler(binary) as the official C. Somehow you found the bug in the source of the compiler and deleted it. But then again, the new binary will reinsert the bugs whenever it is compiled again. Obviously, the login window will remain bugged forever, and it will be even harder to find the trace.

Finally, Thompson says that the moral is all clear. You can’t trust code that you did not create yourself. These so-called heroes(hackers) should be demotivated with some strict rules as it’s similar to a criminal act to get inside someone else’s machine despite one’s consent. He further says that “As the level of program gets lower, these bugs will be harder and harder to detect. A well-installed microcode bug will be almost impossible to detect.

Extra dose: What! does he mean that we should not trust Mr. Thompson? Does it infer that there were back doors on the original UNIX operation system, he and his team coded, you didn’t.

……………………..

This article – a summary I would love to call is based on Turing Award Lecture “Reflections on Trusting Trust” by Ken Thompson. You can read the article here. I have been tricky here not putting the link at the very beginning, coz it’s so wonderful that you will blindly ignore mine.

Disclaimer: I beg your clemency as I have adopted some of the words and even lines directly from the original paper, coz I couldn’t find a better way to explain.

Acknowledge to Mr. Kul Prasad Subedi, A Ph.D. Student at The University of Memphis.

References:
[1] Ken Thompson:
      https://en.wikipedia.org/wiki/Ken_Thompson
[2] Trojan Horses(Computing)
      http://www.cert.org/historical/advisories/CA-1999-02.cfm
[3] Ken Thompson, Reflection on Trusting Trust, Turing Award Lecture, August 1984
      https://www.ece.cmu.edu/~ganger/712.fall02/papers/p761-thompson.pdf
[4] C: Dennis Ritchie, Brian Kernighan, The C Programming Language, 1978
      https://archive.org/details/TheCProgrammingLanguageFirstEdition
[5] Compilers
https://en.wikipedia.org/wiki/Compiler
[6] Python: Guido van Rossum, CWI Netherlands, 1989
      https://en.wikipedia.org/wiki/History_of_Python
[7] List of self-generation programs
      https://www.nyx.net/~gthompso/self_c.txt
[8] Willard Van Orman Quine
      https://en.m.wikipedia.org/wiki/Willard_Van_Orman_Quine
[9] Quine
http://www.madore.org/~david/computers/quine.html

Saurab Dulal
Follow me
Saurab Dulal
Follow me

Latest posts by Saurab Dulal (see all)

Write a Comment

Comment