A brief history of C

Here is a chapter from Paul Bilokon’s forthcoming book on the C programming language.

The history of the programming language C is inextricably linked to the history of UNIX, an operating system, whose development started in 1969 at the Bell Labs research centre by Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, Joe Ossanna and their colleagues. Initially there was no intention to use UNIX outside the Bell System. However, AT&T started licensing UNIX to outside parties in the late 1970s, leading to a variety of both academic and commercial Unix variants from vendors including University of California, Berkeley (BSD), Microsoft (Xenix), Sun Microsystems (SunOS/Solaris), HP/HPE (HP-UX), and IBM (AIX).

Ken Thompson and Dennis Ritchie in 1973.

Since the 1990s, UNIX systems have appeared on home computers: BSD/OS was the first to be commercialized for i386 computers and since then free UNIX-like clones of existing systems have been developed, such as FreeBSD and the combination of Linux and GNU.

The UNIX operating system was about making computers more productive. In the documentary The UNIX System: Making Computers More Productive, one of the two that Bell Labs made in 1982 about UNIX’s significance, impact and usability, Victor A. Vyssotsky, Executive Director, Research Communications Principles, explains:

The usual way to get a large computer application developed involves a big team of people working in close coordination. Most of the time this works surprisingly well, but it does have its problems. And large problems tend to get done poorly. They take a long time, they consume an astonishing amount of money, and in many cases the individual team members are dissatisfied.

This resonates with Fred Brooks’s (1931-2022) seminal book, The Mythical Man Month (1975). Brooks’s observations are based on his experiences at IBM while managing the development of OS/360:

Large-system programming has over the past decade been such a tar pit, and many great and powerful beasts have thrashed violently in it. Most have emerged with running systems – few have met goals, schedules, and budgets. Large and small, massive or wiry, team after team has become entangled in the tar. No one thing seems to cause the difficulty – any particular paw can be pulled away. But the accumulation of simultaneous and interacting factors brings slower and slower motion. Everyone seems to have been surprised by the stickiness of the problem, and it is hard to discern the nature of it. But we must try to understand it if we are to solve it.

Vyssotsky continues:

So everybody in the computing business is constantly searching for ways to do a better job of developing computer applications. […] There are some things that can be done. A good programming environment helps a lot.

And the UNIX operating system was supposed to be just such a good programming environment, for many purposes. In the same documentary, Bell Labs’ John R. Mashey explained the difference between hardware and software:

There is a crying need for useful software to do effective jobs, we just do not have enough people to write all of that software. Keeping large amount of software working in the face of changes is a big job, it takes a lot of skilled people to do this. Now, software is different from hardware. When you build hardware and send it out, you may have to fix it because it breaks, but you don’t demand, for example, that your radio suddenly turn into a television. And you don’t demand that a piece of hardware suddenly do a completely different function. People do this with software all of the time. There is a continual demand for changes, enhancements, new features that people find necessary once they get used to a system.

Since these changes mean that it is impossible to gather requirements upfront, the software must be built change-tolerant.

At the time structured programming, a paradigm focused on writing clear, maintainable code by using logical control structures (sequence, selection, iteration) and modular design, eliminating confusing goto statements (often called spaghetti code), was new. Edsger Dijkstra published his famous letter Go To Statement Considered Harmful in 1968.

The dominant programming languages in the 1960s were Fortran and COBOL. Fortran was originally designed by John Backus and developed by IBM with a reference manual being released in 1956, however, the first compilers only began to produce accurate code two years later. It was the first optimizing compiler. COBOL’s design was started in 1959 by CODASYL and was partly based on the programming language FLOW-MATIC designed by Grace Hopper. It was created as part of a U.S. Department of Defense effort to create a portable programming language for data processing.

Fortran and COBOL were not used to build the UNIX operating system primarily because of design philosophies and technical limitations that made them poorly suited for low-level system programming, particularly compared to C. The C language, developed at Bell Labs alongside UNIX, was specifically designed to address these needs.

Ken Thompson (b. 1943) was hired by Bell Labs in 1966. Dennis Ritchie (1941-2011) joined Bell Labs Computing Science Research Center one year later, in 1967. In the 1960s at Bell Labs, Thompson and Ritchie worked on the Multics operating system. While writing Multics, Thompson created the Bon programming language. He also created a video game called Space Travel. Later, Bell Labs withdrew from the MULTICS project. In order to go on playing the game, Thompson found an old PDP-7 machine and rewrote Space Travel on it. Eventually, the tools developed by Thompson became the Unix operating system: Working on a PDP-7, a team of Bell Labs researchers led by Thompson and Ritchie, and including Rudd Canaday, developed a hierarchical file system, the concepts of computer processes and device files, a command-line interpreter, pipes for easy inter-process communication, and some small utility programs. In 1970, Brian Kernighan suggested the name “Unix, in a pun on the name “Multics”. After initial work on Unix, Thompson decided that Unix needed a system programming language and created B.

Though often mythologized that B derived from BCPL, Thompson created B from a stripped down Fortran source. Thompson began with the letter Z and after each revision, he decremented the name, one letter toward A. When finished, he landed on B. After Ritchie ported B to another machine, Ritchie called in new B, a tongue-in-cheek reference to newbie. That porting let Ritchie add types and structs. Doing that necessitated a bump up revision of the name to C. Later, when writing about C, Ritchie stated one could think of it like a BCPL and from that is the source of all origins confusion.

B was designed for recursive, non-numeric, machine-independent applications, such as system and language software. It was a typeless language with the only data type being the underlying machine’s natural memory word format, whatever that might be. Depending on the context, the word was treated either as an integer or a memory address.

As machines with ASCII processing became common, notably the DEC PDP-11 that arrived at Bell Labs, support for character data stuffed in memory words became important. The typeless nature of the language was seen as a disadvantage, which led Thompson and Ritchie to develop an expanded version of the language supporting new internal and user-defined types, which became the ubiquitous C programming language.

Ritchie developed C between 1972 and 1973 to construct utilities running on Unix. It was applied to re-implement the kernel of the UNIX operating system. During the 1980s, C gradually gained popularity. It has become one of the most widely used programming languages, with C compilers available for practically all modern computer architectures and operating systems. The book \emph{The C Programming Language}, co-authored by the original language designer, served for many years as the \emph{de facto} standard for the language. C has been standardized since 1989 by the American National Standards Institute (ANSI) and subsequently, jointly by the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC).

C is an imperative procedural language, supporting structured programming, lexical variable scope, and recursion, with a static type system. It was designed to be compiled to provide low-level access to memory and language contructs that map efficiently to machine instructions, all with minimal runtime support. Despite its low-level capabilities, the language was designed to encourage cross-platform programming. A standards-compliant C program written with portability in mind can be compiled for a wide variety of computer platforms and operating systems with few changes to its source code.

Since 2000, C has typically ranked as the most or second-most popular language in the TIOBE index.

The Unix philosophy

UNIX and C weren’t just technologies. They gave rise to the \emph{Unix philosophy}, which originated with Ken Thompson’s early meditations on how to design a small but capable operating system with a clean service interface. It grew as the Unix culture learned things about how to get maximum leverage out of Thompson’s design, absorbing lessons from many sources along the way.

Doug McIlroy, the inventor of Unix pipes and one of the founders of the Unix tradition had this to say at the time:

  • Make each program do one thing well. To do a new job, build afresh rather than complicate old programs by adding new features.
  • Expect the output of every program to become the input to another, as yet unknown, program. Don’t clutter output with extraneous information. Avoid stringently columnar or binary input formats. Don’t insist on interactive input.
  • Design and build software, even operating systems, to be tried early, ideally within weeks. Don’t hesitate to throw away the clumsy parts and rebuild them.
  • Use tools in preference to unskilled help to lighten a programming task, even if you have to detour to build the tools and expect to throw some of them out after you’ve finished using them.

He later summarized it this way (quoted in A Quarter Century of Unix):

This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.

Bruce Molay has abstracted the following rules from the Unix philosophy:

  • Rule of Modularity: Write simple parts connected by clean interfaces.
  • Rule of Clarity: Clarity is better than cleverness.
  • Rule of Composition: Design programs to be connected to other programs.
  • Rule of Separation: Separate policy from mechanism; separate interfaces from engines.
  • Rule of Simplicity: Design for simplicity; add complexity only where you must.
  • Rule of Parsimony: Write a big program only when it is clear by demonstration that nothing else will do.
  • Rule of Transparency: Design for visibility to make inspection and debugging easier.
  • Rule of Robustness: Robustness is the child of transparency and simplicity.
  • Rule of Representation: Fold knowledge into data so program logic can be stupid and robust.
  • Rule of Least Surprise: In interface design, always do the least surprising thing.
  • Rule of Silence: When a program has nothing surprising to say, it should say nothing.
  • Rule of Repair: When you must fail, fail nosily and as soon as possible.
  • Rule of Economy: Programmer time is expensive: conserve it in preference to machine time.
  • Rule of Generation: Avoid hand-hacking; write programs to write programs when you can.
  • Rule of Optimization: Prototype before polishing. Get it working before you optimize it.
  • Rule of Diversity: Distrust all claims for “one true way”.
  • Rule of Extensibility: Design for the future, because it will be here sooner than you think.

The celebrity Goldman Sachs quant Emanuel Derman, who also worked at Bell Labs, recalls in his book My Life as a Quant:

Until I arrived at the Labs in 1980, I had never realized how elegant and challenging programming could be. I had never used a computer terminal. During my student and postdoc years, all my programs had aimed merely to obtain the numerical values of complicated mathematical formulae, over and over again. I thought of the computer as a glorified adding machine. My only exception had been at the University of Cape Town, in 1965, when I used punched cards to enter a vocabulary into the machine and then created randomly generated short poems. I had always thought of that effort as a childish lark.

But at AT&T in 1980, the whole firm was embracing C, the simultaneously graceful and yet practical language invented by Dennis Ritchie about ten years earlier at Murray Hill. He had devised C to be a high-level tool with which to write portable versions of UNIX, the operating system also invented there by Ken Thompson and Ritchie. Now, everything from telephone switching systems to word-processing software was being written in C, on UNIX, all with amazing style. Eventually, even physicists, who are generally interested only in the number of digits after a decimal point, began to foresake ugly utilitarian FORTRAN for poetically stylish C. Programming was in the late stages of a revolution about which I was just beginning to learn.

Derman further reflects on the nature of programming:

What are you doing when you program? You are trying to use a language to specify an imagined world and its details as accurately as possible. You are trying to create this world on a machine that can understand and execute only simple commands. You do this solely by writing precise instructions, often many hundreds of thousands of lines long. Your sequence of instructions must be executed without ambiguity by an uncomprehending automaton, the computer, and yet, in parallel, must be read, comprehended, remembered and modified by you and other programmers. Just as poetry strives to resolve the tension between form and meaning, so programming must resolve the tension between intelligibility and concision. In this endeavour, the language you employ is critically important.

People rarely talk about UNIX nowadays, but they do talk a lot about Linux and GNU. Linux is a family of open source UNIX-like operating systems based on the Linux kernel, an operating system kernel first released on 17 September, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution (distro), which includes the kernel and supporting operating system software and libraries – most of which are provided by third parties – to create a complete operating system, designed as a clone of UNIX and released under the copyleft GPL license.

Thousands of Linux distributions exist, many based directly or indirectly on other distributions; popular Linux distributions include Debian, Fedora Linux, Linux Mint, Arch Linux, and Ubuntu, while commercial distributions include Red Hat Enterprise Linux, SUSE Linux Enterprise, and ChromeOS. Linux distributions are frequently used in server platforms. Many Linux distributions use the word "Linux in their name, but the Free Software Foundation uses and recommends the name “GNU/Linux” to emphasize the use and importance of GNU software in many distributions, causing some controversy.

GNU is a recursive acronym for “GNU’s Not Unix!”, chosen because GNU’s design is UNIX-like, but differs from UNIX by being free software and containing no UNIX code. Formally, GNU is an extensive collection of free software, which can be used as an operating system or can be used in parts of other operating systems. The use of the completed GNU tools was one of the drivers behind the popularity of Linux. Most of GNU is licensed under the GNU Project’s own General Public License (GPL).

GNU is also the project within which the free software concept originated. Richard Stallman, the founder of the project, views GNU as a technical means to a social end. Relatedly, Lawrence Lessig states in his introduction to the second edition of Stallman's book Free Software, Free Society that in it Stallman has written about the social aspects of software and how Free Software can create community and social justice.

Other than the Linux kernel, key components that make up a distribution may include a display server (windowing system), a package manager, a bootloader, and a Linux shell.

Over time, Linux started supplanting UNIX as a core operating system in financial services. Consider the article Linux lands big Reuters win (https://www.zdnet.com/article/linux-lands-big-reuters-win/) contributed to ZDNET by Stephen Shankland on 16 May, 2002:

Reuters will announce plans Thursday to bring its financial information software to Linux in conjunction with Red Hat, Intel and Hewlett-Packard, sources said, a major achievement for the comparatively young open-source operating system. Morgan Stanley will be one among other customers using the software, sources familiar with the project said. Red Hat, the leading seller of the Linux operating system, announced in March that Morgan Stanley was a customer.

The software involved is Reuters Market Data System (RMDS), used by financial services personnel such as stock traders to retrieve and digest financial statistics and news. The software runs on both workstations and servers.

Unix computers – chiefly from Sun Microsystems – have been popular among these customers. Intel and the PC community, however, have dedicated themselves to ousting these systems with less expensive computers made of PC components and Linux or Windows. Red Hat, meanwhile, has emerged as one of the chief advocates of Linux in corporations, and its biggest marketing push has been to replace Unix.

Linus Torvalds in a famous video Nothing better than C (2019) said:

I have to say, I’m kind of old-fashioned. The reason I got into Linux and operating systems in the first place is that I really love hardware. I love tinkering with hardware. Not in the sense that I’m a hardware person – giving me a soldering iron is a bad idea – but I like interacting with hardware from a software perspective. And I have yet to see a language that comes even close to C in that respect. It’s not just that you can use C to generate good code for hardware. It’s that if you think like a computer writing C actually makes sense. The people who designed C designed it at a time when compilers had to be simple and the language had to be geared towards what the output was. So when I read C I know what the Assembly language will look like.

This matters because, nowadays, in financial services and beyond, it is a common practice to develop the software on Windows (the user-friendly interface facilitates rapid development) but test, deploy, and run the code in production on Linux. C, then, is not only the programming language of choice, but part of the standard development infrastructure.


Leave a Reply

Your email address will not be published. Required fields are marked *