APUE2e - Chapter 2 - Standardization and Implementations

The first half of chapter 2 lists a dizzyingly-long, eye-glaze-inducing timeline of UNIX-related standards and implementations all with slightly varying names and meanings. The main concern here is which headers are part of which standards and which implementations have which headers. There are nice tables in this chapter showing which systems have which headers. It seems like good reference material.

The UNIX and UNIX-like implementations considered and used for testing examples in APUE2e were FreeBSD 5.2.1, Linux 2.4.22 (Mandrake 9), Mac OS X 10.3 (Darwin 7.4.0) and Solaris 9 . Although I have a bias for Debian in the Linux world, that certainly seems like a representative collection of UNIX-like systems I have and will encounter.

ISO C

There have been two major standards for the C programming language.

The 15 header files in ANSI C, the ones listed at the back of K&R2e are all included by the four implementations considered in the book.

The ISO C 1999 standard requires 24 headers (including the 15 in ANSI C) but not all UNIX-like systems include all of the headers. I'm happy to see the wide character headers, wchar.h and wctype.h, are widely available as they seem like they would be important in any modern application (e.g. programming language, text editor, HTTP server) which probably deals with at least Unicode and UTF-8. I'm looking forward to learning more about handling character encodings in C.

My JavaScript browser scripting experience warns me that just because an implementation has a header, it doesn't mean the header contains everything it should or that the functions are bug-free.

IEEE POSIX

The POSIX section is a messy one. There have been many specifications and to be POSIX-compliant an implementation needs to adhere to the standards described in many different documents.

The POSIX standard requires the 24 ISO C headers and 26 more headers (e.g. unistd.h, sys/socket.h.) APUE2e shows that all four of the considered implementations have all these headers with one exception: OS X 10.3 did not have the wordexp.h header. Looking in my OS X 10.5 system I can see that header is now present.

There are 34 more optional headers in the POSIX standard with varying levels of support in the implementations. The dlfcn.h and pthread.h headers catch my eye as important ones that are supported in the four implementations. I believe the Apache server makes extensive use of both of these headers.

The Single UNIX Specification

The Single UNIX Specification (SUS) is a superset of POSIX and a system that implements all of SUS can be certified as real "UNIX". This section is another standards history lesson. It seems important to plough through this sort of reading once so I know it is there if I need to reference it in the future.

Limits, Options and Feature Testing

A major part of the UNIX-related specifications are about limits, especially minimum limits, for the size of things. The limits.h and floats.h headers in ANSI C are the ones I had encountered in the past. For example, INT_MAX in limits.h must be at least 32767.

Somewhat not surprisingly, there are all sorts of system-related limits and options that are outside the scope of the C standard. Chapter 2 spends a lot of time describing the kinds of limits. There are limits that can be determined at compile-time like INT_MAX can. There are also runtime limits that can possibly be determined by calling functions sysconf, pathconf and fpathconf.

That all seems simple enough until the book starts discussing indeterminate runtime limits. Two extended examples discuss determining the maximum length a pathname (so storage can be alloc'ed to hold a pathname string) and the maximum number of files that can be open (so all open files can be closed.) What a disappointment that on some systems these simple limits cannot be determined.

For the pathname example the book provides a path_alloc function. It is somewhat unsatisfying to read that "If pathconf indicates that PATH_MAX is indeterminate, we have to punt and just guess a value." Ugg. What an ugly situation.

In the maximum number of open files example, the book similarly provides an open_max function with the unfortunate disclaimer "Our best option in [the indeterminate] case is to just close up all descriptors up to some arbitrary limit, say 256. As with our pathname example, this is not guaranteed to work for all cases, but it's the best we can do." :-(

I'm thinking these sorts of guessing situations can lead to bugs that are very difficult to debug somewhere in the distant future when the guesses are no longer good ones. I can imagine that there are some programming purists that would turn their noises up at this guessing. I could easily end up being one of them.

I've thoroughly investigated this sort of cross-platform feature testing in the browser. It's a great challenge but not a fun challenge. One very encouraging aspect of APUE2e's treatment of system feature testing is there is nothing comparable to "browser sniffing" that is widespread in the browser scripting world. I don't see any code like #ifdef LINUX and then assumptions about the limits in Linux implementations. Those sorts of assumptions don't survive the test of time as Linux will inevitably change. Instead APUE2e uses direct tests on the actual features of interest. Certainly a good practice.

Summary

Chapter 2 isn't an exciting chapter. It's even the kind of chapter I'd hope didn't even need to exist. It would be nice if there was just one standard with well defined limits and all the implementations conformed. I imagine the fact that UNIX-like systems exist that can run a supercomputer and other systems that can run your toothbrush demands a large amount of flexibility in standards support and limit sizes to accommodate hardware with widely varying power, size and speed.

It looks like with chapters 1 and 2 out of the way, the remainder of the book will be more exciting. Actual details about controlling file I/O, processes, signals, threads, daemons, network communication, terminal I/O, etc.

Comments

Have something to write? Comment on this article.