Next: 7. Related Work Up: Discovery and Hot Replacement Previous: 5. Evaluation

Subsections

6. Experiences

6.1 Experiences in Kernel Development

During the time we worked on this system, we have gained considerable experience developing and testing kernel code. It has proven to be a challenging task. These comments were borne out of working in the SunOS 4.x operating system, but they are valuable nonetheless to many other environments based on a monolithic kernel.

6.1.1 Debugging

The largest single problem when developing kernel code is debugging. Each time a new test had to be made, we had to edit kernel sources, rebuild the kernel executable (/vmunix), install it, reboot, wait for the machine to come up, and then start our tests. This whole cycle, for a SparcStation II averaged around 30 minutes for very small code changes. (That might explain why this work spanned over several years.)

6.1.1.1 `printf()`s

The best method for debugging kernels which we came to use was copious printf statements in small code sections that had to be debugged at the moment. We had to be careful about how many and where we placed these print statements. For example, busy sections such as the name resolution function (au_lookuppn()) are bad places to insert them, because the amount of output that will get generated by the kernel -- which gets printed on the console and added to a syslog [SMCC90e] daemon -- is so voluminous that the machine spends most of its time displaying debugging output, and user processes are pushed down the scheduling priority.^6.1

However, not even using printf helped us at times. Output has to pass through a kernel buffer, out to the syslog mechanism, and then to the console (or wherever /etc/syslog.conf directs it to). When a kernel panic occurs, the kernel printf buffer almost certainly has some output that has not been flushed to the console. That output is lost when the machine panics. Unfortunately, that output is the most critical to have, because it is the debugging information just leading to the panic. The best ways to avoid these problems were to be extremely careful when writing kernel code. See Section 6.1.2.

Our solution was to introduce a new system call dedicated to turning kernel debugging on and off for any section of our code, and for querying or even changing information that is accumulated by the filesystem code. See Sections 4.2 and 4.3.

6.1.1.2 Kernel Debuggers

We have tried other methods for debugging kernels, such as using kadb. But we found these to be cumbersome and greatly lacking in flexibility as compared to user-level debuggers such as gdb.^6.2

6.1.1.3 Source Browsing

SunOS 4.1.2's kernel sources number almost a half a million of C code lines. No one person could be an expert in every part of this large system. Our work mainly concentrated on less than one tenth of that amount, and on the whole, no more than one fifth of the code (about 73,000 lines) had to be looked at to achieve our goals. In one respect, this exemplifies just how much modularity there is in the kernel. On the other hand, we had to learn how the SunOS kernel operates all on our own, testing one section at a time. A lot of time was spent placing printf statements at various points in the kernel, and checking what output was produced.

That is how we learned the execution flow in the kernel. It is difficult to know at any given point how did the kernel get there. One of the main reasons is the so-called ``object-oriented'' programming style the SunOS kernel has. Many routines are not called directly, but as a consequence of a macro expansion on a field of a structure containing opaque data and generic structures full of pointers to functions. One of these functions is dereferenced, and then called on the actual data point it was passed. Here is an example from <sys/vnode.h> showing how this is achieved:

#define VOP_GETATTR(VP,VA,C)		(*(VP)->v_op->vn_getattr)(VP,VA,C)

The operation might have been applied to any vnode, but at that level the knowledge of what filesystem that vnode belonged to was lost. The best way we found to recover that information was to compare the addresses of the pointers to the functions -- in this example, comparing *(VP)->v_op with the global &nfs_vnodeops. That way we could tell the vnode in question is an NFS one.

Another problem was the lack of documentation specific to SunOS kernels or even more general about ``modern'' operating system resembling SunOS. The books available to us at the time were outdated, too broad, or inapplicable [Bach86,Leffler89,Tanenbaum87].

6.1.2 Coding Practices

When coding in the kernel, we found many of our assumptions and experiences accumulated over years of user-level programming to be false. These proved to be futile; the slightest problem in the kernel causes a panic, followed by a long kernel-dump of memory pages, and the obvious need to fix the code.

These are some of our recommendations when writing kernel code:

pointers: Be very careful with pointers. Don't ever assume that the value of a pointer you are passing around is what you thought it was. Always check to make sure. When your code is working flawlessly, you can remove extraneous checks to speed it up. Many times, due to memory allocation and/or alignment problems, pointers and their data get corrupted. Also important is to initialize values of any allocated data (static, automatic, etc.). Most user-level compilers these days will make sure values of stack or heap allocated storage is zeroed first. That is not so in the kernel (for speed reasons), so you must initialize all values yourself. Besides, initialization is just good programming practice.
memory: In the kernel you don't have infinite amounts of memory, not even virtual memory. Every byte you use comes out of a fixed amount of physical memory the kernel takes away from the rest of the system when it starts up. You can easily run off the end of the kernel memory by calling too many nested functions (recursion is not a good idea either), by using many (or large) automatic or static variables, and of course, by forgetting to call the proper kernel free routine for something you have allocated. For example, using large automatic strings allocated for deeply nested functions accounted for several days of frustrating debugging, faced with the obscure ``watchdog reset -- rebooting'' messages.
output: as mentioned in section 6.1.1.1, it is not recommended to generate too much output to the console for several reasons: you want the kernel print buffer to flush in time, too much output in frequently called kernel functions may slow the system manyfold, and it is easier to look at less debugging output at the critical moment than many pages of useless information. Other methods we used included ``timed output''. That is, we turn on verbose output from some code section for only a short period of time, which automatically turns itself off.
backup kernel: always keep a backup kernel image in available for use. We always left a known working kernel in /vmunix.good, which we were able to specify at boot time in case our newly installed kernel failed to boot or crashed frequently.

6.2 Vendor Bugs

Even if you are an experienced C programmer and wrote bug-free kernel code, you may still get kernel panics. Kernel code of vendors is hardly bug-free. In our case, hundreds of patches exist for various versions of SunOS 4.x.

The system administrators at our site have installed most of these patches, and were constantly installing new ones. However, the sources we were working from were those of the original unpatched system. That meant that the kernels we were building from original sources did not include any bug fixes. We had our kernels crash several times due to known bugs which we had no source fixes to.

In a few occasions, we tried to install binary kernel patches to object modules for those files which we knew we were not modifying, and working under the assumption that it is better to fix some bugs than none at all. That assumption only worked half of the time. Often, large patches are distributed in collections known as ``Jumbo Kernel Patches''. These are such extensive patches that they span many kernel modules, and make incompatible changes that must be coordinated among different code sections. Installing only a few of them, and expecting the rest to be generated from sources often did not work. If we were lucky, the kernel would not build due to missing symbols. If we were unlucky, the kernel linked, but failed to run at some stage, sometimes several days after the system was rebooted.

Next: 7. Related Work Up: Discovery and Hot Replacement Previous: 5. Evaluation

Erez Zadok
1999-02-17