A ``vnode'' is a data structure used within Unix-based operating systems to represent an open file, directory, device, or other entity (e.g., socket) that can appear in the file system name-space. The ``vnode interface'' is an interface within an operating system's file system module. It allows higher level operating system modules to perform operations on vnodes. The vnode interface was invented by Sun Microsystems to facilitate the coexistence of multiple file systems [Kleiman86], specifically the local file system that manages disk storage and the NFS [Sun89,Pawlowski94] remote file system. When a vnode represents storage (such as a file or directory), it does not expose what type of physical file system implements the storage. This ``virtual file system'' concept has proven very useful, and nearly every version of Unix includes some version of vnodes and a vnode interface.
One notable improvement to the vnode concept is ``vnode stacking,'' [Rosenthal92,Heidemann94,Skinner93] a technique for modularizing file system functions. The idea is to allow one vnode interface to call another. Before stacking existed, there was only a single vnode interface. Higher level operating systems code called the vnode interface which in turn called code for a specific file system. With vnode stacking, several vnode interfaces may exist and they may call each other in sequence: the code for a certain operation at stack level N calls the corresponding operation at level N+1, and so on.
For an example of the utility of vnode stacking, consider the complex caching file system (Cachefs) shown in Figure fig-intro-decompose. Here, files are accessed from a compressed (Gzipfs), replicated (Replicfs), file system and cached in an encrypted (Cryptfs), compressed, file system. One of the replicas of the source file system is itself encrypted, presumably with a key different from that of the encrypted cache. The cache is stored in a UFS [LoVerso91] physical file system. Each of the three replicas is stored in a different type of physical file system, UFS, NFS, and PCFS [Forin94].
One could design a single file system that includes all of this functionality. However, the result would probably be complex and difficult to debug and maintain. Alternatively, one could decompose such a file system into a set of components:
These components can be combined in many ways provided that they are written to call and be callable by other, unknown, components. Figure fig-intro-decompose shows how the cryptographic file system can stack on top of either a physical file system (PCFS) or a non-physical one (Gzipfs). Vnode stacking facilitates this design concept by providing a convenient inter-component interface. The introduction of one module on top of another in the stack is called ``interposition.''
Building file systems by component interposition carries the expected advantages of greater modularity, easier debugging, scalability, etc. The primary disadvantage is performance. Crossing the vnode interface is overhead. However, I claim that the overhead can be made so small that any loss in performance is outweighed by the benefits. See Section sec-design-eval-performance.
The example in Figure fig-intro-decompose illustrates another property of vnode stacking: fanout. The implementation of Replicfs calls three different file systems. Fan-in can exist, too. There is no reason to restrict the stacking concept to a linear stack or chain of file systems.