... design.1.1
Examples of such re-thinking can be found in [Tait91b] and [Tait91a].
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... copen().2.1
Copen() is the common code for open() and create().
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
....2.2
The C preprocessor (cpp) symbol NFSMGR is used to enclose our code in the kernel sources. When defined, our changes are included in the built kernel image.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... directory.2.3
There are other entities represented as vnodes, such as devices and network communication end-points, but these are irrelevant to our work.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... socket.2.4
Arguably these should not have to be distinguished at this point. After all, the vnode interface should not care if something is a vnode or a network file-descriptor. This is the unfortunate result of the ``hacks'' that were made to the original BSD 4.3 kernels (of which SunOS 4.x was based on) when networking code was added later.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... field!2.5
Obviously code can break if people don't know of this feature. These semantics exist to support fork/dup/pipe. If a child wants to maintain a different offset into the same file, it must close and reopen it.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... invoked.2.6
Amd is invoked for every file operation which traverses its automount filesystem or acts on its node. Most of these operations are empty stubs and simply return without performing any action (for example, NFS_WRITE). Once the name resolution passed the path component of the automounter, by crossing the symbolic link which Amd had presented it with, the invoking process is not at the ``mercy'' of the automounter any more, but whatever filesystem server it crossed over to.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... program3.1
Written by Van Jacobson and widely available by anonymous ftp from ftp.uu.net in /networking/ip/trace/traceroute_pkg.tar.Z.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... operations.3.2
These are the NFS_GETATTR, NFS_READ, and sometimes NFS_NULL operations. The latter might have been better suited for our needs as it exhibits the least variability, but it does not occur often enough. Note also that the null operation does not account for the whole performance of the server (for example, including disk performance), but mostly characterizes the network.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... environment.3.3
A better trigger function would take into account the absolute latency; see Section 8.1.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... RPC3.4
Non-blocking operation is provided by a special kernel implementation of Sun RPC.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... it3.5
Amd provides an RPC interface used by its query client amq that we use to query and control Amd.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... comparison.3.6
This avoids the need to lock the call out to nfsmgrd.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... inactivity3.7
Note also that a filesystem cannot be unmounted, even if nfs_umount is called, as long as there are open file-descriptors in use on that filesystem. That means that even if we can avoid using a filesystem because we have a replacement for it, we may not be able to release the kernel resources it occupies.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... systems3.8
That is, they are exported as read-only to some hosts (including our client hosts), although they might be exported as read-write to others.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... reads.3.9
An example of ``careful update'' is provided by the SUP utility [Shafer92]. SUP transfers the new file to a temporary name, renames the target file to another temporary name, renames the newly transferred file to the final name, and then unlinks the old file which was also renamed. This is meant to make sure that any open descriptors on the old file can still access it and will not encounter possible paging problems.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... updated3.10
That is, updated by a host to which the file system is exported read-write.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... level.4.1
The spl are kernel routines that Set Process Lock levels.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... limits.5.1
Under SunOS, the maximum path name length alone, MAXPATHLEN, is 4096 bytes long.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... priority.6.1
Console output is considered a high-priority event in SunOS 4.x.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... gdb.6.2
Gdb has the ability to debug kernels over the network and/or from processes, but it is only possible for micro-kernel based operating-systems, such as Mach 3.0 [Stallman94].
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... files.8.1
One might suppose that a ``most common subset'' of system files could be designated and loaded. However, specifying such a subset is ever harder as programs depend on more and more files for configuration and auxiliary information. This approach also increases the user's responsibility for system administration, which we regard as a poor way to design systems. One possible solution is a caching filesystem such as [SMCC92b]. With a caching filesystem, only a small working set of files most frequently used are stored on a smaller local disk, alleviating the need to go to a remote server for file access. Only when rarely used files are requested, a file search on remote hosts could be conducted, perhaps using our switching mechanism.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Erez Zadok
1999-02-17