KNFSD - The Kernel NFS Daemon for Linux
Intro
KNFD, the Kernal Based NFS daemon (as opposed to
UNFSD, the Universal Usermode daemon)
seems to have been mostly written by
Olaf Kirch,
though he doesn't seem to be working on it any more.
"G. Allen Morris III" <gam3@ixlabs.com> has a
web page
which claims to be the Kernal NFS Daemon Maintainer's Page, but it is
a bit out of date now.
H.J. Lu of VA-Research did a
lot of work getting knfsd in 2.2 to pass Connect-a-thon tests and did
some SPEC-SFS benchmarks. He also maintains the user-land support
programs. The can be found in in the linuxnfs project on SourceForge:
http://www.linuxnfs.sourceforge.org
SGI have an "NFSv3" project at
http://oss.sgi.com/projects/nfs3/.
Patches
Patches against recent 2.2 kernels can be found on
Source Forge. We are
hoping they will get into 2.2.18 (but he have hoped that sort of thing
since 2.2.14).
My current project for 2.4.0-test is providing an nfsd_operations interface between
knfsd and filesystems. Currently, ext2fs is the only filesystem in
the main tree that can be effectively exported, however Chris Mason
has patches to make reiserfs work.
I have one outstanding patches at the moment:
- A
- Add nfsd_operations support, including some support added to
fs/dcache.c. With this, ext2fs is definately the only filesystem type
that can be exported, but others can provide support through a well
defined interface
Documentation
I have started writing up a
commentary on the code.
ToDo List
- try to mount an mounted-on but not exported directory dies.(DONE)
- UDP check sums aren't checked (DONE)
- work around bug in Solaris7 wherebytop bit of mtime/atime gets set
by excl create, but annoyes client. (DONE)
-
fix stable/non-stable write code in vfs.c (DONE)
- make reply cache check port+proto aswell as ip+xid (DONE E)
- really sync inodes now that we can (DONE)
- make sure lockd is running each time we start nfsd- it might
have been shot - already there.
- check nfsd and lock are registered each time we start nfsd. Well,
if you kill them off and then restart, it will be fine
- If a filesystem is mounted in two places, make it possible
to export the two differently
-
Linux currently uses 1-second resolution mtime. NFS can benefit
from higher resolution to do cache consistancy checking. This
could be provided without underlying filesystems being changed by:
- Adding microsecond (or nanosecond?) fields the the inode
structure
- Having VFS set these appropriately
- When an inode is loaded - preset the nanosecond field to the
maximum possible value. (10^9-1) If the filesystem supports
better resolution it can set it, otherwise th default value
should do the closest possible to the "right thing"
-
In Linux 2.3, NFSv3 cannot return pre-op attributes for writes as
it doesn't lock the inode (the lock happens lower down). This
should be fixed, but it is not clear how.
-
The "nohide" export option is not fully supported by
exportfs/mountd. When an export entry is given to the kernel
(e.g. in response to a mount request), all export entries for child
filesystems which are marked "nohide" should also be given to the kernel.
-
NFSD makes more demands on filesystems than VFS does. In some
cases, NFSD could use certain services that a filesystem could
supply, but doesn't because they aren't defined in the various
_operations structures.
There should be an nfsd_operations structure that filesystems
could provide that would define operations like "convert filehandle
to dentry" and "convert dentry to filehandle" and others.
-
NFSEXP_ASYNC should be OFF by default. This is controlled by the
user-land utilities. This needs to be agreed to by users before the
change is made.
-
NFSv4 support <grin>
-
Restructure export table to:
-
Allow non-independant subtrees of a filesystem (e.g. / and /tmp) to
be exported with different options (assuming subtreecheck).
-
Be friendly to authentication styles other than AUTH_UNIX.
-
work well with an upcall mechanism to tell mountd if an request
arrives for an unknown client or export point. This would require
negative entries, and timing out of unused enties.
-
Provide an upcall mechanism to mountd in case of a request or
filehandle that cannot be authenticated/authorised by info in the
kernel. It would have to allow for:
-
Unknown client
-
Unknown authentication scheme
-
uninitialised authentication context.
-
unknown filesystem
-
unknown export options for client/filesystem
I have a separate ToDone list of things that
have moved off the ToDo list.
File System Interface
NFSD has the following requirements of any underlying filesystem that
it is going to support. These might well be changed when
nfsd_operations becomes a reality.
-
The filesystem must be a
FS_REQUIRES_DEV filesystem -
partly because a stable device number is needed for the handle, partly
to exclude PROCFS and NFS from being exported.
-
The filesystem must provide a read_inode method which will take an
inode number and provide the inode.
-
If
read_inode is presented with an inode number which refers to an
inactive (never used or deleted) inode, it should return a
bad_inode created with make_bad_inode.
Correspondingly, if delete_inode is presented with an
is_bad_inode inode, it should quietly ignore it.
-
The filesystem should set and maintain the
i_generation field in the inode structure. Without
this nfsd might provide access to the wrong file when it should
return ESTALE.
-
The filesystem must support lookup of ".." to find the parent of a
directory. VFS never actually uses this, but most filesystems seem
to provide it anyway.
Last updated 15th December 1999
Visit my home page.