Not Bruce's Home Page

Try this.

Various linux/nfs/etc todo's:

Stuff that has to be done before we can lift the "experimental" marking:

  • Reboot recovery: if we're going to replace current system, that's a user-visible change; would be nice to do that sooner rather than later.
  • Simplified pseudofilesystem management, consistent v2/v3/v4 export paths: steved is working on this.
Stuff to play with:
  • oprofile
  • stap
Other stuff:
  • big rpcsec_gss init upcalls
  • delegations
  • nfsd4 state problems:
    • various callback problems
    • Audit lifetime rules for clients.
    • Fix locking on start/stop.
    • In setclientid, allow client to blow away old state (even if credentials conflict) if old state has no open state associated with it.
  • Fix open on the server; problems:
    • We shouldn't be rechecking open permissions in the case where the open creates a file. But we don't actually know whether it did create the file on exit from nfsd_create_v3. Hm.
    • Truncate is handled strangely: we don't set op_truncate at all in the non-create case, and in the create case we only set it if the call sets the size to exactly zero. What happens otherwise? Also, what does MAY_TRUNC really mean?
  • Server handles open-upgrades in a weird way, by hanging on to the same struct file and doing a get_write_access if necessary. It would probably be better to keep two struct files in the stateid instead; sketch:
    • Call the two st_reader and st_writer
    • If the first open is for read or both, assign it to st_reader and leave st_writer null. (Arbitrary choice.)
    • Otherwise, in write-only case, assign to st_writer.
    • On upgrade, assign to the other if necessary.
    • On close, put one of them if necessary; note closes are required to balance open upgrades in such a way that this is always possible.
    • Eliminate the extra bitmask accounting while we're at it, which is only there to catch out-of-spec clients anyway.
  • nfs client idmapping error handling: should probably, for example, be erroring out instead of falling back on nobody when setting uid/gid?
  • nfs client: make sure setting either acl or mode invalidates cache of the other.
    • Not completely done yet. Need to ensure same invalidation happens on mode. Need to ensure what's invalidated isn't more than necessary.
  • krb5p implemntation is ugly; it has unclear requirements from client and server code, and does a lot of complicated an inefficient things for obscure reasons. To do:
    • summarize client- and server- side requirements, make sure we've accounted for the obvious nits in the current code
    • attempt pencil-and-paper design of simpler krb5p API.
    • implement incrementally....
  • nfsv4 deferral problems.

    Goal in too-many-upcalls case: make sure threads that *don't* require an upcall can still make progress.

    Current deferral process:

    • svc_recv sets rq_chandle.defer to svc_defer. (rq_chandle is a struct cache_req, and defer is its only field.) Defer just takes its cache_req and returns a cache_deferred_req, which contains a "revisit" function, imbedded in whatever other data you need to do the revisit. ->defer is called from cache_defer_req, called from cache_check.

    Detailed plan:

    • NEXT: Store nfsv4-specific state with the deferral (current & saved fh, op #, offset to next op (in raw xdr, or in decoded result??), partially xdr'd response); may need a callback from svcsock to nfsd4 to do this. Maybe some of that state could be moved into the svc_rqst struct itself?
    • Remove the special case code for idmapd.
    • Also consider extending deferral handling to allow deferring larger requests, with an appropriate limit on the amount of memory used for deferred requests.
    • Also modify deferral code so that if it decides to drop later, it either forces a revisit and a -ETIMEDOUT return, or it actually closes down the connection. (The former might be easiest--could probably just re-run it again and count on resource limits to ensure the -ETIMEDOUT as soon as we attempt to defer it again.) (Note: Trond points out that closing the connection may put some particular stress on the replay cache if we happen to be processing another request on that socket at the same time....
    • Lower priority: at some point, audit mountd for how it handles errors from the filehandle nfsctl(s)--I'm not sure it takes into account the fact that exp_parent can fail?
  • nfsv4 compound limitations: Can't handle, e.g., multiple reads/writes in one compound. Greg has some idea that the page buffers should be eliminated completely and we should read directly out of the network's buffers. That might help....
  • nfsv4 compound processing: note -ENOMEM -> drop, possibly leading to client replaying nonidempotent operation.

Pointless lists:


Yeah, yeah, everybody has to have a blog these days.

I kept it elsewhere until May 2005.


NFS specs: Some NFS history to read: Language resources: Local weather