123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448 |
- ==========================
- General Filesystem Caching
- ==========================
- ========
- OVERVIEW
- ========
- This facility is a general purpose cache for network filesystems, though it
- could be used for caching other things such as ISO9660 filesystems too.
- FS-Cache mediates between cache backends (such as CacheFS) and network
- filesystems:
- +---------+
- | | +--------------+
- | NFS |--+ | |
- | | | +-->| CacheFS |
- +---------+ | +----------+ | | /dev/hda5 |
- | | | | +--------------+
- +---------+ +-->| | |
- | | | |--+
- | AFS |----->| FS-Cache |
- | | | |--+
- +---------+ +-->| | |
- | | | | +--------------+
- +---------+ | +----------+ | | |
- | | | +-->| CacheFiles |
- | ISOFS |--+ | /var/cache |
- | | +--------------+
- +---------+
- Or to look at it another way, FS-Cache is a module that provides a caching
- facility to a network filesystem such that the cache is transparent to the
- user:
- +---------+
- | |
- | Server |
- | |
- +---------+
- | NETWORK
- ~~~~~|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- |
- | +----------+
- V | |
- +---------+ | |
- | | | |
- | NFS |----->| FS-Cache |
- | | | |--+
- +---------+ | | | +--------------+ +--------------+
- | | | | | | | |
- V +----------+ +-->| CacheFiles |-->| Ext3 |
- +---------+ | /var/cache | | /dev/sda6 |
- | | +--------------+ +--------------+
- | VFS | ^ ^
- | | | |
- +---------+ +--------------+ |
- | KERNEL SPACE | |
- ~~~~~|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|~~~~~~|~~~~
- | USER SPACE | |
- V | |
- +---------+ +--------------+
- | | | |
- | Process | | cachefilesd |
- | | | |
- +---------+ +--------------+
- FS-Cache does not follow the idea of completely loading every netfs file
- opened in its entirety into a cache before permitting it to be accessed and
- then serving the pages out of that cache rather than the netfs inode because:
- (1) It must be practical to operate without a cache.
- (2) The size of any accessible file must not be limited to the size of the
- cache.
- (3) The combined size of all opened files (this includes mapped libraries)
- must not be limited to the size of the cache.
- (4) The user should not be forced to download an entire file just to do a
- one-off access of a small portion of it (such as might be done with the
- "file" program).
- It instead serves the cache out in PAGE_SIZE chunks as and when requested by
- the netfs('s) using it.
- FS-Cache provides the following facilities:
- (1) More than one cache can be used at once. Caches can be selected
- explicitly by use of tags.
- (2) Caches can be added / removed at any time.
- (3) The netfs is provided with an interface that allows either party to
- withdraw caching facilities from a file (required for (2)).
- (4) The interface to the netfs returns as few errors as possible, preferring
- rather to let the netfs remain oblivious.
- (5) Cookies are used to represent indices, files and other objects to the
- netfs. The simplest cookie is just a NULL pointer - indicating nothing
- cached there.
- (6) The netfs is allowed to propose - dynamically - any index hierarchy it
- desires, though it must be aware that the index search function is
- recursive, stack space is limited, and indices can only be children of
- indices.
- (7) Data I/O is done direct to and from the netfs's pages. The netfs
- indicates that page A is at index B of the data-file represented by cookie
- C, and that it should be read or written. The cache backend may or may
- not start I/O on that page, but if it does, a netfs callback will be
- invoked to indicate completion. The I/O may be either synchronous or
- asynchronous.
- (8) Cookies can be "retired" upon release. At this point FS-Cache will mark
- them as obsolete and the index hierarchy rooted at that point will get
- recycled.
- (9) The netfs provides a "match" function for index searches. In addition to
- saying whether a match was made or not, this can also specify that an
- entry should be updated or deleted.
- (10) As much as possible is done asynchronously.
- FS-Cache maintains a virtual indexing tree in which all indices, files, objects
- and pages are kept. Bits of this tree may actually reside in one or more
- caches.
- FSDEF
- |
- +------------------------------------+
- | |
- NFS AFS
- | |
- +--------------------------+ +-----------+
- | | | |
- homedir mirror afs.org redhat.com
- | | |
- +------------+ +---------------+ +----------+
- | | | | | |
- 00001 00002 00007 00125 vol00001 vol00002
- | | | | |
- +---+---+ +-----+ +---+ +------+------+ +-----+----+
- | | | | | | | | | | | | |
- PG0 PG1 PG2 PG0 XATTR PG0 PG1 DIRENT DIRENT DIRENT R/W R/O Bak
- | |
- PG0 +-------+
- | |
- 00001 00003
- |
- +---+---+
- | | |
- PG0 PG1 PG2
- In the example above, you can see two netfs's being backed: NFS and AFS. These
- have different index hierarchies:
- (*) The NFS primary index contains per-server indices. Each server index is
- indexed by NFS file handles to get data file objects. Each data file
- objects can have an array of pages, but may also have further child
- objects, such as extended attributes and directory entries. Extended
- attribute objects themselves have page-array contents.
- (*) The AFS primary index contains per-cell indices. Each cell index contains
- per-logical-volume indices. Each of volume index contains up to three
- indices for the read-write, read-only and backup mirrors of those volumes.
- Each of these contains vnode data file objects, each of which contains an
- array of pages.
- The very top index is the FS-Cache master index in which individual netfs's
- have entries.
- Any index object may reside in more than one cache, provided it only has index
- children. Any index with non-index object children will be assumed to only
- reside in one cache.
- The netfs API to FS-Cache can be found in:
- Documentation/filesystems/caching/netfs-api.txt
- The cache backend API to FS-Cache can be found in:
- Documentation/filesystems/caching/backend-api.txt
- A description of the internal representations and object state machine can be
- found in:
- Documentation/filesystems/caching/object.txt
- =======================
- STATISTICAL INFORMATION
- =======================
- If FS-Cache is compiled with the following options enabled:
- CONFIG_FSCACHE_STATS=y
- CONFIG_FSCACHE_HISTOGRAM=y
- then it will gather certain statistics and display them through a number of
- proc files.
- (*) /proc/fs/fscache/stats
- This shows counts of a number of events that can happen in FS-Cache:
- CLASS EVENT MEANING
- ======= ======= =======================================================
- Cookies idx=N Number of index cookies allocated
- dat=N Number of data storage cookies allocated
- spc=N Number of special cookies allocated
- Objects alc=N Number of objects allocated
- nal=N Number of object allocation failures
- avl=N Number of objects that reached the available state
- ded=N Number of objects that reached the dead state
- ChkAux non=N Number of objects that didn't have a coherency check
- ok=N Number of objects that passed a coherency check
- upd=N Number of objects that needed a coherency data update
- obs=N Number of objects that were declared obsolete
- Pages mrk=N Number of pages marked as being cached
- unc=N Number of uncache page requests seen
- Acquire n=N Number of acquire cookie requests seen
- nul=N Number of acq reqs given a NULL parent
- noc=N Number of acq reqs rejected due to no cache available
- ok=N Number of acq reqs succeeded
- nbf=N Number of acq reqs rejected due to error
- oom=N Number of acq reqs failed on ENOMEM
- Lookups n=N Number of lookup calls made on cache backends
- neg=N Number of negative lookups made
- pos=N Number of positive lookups made
- crt=N Number of objects created by lookup
- tmo=N Number of lookups timed out and requeued
- Updates n=N Number of update cookie requests seen
- nul=N Number of upd reqs given a NULL parent
- run=N Number of upd reqs granted CPU time
- Relinqs n=N Number of relinquish cookie requests seen
- nul=N Number of rlq reqs given a NULL parent
- wcr=N Number of rlq reqs waited on completion of creation
- AttrChg n=N Number of attribute changed requests seen
- ok=N Number of attr changed requests queued
- nbf=N Number of attr changed rejected -ENOBUFS
- oom=N Number of attr changed failed -ENOMEM
- run=N Number of attr changed ops given CPU time
- Allocs n=N Number of allocation requests seen
- ok=N Number of successful alloc reqs
- wt=N Number of alloc reqs that waited on lookup completion
- nbf=N Number of alloc reqs rejected -ENOBUFS
- int=N Number of alloc reqs aborted -ERESTARTSYS
- ops=N Number of alloc reqs submitted
- owt=N Number of alloc reqs waited for CPU time
- abt=N Number of alloc reqs aborted due to object death
- Retrvls n=N Number of retrieval (read) requests seen
- ok=N Number of successful retr reqs
- wt=N Number of retr reqs that waited on lookup completion
- nod=N Number of retr reqs returned -ENODATA
- nbf=N Number of retr reqs rejected -ENOBUFS
- int=N Number of retr reqs aborted -ERESTARTSYS
- oom=N Number of retr reqs failed -ENOMEM
- ops=N Number of retr reqs submitted
- owt=N Number of retr reqs waited for CPU time
- abt=N Number of retr reqs aborted due to object death
- Stores n=N Number of storage (write) requests seen
- ok=N Number of successful store reqs
- agn=N Number of store reqs on a page already pending storage
- nbf=N Number of store reqs rejected -ENOBUFS
- oom=N Number of store reqs failed -ENOMEM
- ops=N Number of store reqs submitted
- run=N Number of store reqs granted CPU time
- pgs=N Number of pages given store req processing time
- rxd=N Number of store reqs deleted from tracking tree
- olm=N Number of store reqs over store limit
- VmScan nos=N Number of release reqs against pages with no pending store
- gon=N Number of release reqs against pages stored by time lock granted
- bsy=N Number of release reqs ignored due to in-progress store
- can=N Number of page stores cancelled due to release req
- Ops pend=N Number of times async ops added to pending queues
- run=N Number of times async ops given CPU time
- enq=N Number of times async ops queued for processing
- can=N Number of async ops cancelled
- rej=N Number of async ops rejected due to object lookup/create failure
- ini=N Number of async ops initialised
- dfr=N Number of async ops queued for deferred release
- rel=N Number of async ops released (should equal ini=N when idle)
- gc=N Number of deferred-release async ops garbage collected
- CacheOp alo=N Number of in-progress alloc_object() cache ops
- luo=N Number of in-progress lookup_object() cache ops
- luc=N Number of in-progress lookup_complete() cache ops
- gro=N Number of in-progress grab_object() cache ops
- upo=N Number of in-progress update_object() cache ops
- dro=N Number of in-progress drop_object() cache ops
- pto=N Number of in-progress put_object() cache ops
- syn=N Number of in-progress sync_cache() cache ops
- atc=N Number of in-progress attr_changed() cache ops
- rap=N Number of in-progress read_or_alloc_page() cache ops
- ras=N Number of in-progress read_or_alloc_pages() cache ops
- alp=N Number of in-progress allocate_page() cache ops
- als=N Number of in-progress allocate_pages() cache ops
- wrp=N Number of in-progress write_page() cache ops
- ucp=N Number of in-progress uncache_page() cache ops
- dsp=N Number of in-progress dissociate_pages() cache ops
- CacheEv nsp=N Number of object lookups/creations rejected due to lack of space
- stl=N Number of stale objects deleted
- rtr=N Number of objects retired when relinquished
- cul=N Number of objects culled
- (*) /proc/fs/fscache/histogram
- cat /proc/fs/fscache/histogram
- JIFS SECS OBJ INST OP RUNS OBJ RUNS RETRV DLY RETRIEVLS
- ===== ===== ========= ========= ========= ========= =========
- This shows the breakdown of the number of times each amount of time
- between 0 jiffies and HZ-1 jiffies a variety of tasks took to run. The
- columns are as follows:
- COLUMN TIME MEASUREMENT
- ======= =======================================================
- OBJ INST Length of time to instantiate an object
- OP RUNS Length of time a call to process an operation took
- OBJ RUNS Length of time a call to process an object event took
- RETRV DLY Time between an requesting a read and lookup completing
- RETRIEVLS Time between beginning and end of a retrieval
- Each row shows the number of events that took a particular range of times.
- Each step is 1 jiffy in size. The JIFS column indicates the particular
- jiffy range covered, and the SECS field the equivalent number of seconds.
- ===========
- OBJECT LIST
- ===========
- If CONFIG_FSCACHE_OBJECT_LIST is enabled, the FS-Cache facility will maintain a
- list of all the objects currently allocated and allow them to be viewed
- through:
- /proc/fs/fscache/objects
- This will look something like:
- [root@andromeda ~]# head /proc/fs/fscache/objects
- OBJECT PARENT STAT CHLDN OPS OOP IPR EX READS EM EV F S | NETFS_COOKIE_DEF TY FL NETFS_DATA OBJECT_KEY, AUX_DATA
- ======== ======== ==== ===== === === === == ===== == == = = | ================ == == ================ ================
- 17e4b 2 ACTV 0 0 0 0 0 0 7b 4 0 0 | NFS.fh DT 0 ffff88001dd82820 010006017edcf8bbc93b43298fdfbe71e50b57b13a172c0117f38472, e567634700000000000000000000000063f2404a000000000000000000000000c9030000000000000000000063f2404a
- 1693a 2 ACTV 0 0 0 0 0 0 7b 4 0 0 | NFS.fh DT 0 ffff88002db23380 010006017edcf8bbc93b43298fdfbe71e50b57b1e0162c01a2df0ea6, 420ebc4a000000000000000000000000420ebc4a0000000000000000000000000e1801000000000000000000420ebc4a
- where the first set of columns before the '|' describe the object:
- COLUMN DESCRIPTION
- ======= ===============================================================
- OBJECT Object debugging ID (appears as OBJ%x in some debug messages)
- PARENT Debugging ID of parent object
- STAT Object state
- CHLDN Number of child objects of this object
- OPS Number of outstanding operations on this object
- OOP Number of outstanding child object management operations
- IPR
- EX Number of outstanding exclusive operations
- READS Number of outstanding read operations
- EM Object's event mask
- EV Events raised on this object
- F Object flags
- S Object work item busy state mask (1:pending 2:running)
- and the second set of columns describe the object's cookie, if present:
- COLUMN DESCRIPTION
- =============== =======================================================
- NETFS_COOKIE_DEF Name of netfs cookie definition
- TY Cookie type (IX - index, DT - data, hex - special)
- FL Cookie flags
- NETFS_DATA Netfs private data stored in the cookie
- OBJECT_KEY Object key } 1 column, with separating comma
- AUX_DATA Object aux data } presence may be configured
- The data shown may be filtered by attaching the a key to an appropriate keyring
- before viewing the file. Something like:
- keyctl add user fscache:objlist <restrictions> @s
- where <restrictions> are a selection of the following letters:
- K Show hexdump of object key (don't show if not given)
- A Show hexdump of object aux data (don't show if not given)
- and the following paired letters:
- C Show objects that have a cookie
- c Show objects that don't have a cookie
- B Show objects that are busy
- b Show objects that aren't busy
- W Show objects that have pending writes
- w Show objects that don't have pending writes
- R Show objects that have outstanding reads
- r Show objects that don't have outstanding reads
- S Show objects that have work queued
- s Show objects that don't have work queued
- If neither side of a letter pair is given, then both are implied. For example:
- keyctl add user fscache:objlist KB @s
- shows objects that are busy, and lists their object keys, but does not dump
- their auxiliary data. It also implies "CcWwRrSs", but as 'B' is given, 'b' is
- not implied.
- By default all objects and all fields will be shown.
- =========
- DEBUGGING
- =========
- If CONFIG_FSCACHE_DEBUG is enabled, the FS-Cache facility can have runtime
- debugging enabled by adjusting the value in:
- /sys/module/fscache/parameters/debug
- This is a bitmask of debugging streams to enable:
- BIT VALUE STREAM POINT
- ======= ======= =============================== =======================
- 0 1 Cache management Function entry trace
- 1 2 Function exit trace
- 2 4 General
- 3 8 Cookie management Function entry trace
- 4 16 Function exit trace
- 5 32 General
- 6 64 Page handling Function entry trace
- 7 128 Function exit trace
- 8 256 General
- 9 512 Operation management Function entry trace
- 10 1024 Function exit trace
- 11 2048 General
- The appropriate set of values should be OR'd together and the result written to
- the control file. For example:
- echo $((1|8|64)) >/sys/module/fscache/parameters/debug
- will turn on all function entry debugging.
|