123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270 |
- NILFS2
- ------
- NILFS2 is a log-structured file system (LFS) supporting continuous
- snapshotting. In addition to versioning capability of the entire file
- system, users can even restore files mistakenly overwritten or
- destroyed just a few seconds ago. Since NILFS2 can keep consistency
- like conventional LFS, it achieves quick recovery after system
- crashes.
- NILFS2 creates a number of checkpoints every few seconds or per
- synchronous write basis (unless there is no change). Users can select
- significant versions among continuously created checkpoints, and can
- change them into snapshots which will be preserved until they are
- changed back to checkpoints.
- There is no limit on the number of snapshots until the volume gets
- full. Each snapshot is mountable as a read-only file system
- concurrently with its writable mount, and this feature is convenient
- for online backup.
- The userland tools are included in nilfs-utils package, which is
- available from the following download page. At least "mkfs.nilfs2",
- "mount.nilfs2", "umount.nilfs2", and "nilfs_cleanerd" (so called
- cleaner or garbage collector) are required. Details on the tools are
- described in the man pages included in the package.
- Project web page: http://nilfs.sourceforge.net/
- Download page: http://nilfs.sourceforge.net/en/download.html
- List info: http://vger.kernel.org/vger-lists.html#linux-nilfs
- Caveats
- =======
- Features which NILFS2 does not support yet:
- - atime
- - extended attributes
- - POSIX ACLs
- - quotas
- - fsck
- - defragmentation
- Mount options
- =============
- NILFS2 supports the following mount options:
- (*) == default
- barrier(*) This enables/disables the use of write barriers. This
- nobarrier requires an IO stack which can support barriers, and
- if nilfs gets an error on a barrier write, it will
- disable again with a warning.
- errors=continue Keep going on a filesystem error.
- errors=remount-ro(*) Remount the filesystem read-only on an error.
- errors=panic Panic and halt the machine if an error occurs.
- cp=n Specify the checkpoint-number of the snapshot to be
- mounted. Checkpoints and snapshots are listed by lscp
- user command. Only the checkpoints marked as snapshot
- are mountable with this option. Snapshot is read-only,
- so a read-only mount option must be specified together.
- order=relaxed(*) Apply relaxed order semantics that allows modified data
- blocks to be written to disk without making a
- checkpoint if no metadata update is going. This mode
- is equivalent to the ordered data mode of the ext3
- filesystem except for the updates on data blocks still
- conserve atomicity. This will improve synchronous
- write performance for overwriting.
- order=strict Apply strict in-order semantics that preserves sequence
- of all file operations including overwriting of data
- blocks. That means, it is guaranteed that no
- overtaking of events occurs in the recovered file
- system after a crash.
- norecovery Disable recovery of the filesystem on mount.
- This disables every write access on the device for
- read-only mounts or snapshots. This option will fail
- for r/w mounts on an unclean volume.
- discard This enables/disables the use of discard/TRIM commands.
- nodiscard(*) The discard/TRIM commands are sent to the underlying
- block device when blocks are freed. This is useful
- for SSD devices and sparse/thinly-provisioned LUNs.
- Ioctls
- ======
- There is some NILFS2 specific functionality which can be accessed by applications
- through the system call interfaces. The list of all NILFS2 specific ioctls are
- shown in the table below.
- Table of NILFS2 specific ioctls
- ..............................................................................
- Ioctl Description
- NILFS_IOCTL_CHANGE_CPMODE Change mode of given checkpoint between
- checkpoint and snapshot state. This ioctl is
- used in chcp and mkcp utilities.
- NILFS_IOCTL_DELETE_CHECKPOINT Remove checkpoint from NILFS2 file system.
- This ioctl is used in rmcp utility.
- NILFS_IOCTL_GET_CPINFO Return info about requested checkpoints. This
- ioctl is used in lscp utility and by
- nilfs_cleanerd daemon.
- NILFS_IOCTL_GET_CPSTAT Return checkpoints statistics. This ioctl is
- used by lscp, rmcp utilities and by
- nilfs_cleanerd daemon.
- NILFS_IOCTL_GET_SUINFO Return segment usage info about requested
- segments. This ioctl is used in lssu,
- nilfs_resize utilities and by nilfs_cleanerd
- daemon.
- NILFS_IOCTL_SET_SUINFO Modify segment usage info of requested
- segments. This ioctl is used by
- nilfs_cleanerd daemon to skip unnecessary
- cleaning operation of segments and reduce
- performance penalty or wear of flash device
- due to redundant move of in-use blocks.
- NILFS_IOCTL_GET_SUSTAT Return segment usage statistics. This ioctl
- is used in lssu, nilfs_resize utilities and
- by nilfs_cleanerd daemon.
- NILFS_IOCTL_GET_VINFO Return information on virtual block addresses.
- This ioctl is used by nilfs_cleanerd daemon.
- NILFS_IOCTL_GET_BDESCS Return information about descriptors of disk
- block numbers. This ioctl is used by
- nilfs_cleanerd daemon.
- NILFS_IOCTL_CLEAN_SEGMENTS Do garbage collection operation in the
- environment of requested parameters from
- userspace. This ioctl is used by
- nilfs_cleanerd daemon.
- NILFS_IOCTL_SYNC Make a checkpoint. This ioctl is used in
- mkcp utility.
- NILFS_IOCTL_RESIZE Resize NILFS2 volume. This ioctl is used
- by nilfs_resize utility.
- NILFS_IOCTL_SET_ALLOC_RANGE Define lower limit of segments in bytes and
- upper limit of segments in bytes. This ioctl
- is used by nilfs_resize utility.
- NILFS2 usage
- ============
- To use nilfs2 as a local file system, simply:
- # mkfs -t nilfs2 /dev/block_device
- # mount -t nilfs2 /dev/block_device /dir
- This will also invoke the cleaner through the mount helper program
- (mount.nilfs2).
- Checkpoints and snapshots are managed by the following commands.
- Their manpages are included in the nilfs-utils package above.
- lscp list checkpoints or snapshots.
- mkcp make a checkpoint or a snapshot.
- chcp change an existing checkpoint to a snapshot or vice versa.
- rmcp invalidate specified checkpoint(s).
- To mount a snapshot,
- # mount -t nilfs2 -r -o cp=<cno> /dev/block_device /snap_dir
- where <cno> is the checkpoint number of the snapshot.
- To unmount the NILFS2 mount point or snapshot, simply:
- # umount /dir
- Then, the cleaner daemon is automatically shut down by the umount
- helper program (umount.nilfs2).
- Disk format
- ===========
- A nilfs2 volume is equally divided into a number of segments except
- for the super block (SB) and segment #0. A segment is the container
- of logs. Each log is composed of summary information blocks, payload
- blocks, and an optional super root block (SR):
- ______________________________________________________
- | |SB| | Segment | Segment | Segment | ... | Segment | |
- |_|__|_|____0____|____1____|____2____|_____|____N____|_|
- 0 +1K +4K +8M +16M +24M +(8MB x N)
- . . (Typical offsets for 4KB-block)
- . .
- .______________________.
- | log | log |... | log |
- |__1__|__2__|____|__m__|
- . .
- . .
- . .
- .______________________________.
- | Summary | Payload blocks |SR|
- |_blocks__|_________________|__|
- The payload blocks are organized per file, and each file consists of
- data blocks and B-tree node blocks:
- |<--- File-A --->|<--- File-B --->|
- _______________________________________________________________
- | Data blocks | B-tree blocks | Data blocks | B-tree blocks | ...
- _|_____________|_______________|_____________|_______________|_
- Since only the modified blocks are written in the log, it may have
- files without data blocks or B-tree node blocks.
- The organization of the blocks is recorded in the summary information
- blocks, which contains a header structure (nilfs_segment_summary), per
- file structures (nilfs_finfo), and per block structures (nilfs_binfo):
- _________________________________________________________________________
- | Summary | finfo | binfo | ... | binfo | finfo | binfo | ... | binfo |...
- |_blocks__|___A___|_(A,1)_|_____|(A,Na)_|___B___|_(B,1)_|_____|(B,Nb)_|___
- The logs include regular files, directory files, symbolic link files
- and several meta data files. The mata data files are the files used
- to maintain file system meta data. The current version of NILFS2 uses
- the following meta data files:
- 1) Inode file (ifile) -- Stores on-disk inodes
- 2) Checkpoint file (cpfile) -- Stores checkpoints
- 3) Segment usage file (sufile) -- Stores allocation state of segments
- 4) Data address translation file -- Maps virtual block numbers to usual
- (DAT) block numbers. This file serves to
- make on-disk blocks relocatable.
- The following figure shows a typical organization of the logs:
- _________________________________________________________________________
- | Summary | regular file | file | ... | ifile | cpfile | sufile | DAT |SR|
- |_blocks__|_or_directory_|_______|_____|_______|________|________|_____|__|
- To stride over segment boundaries, this sequence of files may be split
- into multiple logs. The sequence of logs that should be treated as
- logically one log, is delimited with flags marked in the segment
- summary. The recovery code of nilfs2 looks this boundary information
- to ensure atomicity of updates.
- The super root block is inserted for every checkpoints. It includes
- three special inodes, inodes for the DAT, cpfile, and sufile. Inodes
- of regular files, directories, symlinks and other special files, are
- included in the ifile. The inode of ifile itself is included in the
- corresponding checkpoint entry in the cpfile. Thus, the hierarchy
- among NILFS2 files can be depicted as follows:
- Super block (SB)
- |
- v
- Super root block (the latest cno=xx)
- |-- DAT
- |-- sufile
- `-- cpfile
- |-- ifile (cno=c1)
- |-- ifile (cno=c2) ---- file (ino=i1)
- : : |-- file (ino=i2)
- `-- ifile (cno=xx) |-- file (ino=i3)
- : :
- `-- file (ino=yy)
- ( regular file, directory, or symlink )
- For detail on the format of each file, please see include/linux/nilfs2_fs.h.
|