123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229 |
- ============
- Fiemap Ioctl
- ============
- The fiemap ioctl is an efficient method for userspace to get file
- extent mappings. Instead of block-by-block mapping (such as bmap), fiemap
- returns a list of extents.
- Request Basics
- --------------
- A fiemap request is encoded within struct fiemap:
- struct fiemap {
- __u64 fm_start; /* logical offset (inclusive) at
- * which to start mapping (in) */
- __u64 fm_length; /* logical length of mapping which
- * userspace cares about (in) */
- __u32 fm_flags; /* FIEMAP_FLAG_* flags for request (in/out) */
- __u32 fm_mapped_extents; /* number of extents that were
- * mapped (out) */
- __u32 fm_extent_count; /* size of fm_extents array (in) */
- __u32 fm_reserved;
- struct fiemap_extent fm_extents[0]; /* array of mapped extents (out) */
- };
- fm_start, and fm_length specify the logical range within the file
- which the process would like mappings for. Extents returned mirror
- those on disk - that is, the logical offset of the 1st returned extent
- may start before fm_start, and the range covered by the last returned
- extent may end after fm_length. All offsets and lengths are in bytes.
- Certain flags to modify the way in which mappings are looked up can be
- set in fm_flags. If the kernel doesn't understand some particular
- flags, it will return EBADR and the contents of fm_flags will contain
- the set of flags which caused the error. If the kernel is compatible
- with all flags passed, the contents of fm_flags will be unmodified.
- It is up to userspace to determine whether rejection of a particular
- flag is fatal to its operation. This scheme is intended to allow the
- fiemap interface to grow in the future but without losing
- compatibility with old software.
- fm_extent_count specifies the number of elements in the fm_extents[] array
- that can be used to return extents. If fm_extent_count is zero, then the
- fm_extents[] array is ignored (no extents will be returned), and the
- fm_mapped_extents count will hold the number of extents needed in
- fm_extents[] to hold the file's current mapping. Note that there is
- nothing to prevent the file from changing between calls to FIEMAP.
- The following flags can be set in fm_flags:
- * FIEMAP_FLAG_SYNC
- If this flag is set, the kernel will sync the file before mapping extents.
- * FIEMAP_FLAG_XATTR
- If this flag is set, the extents returned will describe the inodes
- extended attribute lookup tree, instead of its data tree.
- Extent Mapping
- --------------
- Extent information is returned within the embedded fm_extents array
- which userspace must allocate along with the fiemap structure. The
- number of elements in the fiemap_extents[] array should be passed via
- fm_extent_count. The number of extents mapped by kernel will be
- returned via fm_mapped_extents. If the number of fiemap_extents
- allocated is less than would be required to map the requested range,
- the maximum number of extents that can be mapped in the fm_extent[]
- array will be returned and fm_mapped_extents will be equal to
- fm_extent_count. In that case, the last extent in the array will not
- complete the requested range and will not have the FIEMAP_EXTENT_LAST
- flag set (see the next section on extent flags).
- Each extent is described by a single fiemap_extent structure as
- returned in fm_extents.
- struct fiemap_extent {
- __u64 fe_logical; /* logical offset in bytes for the start of
- * the extent */
- __u64 fe_physical; /* physical offset in bytes for the start
- * of the extent */
- __u64 fe_length; /* length in bytes for the extent */
- __u64 fe_reserved64[2];
- __u32 fe_flags; /* FIEMAP_EXTENT_* flags for this extent */
- __u32 fe_reserved[3];
- };
- All offsets and lengths are in bytes and mirror those on disk. It is valid
- for an extents logical offset to start before the request or its logical
- length to extend past the request. Unless FIEMAP_EXTENT_NOT_ALIGNED is
- returned, fe_logical, fe_physical, and fe_length will be aligned to the
- block size of the file system. With the exception of extents flagged as
- FIEMAP_EXTENT_MERGED, adjacent extents will not be merged.
- The fe_flags field contains flags which describe the extent returned.
- A special flag, FIEMAP_EXTENT_LAST is always set on the last extent in
- the file so that the process making fiemap calls can determine when no
- more extents are available, without having to call the ioctl again.
- Some flags are intentionally vague and will always be set in the
- presence of other more specific flags. This way a program looking for
- a general property does not have to know all existing and future flags
- which imply that property.
- For example, if FIEMAP_EXTENT_DATA_INLINE or FIEMAP_EXTENT_DATA_TAIL
- are set, FIEMAP_EXTENT_NOT_ALIGNED will also be set. A program looking
- for inline or tail-packed data can key on the specific flag. Software
- which simply cares not to try operating on non-aligned extents
- however, can just key on FIEMAP_EXTENT_NOT_ALIGNED, and not have to
- worry about all present and future flags which might imply unaligned
- data. Note that the opposite is not true - it would be valid for
- FIEMAP_EXTENT_NOT_ALIGNED to appear alone.
- * FIEMAP_EXTENT_LAST
- This is the last extent in the file. A mapping attempt past this
- extent will return nothing.
- * FIEMAP_EXTENT_UNKNOWN
- The location of this extent is currently unknown. This may indicate
- the data is stored on an inaccessible volume or that no storage has
- been allocated for the file yet.
- * FIEMAP_EXTENT_DELALLOC
- - This will also set FIEMAP_EXTENT_UNKNOWN.
- Delayed allocation - while there is data for this extent, its
- physical location has not been allocated yet.
- * FIEMAP_EXTENT_ENCODED
- This extent does not consist of plain filesystem blocks but is
- encoded (e.g. encrypted or compressed). Reading the data in this
- extent via I/O to the block device will have undefined results.
- Note that it is *always* undefined to try to update the data
- in-place by writing to the indicated location without the
- assistance of the filesystem, or to access the data using the
- information returned by the FIEMAP interface while the filesystem
- is mounted. In other words, user applications may only read the
- extent data via I/O to the block device while the filesystem is
- unmounted, and then only if the FIEMAP_EXTENT_ENCODED flag is
- clear; user applications must not try reading or writing to the
- filesystem via the block device under any other circumstances.
- * FIEMAP_EXTENT_DATA_ENCRYPTED
- - This will also set FIEMAP_EXTENT_ENCODED
- The data in this extent has been encrypted by the file system.
- * FIEMAP_EXTENT_NOT_ALIGNED
- Extent offsets and length are not guaranteed to be block aligned.
- * FIEMAP_EXTENT_DATA_INLINE
- This will also set FIEMAP_EXTENT_NOT_ALIGNED
- Data is located within a meta data block.
- * FIEMAP_EXTENT_DATA_TAIL
- This will also set FIEMAP_EXTENT_NOT_ALIGNED
- Data is packed into a block with data from other files.
- * FIEMAP_EXTENT_UNWRITTEN
- Unwritten extent - the extent is allocated but its data has not been
- initialized. This indicates the extent's data will be all zero if read
- through the filesystem but the contents are undefined if read directly from
- the device.
- * FIEMAP_EXTENT_MERGED
- This will be set when a file does not support extents, i.e., it uses a block
- based addressing scheme. Since returning an extent for each block back to
- userspace would be highly inefficient, the kernel will try to merge most
- adjacent blocks into 'extents'.
- VFS -> File System Implementation
- ---------------------------------
- File systems wishing to support fiemap must implement a ->fiemap callback on
- their inode_operations structure. The fs ->fiemap call is responsible for
- defining its set of supported fiemap flags, and calling a helper function on
- each discovered extent:
- struct inode_operations {
- ...
- int (*fiemap)(struct inode *, struct fiemap_extent_info *, u64 start,
- u64 len);
- ->fiemap is passed struct fiemap_extent_info which describes the
- fiemap request:
- struct fiemap_extent_info {
- unsigned int fi_flags; /* Flags as passed from user */
- unsigned int fi_extents_mapped; /* Number of mapped extents */
- unsigned int fi_extents_max; /* Size of fiemap_extent array */
- struct fiemap_extent *fi_extents_start; /* Start of fiemap_extent array */
- };
- It is intended that the file system should not need to access any of this
- structure directly. Filesystem handlers should be tolerant to signals and return
- EINTR once fatal signal received.
- Flag checking should be done at the beginning of the ->fiemap callback via the
- fiemap_check_flags() helper:
- int fiemap_check_flags(struct fiemap_extent_info *fieinfo, u32 fs_flags);
- The struct fieinfo should be passed in as received from ioctl_fiemap(). The
- set of fiemap flags which the fs understands should be passed via fs_flags. If
- fiemap_check_flags finds invalid user flags, it will place the bad values in
- fieinfo->fi_flags and return -EBADR. If the file system gets -EBADR, from
- fiemap_check_flags(), it should immediately exit, returning that error back to
- ioctl_fiemap().
- For each extent in the request range, the file system should call
- the helper function, fiemap_fill_next_extent():
- int fiemap_fill_next_extent(struct fiemap_extent_info *info, u64 logical,
- u64 phys, u64 len, u32 flags, u32 dev);
- fiemap_fill_next_extent() will use the passed values to populate the
- next free extent in the fm_extents array. 'General' extent flags will
- automatically be set from specific flags on behalf of the calling file
- system so that the userspace API is not broken.
- fiemap_fill_next_extent() returns 0 on success, and 1 when the
- user-supplied fm_extents array is full. If an error is encountered
- while copying the extent to user memory, -EFAULT will be returned.
|