diff options
Diffstat (limited to 'man/lvmcache.7_main')
-rw-r--r-- | man/lvmcache.7_main | 701 |
1 files changed, 701 insertions, 0 deletions
diff --git a/man/lvmcache.7_main b/man/lvmcache.7_main new file mode 100644 index 0000000..3beb8de --- /dev/null +++ b/man/lvmcache.7_main @@ -0,0 +1,701 @@ +.TH "LVMCACHE" "7" "LVM TOOLS #VERSION#" "Red Hat, Inc" "\"" +. +.SH NAME +. +lvmcache \(em LVM caching +. +.SH DESCRIPTION +. +\fBlvm\fP(8) includes two kinds of caching that can be used to improve the +performance of a Logical Volume (LV). When caching, varying subsets of an +LV's data are temporarily stored on a smaller, faster device (e.g. an SSD) +to improve the performance of the LV. +.P +To do this with lvm, a new special LV is first created from the faster +device. This LV will hold the cache. Then, the new fast LV is attached to +the main LV by way of an lvconvert command. lvconvert inserts one of the +device mapper caching targets into the main LV's i/o path. The device +mapper target combines the main LV and fast LV into a hybrid device that looks +like the main LV, but has better performance. While the main LV is being +used, portions of its data will be temporarily and transparently stored on +the special fast LV. +.P +The two kinds of caching are: +.P +.IP \[bu] 2 +A read and write hot-spot cache, using the dm-cache kernel module. +This cache tracks access patterns and adjusts its content deliberately so +that commonly used parts of the main LV are likely to be found on the fast +storage. LVM refers to this using the LV type \fBcache\fP. +. +.IP \[bu] +A write cache, using the dm-writecache kernel module. This cache can be +used with SSD or PMEM devices to speed up all writes to the main LV. Data +read from the main LV is not stored in the cache, only newly written data. +LVM refers to this using the LV type \fBwritecache\fP. +. +.SH USAGE +. +.SS 1. Identify main LV that needs caching +. +The main LV may already exist, and is located on larger, slower devices. +A main LV would be created with a command like: +.P +# lvcreate -n main -L Size vg /dev/slow_hhd +. +.SS 2. Identify fast LV to use as the cache +. +A fast LV is created using one or more fast devices, like an SSD. This +special LV will be used to hold the cache: +.P +# lvcreate -n fast -L Size vg /dev/fast_ssd +.P +# lvs -a + LV Attr Type Devices + fast -wi------- linear /dev/fast_ssd + main -wi------- linear /dev/slow_hhd +.fi +. +.SS 3. Start caching the main LV +. +To start caching the main LV, convert the main LV to the desired caching +type, and specify the fast LV to use as the cache: +.P +using dm-cache (with cachepool): +.P +# lvconvert --type cache --cachepool fast vg/main +.P +using dm-cache (with cachevol): +.P +# lvconvert --type cache --cachevol fast vg/main +.P +using dm-writecache (with cachevol): +.P +# lvconvert --type writecache --cachevol fast vg/main +.P +For more alternatives see: +.br +dm-cache command shortcut +.br +dm-cache with separate data and metadata LVs +. +.SS 4. Display LVs +. +Once the fast LV has been attached to the main LV, lvm reports the main LV +type as either \fBcache\fP or \fBwritecache\fP depending on the type used. +While attached, the fast LV is hidden, and renamed with a _cvol or _cpool +suffix. It is displayed by lvs -a. The _corig or _wcorig LV represents +the original LV without the cache. +.sp +using dm-cache (with cachepool): +.P +# lvs -ao+devices +.nf + LV Pool Type Devices + main [fast_cpool] cache main_corig(0) + [fast_cpool] cache-pool fast_pool_cdata(0) + [fast_cpool_cdata] linear /dev/fast_ssd + [fast_cpool_cmeta] linear /dev/fast_ssd + [main_corig] linear /dev/slow_hhd +.fi +.sp +using dm-cache (with cachevol): +.P +# lvs -ao+devices +.P +.nf + LV Pool Type Devices + main [fast_cvol] cache main_corig(0) + [fast_cvol] linear /dev/fast_ssd + [main_corig] linear /dev/slow_hhd +.fi +.sp +using dm-writecache (with cachevol): +.P +# lvs -ao+devices +.P +.nf + LV Pool Type Devices + main [fast_cvol] writecache main_wcorig(0) + [fast_cvol] linear /dev/fast_ssd + [main_wcorig] linear /dev/slow_hhd +.fi +. +.SS 5. Use the main LV +. +Use the LV until the cache is no longer wanted, or needs to be changed. +. +.SS 6. Stop caching +. +To stop caching the main LV and also remove unneeded cache pool, +use the --uncache: +.P +# lvconvert --uncache vg/main +.P +# lvs -a +.nf + LV VG Attr Type Devices + main vg -wi------- linear /dev/slow_hhd +.fi +.P +To stop caching the main LV, separate the fast LV from the main LV. This +changes the type of the main LV back to what it was before the cache was +attached. +.P +# lvconvert --splitcache vg/main +.P +# lvs -a +.nf + LV VG Attr Type Devices + fast vg -wi------- linear /dev/fast_ssd + main vg -wi------- linear /dev/slow_hhd +.fi +. +.SS 7. Create a new LV with caching +. +A new LV can be created with caching attached at the time of creation +using the following command: +.P +.nf +# lvcreate --type cache|writecache -n Name -L Size + --cachedevice /dev/fast_ssd vg /dev/slow_hhd +.fi +.P +The main LV is created with the specified Name and Size from the slow_hhd. +A hidden fast LV is created on the fast_ssd and is then attached to the +new main LV. If the fast_ssd is unused, the entire disk will be used as +the cache unless the --cachesize option is used to specify a size for the +fast LV. The --cachedevice option can be repeated to use multiple disks +for the fast LV. +. +.SH OPTIONS +. +.SS option args +. +.B --cachepool +.IR CachePoolLV | LV +.P +Pass this option a cachepool LV or a standard LV. When using a cache +pool, lvm places cache data and cache metadata on different LVs. The two +LVs together are called a cache pool. This has a bit better performance +for dm-cache and permits specific placement and segment type selection +for data and metadata volumes. +A cache pool is represented as a special type of LV +that cannot be used directly. If a standard LV is passed with this +option, lvm will first convert it to a cache pool by combining it with +another LV to use for metadata. This option can be used with dm-cache. +.P +.B --cachevol +.I LV +.P +Pass this option a fast LV that should be used to hold the cache. With a +cachevol, cache data and metadata are stored in different parts of the +same fast LV. This option can be used with dm-writecache or dm-cache. +.P +.B --cachedevice +.I PV +.P +This option can be used in place of --cachevol, in which case a cachevol +LV will be created using the specified device. This option can be +repeated to create a cachevol using multiple devices, or a tag name can be +specified in which case the cachevol will be created using any of the +devices with the given tag. If a named cache device is unused, the entire +device will be used to create the cachevol. To create a cachevol of a +specific size from the cache devices, include the --cachesize option. +. +.SS dm-cache block size +. +A cache pool will have a logical block size of 4096 bytes if it is created +on a device with a logical block size of 4096 bytes. +.P +If a main LV has logical block size 512 (with an existing xfs file system +using that size), then it cannot use a cache pool with a 4096 logical +block size. If the cache pool is attached, the main LV will likely fail +to mount. +.P +To avoid this problem, use a mkfs option to specify a 4096 block size for +the file system, or attach the cache pool before running mkfs. +. +.SS dm-writecache block size +. +The dm-writecache block size can be 4096 bytes (the default), or 512 +bytes. The default 4096 has better performance and should be used except +when 512 is necessary for compatibility. The dm-writecache block size is +specified with --cachesettings block_size=4096|512 when caching is started. +.P +When a file system like xfs already exists on the main LV prior to +caching, and the file system is using a block size of 512, then the +writecache block size should be set to 512. (The file system will likely +fail to mount if writecache block size of 4096 is used in this case.) +.P +Check the xfs sector size while the fs is mounted: +.P +# xfs_info /dev/vg/main +.nf +Look for sectsz=512 or sectsz=4096 +.fi +.P +The writecache block size should be chosen to match the xfs sectsz value. +.P +It is also possible to specify a sector size of 4096 to mkfs.xfs when +creating the file system. In this case the writecache block size of 4096 +can be used. +.P +The writecache block size is displayed by the command: +.br +lvs -o writecacheblocksize VG/LV +.P +.SS dm-writecache memory usage +.P +The amount of main system memory used by dm-writecache can be a factor +when selecting the writecache cachevol size and the writecache block size. +.P +.IP \[bu] 2 +writecache block size 4096: each 100 GiB of writecache cachevol uses +slightly over 2 GiB of system memory. +.IP \[bu] 2 +writecache block size 512: each 100 GiB of writecache cachevol uses +a little over 16 GiB of system memory. +. +.SS dm-writecache settings +. +To specify dm-writecache tunable settings on the command line, use: +.br +--cachesettings 'option=N' or +.br +--cachesettings 'option1=N option2=N ...' +.P +For example, --cachesettings 'high_watermark=90 writeback_jobs=4'. +.P +To include settings when caching is started, run: +.P +.nf +# lvconvert --type writecache --cachevol fast \\ + --cachesettings 'option=N' vg/main +.fi +.P +To change settings for an existing writecache, run: +.P +.nf +# lvchange --cachesettings 'option=N' vg/main +.fi +.P +To clear all settings that have been applied, run: +.P +.nf +# lvchange --cachesettings '' vg/main +.fi +.P +To view the settings that are applied to a writecache LV, run: +.P +.nf +# lvs -o cachesettings vg/main +.fi +.P +Tunable settings are: +. +.TP +high_watermark = <percent> +Start writeback when the writecache usage reaches this percent (0-100). +. +.TP +low_watermark = <percent> +Stop writeback when the writecache usage reaches this percent (0-100). +. +.TP +writeback_jobs = <count> +Limit the number of blocks that are in flight during writeback. Setting +this value reduces writeback throughput, but it may improve latency of +read requests. +. +.TP +autocommit_blocks = <count> +When the application writes this amount of blocks without issuing the +FLUSH request, the blocks are automatically committed. +. +.TP +autocommit_time = <milliseconds> +The data is automatically committed if this time passes and no FLUSH +request is received. +. +.TP +fua = 0|1 +Use the FUA flag when writing data from persistent memory back to the +underlying device. +Applicable only to persistent memory. +. +.TP +nofua = 0|1 +Don't use the FUA flag when writing back data and send the FLUSH request +afterwards. Some underlying devices perform better with fua, some with +nofua. Testing is necessary to determine which. +Applicable only to persistent memory. +. +.TP +cleaner = 0|1 +Setting cleaner=1 enables the writecache cleaner mode in which data is +gradually flushed from the cache. If this is done prior to detaching the +writecache, then the splitcache command will have little or no flushing to +perform. If not done beforehand, the splitcache command enables the +cleaner mode and waits for flushing to complete before detaching the +writecache. Adding cleaner=0 to the splitcache command will skip the +cleaner mode, and any required flushing is performed in device suspend. +. +.TP +max_age = <milliseconds> +Specifies the maximum age of a block in milliseconds. If a block is stored in +the cache for too long, it will be written to the underlying device and cleaned up. +. +.TP +metadata_only = 0|1 +Only metadata is promoted to the cache. This option improves performance for +heavier REQ_META workloads. +. +.TP +pause_writeback = <milliseconds> +Pause writeback if there was some write I/O redirected to the origin volume in +the last number of milliseconds. + +.SS dm-writecache using metadata profiles +. +In addition to specifying writecache settings on the command line, they +can also be set in lvm.conf, or in a profile file, using the +allocation/cache_settings/writecache config structure shown below. +.P +It's possible to prepare a number of different profile files in the +\fI#DEFAULT_SYS_DIR#/profile\fP directory and specify the file name +of the profile when starting writecache. +.P +.I Example +.nf +# cat <<EOF > #DEFAULT_SYS_DIR#/profile/cache_writecache.profile +allocation { +.RS +cache_settings { +.RS +writecache { +.RS +high_watermark=60 +writeback_jobs=1024 +.RE +} +.RE +} +.RE +} +EOF +.P + +# lvcreate -an -L10G --name fast vg /dev/fast_ssd +# lvcreate --type writecache -L10G --name main --cachevol fast \\ + --metadataprofile cache_writecache vg /dev/slow_hdd +.fi +. +.SS dm-cache with separate data and metadata LVs +. +Preferred way of using dm-cache is to place the cache metadata and cache data +on separate LVs. To do this, a "cache pool" is created, which is a special +LV that references two sub LVs, one for data and one for metadata. +.P +To create a cache pool of given data size and let lvm2 calculate appropriate +metadata size: +.P +# lvcreate --type cache-pool -L DataSize -n fast vg /dev/fast_ssd1 +.P +To create a cache pool from separate LV and let lvm2 calculate +appropriate cache metadata size: +.P +# lvcreate -n fast -L DataSize vg /dev/fast_ssd1 +.br +# lvconvert --type cache-pool vg/fast /dev/fast_ssd1 +.br +.P +To create a cache pool from two separate LVs: +.P +# lvcreate -n fast -L DataSize vg /dev/fast_ssd1 +.br +# lvcreate -n fastmeta -L MetadataSize vg /dev/fast_ssd2 +.br +# lvconvert --type cache-pool --poolmetadata fastmeta vg/fast +.P +Then use the cache pool LV to start caching the main LV: +.P +# lvconvert --type cache --cachepool fast vg/main +.P +A variation of the same procedure automatically creates a cache pool when +caching is started. To do this, use a standard LV as the --cachepool +(this will hold cache data), and use another standard LV as the +--poolmetadata (this will hold cache metadata). LVM will create a +cache pool LV from the two specified LVs, and use the cache pool to start +caching the main LV. +.P +.nf +# lvcreate -n fast -L DataSize vg /dev/fast_ssd1 +# lvcreate -n fastmeta -L MetadataSize vg /dev/fast_ssd2 +# lvconvert --type cache --cachepool fast \\ + --poolmetadata fastmeta vg/main +.fi +. +.SS dm-cache cache modes +. +The default dm-cache cache mode is "writethrough". Writethrough ensures +that any data written will be stored both in the cache and on the origin +LV. The loss of a device associated with the cache in this case would not +mean the loss of any data. +.P +A second cache mode is "writeback". Writeback delays writing data blocks +from the cache back to the origin LV. This mode will increase +performance, but the loss of a cache device can result in lost data. +.P +With the --cachemode option, the cache mode can be set when caching is +started, or changed on an LV that is already cached. The current cache +mode can be displayed with the cache_mode reporting option: +.P +.B lvs -o+cache_mode VG/LV +.P +.BR lvm.conf (5) +.B allocation/cache_mode +.br +defines the default cache mode. +.P +.nf +# lvconvert --type cache --cachemode writethrough \\ + --cachepool fast vg/main +.P +# lvconvert --type cache --cachemode writethrough \\ + --cachevol fast vg/main +.nf +. +.SS dm-cache chunk size +. +The size of data blocks managed by dm-cache can be specified with the +--chunksize option when caching is started. The default unit is KiB. The +value must be a multiple of 32 KiB between 32 KiB and 1 GiB. Cache chunks +bigger then 512KiB shall be only used when necessary. +.P +Using a chunk size that is too large can result in wasteful use of the +cache, in which small reads and writes cause large sections of an LV to be +stored in the cache. It can also require increasing migration threshold +which defaults to 2048 sectors (1 MiB). Lvm2 ensures migration threshold is +at least 8 chunks in size. This may in some cases result in very +high bandwidth load of transferring data between the cache LV and its +cache origin LV. However, choosing a chunk size that is too small +can result in more overhead trying to manage the numerous chunks that +become mapped into the cache. Overhead can include both excessive CPU +time searching for chunks, and excessive memory tracking chunks. +.P +Command to display the chunk size: +.P +.B lvs -o+chunksize VG/LV +.P +.BR lvm.conf (5) +.B allocation/cache_pool_chunk_size +.P +controls the default chunk size. +.P +The default value is shown by: +.P +.B lvmconfig --type default allocation/cache_pool_chunk_size +.P +Checking migration threshold (in sectors) of running cached LV: +.br +.B lvs -o+kernel_cache_settings VG/LV +. +.SS dm-cache cache settings +. +To set dm-cache cache setting use: +.P +--cachesettings 'option1=N option2=N ...' +.P +To unset/drop cache setting and restore its default kernel value +use special keyword 'default' as option parameter: +.P +--cachesettings 'option1=default option2=default ...' +. +.SS dm-cache migration threshold cache setting +. +Migrating data between the origin and cache LV uses bandwidth. +The user can set a throttle to prevent more than a certain amount of +migration occurring at any one time. Currently dm-cache is not taking any +account of normal io traffic going to the devices. +.P +User can set migration threshold via cache policy settings as +"migration_threshold=<#sectors>" to set the maximum number +of sectors being migrated, the default being 2048 sectors (1 MiB) +or 8 cache chunks whichever of those two values is larger. +.P +Command to set migration threshold to 2 MiB (4096 sectors): +.P +.B lvcreate --cachesettings 'migration_threshold=4096' VG/LV +.P +Command to display the migration threshold: +.P +.B lvs -o+kernel_cache_settings,cache_settings VG/LV +.br +.B lvs -o+chunksize VG/LV +.P +Command to restore/revert to default value: +.P +.B lvchange --cachesettings 'migration_threshold=default' VG/LV +. +.SS dm-cache cache policy +. +The dm-cache subsystem has additional per-LV parameters: the cache policy +to use, and possibly tunable parameters for the cache policy. Three +policies are currently available: "smq" is the default policy, "mq" is an +older implementation, and "cleaner" is used to force the cache to write +back (flush) all cached writes to the origin LV. +.P +The older "mq" policy has a number of tunable parameters. The defaults are +chosen to be suitable for the majority of systems, but in special +circumstances, changing the settings can improve performance. +Newer kernels however alias this policy with "smq" policy. Cache settings +used to configure "mq" policy [random_threshold, sequential_threshold, +discard_promote_adjustment, read_promote_adjustment, +write_promote_adjustment] are thus silently ignored also performance +matches "smq". +.P +With the --cachepolicy and --cachesettings options, the cache policy and +settings can be set when caching is started, or changed on an existing +cached LV (both options can be used together). The current cache policy +and settings can be displayed with the cache_policy and cache_settings +reporting options: +.P +.B lvs -o+cache_policy,cache_settings VG/LV +.P +Change the cache policy and settings of an existing LV. +.nf +# lvchange --cachepolicy mq --cachesettings \\ + \(aqmigration_threshold=2048 random_threshold=4\(aq vg/main +.fi +.P +.BR lvm.conf (5) +.B allocation/cache_policy +.br +defines the default cache policy. +.P +.BR lvm.conf (5) +.B allocation/cache_settings +.br +defines the default cache settings. +. +.SS dm-cache using metadata profiles +. +Cache pools allows to set a variety of options. Lots of these settings +can be specified in lvm.conf or profile settings. You can prepare +a number of different profiles in the \fI#DEFAULT_SYS_DIR#/profile\fP directory +and just specify the metadata profile file name when caching LV or creating cache-pool. +Check the output of \fBlvmconfig --type default --withcomments\fP +for a detailed description of all individual cache settings. +.P +.I Example +.nf +# cat <<EOF > #DEFAULT_SYS_DIR#/profile/cache_big_chunk.profile +allocation { +.RS +cache_pool_metadata_require_separate_pvs=0 +cache_pool_chunk_size=512 +cache_metadata_format=2 +cache_mode="writethrough" +cache_policy="smq" +cache_settings { +.RS +smq { +.RS +migration_threshold=8192 +.RE +} +.RE +} +.RE +} +EOF +.P + +# lvcreate --cache -L10G --metadataprofile cache_big_chunk vg/main \\ + /dev/fast_ssd +# lvcreate --cache -L10G vg/main --config \\ + 'allocation/cache_pool_chunk_size=512' /dev/fast_ssd +.fi +. +.SS dm-cache spare metadata LV +. +See +.BR lvmthin (7) +for a description of the "pool metadata spare" LV. +The same concept is used for cache pools. +. +.SS dm-cache metadata formats +. +There are two disk formats for dm-cache metadata. The metadata format can +be specified with --cachemetadataformat when caching is started, and +cannot be changed. Format \fB2\fP has better performance; it is more +compact, and stores dirty bits in a separate btree, which improves the +speed of shutting down the cache. With \fBauto\fP, lvm selects the best +option provided by the current dm-cache kernel module. +. +.SS RAID1 cache device +. +RAID1 can be used to create the fast LV holding the cache so that it can +tolerate a device failure. (When using dm-cache with separate data +and metadata LVs, each of the sub-LVs can use RAID1.) +.P +.nf +# lvcreate -n main -L Size vg /dev/slow +# lvcreate --type raid1 -m 1 -n fast -L Size vg /dev/ssd1 /dev/ssd2 +# lvconvert --type cache --cachevol fast vg/main +.fi +. +.SS dm-cache command shortcut +. +A single command can be used to cache main LV with automatic +creation of a cache-pool: +.P +.nf +# lvcreate --cache --size CacheDataSize VG/LV [FastPVs] +.fi +.P +or the longer variant +.P +.nf +# lvcreate --type cache --size CacheDataSize \\ + --name NameCachePool VG/LV [FastPVs] +.fi +.P +In this command, the specified LV already exists, and is the main LV to be +cached. The command creates a new cache pool with size and given name +or the name is automatically selected from a sequence lvolX_cpool, +using the optionally specified fast PV(s) (typically an ssd). Then it +attaches the new cache pool to the existing main LV to begin caching. +.P +(Note: ensure that the specified main LV is a standard LV. If a cache +pool LV is mistakenly specified, then the command does something +different.) +.P +(Note: the type option is interpreted differently by this command than by +normal lvcreate commands in which --type specifies the type of the newly +created LV. In this case, an LV with type cache-pool is being created, +and the existing main LV is being converted to type cache.) +. +.SH SEE ALSO +. +.nh +.ad l +.BR lvm.conf (5), +.BR lvchange (8), +.BR lvcreate (8), +.BR lvdisplay (8), +.BR lvextend (8), +.BR lvremove (8), +.BR lvrename (8), +.BR lvresize (8), +.BR lvs (8), +.br +.BR vgchange (8), +.BR vgmerge (8), +.BR vgreduce (8), +.BR vgsplit (8), +.P +.BR cache_check (8), +.BR cache_dump (8), +.BR cache_repair (8) |