Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Documentation: Document the blk-crypto framework

The blk-crypto framework adds support for inline encryption. There are
numerous changes throughout the storage stack. This patch documents the
main design choices in the block layer, the API presented to users of
the block layer (like fscrypt or layered devices) and the API presented
to drivers for adding support for inline encryption.

Signed-off-by: Satya Tangirala <satyat@google.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

authored by

Satya Tangirala and committed by
Jens Axboe
54b259f6 81ca627a

+264
+1
Documentation/block/index.rst
··· 14 14 cmdline-partition 15 15 data-integrity 16 16 deadline-iosched 17 + inline-encryption 17 18 ioprio 18 19 kyber-iosched 19 20 null_blk
+263
Documentation/block/inline-encryption.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0 2 + 3 + ================= 4 + Inline Encryption 5 + ================= 6 + 7 + Background 8 + ========== 9 + 10 + Inline encryption hardware sits logically between memory and the disk, and can 11 + en/decrypt data as it goes in/out of the disk. Inline encryption hardware has a 12 + fixed number of "keyslots" - slots into which encryption contexts (i.e. the 13 + encryption key, encryption algorithm, data unit size) can be programmed by the 14 + kernel at any time. Each request sent to the disk can be tagged with the index 15 + of a keyslot (and also a data unit number to act as an encryption tweak), and 16 + the inline encryption hardware will en/decrypt the data in the request with the 17 + encryption context programmed into that keyslot. This is very different from 18 + full disk encryption solutions like self encrypting drives/TCG OPAL/ATA 19 + Security standards, since with inline encryption, any block on disk could be 20 + encrypted with any encryption context the kernel chooses. 21 + 22 + 23 + Objective 24 + ========= 25 + 26 + We want to support inline encryption (IE) in the kernel. 27 + To allow for testing, we also want a crypto API fallback when actual 28 + IE hardware is absent. We also want IE to work with layered devices 29 + like dm and loopback (i.e. we want to be able to use the IE hardware 30 + of the underlying devices if present, or else fall back to crypto API 31 + en/decryption). 32 + 33 + 34 + Constraints and notes 35 + ===================== 36 + 37 + - IE hardware has a limited number of "keyslots" that can be programmed 38 + with an encryption context (key, algorithm, data unit size, etc.) at any time. 39 + One can specify a keyslot in a data request made to the device, and the 40 + device will en/decrypt the data using the encryption context programmed into 41 + that specified keyslot. When possible, we want to make multiple requests with 42 + the same encryption context share the same keyslot. 43 + 44 + - We need a way for upper layers like filesystems to specify an encryption 45 + context to use for en/decrypting a struct bio, and a device driver (like UFS) 46 + needs to be able to use that encryption context when it processes the bio. 47 + 48 + - We need a way for device drivers to expose their inline encryption 49 + capabilities in a unified way to the upper layers. 50 + 51 + 52 + Design 53 + ====== 54 + 55 + We add a :c:type:`struct bio_crypt_ctx` to :c:type:`struct bio` that can 56 + represent an encryption context, because we need to be able to pass this 57 + encryption context from the upper layers (like the fs layer) to the 58 + device driver to act upon. 59 + 60 + While IE hardware works on the notion of keyslots, the FS layer has no 61 + knowledge of keyslots - it simply wants to specify an encryption context to 62 + use while en/decrypting a bio. 63 + 64 + We introduce a keyslot manager (KSM) that handles the translation from 65 + encryption contexts specified by the FS to keyslots on the IE hardware. 66 + This KSM also serves as the way IE hardware can expose its capabilities to 67 + upper layers. The generic mode of operation is: each device driver that wants 68 + to support IE will construct a KSM and set it up in its struct request_queue. 69 + Upper layers that want to use IE on this device can then use this KSM in 70 + the device's struct request_queue to translate an encryption context into 71 + a keyslot. The presence of the KSM in the request queue shall be used to mean 72 + that the device supports IE. 73 + 74 + The KSM uses refcounts to track which keyslots are idle (either they have no 75 + encryption context programmed, or there are no in-flight struct bios 76 + referencing that keyslot). When a new encryption context needs a keyslot, it 77 + tries to find a keyslot that has already been programmed with the same 78 + encryption context, and if there is no such keyslot, it evicts the least 79 + recently used idle keyslot and programs the new encryption context into that 80 + one. If no idle keyslots are available, then the caller will sleep until there 81 + is at least one. 82 + 83 + 84 + blk-mq changes, other block layer changes and blk-crypto-fallback 85 + ================================================================= 86 + 87 + We add a pointer to a ``bi_crypt_context`` and ``keyslot`` to 88 + :c:type:`struct request`. These will be referred to as the ``crypto fields`` 89 + for the request. This ``keyslot`` is the keyslot into which the 90 + ``bi_crypt_context`` has been programmed in the KSM of the ``request_queue`` 91 + that this request is being sent to. 92 + 93 + We introduce ``block/blk-crypto-fallback.c``, which allows upper layers to remain 94 + blissfully unaware of whether or not real inline encryption hardware is present 95 + underneath. When a bio is submitted with a target ``request_queue`` that doesn't 96 + support the encryption context specified with the bio, the block layer will 97 + en/decrypt the bio with the blk-crypto-fallback. 98 + 99 + If the bio is a ``WRITE`` bio, a bounce bio is allocated, and the data in the bio 100 + is encrypted stored in the bounce bio - blk-mq will then proceed to process the 101 + bounce bio as if it were not encrypted at all (except when blk-integrity is 102 + concerned). ``blk-crypto-fallback`` sets the bounce bio's ``bi_end_io`` to an 103 + internal function that cleans up the bounce bio and ends the original bio. 104 + 105 + If the bio is a ``READ`` bio, the bio's ``bi_end_io`` (and also ``bi_private``) 106 + is saved and overwritten by ``blk-crypto-fallback`` to 107 + ``bio_crypto_fallback_decrypt_bio``. The bio's ``bi_crypt_context`` is also 108 + overwritten with ``NULL``, so that to the rest of the stack, the bio looks 109 + as if it was a regular bio that never had an encryption context specified. 110 + ``bio_crypto_fallback_decrypt_bio`` will decrypt the bio, restore the original 111 + ``bi_end_io`` (and also ``bi_private``) and end the bio again. 112 + 113 + Regardless of whether real inline encryption hardware is used or the 114 + blk-crypto-fallback is used, the ciphertext written to disk (and hence the 115 + on-disk format of data) will be the same (assuming the hardware's implementation 116 + of the algorithm being used adheres to spec and functions correctly). 117 + 118 + If a ``request queue``'s inline encryption hardware claimed to support the 119 + encryption context specified with a bio, then it will not be handled by the 120 + ``blk-crypto-fallback``. We will eventually reach a point in blk-mq when a 121 + :c:type:`struct request` needs to be allocated for that bio. At that point, 122 + blk-mq tries to program the encryption context into the ``request_queue``'s 123 + keyslot_manager, and obtain a keyslot, which it stores in its newly added 124 + ``keyslot`` field. This keyslot is released when the request is completed. 125 + 126 + When the first bio is added to a request, ``blk_crypto_rq_bio_prep`` is called, 127 + which sets the request's ``crypt_ctx`` to a copy of the bio's 128 + ``bi_crypt_context``. bio_crypt_do_front_merge is called whenever a subsequent 129 + bio is merged to the front of the request, which updates the ``crypt_ctx`` of 130 + the request so that it matches the newly merged bio's ``bi_crypt_context``. In particular, the request keeps a copy of the ``bi_crypt_context`` of the first 131 + bio in its bio-list (blk-mq needs to be careful to maintain this invariant 132 + during bio and request merges). 133 + 134 + To make it possible for inline encryption to work with request queue based 135 + layered devices, when a request is cloned, its ``crypto fields`` are cloned as 136 + well. When the cloned request is submitted, blk-mq programs the 137 + ``bi_crypt_context`` of the request into the clone's request_queue's keyslot 138 + manager, and stores the returned keyslot in the clone's ``keyslot``. 139 + 140 + 141 + API presented to users of the block layer 142 + ========================================= 143 + 144 + ``struct blk_crypto_key`` represents a crypto key (the raw key, size of the 145 + key, the crypto algorithm to use, the data unit size to use, and the number of 146 + bytes required to represent data unit numbers that will be specified with the 147 + ``bi_crypt_context``). 148 + 149 + ``blk_crypto_init_key`` allows upper layers to initialize such a 150 + ``blk_crypto_key``. 151 + 152 + ``bio_crypt_set_ctx`` should be called on any bio that a user of 153 + the block layer wants en/decrypted via inline encryption (or the 154 + blk-crypto-fallback, if hardware support isn't available for the desired 155 + crypto configuration). This function takes the ``blk_crypto_key`` and the 156 + data unit number (DUN) to use when en/decrypting the bio. 157 + 158 + ``blk_crypto_config_supported`` allows upper layers to query whether or not the 159 + an encryption context passed to request queue can be handled by blk-crypto 160 + (either by real inline encryption hardware, or by the blk-crypto-fallback). 161 + This is useful e.g. when blk-crypto-fallback is disabled, and the upper layer 162 + wants to use an algorithm that may not supported by hardware - this function 163 + lets the upper layer know ahead of time that the algorithm isn't supported, 164 + and the upper layer can fallback to something else if appropriate. 165 + 166 + ``blk_crypto_start_using_key`` - Upper layers must call this function on 167 + ``blk_crypto_key`` and a ``request_queue`` before using the key with any bio 168 + headed for that ``request_queue``. This function ensures that either the 169 + hardware supports the key's crypto settings, or the crypto API fallback has 170 + transforms for the needed mode allocated and ready to go. Note that this 171 + function may allocate an ``skcipher``, and must not be called from the data 172 + path, since allocating ``skciphers`` from the data path can deadlock. 173 + 174 + ``blk_crypto_evict_key`` *must* be called by upper layers before a 175 + ``blk_crypto_key`` is freed. Further, it *must* only be called only once 176 + there are no more in-flight requests that use that ``blk_crypto_key``. 177 + ``blk_crypto_evict_key`` will ensure that a key is removed from any keyslots in 178 + inline encryption hardware that the key might have been programmed into (or the blk-crypto-fallback). 179 + 180 + API presented to device drivers 181 + =============================== 182 + 183 + A :c:type:``struct blk_keyslot_manager`` should be set up by device drivers in 184 + the ``request_queue`` of the device. The device driver needs to call 185 + ``blk_ksm_init`` on the ``blk_keyslot_manager``, which specifying the number of 186 + keyslots supported by the hardware. 187 + 188 + The device driver also needs to tell the KSM how to actually manipulate the 189 + IE hardware in the device to do things like programming the crypto key into 190 + the IE hardware into a particular keyslot. All this is achieved through the 191 + :c:type:`struct blk_ksm_ll_ops` field in the KSM that the device driver 192 + must fill up after initing the ``blk_keyslot_manager``. 193 + 194 + The KSM also handles runtime power management for the device when applicable 195 + (e.g. when it wants to program a crypto key into the IE hardware, the device 196 + must be runtime powered on) - so the device driver must also set the ``dev`` 197 + field in the ksm to point to the `struct device` for the KSM to use for runtime 198 + power management. 199 + 200 + ``blk_ksm_reprogram_all_keys`` can be called by device drivers if the device 201 + needs each and every of its keyslots to be reprogrammed with the key it 202 + "should have" at the point in time when the function is called. This is useful 203 + e.g. if a device loses all its keys on runtime power down/up. 204 + 205 + ``blk_ksm_destroy`` should be called to free up all resources used by a keyslot 206 + manager upon ``blk_ksm_init``, once the ``blk_keyslot_manager`` is no longer 207 + needed. 208 + 209 + 210 + Layered Devices 211 + =============== 212 + 213 + Request queue based layered devices like dm-rq that wish to support IE need to 214 + create their own keyslot manager for their request queue, and expose whatever 215 + functionality they choose. When a layered device wants to pass a clone of that 216 + request to another ``request_queue``, blk-crypto will initialize and prepare the 217 + clone as necessary - see ``blk_crypto_insert_cloned_request`` in 218 + ``blk-crypto.c``. 219 + 220 + 221 + Future Optimizations for layered devices 222 + ======================================== 223 + 224 + Creating a keyslot manager for a layered device uses up memory for each 225 + keyslot, and in general, a layered device merely passes the request on to a 226 + "child" device, so the keyslots in the layered device itself are completely 227 + unused, and don't need any refcounting or keyslot programming. We can instead 228 + define a new type of KSM; the "passthrough KSM", that layered devices can use 229 + to advertise an unlimited number of keyslots, and support for any encryption 230 + algorithms they choose, while not actually using any memory for each keyslot. 231 + Another use case for the "passthrough KSM" is for IE devices that do not have a 232 + limited number of keyslots. 233 + 234 + 235 + Interaction between inline encryption and blk integrity 236 + ======================================================= 237 + 238 + At the time of this patch, there is no real hardware that supports both these 239 + features. However, these features do interact with each other, and it's not 240 + completely trivial to make them both work together properly. In particular, 241 + when a WRITE bio wants to use inline encryption on a device that supports both 242 + features, the bio will have an encryption context specified, after which 243 + its integrity information is calculated (using the plaintext data, since 244 + the encryption will happen while data is being written), and the data and 245 + integrity info is sent to the device. Obviously, the integrity info must be 246 + verified before the data is encrypted. After the data is encrypted, the device 247 + must not store the integrity info that it received with the plaintext data 248 + since that might reveal information about the plaintext data. As such, it must 249 + re-generate the integrity info from the ciphertext data and store that on disk 250 + instead. Another issue with storing the integrity info of the plaintext data is 251 + that it changes the on disk format depending on whether hardware inline 252 + encryption support is present or the kernel crypto API fallback is used (since 253 + if the fallback is used, the device will receive the integrity info of the 254 + ciphertext, not that of the plaintext). 255 + 256 + Because there isn't any real hardware yet, it seems prudent to assume that 257 + hardware implementations might not implement both features together correctly, 258 + and disallow the combination for now. Whenever a device supports integrity, the 259 + kernel will pretend that the device does not support hardware inline encryption 260 + (by essentially setting the keyslot manager in the request_queue of the device 261 + to NULL). When the crypto API fallback is enabled, this means that all bios with 262 + and encryption context will use the fallback, and IO will complete as usual. 263 + When the fallback is disabled, a bio with an encryption context will be failed.