Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

fscrypt: improve format of no-key names

When an encrypted directory is listed without the key, the filesystem
must show "no-key names" that uniquely identify directory entries, are
at most 255 (NAME_MAX) bytes long, and don't contain '/' or '\0'.
Currently, for short names the no-key name is the base64 encoding of the
ciphertext filename, while for long names it's the base64 encoding of
the ciphertext filename's dirhash and second-to-last 16-byte block.

This format has the following problems:

- Since it doesn't always include the dirhash, it's incompatible with
directories that will use a secret-keyed dirhash over the plaintext
filenames. In this case, the dirhash won't be computable from the
ciphertext name without the key, so it instead must be retrieved from
the directory entry and always included in the no-key name.
Casefolded encrypted directories will use this type of dirhash.

- It's ambiguous: it's possible to craft two filenames that map to the
same no-key name, since the method used to abbreviate long filenames
doesn't use a proper cryptographic hash function.

Solve both these problems by switching to a new no-key name format that
is the base64 encoding of a variable-length structure that contains the
dirhash, up to 149 bytes of the ciphertext filename, and (if any bytes
remain) the SHA-256 of the remaining bytes of the ciphertext filename.

This ensures that each no-key name contains everything needed to find
the directory entry again, contains only legal characters, doesn't
exceed NAME_MAX, is unambiguous unless there's a SHA-256 collision, and
that we only take the performance hit of SHA-256 on very long filenames.

Note: this change does *not* address the existing issue where users can
modify the 'dirhash' part of a no-key name and the filesystem may still
accept the name.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
[EB: improved comments and commit message, fixed checking return value
of base64_decode(), check for SHA-256 error, continue to set disk_name
for short names to keep matching simpler, and many other cleanups]
Link: https://lore.kernel.org/r/20200120223201.241390-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>

authored by

Daniel Rosenberg and committed by
Eric Biggers
edc440e3 aec992aa

+171 -127
+1 -1
Documentation/filesystems/fscrypt.rst
··· 1202 1202 allows the filesystem to still, with a high degree of confidence, map 1203 1203 the filename given in ->lookup() back to a particular directory entry 1204 1204 that was previously listed by readdir(). See :c:type:`struct 1205 - fscrypt_digested_name` in the source for more details. 1205 + fscrypt_nokey_name` in the source for more details. 1206 1206 1207 1207 Note that the precise way that filenames are presented to userspace 1208 1208 without the key is subject to change in the future. It is only meant
+1
fs/crypto/Kconfig
··· 21 21 select CRYPTO_CTS 22 22 select CRYPTO_ECB 23 23 select CRYPTO_HMAC 24 + select CRYPTO_SHA256 24 25 select CRYPTO_SHA512 25 26 select CRYPTO_XTS
+167 -51
fs/crypto/fname.c
··· 13 13 14 14 #include <linux/namei.h> 15 15 #include <linux/scatterlist.h> 16 + #include <crypto/hash.h> 17 + #include <crypto/sha.h> 16 18 #include <crypto/skcipher.h> 17 19 #include "fscrypt_private.h" 20 + 21 + /** 22 + * struct fscrypt_nokey_name - identifier for directory entry when key is absent 23 + * 24 + * When userspace lists an encrypted directory without access to the key, the 25 + * filesystem must present a unique "no-key name" for each filename that allows 26 + * it to find the directory entry again if requested. Naively, that would just 27 + * mean using the ciphertext filenames. However, since the ciphertext filenames 28 + * can contain illegal characters ('\0' and '/'), they must be encoded in some 29 + * way. We use base64. But that can cause names to exceed NAME_MAX (255 30 + * bytes), so we also need to use a strong hash to abbreviate long names. 31 + * 32 + * The filesystem may also need another kind of hash, the "dirhash", to quickly 33 + * find the directory entry. Since filesystems normally compute the dirhash 34 + * over the on-disk filename (i.e. the ciphertext), it's not computable from 35 + * no-key names that abbreviate the ciphertext using the strong hash to fit in 36 + * NAME_MAX. It's also not computable if it's a keyed hash taken over the 37 + * plaintext (but it may still be available in the on-disk directory entry); 38 + * casefolded directories use this type of dirhash. At least in these cases, 39 + * each no-key name must include the name's dirhash too. 40 + * 41 + * To meet all these requirements, we base64-encode the following 42 + * variable-length structure. It contains the dirhash, or 0's if the filesystem 43 + * didn't provide one; up to 149 bytes of the ciphertext name; and for 44 + * ciphertexts longer than 149 bytes, also the SHA-256 of the remaining bytes. 45 + * 46 + * This ensures that each no-key name contains everything needed to find the 47 + * directory entry again, contains only legal characters, doesn't exceed 48 + * NAME_MAX, is unambiguous unless there's a SHA-256 collision, and that we only 49 + * take the performance hit of SHA-256 on very long filenames (which are rare). 50 + */ 51 + struct fscrypt_nokey_name { 52 + u32 dirhash[2]; 53 + u8 bytes[149]; 54 + u8 sha256[SHA256_DIGEST_SIZE]; 55 + }; /* 189 bytes => 252 bytes base64-encoded, which is <= NAME_MAX (255) */ 56 + 57 + /* 58 + * Decoded size of max-size nokey name, i.e. a name that was abbreviated using 59 + * the strong hash and thus includes the 'sha256' field. This isn't simply 60 + * sizeof(struct fscrypt_nokey_name), as the padding at the end isn't included. 61 + */ 62 + #define FSCRYPT_NOKEY_NAME_MAX offsetofend(struct fscrypt_nokey_name, sha256) 63 + 64 + static struct crypto_shash *sha256_hash_tfm; 65 + 66 + static int fscrypt_do_sha256(const u8 *data, unsigned int data_len, u8 *result) 67 + { 68 + struct crypto_shash *tfm = READ_ONCE(sha256_hash_tfm); 69 + 70 + if (unlikely(!tfm)) { 71 + struct crypto_shash *prev_tfm; 72 + 73 + tfm = crypto_alloc_shash("sha256", 0, 0); 74 + if (IS_ERR(tfm)) { 75 + fscrypt_err(NULL, 76 + "Error allocating SHA-256 transform: %ld", 77 + PTR_ERR(tfm)); 78 + return PTR_ERR(tfm); 79 + } 80 + prev_tfm = cmpxchg(&sha256_hash_tfm, NULL, tfm); 81 + if (prev_tfm) { 82 + crypto_free_shash(tfm); 83 + tfm = prev_tfm; 84 + } 85 + } 86 + { 87 + SHASH_DESC_ON_STACK(desc, tfm); 88 + 89 + desc->tfm = tfm; 90 + 91 + return crypto_shash_digest(desc, data, data_len, result); 92 + } 93 + } 18 94 19 95 static inline bool fscrypt_is_dot_dotdot(const struct qstr *str) 20 96 { ··· 283 207 u32 max_encrypted_len, 284 208 struct fscrypt_str *crypto_str) 285 209 { 286 - const u32 max_encoded_len = 287 - max_t(u32, BASE64_CHARS(FSCRYPT_FNAME_MAX_UNDIGESTED_SIZE), 288 - 1 + BASE64_CHARS(sizeof(struct fscrypt_digested_name))); 210 + const u32 max_encoded_len = BASE64_CHARS(FSCRYPT_NOKEY_NAME_MAX); 289 211 u32 max_presented_len; 290 212 291 213 max_presented_len = max(max_encoded_len, max_encrypted_len); ··· 316 242 * 317 243 * The caller must have allocated sufficient memory for the @oname string. 318 244 * 319 - * If the key is available, we'll decrypt the disk name; otherwise, we'll encode 320 - * it for presentation. Short names are directly base64-encoded, while long 321 - * names are encoded in fscrypt_digested_name format. 245 + * If the key is available, we'll decrypt the disk name. Otherwise, we'll 246 + * encode it for presentation in fscrypt_nokey_name format. 247 + * See struct fscrypt_nokey_name for details. 322 248 * 323 249 * Return: 0 on success, -errno on failure 324 250 */ ··· 328 254 struct fscrypt_str *oname) 329 255 { 330 256 const struct qstr qname = FSTR_TO_QSTR(iname); 331 - struct fscrypt_digested_name digested_name; 257 + struct fscrypt_nokey_name nokey_name; 258 + u32 size; /* size of the unencoded no-key name */ 259 + int err; 332 260 333 261 if (fscrypt_is_dot_dotdot(&qname)) { 334 262 oname->name[0] = '.'; ··· 345 269 if (fscrypt_has_encryption_key(inode)) 346 270 return fname_decrypt(inode, iname, oname); 347 271 348 - if (iname->len <= FSCRYPT_FNAME_MAX_UNDIGESTED_SIZE) { 349 - oname->len = base64_encode(iname->name, iname->len, 350 - oname->name); 351 - return 0; 352 - } 272 + /* 273 + * Sanity check that struct fscrypt_nokey_name doesn't have padding 274 + * between fields and that its encoded size never exceeds NAME_MAX. 275 + */ 276 + BUILD_BUG_ON(offsetofend(struct fscrypt_nokey_name, dirhash) != 277 + offsetof(struct fscrypt_nokey_name, bytes)); 278 + BUILD_BUG_ON(offsetofend(struct fscrypt_nokey_name, bytes) != 279 + offsetof(struct fscrypt_nokey_name, sha256)); 280 + BUILD_BUG_ON(BASE64_CHARS(FSCRYPT_NOKEY_NAME_MAX) > NAME_MAX); 281 + 353 282 if (hash) { 354 - digested_name.hash = hash; 355 - digested_name.minor_hash = minor_hash; 283 + nokey_name.dirhash[0] = hash; 284 + nokey_name.dirhash[1] = minor_hash; 356 285 } else { 357 - digested_name.hash = 0; 358 - digested_name.minor_hash = 0; 286 + nokey_name.dirhash[0] = 0; 287 + nokey_name.dirhash[1] = 0; 359 288 } 360 - memcpy(digested_name.digest, 361 - FSCRYPT_FNAME_DIGEST(iname->name, iname->len), 362 - FSCRYPT_FNAME_DIGEST_SIZE); 363 - oname->name[0] = '_'; 364 - oname->len = 1 + base64_encode((const u8 *)&digested_name, 365 - sizeof(digested_name), oname->name + 1); 289 + if (iname->len <= sizeof(nokey_name.bytes)) { 290 + memcpy(nokey_name.bytes, iname->name, iname->len); 291 + size = offsetof(struct fscrypt_nokey_name, bytes[iname->len]); 292 + } else { 293 + memcpy(nokey_name.bytes, iname->name, sizeof(nokey_name.bytes)); 294 + /* Compute strong hash of remaining part of name. */ 295 + err = fscrypt_do_sha256(&iname->name[sizeof(nokey_name.bytes)], 296 + iname->len - sizeof(nokey_name.bytes), 297 + nokey_name.sha256); 298 + if (err) 299 + return err; 300 + size = FSCRYPT_NOKEY_NAME_MAX; 301 + } 302 + oname->len = base64_encode((const u8 *)&nokey_name, size, oname->name); 366 303 return 0; 367 304 } 368 305 EXPORT_SYMBOL(fscrypt_fname_disk_to_usr); ··· 396 307 * get the disk_name. 397 308 * 398 309 * Else, for keyless @lookup operations, @iname is the presented ciphertext, so 399 - * we decode it to get either the ciphertext disk_name (for short names) or the 400 - * fscrypt_digested_name (for long names). Non-@lookup operations will be 310 + * we decode it to get the fscrypt_nokey_name. Non-@lookup operations will be 401 311 * impossible in this case, so we fail them with ENOKEY. 402 312 * 403 313 * If successful, fscrypt_free_filename() must be called later to clean up. ··· 406 318 int fscrypt_setup_filename(struct inode *dir, const struct qstr *iname, 407 319 int lookup, struct fscrypt_name *fname) 408 320 { 321 + struct fscrypt_nokey_name *nokey_name; 409 322 int ret; 410 - int digested; 411 323 412 324 memset(fname, 0, sizeof(struct fscrypt_name)); 413 325 fname->usr_fname = iname; ··· 447 359 * We don't have the key and we are doing a lookup; decode the 448 360 * user-supplied name 449 361 */ 450 - if (iname->name[0] == '_') { 451 - if (iname->len != 452 - 1 + BASE64_CHARS(sizeof(struct fscrypt_digested_name))) 453 - return -ENOENT; 454 - digested = 1; 455 - } else { 456 - if (iname->len > 457 - BASE64_CHARS(FSCRYPT_FNAME_MAX_UNDIGESTED_SIZE)) 458 - return -ENOENT; 459 - digested = 0; 460 - } 461 362 462 - fname->crypto_buf.name = 463 - kmalloc(max_t(size_t, FSCRYPT_FNAME_MAX_UNDIGESTED_SIZE, 464 - sizeof(struct fscrypt_digested_name)), 465 - GFP_KERNEL); 363 + if (iname->len > BASE64_CHARS(FSCRYPT_NOKEY_NAME_MAX)) 364 + return -ENOENT; 365 + 366 + fname->crypto_buf.name = kmalloc(FSCRYPT_NOKEY_NAME_MAX, GFP_KERNEL); 466 367 if (fname->crypto_buf.name == NULL) 467 368 return -ENOMEM; 468 369 469 - ret = base64_decode(iname->name + digested, iname->len - digested, 470 - fname->crypto_buf.name); 471 - if (ret < 0) { 370 + ret = base64_decode(iname->name, iname->len, fname->crypto_buf.name); 371 + if (ret < (int)offsetof(struct fscrypt_nokey_name, bytes[1]) || 372 + (ret > offsetof(struct fscrypt_nokey_name, sha256) && 373 + ret != FSCRYPT_NOKEY_NAME_MAX)) { 472 374 ret = -ENOENT; 473 375 goto errout; 474 376 } 475 377 fname->crypto_buf.len = ret; 476 - if (digested) { 477 - const struct fscrypt_digested_name *n = 478 - (const void *)fname->crypto_buf.name; 479 - fname->hash = n->hash; 480 - fname->minor_hash = n->minor_hash; 481 - } else { 482 - fname->disk_name.name = fname->crypto_buf.name; 483 - fname->disk_name.len = fname->crypto_buf.len; 378 + 379 + nokey_name = (void *)fname->crypto_buf.name; 380 + fname->hash = nokey_name->dirhash[0]; 381 + fname->minor_hash = nokey_name->dirhash[1]; 382 + if (ret != FSCRYPT_NOKEY_NAME_MAX) { 383 + /* The full ciphertext filename is available. */ 384 + fname->disk_name.name = nokey_name->bytes; 385 + fname->disk_name.len = 386 + ret - offsetof(struct fscrypt_nokey_name, bytes); 484 387 } 485 388 return 0; 486 389 ··· 480 401 return ret; 481 402 } 482 403 EXPORT_SYMBOL(fscrypt_setup_filename); 404 + 405 + /** 406 + * fscrypt_match_name() - test whether the given name matches a directory entry 407 + * @fname: the name being searched for 408 + * @de_name: the name from the directory entry 409 + * @de_name_len: the length of @de_name in bytes 410 + * 411 + * Normally @fname->disk_name will be set, and in that case we simply compare 412 + * that to the name stored in the directory entry. The only exception is that 413 + * if we don't have the key for an encrypted directory and the name we're 414 + * looking for is very long, then we won't have the full disk_name and instead 415 + * we'll need to match against a fscrypt_nokey_name that includes a strong hash. 416 + * 417 + * Return: %true if the name matches, otherwise %false. 418 + */ 419 + bool fscrypt_match_name(const struct fscrypt_name *fname, 420 + const u8 *de_name, u32 de_name_len) 421 + { 422 + const struct fscrypt_nokey_name *nokey_name = 423 + (const void *)fname->crypto_buf.name; 424 + u8 sha256[SHA256_DIGEST_SIZE]; 425 + 426 + if (likely(fname->disk_name.name)) { 427 + if (de_name_len != fname->disk_name.len) 428 + return false; 429 + return !memcmp(de_name, fname->disk_name.name, de_name_len); 430 + } 431 + if (de_name_len <= sizeof(nokey_name->bytes)) 432 + return false; 433 + if (memcmp(de_name, nokey_name->bytes, sizeof(nokey_name->bytes))) 434 + return false; 435 + if (fscrypt_do_sha256(&de_name[sizeof(nokey_name->bytes)], 436 + de_name_len - sizeof(nokey_name->bytes), sha256)) 437 + return false; 438 + return !memcmp(sha256, nokey_name->sha256, sizeof(sha256)); 439 + } 440 + EXPORT_SYMBOL_GPL(fscrypt_match_name); 483 441 484 442 /** 485 443 * fscrypt_fname_siphash() - calculate the SipHash of a filename
+2 -75
include/linux/fscrypt.h
··· 172 172 u32 hash, u32 minor_hash, 173 173 const struct fscrypt_str *iname, 174 174 struct fscrypt_str *oname); 175 - 176 - #define FSCRYPT_FNAME_MAX_UNDIGESTED_SIZE 32 177 - 178 - /* Extracts the second-to-last ciphertext block; see explanation below */ 179 - #define FSCRYPT_FNAME_DIGEST(name, len) \ 180 - ((name) + round_down((len) - FS_CRYPTO_BLOCK_SIZE - 1, \ 181 - FS_CRYPTO_BLOCK_SIZE)) 182 - 183 - #define FSCRYPT_FNAME_DIGEST_SIZE FS_CRYPTO_BLOCK_SIZE 184 - 185 - /** 186 - * fscrypt_digested_name - alternate identifier for an on-disk filename 187 - * 188 - * When userspace lists an encrypted directory without access to the key, 189 - * filenames whose ciphertext is longer than FSCRYPT_FNAME_MAX_UNDIGESTED_SIZE 190 - * bytes are shown in this abbreviated form (base64-encoded) rather than as the 191 - * full ciphertext (base64-encoded). This is necessary to allow supporting 192 - * filenames up to NAME_MAX bytes, since base64 encoding expands the length. 193 - * 194 - * To make it possible for filesystems to still find the correct directory entry 195 - * despite not knowing the full on-disk name, we encode any filesystem-specific 196 - * 'hash' and/or 'minor_hash' which the filesystem may need for its lookups, 197 - * followed by the second-to-last ciphertext block of the filename. Due to the 198 - * use of the CBC-CTS encryption mode, the second-to-last ciphertext block 199 - * depends on the full plaintext. (Note that ciphertext stealing causes the 200 - * last two blocks to appear "flipped".) This makes accidental collisions very 201 - * unlikely: just a 1 in 2^128 chance for two filenames to collide even if they 202 - * share the same filesystem-specific hashes. 203 - * 204 - * However, this scheme isn't immune to intentional collisions, which can be 205 - * created by anyone able to create arbitrary plaintext filenames and view them 206 - * without the key. Making the "digest" be a real cryptographic hash like 207 - * SHA-256 over the full ciphertext would prevent this, although it would be 208 - * less efficient and harder to implement, especially since the filesystem would 209 - * need to calculate it for each directory entry examined during a search. 210 - */ 211 - struct fscrypt_digested_name { 212 - u32 hash; 213 - u32 minor_hash; 214 - u8 digest[FSCRYPT_FNAME_DIGEST_SIZE]; 215 - }; 216 - 217 - /** 218 - * fscrypt_match_name() - test whether the given name matches a directory entry 219 - * @fname: the name being searched for 220 - * @de_name: the name from the directory entry 221 - * @de_name_len: the length of @de_name in bytes 222 - * 223 - * Normally @fname->disk_name will be set, and in that case we simply compare 224 - * that to the name stored in the directory entry. The only exception is that 225 - * if we don't have the key for an encrypted directory and a filename in it is 226 - * very long, then we won't have the full disk_name and we'll instead need to 227 - * match against the fscrypt_digested_name. 228 - * 229 - * Return: %true if the name matches, otherwise %false. 230 - */ 231 - static inline bool fscrypt_match_name(const struct fscrypt_name *fname, 232 - const u8 *de_name, u32 de_name_len) 233 - { 234 - if (unlikely(!fname->disk_name.name)) { 235 - const struct fscrypt_digested_name *n = 236 - (const void *)fname->crypto_buf.name; 237 - if (WARN_ON_ONCE(fname->usr_fname->name[0] != '_')) 238 - return false; 239 - if (de_name_len <= FSCRYPT_FNAME_MAX_UNDIGESTED_SIZE) 240 - return false; 241 - return !memcmp(FSCRYPT_FNAME_DIGEST(de_name, de_name_len), 242 - n->digest, FSCRYPT_FNAME_DIGEST_SIZE); 243 - } 244 - 245 - if (de_name_len != fname->disk_name.len) 246 - return false; 247 - return !memcmp(de_name, fname->disk_name.name, fname->disk_name.len); 248 - } 249 - 175 + extern bool fscrypt_match_name(const struct fscrypt_name *fname, 176 + const u8 *de_name, u32 de_name_len); 250 177 extern u64 fscrypt_fname_siphash(const struct inode *dir, 251 178 const struct qstr *name); 252 179