ext4: fix indirect punch hole corruption

Commit 4f579ae7de56 (ext4: fix punch hole on files with indirect
mapping) rewrote FALLOC_FL_PUNCH_HOLE for ext4 files with indirect
mapping. However, there are bugs in several corner cases. This fixes 5
distinct bugs:

1. When there is at least one entire level of indirection between the
start and end of the punch range and the end of the punch range is the
first block of its level, we can't return early; we have to free the
intervening levels.

2. When the end is at a higher level of indirection than the start and
ext4_find_shared returns a top branch for the end, we still need to free
the rest of the shared branch it returns; we can't decrement partial2.

3. When a punch happens within one level of indirection, we need to
converge on an indirect block that contains the start and end. However,
because the branches returned from ext4_find_shared do not necessarily
start at the same level (e.g., the partial2 chain will be shallower if
the last block occurs at the beginning of an indirect group), the walk
of the two chains can end up "missing" each other and freeing a bunch of
extra blocks in the process. This mismatch can be handled by first
making sure that the chains are at the same level, then walking them
together until they converge.

4. When the punch happens within one level of indirection and
ext4_find_shared returns a top branch for the start, we must free it,
but only if the end does not occur within that branch.

5. When the punch happens within one level of indirection and
ext4_find_shared returns a top branch for the end, then we shouldn't
free the block referenced by the end of the returned chain (this mirrors
the different levels case).

Signed-off-by: Omar Sandoval <osandov@osandov.com>

authored by Omar Sandoval and committed by Theodore Ts'o 6f30b7e3 2d5b86e0

Changed files
+73 -36
fs
ext4
+73 -36
fs/ext4/indirect.c
··· 1393 1393 * to free. Everything was covered by the start 1394 1394 * of the range. 1395 1395 */ 1396 - return 0; 1397 - } else { 1398 - /* Shared branch grows from an indirect block */ 1399 - partial2--; 1396 + goto do_indirects; 1400 1397 } 1401 1398 } else { 1402 1399 /* ··· 1424 1427 /* Punch happened within the same level (n == n2) */ 1425 1428 partial = ext4_find_shared(inode, n, offsets, chain, &nr); 1426 1429 partial2 = ext4_find_shared(inode, n2, offsets2, chain2, &nr2); 1427 - /* 1428 - * ext4_find_shared returns Indirect structure which 1429 - * points to the last element which should not be 1430 - * removed by truncate. But this is end of the range 1431 - * in punch_hole so we need to point to the next element 1432 - */ 1433 - partial2->p++; 1434 - while ((partial > chain) || (partial2 > chain2)) { 1435 - /* We're at the same block, so we're almost finished */ 1436 - if ((partial->bh && partial2->bh) && 1437 - (partial->bh->b_blocknr == partial2->bh->b_blocknr)) { 1438 - if ((partial > chain) && (partial2 > chain2)) { 1439 - ext4_free_branches(handle, inode, partial->bh, 1440 - partial->p + 1, 1441 - partial2->p, 1442 - (chain+n-1) - partial); 1443 - BUFFER_TRACE(partial->bh, "call brelse"); 1444 - brelse(partial->bh); 1445 - BUFFER_TRACE(partial2->bh, "call brelse"); 1446 - brelse(partial2->bh); 1430 + 1431 + /* Free top, but only if partial2 isn't its subtree. */ 1432 + if (nr) { 1433 + int level = min(partial - chain, partial2 - chain2); 1434 + int i; 1435 + int subtree = 1; 1436 + 1437 + for (i = 0; i <= level; i++) { 1438 + if (offsets[i] != offsets2[i]) { 1439 + subtree = 0; 1440 + break; 1447 1441 } 1442 + } 1443 + 1444 + if (!subtree) { 1445 + if (partial == chain) { 1446 + /* Shared branch grows from the inode */ 1447 + ext4_free_branches(handle, inode, NULL, 1448 + &nr, &nr+1, 1449 + (chain+n-1) - partial); 1450 + *partial->p = 0; 1451 + } else { 1452 + /* Shared branch grows from an indirect block */ 1453 + BUFFER_TRACE(partial->bh, "get_write_access"); 1454 + ext4_free_branches(handle, inode, partial->bh, 1455 + partial->p, 1456 + partial->p+1, 1457 + (chain+n-1) - partial); 1458 + } 1459 + } 1460 + } 1461 + 1462 + if (!nr2) { 1463 + /* 1464 + * ext4_find_shared returns Indirect structure which 1465 + * points to the last element which should not be 1466 + * removed by truncate. But this is end of the range 1467 + * in punch_hole so we need to point to the next element 1468 + */ 1469 + partial2->p++; 1470 + } 1471 + 1472 + while (partial > chain || partial2 > chain2) { 1473 + int depth = (chain+n-1) - partial; 1474 + int depth2 = (chain2+n2-1) - partial2; 1475 + 1476 + if (partial > chain && partial2 > chain2 && 1477 + partial->bh->b_blocknr == partial2->bh->b_blocknr) { 1478 + /* 1479 + * We've converged on the same block. Clear the range, 1480 + * then we're done. 1481 + */ 1482 + ext4_free_branches(handle, inode, partial->bh, 1483 + partial->p + 1, 1484 + partial2->p, 1485 + (chain+n-1) - partial); 1486 + BUFFER_TRACE(partial->bh, "call brelse"); 1487 + brelse(partial->bh); 1488 + BUFFER_TRACE(partial2->bh, "call brelse"); 1489 + brelse(partial2->bh); 1448 1490 return 0; 1449 1491 } 1492 + 1450 1493 /* 1451 - * Clear the ends of indirect blocks on the shared branch 1452 - * at the start of the range 1494 + * The start and end partial branches may not be at the same 1495 + * level even though the punch happened within one level. So, we 1496 + * give them a chance to arrive at the same level, then walk 1497 + * them in step with each other until we converge on the same 1498 + * block. 1453 1499 */ 1454 - if (partial > chain) { 1500 + if (partial > chain && depth <= depth2) { 1455 1501 ext4_free_branches(handle, inode, partial->bh, 1456 - partial->p + 1, 1457 - (__le32 *)partial->bh->b_data+addr_per_block, 1458 - (chain+n-1) - partial); 1502 + partial->p + 1, 1503 + (__le32 *)partial->bh->b_data+addr_per_block, 1504 + (chain+n-1) - partial); 1459 1505 BUFFER_TRACE(partial->bh, "call brelse"); 1460 1506 brelse(partial->bh); 1461 1507 partial--; 1462 1508 } 1463 - /* 1464 - * Clear the ends of indirect blocks on the shared branch 1465 - * at the end of the range 1466 - */ 1467 - if (partial2 > chain2) { 1509 + if (partial2 > chain2 && depth2 <= depth) { 1468 1510 ext4_free_branches(handle, inode, partial2->bh, 1469 1511 (__le32 *)partial2->bh->b_data, 1470 1512 partial2->p, 1471 - (chain2+n-1) - partial2); 1513 + (chain2+n2-1) - partial2); 1472 1514 BUFFER_TRACE(partial2->bh, "call brelse"); 1473 1515 brelse(partial2->bh); 1474 1516 partial2--; 1475 1517 } 1476 1518 } 1519 + return 0; 1477 1520 1478 1521 do_indirects: 1479 1522 /* Kill the remaining (whole) subtrees */