OCFS2: Allow huge (> 16 TiB) volumes to mount

The OCFS2 developers have already done all of the hard work to allow
volumes larger than 16 TiB. But there is still a "sanity check" in
fs/ocfs2/super.c that prevents the mounting of such volumes, even when
the cluster size and journal options would allow it.

This patch replaces that sanity check with a more sophisticated one to
mount a huge volume provided that (a) it is addressable by the raw
word/address size of the system (borrowing a test from ext4); (b) the
volume is using JBD2; and (c) the JBD2_FEATURE_INCOMPAT_64BIT flag is
set on the journal.

I factored out the sanity check into its own function. I also moved it
from ocfs2_initialize_super() down to ocfs2_check_volume(); any earlier,
and the journal will not have been initialized yet.

This patch is one of a pair, and it depends on the other ("JBD2: Allow
feature checks before journal recovery").

I have tested this patch on small volumes, huge volumes, and huge
volumes without 64-bit block support in the journal. All of them appear
to work or to fail gracefully, as appropriate.

Signed-off-by: Patrick LoPresti <lopresti@gmail.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>

authored by

Patrick J. LoPresti and committed by
Joel Becker
3bdb8efd 1113e1b5

+46 -5
+46 -5
fs/ocfs2/super.c
··· 1990 return 0; 1991 } 1992 1993 static int ocfs2_initialize_super(struct super_block *sb, 1994 struct buffer_head *bh, 1995 int sector_size, ··· 2032 struct ocfs2_journal *journal; 2033 __le32 uuid_net_key; 2034 struct ocfs2_super *osb; 2035 2036 mlog_entry_void(); 2037 ··· 2245 goto bail; 2246 } 2247 2248 - if (ocfs2_clusters_to_blocks(osb->sb, le32_to_cpu(di->i_clusters) - 1) 2249 - > (u32)~0UL) { 2250 - mlog(ML_ERROR, "Volume might try to write to blocks beyond " 2251 - "what jbd can address in 32 bits.\n"); 2252 - status = -EINVAL; 2253 goto bail; 2254 } 2255 ··· 2414 mlog(ML_ERROR, "Could not initialize journal!\n"); 2415 goto finally; 2416 } 2417 2418 /* If the journal was unmounted cleanly then we don't want to 2419 * recover anything. Otherwise, journal_load will do that
··· 1990 return 0; 1991 } 1992 1993 + /* Make sure entire volume is addressable by our journal. Requires 1994 + osb_clusters_at_boot to be valid and for the journal to have been 1995 + initialized by ocfs2_journal_init(). */ 1996 + static int ocfs2_journal_addressable(struct ocfs2_super *osb) 1997 + { 1998 + int status = 0; 1999 + u64 max_block = 2000 + ocfs2_clusters_to_blocks(osb->sb, 2001 + osb->osb_clusters_at_boot) - 1; 2002 + 2003 + /* 32-bit block number is always OK. */ 2004 + if (max_block <= (u32)~0ULL) 2005 + goto out; 2006 + 2007 + /* Volume is "huge", so see if our journal is new enough to 2008 + support it. */ 2009 + if (!(OCFS2_HAS_COMPAT_FEATURE(osb->sb, 2010 + OCFS2_FEATURE_COMPAT_JBD2_SB) && 2011 + jbd2_journal_check_used_features(osb->journal->j_journal, 0, 0, 2012 + JBD2_FEATURE_INCOMPAT_64BIT))) { 2013 + mlog(ML_ERROR, "The journal cannot address the entire volume. " 2014 + "Enable the 'block64' journal option with tunefs.ocfs2"); 2015 + status = -EFBIG; 2016 + goto out; 2017 + } 2018 + 2019 + out: 2020 + return status; 2021 + } 2022 + 2023 static int ocfs2_initialize_super(struct super_block *sb, 2024 struct buffer_head *bh, 2025 int sector_size, ··· 2002 struct ocfs2_journal *journal; 2003 __le32 uuid_net_key; 2004 struct ocfs2_super *osb; 2005 + u64 total_blocks; 2006 2007 mlog_entry_void(); 2008 ··· 2214 goto bail; 2215 } 2216 2217 + total_blocks = ocfs2_clusters_to_blocks(osb->sb, 2218 + le32_to_cpu(di->i_clusters)); 2219 + 2220 + status = generic_check_addressable(osb->sb->s_blocksize_bits, 2221 + total_blocks); 2222 + if (status) { 2223 + mlog(ML_ERROR, "Volume too large " 2224 + "to mount safely on this system"); 2225 + status = -EFBIG; 2226 goto bail; 2227 } 2228 ··· 2379 mlog(ML_ERROR, "Could not initialize journal!\n"); 2380 goto finally; 2381 } 2382 + 2383 + /* Now that journal has been initialized, check to make sure 2384 + entire volume is addressable. */ 2385 + status = ocfs2_journal_addressable(osb); 2386 + if (status) 2387 + goto finally; 2388 2389 /* If the journal was unmounted cleanly then we don't want to 2390 * recover anything. Otherwise, journal_load will do that