IB/mthca: Fix and simplify page size calculation in mthca_reg_phys_mr() In mthca_reg_phys_mr(), we calculate the page size for the HCA hardware to use to map the buffer list passed in by the consumer. For example, if the consumer passes in

[0] addr 0x1000, size 0x1000
[1] addr 0x2000, size 0x1000

then the algorithm would come up with a page size of 0x2000 and a list
of two pages, at 0x0000 and 0x2000. Usually, this would work fine
since the memory region would start at an offset of 0x1000 and have a
length of 0x2000.

However, the old code did not take into account the alignment of the
IO virtual address passed in. For example, if the consumer passed in
a virtual address of 0x6000 for the above, then the offset of 0x1000
would not be used correctly because the page mask of 0x1fff would
result in an offset of 0.

We can fix this quite neatly by making sure that the page shift we use
is no bigger than the first bit where the start of the first buffer
and the IO virtual address differ. Also, we can further simplify the
code by removing the special case for a single buffer by noticing that
it doesn't matter if we use a page size that is too big. This allows
the loop to compute the page shift to be replaced with __ffs().

Thanks to Bryan S Rosenburg <rosnbrg@us.ibm.com> for pointing out the
original bug and suggesting several ways to improve this patch.

Signed-off-by: Roland Dreier <rolandd@cisco.com>

+3 -17
+3 -17
drivers/infiniband/hw/mthca/mthca_provider.c
··· 923 struct mthca_mr *mr; 924 u64 *page_list; 925 u64 total_size; 926 - u64 mask; 927 int shift; 928 int npages; 929 int err; 930 int i, j, n; 931 932 - /* First check that we have enough alignment */ 933 - if ((*iova_start & ~PAGE_MASK) != (buffer_list[0].addr & ~PAGE_MASK)) 934 - return ERR_PTR(-EINVAL); 935 - 936 - mask = 0; 937 total_size = 0; 938 for (i = 0; i < num_phys_buf; ++i) { 939 if (i != 0) ··· 943 if (mask & ~PAGE_MASK) 944 return ERR_PTR(-EINVAL); 945 946 - /* Find largest page shift we can use to cover buffers */ 947 - for (shift = PAGE_SHIFT; shift < 31; ++shift) 948 - if (num_phys_buf > 1) { 949 - if ((1ULL << shift) & mask) 950 - break; 951 - } else { 952 - if (1ULL << shift >= 953 - buffer_list[0].size + 954 - (buffer_list[0].addr & ((1ULL << shift) - 1))) 955 - break; 956 - } 957 958 buffer_list[0].size += buffer_list[0].addr & ((1ULL << shift) - 1); 959 buffer_list[0].addr &= ~0ull << shift;
··· 923 struct mthca_mr *mr; 924 u64 *page_list; 925 u64 total_size; 926 + unsigned long mask; 927 int shift; 928 int npages; 929 int err; 930 int i, j, n; 931 932 + mask = buffer_list[0].addr ^ *iova_start; 933 total_size = 0; 934 for (i = 0; i < num_phys_buf; ++i) { 935 if (i != 0) ··· 947 if (mask & ~PAGE_MASK) 948 return ERR_PTR(-EINVAL); 949 950 + shift = __ffs(mask | 1 << 31); 951 952 buffer_list[0].size += buffer_list[0].addr & ((1ULL << shift) - 1); 953 buffer_list[0].addr &= ~0ull << shift;