Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

iommu: inline iommu_num_pages

A profile of a network benchmark showed iommu_num_pages rather high up:

0.52% iommu_num_pages

Looking at the profile, an integer divide is taking almost all of the time:

%
: c000000000376ea4 <.iommu_num_pages>:
1.93 : c000000000376ea4: fb e1 ff f8 std r31,-8(r1)
0.00 : c000000000376ea8: f8 21 ff c1 stdu r1,-64(r1)
0.00 : c000000000376eac: 7c 3f 0b 78 mr r31,r1
3.86 : c000000000376eb0: 38 84 ff ff addi r4,r4,-1
0.00 : c000000000376eb4: 38 05 ff ff addi r0,r5,-1
0.00 : c000000000376eb8: 7c 84 2a 14 add r4,r4,r5
46.95 : c000000000376ebc: 7c 00 18 38 and r0,r0,r3
45.66 : c000000000376ec0: 7c 84 02 14 add r4,r4,r0
0.00 : c000000000376ec4: 7c 64 2b 92 divdu r3,r4,r5
0.00 : c000000000376ec8: 38 3f 00 40 addi r1,r31,64
0.00 : c000000000376ecc: eb e1 ff f8 ld r31,-8(r1)
1.61 : c000000000376ed0: 4e 80 00 20 blr

Since every caller of iommu_num_pages passes in a constant power of two
we can inline this such that the divide is replaced by a shift. The
entire function is only a few instructions once optimised, so it is
a good candidate for inlining overall.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

Anton Blanchard and committed by
Linus Torvalds
e269b085 85c9fe8f

+10 -11
+10 -2
include/linux/iommu-helper.h
··· 1 1 #ifndef _LINUX_IOMMU_HELPER_H 2 2 #define _LINUX_IOMMU_HELPER_H 3 3 4 + #include <linux/kernel.h> 5 + 4 6 static inline unsigned long iommu_device_max_index(unsigned long size, 5 7 unsigned long offset, 6 8 u64 dma_mask) ··· 22 20 unsigned long boundary_size, 23 21 unsigned long align_mask); 24 22 25 - extern unsigned long iommu_num_pages(unsigned long addr, unsigned long len, 26 - unsigned long io_page_size); 23 + static inline unsigned long iommu_num_pages(unsigned long addr, 24 + unsigned long len, 25 + unsigned long io_page_size) 26 + { 27 + unsigned long size = (addr & (io_page_size - 1)) + len; 28 + 29 + return DIV_ROUND_UP(size, io_page_size); 30 + } 27 31 28 32 #endif
-9
lib/iommu-helper.c
··· 38 38 return -1; 39 39 } 40 40 EXPORT_SYMBOL(iommu_area_alloc); 41 - 42 - unsigned long iommu_num_pages(unsigned long addr, unsigned long len, 43 - unsigned long io_page_size) 44 - { 45 - unsigned long size = (addr & (io_page_size - 1)) + len; 46 - 47 - return DIV_ROUND_UP(size, io_page_size); 48 - } 49 - EXPORT_SYMBOL(iommu_num_pages);