···11+Kernel-provided User Helpers22+============================33+44+These are segment of kernel provided user code reachable from user space55+at a fixed address in kernel memory. This is used to provide user space66+with some operations which require kernel help because of unimplemented77+native feature and/or instructions in many ARM CPUs. The idea is for this88+code to be executed directly in user mode for best efficiency but which is99+too intimate with the kernel counter part to be left to user libraries.1010+In fact this code might even differ from one CPU to another depending on1111+the available instruction set, or whether it is a SMP systems. In other1212+words, the kernel reserves the right to change this code as needed without1313+warning. Only the entry points and their results as documented here are1414+guaranteed to be stable.1515+1616+This is different from (but doesn't preclude) a full blown VDSO1717+implementation, however a VDSO would prevent some assembly tricks with1818+constants that allows for efficient branching to those code segments. And1919+since those code segments only use a few cycles before returning to user2020+code, the overhead of a VDSO indirect far call would add a measurable2121+overhead to such minimalistic operations.2222+2323+User space is expected to bypass those helpers and implement those things2424+inline (either in the code emitted directly by the compiler, or part of2525+the implementation of a library call) when optimizing for a recent enough2626+processor that has the necessary native support, but only if resulting2727+binaries are already to be incompatible with earlier ARM processors due to2828+useage of similar native instructions for other things. In other words2929+don't make binaries unable to run on earlier processors just for the sake3030+of not using these kernel helpers if your compiled code is not going to3131+use new instructions for other purpose.3232+3333+New helpers may be added over time, so an older kernel may be missing some3434+helpers present in a newer kernel. For this reason, programs must check3535+the value of __kuser_helper_version (see below) before assuming that it is3636+safe to call any particular helper. This check should ideally be3737+performed only once at process startup time, and execution aborted early3838+if the required helpers are not provided by the kernel version that3939+process is running on.4040+4141+kuser_helper_version4242+--------------------4343+4444+Location: 0xffff0ffc4545+4646+Reference declaration:4747+4848+ extern int32_t __kuser_helper_version;4949+5050+Definition:5151+5252+ This field contains the number of helpers being implemented by the5353+ running kernel. User space may read this to determine the availability5454+ of a particular helper.5555+5656+Usage example:5757+5858+#define __kuser_helper_version (*(int32_t *)0xffff0ffc)5959+6060+void check_kuser_version(void)6161+{6262+ if (__kuser_helper_version < 2) {6363+ fprintf(stderr, "can't do atomic operations, kernel too old\n");6464+ abort();6565+ }6666+}6767+6868+Notes:6969+7070+ User space may assume that the value of this field never changes7171+ during the lifetime of any single process. This means that this7272+ field can be read once during the initialisation of a library or7373+ startup phase of a program.7474+7575+kuser_get_tls7676+-------------7777+7878+Location: 0xffff0fe07979+8080+Reference prototype:8181+8282+ void * __kuser_get_tls(void);8383+8484+Input:8585+8686+ lr = return address8787+8888+Output:8989+9090+ r0 = TLS value9191+9292+Clobbered registers:9393+9494+ none9595+9696+Definition:9797+9898+ Get the TLS value as previously set via the __ARM_NR_set_tls syscall.9999+100100+Usage example:101101+102102+typedef void * (__kuser_get_tls_t)(void);103103+#define __kuser_get_tls (*(__kuser_get_tls_t *)0xffff0fe0)104104+105105+void foo()106106+{107107+ void *tls = __kuser_get_tls();108108+ printf("TLS = %p\n", tls);109109+}110110+111111+Notes:112112+113113+ - Valid only if __kuser_helper_version >= 1 (from kernel version 2.6.12).114114+115115+kuser_cmpxchg116116+-------------117117+118118+Location: 0xffff0fc0119119+120120+Reference prototype:121121+122122+ int __kuser_cmpxchg(int32_t oldval, int32_t newval, volatile int32_t *ptr);123123+124124+Input:125125+126126+ r0 = oldval127127+ r1 = newval128128+ r2 = ptr129129+ lr = return address130130+131131+Output:132132+133133+ r0 = success code (zero or non-zero)134134+ C flag = set if r0 == 0, clear if r0 != 0135135+136136+Clobbered registers:137137+138138+ r3, ip, flags139139+140140+Definition:141141+142142+ Atomically store newval in *ptr only if *ptr is equal to oldval.143143+ Return zero if *ptr was changed or non-zero if no exchange happened.144144+ The C flag is also set if *ptr was changed to allow for assembly145145+ optimization in the calling code.146146+147147+Usage example:148148+149149+typedef int (__kuser_cmpxchg_t)(int oldval, int newval, volatile int *ptr);150150+#define __kuser_cmpxchg (*(__kuser_cmpxchg_t *)0xffff0fc0)151151+152152+int atomic_add(volatile int *ptr, int val)153153+{154154+ int old, new;155155+156156+ do {157157+ old = *ptr;158158+ new = old + val;159159+ } while(__kuser_cmpxchg(old, new, ptr));160160+161161+ return new;162162+}163163+164164+Notes:165165+166166+ - This routine already includes memory barriers as needed.167167+168168+ - Valid only if __kuser_helper_version >= 2 (from kernel version 2.6.12).169169+170170+kuser_memory_barrier171171+--------------------172172+173173+Location: 0xffff0fa0174174+175175+Reference prototype:176176+177177+ void __kuser_memory_barrier(void);178178+179179+Input:180180+181181+ lr = return address182182+183183+Output:184184+185185+ none186186+187187+Clobbered registers:188188+189189+ none190190+191191+Definition:192192+193193+ Apply any needed memory barrier to preserve consistency with data modified194194+ manually and __kuser_cmpxchg usage.195195+196196+Usage example:197197+198198+typedef void (__kuser_dmb_t)(void);199199+#define __kuser_dmb (*(__kuser_dmb_t *)0xffff0fa0)200200+201201+Notes:202202+203203+ - Valid only if __kuser_helper_version >= 3 (from kernel version 2.6.15).204204+205205+kuser_cmpxchg64206206+---------------207207+208208+Location: 0xffff0f60209209+210210+Reference prototype:211211+212212+ int __kuser_cmpxchg64(const int64_t *oldval,213213+ const int64_t *newval,214214+ volatile int64_t *ptr);215215+216216+Input:217217+218218+ r0 = pointer to oldval219219+ r1 = pointer to newval220220+ r2 = pointer to target value221221+ lr = return address222222+223223+Output:224224+225225+ r0 = success code (zero or non-zero)226226+ C flag = set if r0 == 0, clear if r0 != 0227227+228228+Clobbered registers:229229+230230+ r3, lr, flags231231+232232+Definition:233233+234234+ Atomically store the 64-bit value pointed by *newval in *ptr only if *ptr235235+ is equal to the 64-bit value pointed by *oldval. Return zero if *ptr was236236+ changed or non-zero if no exchange happened.237237+238238+ The C flag is also set if *ptr was changed to allow for assembly239239+ optimization in the calling code.240240+241241+Usage example:242242+243243+typedef int (__kuser_cmpxchg64_t)(const int64_t *oldval,244244+ const int64_t *newval,245245+ volatile int64_t *ptr);246246+#define __kuser_cmpxchg64 (*(__kuser_cmpxchg64_t *)0xffff0f60)247247+248248+int64_t atomic_add64(volatile int64_t *ptr, int64_t val)249249+{250250+ int64_t old, new;251251+252252+ do {253253+ old = *ptr;254254+ new = old + val;255255+ } while(__kuser_cmpxchg64(&old, &new, ptr));256256+257257+ return new;258258+}259259+260260+Notes:261261+262262+ - This routine already includes memory barriers as needed.263263+264264+ - Due to the length of this sequence, this spans 2 conventional kuser265265+ "slots", therefore 0xffff0f80 is not used as a valid entry point.266266+267267+ - Valid only if __kuser_helper_version >= 5 (from kernel version 3.1).
+94-152
arch/arm/kernel/entry-armv.S
···383383 .endm384384385385 .macro kuser_cmpxchg_check386386-#if __LINUX_ARM_ARCH__ < 6 && !defined(CONFIG_NEEDS_SYSCALL_FOR_CMPXCHG)386386+#if !defined(CONFIG_CPU_32v6K) && !defined(CONFIG_NEEDS_SYSCALL_FOR_CMPXCHG)387387#ifndef CONFIG_MMU388388#warning "NPTL on non MMU needs fixing"389389#else···392392 @ perform a quick test inline since it should be false393393 @ 99.9999% of the time. The rest is done out of line.394394 cmp r2, #TASK_SIZE395395- blhs kuser_cmpxchg_fixup395395+ blhs kuser_cmpxchg64_fixup396396#endif397397#endif398398 .endm···758758/*759759 * User helpers.760760 *761761- * These are segment of kernel provided user code reachable from user space762762- * at a fixed address in kernel memory. This is used to provide user space763763- * with some operations which require kernel help because of unimplemented764764- * native feature and/or instructions in many ARM CPUs. The idea is for765765- * this code to be executed directly in user mode for best efficiency but766766- * which is too intimate with the kernel counter part to be left to user767767- * libraries. In fact this code might even differ from one CPU to another768768- * depending on the available instruction set and restrictions like on769769- * SMP systems. In other words, the kernel reserves the right to change770770- * this code as needed without warning. Only the entry points and their771771- * results are guaranteed to be stable.772772- *773761 * Each segment is 32-byte aligned and will be moved to the top of the high774762 * vector page. New segments (if ever needed) must be added in front of775763 * existing ones. This mechanism should be used only for things that are776764 * really small and justified, and not be abused freely.777765 *778778- * User space is expected to implement those things inline when optimizing779779- * for a processor that has the necessary native support, but only if such780780- * resulting binaries are already to be incompatible with earlier ARM781781- * processors due to the use of unsupported instructions other than what782782- * is provided here. In other words don't make binaries unable to run on783783- * earlier processors just for the sake of not using these kernel helpers784784- * if your compiled code is not going to use the new instructions for other785785- * purpose.766766+ * See Documentation/arm/kernel_user_helpers.txt for formal definitions.786767 */787768 THUMB( .arm )788769···780799__kuser_helper_start:781800782801/*783783- * Reference prototype:784784- *785785- * void __kernel_memory_barrier(void)786786- *787787- * Input:788788- *789789- * lr = return address790790- *791791- * Output:792792- *793793- * none794794- *795795- * Clobbered:796796- *797797- * none798798- *799799- * Definition and user space usage example:800800- *801801- * typedef void (__kernel_dmb_t)(void);802802- * #define __kernel_dmb (*(__kernel_dmb_t *)0xffff0fa0)803803- *804804- * Apply any needed memory barrier to preserve consistency with data modified805805- * manually and __kuser_cmpxchg usage.806806- *807807- * This could be used as follows:808808- *809809- * #define __kernel_dmb() \810810- * asm volatile ( "mov r0, #0xffff0fff; mov lr, pc; sub pc, r0, #95" \811811- * : : : "r0", "lr","cc" )802802+ * Due to the length of some sequences, __kuser_cmpxchg64 spans 2 regular803803+ * kuser "slots", therefore 0xffff0f80 is not used as a valid entry point.812804 */805805+806806+__kuser_cmpxchg64: @ 0xffff0f60807807+808808+#if defined(CONFIG_NEEDS_SYSCALL_FOR_CMPXCHG)809809+810810+ /*811811+ * Poor you. No fast solution possible...812812+ * The kernel itself must perform the operation.813813+ * A special ghost syscall is used for that (see traps.c).814814+ */815815+ stmfd sp!, {r7, lr}816816+ ldr r7, 1f @ it's 20 bits817817+ swi __ARM_NR_cmpxchg64818818+ ldmfd sp!, {r7, pc}819819+1: .word __ARM_NR_cmpxchg64820820+821821+#elif defined(CONFIG_CPU_32v6K)822822+823823+ stmfd sp!, {r4, r5, r6, r7}824824+ ldrd r4, r5, [r0] @ load old val825825+ ldrd r6, r7, [r1] @ load new val826826+ smp_dmb arm827827+1: ldrexd r0, r1, [r2] @ load current val828828+ eors r3, r0, r4 @ compare with oldval (1)829829+ eoreqs r3, r1, r5 @ compare with oldval (2)830830+ strexdeq r3, r6, r7, [r2] @ store newval if eq831831+ teqeq r3, #1 @ success?832832+ beq 1b @ if no then retry833833+ smp_dmb arm834834+ rsbs r0, r3, #0 @ set returned val and C flag835835+ ldmfd sp!, {r4, r5, r6, r7}836836+ bx lr837837+838838+#elif !defined(CONFIG_SMP)839839+840840+#ifdef CONFIG_MMU841841+842842+ /*843843+ * The only thing that can break atomicity in this cmpxchg64844844+ * implementation is either an IRQ or a data abort exception845845+ * causing another process/thread to be scheduled in the middle of846846+ * the critical sequence. The same strategy as for cmpxchg is used.847847+ */848848+ stmfd sp!, {r4, r5, r6, lr}849849+ ldmia r0, {r4, r5} @ load old val850850+ ldmia r1, {r6, lr} @ load new val851851+1: ldmia r2, {r0, r1} @ load current val852852+ eors r3, r0, r4 @ compare with oldval (1)853853+ eoreqs r3, r1, r5 @ compare with oldval (2)854854+2: stmeqia r2, {r6, lr} @ store newval if eq855855+ rsbs r0, r3, #0 @ set return val and C flag856856+ ldmfd sp!, {r4, r5, r6, pc}857857+858858+ .text859859+kuser_cmpxchg64_fixup:860860+ @ Called from kuser_cmpxchg_fixup.861861+ @ r2 = address of interrupted insn (must be preserved).862862+ @ sp = saved regs. r7 and r8 are clobbered.863863+ @ 1b = first critical insn, 2b = last critical insn.864864+ @ If r2 >= 1b and r2 <= 2b then saved pc_usr is set to 1b.865865+ mov r7, #0xffff0fff866866+ sub r7, r7, #(0xffff0fff - (0xffff0f60 + (1b - __kuser_cmpxchg64)))867867+ subs r8, r2, r7868868+ rsbcss r8, r8, #(2b - 1b)869869+ strcs r7, [sp, #S_PC]870870+#if __LINUX_ARM_ARCH__ < 6871871+ bcc kuser_cmpxchg32_fixup872872+#endif873873+ mov pc, lr874874+ .previous875875+876876+#else877877+#warning "NPTL on non MMU needs fixing"878878+ mov r0, #-1879879+ adds r0, r0, #0880880+ usr_ret lr881881+#endif882882+883883+#else884884+#error "incoherent kernel configuration"885885+#endif886886+887887+ /* pad to next slot */888888+ .rept (16 - (. - __kuser_cmpxchg64)/4)889889+ .word 0890890+ .endr891891+892892+ .align 5813893814894__kuser_memory_barrier: @ 0xffff0fa0815895 smp_dmb arm816896 usr_ret lr817897818898 .align 5819819-820820-/*821821- * Reference prototype:822822- *823823- * int __kernel_cmpxchg(int oldval, int newval, int *ptr)824824- *825825- * Input:826826- *827827- * r0 = oldval828828- * r1 = newval829829- * r2 = ptr830830- * lr = return address831831- *832832- * Output:833833- *834834- * r0 = returned value (zero or non-zero)835835- * C flag = set if r0 == 0, clear if r0 != 0836836- *837837- * Clobbered:838838- *839839- * r3, ip, flags840840- *841841- * Definition and user space usage example:842842- *843843- * typedef int (__kernel_cmpxchg_t)(int oldval, int newval, int *ptr);844844- * #define __kernel_cmpxchg (*(__kernel_cmpxchg_t *)0xffff0fc0)845845- *846846- * Atomically store newval in *ptr if *ptr is equal to oldval for user space.847847- * Return zero if *ptr was changed or non-zero if no exchange happened.848848- * The C flag is also set if *ptr was changed to allow for assembly849849- * optimization in the calling code.850850- *851851- * Notes:852852- *853853- * - This routine already includes memory barriers as needed.854854- *855855- * For example, a user space atomic_add implementation could look like this:856856- *857857- * #define atomic_add(ptr, val) \858858- * ({ register unsigned int *__ptr asm("r2") = (ptr); \859859- * register unsigned int __result asm("r1"); \860860- * asm volatile ( \861861- * "1: @ atomic_add\n\t" \862862- * "ldr r0, [r2]\n\t" \863863- * "mov r3, #0xffff0fff\n\t" \864864- * "add lr, pc, #4\n\t" \865865- * "add r1, r0, %2\n\t" \866866- * "add pc, r3, #(0xffff0fc0 - 0xffff0fff)\n\t" \867867- * "bcc 1b" \868868- * : "=&r" (__result) \869869- * : "r" (__ptr), "rIL" (val) \870870- * : "r0","r3","ip","lr","cc","memory" ); \871871- * __result; })872872- */873899874900__kuser_cmpxchg: @ 0xffff0fc0875901···913925 usr_ret lr914926915927 .text916916-kuser_cmpxchg_fixup:928928+kuser_cmpxchg32_fixup:917929 @ Called from kuser_cmpxchg_check macro.918930 @ r2 = address of interrupted insn (must be preserved).919931 @ sp = saved regs. r7 and r8 are clobbered.···951963952964 .align 5953965954954-/*955955- * Reference prototype:956956- *957957- * int __kernel_get_tls(void)958958- *959959- * Input:960960- *961961- * lr = return address962962- *963963- * Output:964964- *965965- * r0 = TLS value966966- *967967- * Clobbered:968968- *969969- * none970970- *971971- * Definition and user space usage example:972972- *973973- * typedef int (__kernel_get_tls_t)(void);974974- * #define __kernel_get_tls (*(__kernel_get_tls_t *)0xffff0fe0)975975- *976976- * Get the TLS value as previously set via the __ARM_NR_set_tls syscall.977977- *978978- * This could be used as follows:979979- *980980- * #define __kernel_get_tls() \981981- * ({ register unsigned int __val asm("r0"); \982982- * asm( "mov r0, #0xffff0fff; mov lr, pc; sub pc, r0, #31" \983983- * : "=r" (__val) : : "lr","cc" ); \984984- * __val; })985985- */986986-987966__kuser_get_tls: @ 0xffff0fe0988967 ldr r0, [pc, #(16 - 8)] @ read TLS, set in kuser_get_tls_init989968 usr_ret lr···9581003 .rep 49591004 .word 0 @ 0xffff0ff0 software TLS value, then9601005 .endr @ pad up to __kuser_helper_version961961-962962-/*963963- * Reference declaration:964964- *965965- * extern unsigned int __kernel_helper_version;966966- *967967- * Definition and user space usage example:968968- *969969- * #define __kernel_helper_version (*(unsigned int *)0xffff0ffc)970970- *971971- * User space may read this to determine the curent number of helpers972972- * available.973973- */97410069751007__kuser_helper_version: @ 0xffff0ffc9761008 .word ((__kuser_helper_end - __kuser_helper_start) >> 5)