commits
Pull more Kbuild updates from Masahiro Yamada:
- improve boolinit.cocci and use_after_iter.cocci semantic patches
- fix alignment for kallsyms
- move 'asm goto' compiler test to Kconfig and clean up jump_label
CONFIG option
- generate asm-generic wrappers automatically if arch does not
implement mandatory UAPI headers
- remove redundant generic-y defines
- misc cleanups
* tag 'kbuild-v4.21-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kconfig: rename generated .*conf-cfg to *conf-cfg
kbuild: remove unnecessary stubs for archheader and archscripts
kbuild: use assignment instead of define ... endef for filechk_* rules
arch: remove redundant UAPI generic-y defines
kbuild: generate asm-generic wrappers if mandatory headers are missing
arch: remove stale comments "UAPI Header export list"
riscv: remove redundant kernel-space generic-y
kbuild: change filechk to surround the given command with { }
kbuild: remove redundant target cleaning on failure
kbuild: clean up rule_dtc_dt_yaml
kbuild: remove UIMAGE_IN and UIMAGE_OUT
jump_label: move 'asm goto' support test to Kconfig
kallsyms: lower alignment on ARM
scripts: coccinelle: boolinit: drop warnings on named constants
scripts: coccinelle: check for redeclaration
kconfig: remove unused "file" field of yylval union
nds32: remove redundant kernel-space generic-y
nios2: remove unneeded HAS_DMA define
Pull perf tooling updates form Ingo Molnar:
"A final batch of perf tooling changes: mostly fixes and small
improvements"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits)
perf session: Add comment for perf_session__register_idle_thread()
perf thread-stack: Fix thread stack processing for the idle task
perf thread-stack: Allocate an array of thread stacks
perf thread-stack: Factor out thread_stack__init()
perf thread-stack: Allow for a thread stack array
perf thread-stack: Avoid direct reference to the thread's stack
perf thread-stack: Tidy thread_stack__bottom() usage
perf thread-stack: Simplify some code in thread_stack__process()
tools gpio: Allow overriding CFLAGS
tools power turbostat: Override CFLAGS assignments and add LDFLAGS to build command
tools thermal tmon: Allow overriding CFLAGS assignments
tools power x86_energy_perf_policy: Override CFLAGS assignments and add LDFLAGS to build command
perf c2c: Increase the HITM ratio limit for displayed cachelines
perf c2c: Change the default coalesce setup
perf trace beauty ioctl: Beautify USBDEVFS_ commands
perf trace beauty: Export function to get the files for a thread
perf trace: Wire up ioctl's USBDEBFS_ cmd table generator
perf beauty ioctl: Add generator for USBDEVFS_ ioctl commands
tools headers uapi: Grab a copy of usbdevice_fs.h
perf trace: Store the major number for a file when storing its pathname
...
Remove the dot-prefixing since it is just a matter of the
.gitignore file.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
The semantics of what "in core" means for the mincore() system call are
somewhat unclear, but Linux has always (since 2.3.52, which is when
mincore() was initially done) treated it as "page is available in page
cache" rather than "page is mapped in the mapping".
The problem with that traditional semantic is that it exposes a lot of
system cache state that it really probably shouldn't, and that users
shouldn't really even care about.
So let's try to avoid that information leak by simply changing the
semantics to be that mincore() counts actual mapped pages, not pages
that might be cheaply mapped if they were faulted (note the "might be"
part of the old semantics: being in the cache doesn't actually guarantee
that you can access them without IO anyway, since things like network
filesystems may have to revalidate the cache before use).
In many ways the old semantics were somewhat insane even aside from the
information leak issue. From the very beginning (and that beginning is
a long time ago: 2.3.52 was released in March 2000, I think), the code
had a comment saying
Later we can get more picky about what "in core" means precisely.
and this is that "later". Admittedly it is much later than is really
comfortable.
NOTE! This is a real semantic change, and it is for example known to
change the output of "fincore", since that program literally does a
mmmap without populating it, and then doing "mincore()" on that mapping
that doesn't actually have any pages in it.
I'm hoping that nobody actually has any workflow that cares, and the
info leak is real.
We may have to do something different if it turns out that people have
valid reasons to want the old semantics, and if we can limit the
information leak sanely.
Cc: Kevin Easton <kevin@guarana.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Masatake YAMATO <yamato@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
perf c2c:
Jiri Olsa:
- Change the default coalesce setup to from '--coalesce pid,iaddr' to just '--coalesce iaddr'.
- Increase the HITM ratio limit for displayed cachelines.
perf script:
Andi Kleen:
- Fix LBR skid dump problems in brstackinsn.
perf trace:
Arnaldo Carvalho de Melo:
- Check if the raw_syscalls:sys_{enter,exit} are setup before setting tp filter.
- Do not hardcode the size of the tracepoint common_ fields.
- Beautify USBDEFFS_ ioctl commands.
Colin Ian King:
- Use correct SECCOMP prefix spelling, "SECOMP_*" -> "SECCOMP_*".
perf python:
Jiri Olsa:
- Do not force closing original perf descriptor in evlist.get_pollfd().
tools misc:
Jiri Olsa:
- Allow overriding CFLAGS and LDFLAGS.
perf build:
Stanislav Fomichev:
- Don't unconditionally link the libbfd feature test to -liberty and -lz
thread-stack:
Adrian Hunter:
- Fix processing for the idle task, having a stack per cpu.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Make simply skips a missing rule when it is marked as .PHONY.
Remove the dummy targets.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Commit 594cc251fdd0 ("make 'user_access_begin()' do 'access_ok()'")
broke both alpha and SH booting in qemu, as noticed by Guenter Roeck.
It turns out that the bug wasn't actually in that commit itself (which
would have been surprising: it was mostly a no-op), but in how the
addition of access_ok() to the strncpy_from_user() and strnlen_user()
functions now triggered the case where those functions would test the
access of the very last byte of the user address space.
The string functions actually did that user range test before too, but
they did it manually by just comparing against user_addr_max(). But
with user_access_begin() doing the check (using "access_ok()"), it now
exposed problems in the architecture implementations of that function.
For example, on alpha, the access_ok() helper macro looked like this:
#define __access_ok(addr, size) \
((get_fs().seg & (addr | size | (addr+size))) == 0)
and what it basically tests is of any of the high bits get set (the
USER_DS masking value is 0xfffffc0000000000).
And that's completely wrong for the "addr+size" check. Because it's
off-by-one for the case where we check to the very end of the user
address space, which is exactly what the strn*_user() functions do.
Why? Because "addr+size" will be exactly the size of the address space,
so trying to access the last byte of the user address space will fail
the __access_ok() check, even though it shouldn't. As a result, the
user string accessor functions failed consistently - because they
literally don't know how long the string is going to be, and the max
access is going to be that last byte of the user address space.
Side note: that alpha macro is buggy for another reason too - it re-uses
the arguments twice.
And SH has another version of almost the exact same bug:
#define __addr_ok(addr) \
((unsigned long __force)(addr) < current_thread_info()->addr_limit.seg)
so far so good: yes, a user address must be below the limit. But then:
#define __access_ok(addr, size) \
(__addr_ok((addr) + (size)))
is wrong with the exact same off-by-one case: the case when "addr+size"
is exactly _equal_ to the limit is actually perfectly fine (think "one
byte access at the last address of the user address space")
The SH version is actually seriously buggy in another way: it doesn't
actually check for overflow, even though it did copy the _comment_ that
talks about overflow.
So it turns out that both SH and alpha actually have completely buggy
implementations of access_ok(), but they happened to work in practice
(although the SH overflow one is a serious serious security bug, not
that anybody likely cares about SH security).
This fixes the problems by using a similar macro on both alpha and SH.
It isn't trying to be clever, the end address is based on this logic:
unsigned long __ao_end = __ao_a + __ao_b - !!__ao_b;
which basically says "add start and length, and then subtract one unless
the length was zero". We can't subtract one for a zero length, or we'd
just hit an underflow instead.
For a lot of access_ok() users the length is a constant, so this isn't
actually as expensive as it initially looks.
Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Caused by making the variable static:
kernel/sched/fair.c:119:21: warning: 'capacity_margin' defined but not used [-Wunused-variable]
Seems easiest to just move it up under the existing ifdef CONFIG_SMP
that's a few lines above.
Fixes: ed8885a14433a ('sched/fair: Make some variables static')
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a comment to perf_session__register_idle_thread() to bring attention to
a pitfall with the idle task thread structure. The pitfall is that there
should really be a 'struct thread' for the idle task of each cpu, but there
is only one that can have pid == tid == 0.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181221120620.9659-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
You do not have to use define ... endef for filechk_* rules.
For simple cases, the use of assignment looks cleaner, IMHO.
I updated the usage for scripts/Kbuild.include in case somebody
misunderstands the 'define ... endif' is the requirement.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Pull fscrypt updates from Ted Ts'o:
"Add Adiantum support for fscrypt"
* tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt:
fscrypt: add Adiantum support
Pull x86 platform update from Ingo Molnar:
"An OLPC platform support simplification patch"
* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/platform/olpc: Do not call of_platform_bus_probe()
perf creates a single 'struct thread' to represent the idle task. That
is because threads are identified by PID and TID, and the idle task
always has PID == TID == 0.
However, there are actually separate idle tasks for each CPU. That
creates a problem for thread stack processing which assumes that each
thread has a single stack, not one stack per CPU.
Fix that by passing through the CPU number, and in the case of the idle
"thread", pick the thread stack from an array based on the CPU number.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181221120620.9659-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now that Kbuild automatically creates asm-generic wrappers for missing
mandatory headers, it is redundant to list the same headers in
generic-y and mandatory-y.
Suggested-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Pull ext4 bug fixes from Ted Ts'o:
"Fix a number of ext4 bugs"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: fix special inode number checks in __ext4_iget()
ext4: track writeback errors using the generic tracking infrastructure
ext4: use ext4_write_inode() when fsyncing w/o a journal
ext4: avoid kernel warning when writing the superblock to a dead device
ext4: fix a potential fiemap/page fault deadlock w/ inline_data
ext4: make sure enough credits are reserved for dioread_nolock writes
Add support for the Adiantum encryption mode to fscrypt. Adiantum is a
tweakable, length-preserving encryption mode with security provably
reducible to that of XChaCha12 and AES-256, subject to a security bound.
It's also a true wide-block mode, unlike XTS. See the paper
"Adiantum: length-preserving encryption for entry-level processors"
(https://eprint.iacr.org/2018/720.pdf) for more details. Also see
commit 059c2a4d8e16 ("crypto: adiantum - add Adiantum support").
On sufficiently long messages, Adiantum's bottlenecks are XChaCha12 and
the NH hash function. These algorithms are fast even on processors
without dedicated crypto instructions. Adiantum makes it feasible to
enable storage encryption on low-end mobile devices that lack AES
instructions; currently such devices are unencrypted. On ARM Cortex-A7,
on 4096-byte messages Adiantum encryption is about 4 times faster than
AES-256-XTS encryption; decryption is about 5 times faster.
In fscrypt, Adiantum is suitable for encrypting both file contents and
names. With filenames, it fixes a known weakness: when two filenames in
a directory share a common prefix of >= 16 bytes, with CTS-CBC their
encrypted filenames share a common prefix too, leaking information.
Adiantum does not have this problem.
Since Adiantum also accepts long tweaks (IVs), it's also safe to use the
master key directly for Adiantum encryption rather than deriving
per-file keys, provided that the per-file nonce is included in the IVs
and the master key isn't used for any other encryption mode. This
configuration saves memory and improves performance. A new fscrypt
policy flag is added to allow users to opt-in to this configuration.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Pull x86 mm updates from Ingo Molnar:
"The main changes in this cycle were:
- Update and clean up x86 fault handling, by Andy Lutomirski.
- Drop usage of __flush_tlb_all() in kernel_physical_mapping_init()
and related fallout, by Dan Williams.
- CPA cleanups and reorganization by Peter Zijlstra: simplify the
flow and remove a few warts.
- Other misc cleanups"
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits)
x86/mm/dump_pagetables: Use DEFINE_SHOW_ATTRIBUTE()
x86/mm/cpa: Rename @addrinarray to @numpages
x86/mm/cpa: Better use CLFLUSHOPT
x86/mm/cpa: Fold cpa_flush_range() and cpa_flush_array() into a single cpa_flush() function
x86/mm/cpa: Make cpa_data::numpages invariant
x86/mm/cpa: Optimize cpa_flush_array() TLB invalidation
x86/mm/cpa: Simplify the code after making cpa->vaddr invariant
x86/mm/cpa: Make cpa_data::vaddr invariant
x86/mm/cpa: Add __cpa_addr() helper
x86/mm/cpa: Add ARRAY and PAGES_ARRAY selftests
x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init()
x86/mm: Validate kernel_physical_mapping_init() PTE population
generic/pgtable: Introduce set_pte_safe()
generic/pgtable: Introduce {p4d,pgd}_same()
generic/pgtable: Make {pmd, pud}_same() unconditionally available
x86/fault: Clean up the page fault oops decoder a bit
x86/fault: Decode page fault OOPSes better
x86/vsyscall/64: Use X86_PF constants in the simulated #PF error code
x86/oops: Show the correct CS value in show_regs()
x86/fault: Don't try to recover from an implicit supervisor access
...
The DT core will probe the DT by default now, so the OLPC platform code
calling of_platform_bus_probe() is not necessary. The algorithm for what
nodes are probed is a little different in how compatible is handled, but
since OLPC uses compatible strings for matching it is not affected by
this difference.
Also, only the battery node located at the root level gets a device
created as the dcon is a PCI device and the RTC device is created in
olpc-xo1-rtc.c.
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Ingo Molnar <mingo@redhat.com>
CC: Lubomir Rintel <lkundrak@v3.sk>
Cc: Thomas Gleixner <tglx@linutronix.de>
CC: devicetree@vger.kernel.org
CC: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/20181116201820.10065-1-robh@kernel.org
In preparation for fixing thread stack processing for the idle task,
allocate an array of thread stacks.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181221120620.9659-7-adrian.hunter@intel.com
[ No need to check for NULL when calling zfree(), noticed by Jiri Olsa ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some time ago, Sam pointed out a certain degree of overwrap between
generic-y and mandatory-y. (https://lkml.org/lkml/2017/7/10/121)
I tweaked the meaning of mandatory-y a little bit; now it defines the
minimum set of ASM headers that all architectures must have.
If arch does not have specific implementation of a mandatory header,
Kbuild will let it fallback to the asm-generic one by automatically
generating a wrapper. This will allow to drop lots of redundant
generic-y defines.
Previously, "mandatory" was used in the context of UAPI, but I guess
this can be extended to kernel space ASM headers.
Suggested-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Pull dma-mapping fixes from Christoph Hellwig:
"Fix various regressions introduced in this cycles:
- fix dma-debug tracking for the map_page / map_single
consolidatation
- properly stub out DMA mapping symbols for !HAS_DMA builds to avoid
link failures
- fix AMD Gart direct mappings
- setup the dma address for no kernel mappings using the remap
allocator"
* tag 'dma-mapping-4.21-1' of git://git.infradead.org/users/hch/dma-mapping:
dma-direct: fix DMA_ATTR_NO_KERNEL_MAPPING for remapped allocations
x86/amd_gart: fix unmapping of non-GART mappings
dma-mapping: remove a few unused exports
dma-mapping: properly stub out the DMA API for !CONFIG_HAS_DMA
dma-mapping: remove dmam_{declare,release}_coherent_memory
dma-mapping: implement dmam_alloc_coherent using dmam_alloc_attrs
dma-mapping: implement dma_map_single_attrs using dma_map_page_attrs
The check for special (reserved) inode number checks in __ext4_iget()
was broken by commit 8a363970d1dc: ("ext4: avoid declaring fs
inconsistent due to invalid file handles"). This was caused by a
botched reversal of the sense of the flag now known as
EXT4_IGET_SPECIAL (when it was previously named EXT4_IGET_NORMAL).
Fix the logic appropriately.
Fixes: 8a363970d1dc ("ext4: avoid declaring fs inconsistent...")
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable@kernel.org
In F2FS_HAS_FEATURE(), we will use F2FS_SB(sb) to get sbi pointer to
access .raw_super field, to avoid unneeded pointer conversion, this
patch changes to F2FS_HAS_FEATURE() accept sbi parameter directly.
Just do cleanup, no logic change.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Pull x86 fpu updates from Ingo Molnar:
"Misc preparatory changes for an upcoming FPU optimization that will
delay the loading of FPU registers to return-to-userspace"
* 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/fpu: Don't export __kernel_fpu_{begin,end}()
x86/fpu: Update comment for __raw_xsave_addr()
x86/fpu: Add might_fault() to user_insn()
x86/pkeys: Make init_pkru_value static
x86/thread_info: Remove _TIF_ALLWORK_MASK
x86/process/32: Remove asm/math_emu.h include
x86/fpu: Use unsigned long long shift in xfeature_uncompacted_offset()
Use DEFINE_SHOW_ATTRIBUTE() instead of open coding it.
Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: keescook@chromium.org
Cc: luto@kernel.org
Cc: peterz@infradead.org
Cc: bp@alien8.de
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/20181119154334.18265-1-tiny.windzz@gmail.com
In preparation for fixing thread stack processing for the idle task,
factor out thread_stack__init().
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181221120620.9659-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
These comments are leftovers of commit fcc8487d477a ("uapi: export all
headers under uapi directories").
Prior to that commit, exported headers must be explicitly added to
header-y. Now, all headers under the uapi/ directories are exported.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Pull chrome platform updates from Benson Leung:
- Changes for EC_MKBP_EVENT_SENSOR_FIFO handling.
- Also, maintainership changes. Olofj out, Enric balletbo in.
* tag 'tag-chrome-platform-for-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform:
MAINTAINERS: add maintainers for ChromeOS EC sub-drivers
MAINTAINERS: platform/chrome: Add Enric as a maintainer
MAINTAINERS: platform/chrome: remove myself as maintainer
platform/chrome: don't report EC_MKBP_EVENT_SENSOR_FIFO as wakeup
platform/chrome: straighten out cros_ec_get_{next,host}_event() error codes
We need to return a dma_addr_t even if we don't have a kernel mapping.
Do so by consolidating the phys_to_dma call in a single place and jump
to it from all the branches that return successfully.
Fixes: bfd56cd60521 ("dma-mapping: support highmem in the generic remap allocator")
Reported-by: Liviu Dudau <liviu@dudau.co.uk
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Liviu Dudau <liviu@dudau.co.uk>
We already using mapping_set_error() in fs/ext4/page_io.c, so all we
need to do is to use file_check_and_advance_wb_err() when handling
fsync() requests in ext4_sync_file().
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Pull x86 cpu updates from Ingo Molnar:
"Misc changes:
- Fix nr_cpus= boot option interaction bug with logical package
management
- Clean up UMIP detection messages
- Add WBNOINVD instruction detection
- Remove the unused get_scattered_cpuid_leaf() function"
* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/topology: Use total_cpus for max logical packages calculation
x86/umip: Make the UMIP activated message generic
x86/umip: Print UMIP line only once
x86/cpufeatures: Add WBNOINVD feature definition
x86/cpufeatures: Remove get_scattered_cpuid_leaf()
There is one user of __kernel_fpu_begin() and before invoking it,
it invokes preempt_disable(). So it could invoke kernel_fpu_begin()
right away. The 32bit version of arch_efi_call_virt_setup() and
arch_efi_call_virt_teardown() does this already.
The comment above *kernel_fpu*() claims that before invoking
__kernel_fpu_begin() preemption should be disabled and that KVM is a
good example of doing it. Well, KVM doesn't do that since commit
f775b13eedee2 ("x86,kvm: move qemu/guest FPU switching out to vcpu_run")
so it is not an example anymore.
With EFI gone as the last user of __kernel_fpu_{begin|end}(), both can
be made static and not exported anymore.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Rik van Riel <riel@surriel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Nicolai Stange <nstange@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: linux-efi <linux-efi@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20181129150210.2k4mawt37ow6c2vq@linutronix.de
The CPA_ARRAY interface works in single pages, and everything, except
in these 'few' locations is this variable called 'numpages'.
Remove this 'addrinarray' abberation and use 'numpages' consistently.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom.StDenis@amd.com
Cc: dave.hansen@intel.com
Link: http://lkml.kernel.org/r/20181203171043.695039210@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull networking fixes from David Miller:
"One last pull request before heading to Vancouver for LPC, here we have:
1) Don't forget to free VSI contexts during ice driver unload, from
Victor Raj.
2) Don't forget napi delete calls during device remove in ice driver,
from Dave Ertman.
3) Don't request VLAN tag insertion of ibmvnic device when SKB
doesn't have VLAN tags at all.
4) IPV4 frag handling code has to accomodate the situation where two
threads try to insert the same fragment into the hash table at the
same time. From Eric Dumazet.
5) Relatedly, don't flow separate on protocol ports for fragmented
frames, also from Eric Dumazet.
6) Memory leaks in qed driver, from Denis Bolotin.
7) Correct valid MTU range in smsc95xx driver, from Stefan Wahren.
8) Validate cls_flower nested policies properly, from Jakub Kicinski.
9) Clearing of stats counters in mc88e6xxx driver doesn't retain
important bits in the G1_STATS_OP register causing the chip to
hang. Fix from Andrew Lunn"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits)
act_mirred: clear skb->tstamp on redirect
net: dsa: mv88e6xxx: Fix clearing of stats counters
tipc: fix link re-establish failure
net: sched: cls_flower: validate nested enc_opts_policy to avoid warning
net: mvneta: correct typo
flow_dissector: do not dissect l4 ports for fragments
net: qualcomm: rmnet: Fix incorrect assignment of real_dev
net: aquantia: allow rx checksum offload configuration
net: aquantia: invalid checksumm offload implementation
net: aquantia: fixed enable unicast on 32 macvlan
net: aquantia: fix potential IOMMU fault after driver unbind
net: aquantia: synchronized flow control between mac/phy
net: smsc95xx: Fix MTU range
net: stmmac: Fix RX packet size > 8191
qed: Fix potential memory corruption
qed: Fix SPQ entries not returned to pool in error flows
qed: Fix blocking/unlimited SPQ entries leak
qed: Fix memory/entry leak in qed_init_sp_request()
inet: frags: better deal with smp races
net: hns3: bugfix for not checking return value
...
In preparation for fixing thread stack processing for the idle task,
allow for a thread stack array.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181221120620.9659-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This commit removes redundant generic-y defines in
arch/riscv/include/asm/Kbuild.
[1] It is redundant to define the same generic-y in both
arch/$(ARCH)/include/asm/Kbuild and
arch/$(ARCH)/include/uapi/asm/Kbuild.
Remove the following generic-y:
errno.h
fcntl.h
ioctl.h
ioctls.h
ipcbuf.h
mman.h
msgbuf.h
param.h
poll.h
posix_types.h
resource.h
sembuf.h
setup.h
shmbuf.h
signal.h
socket.h
sockios.h
stat.h
statfs.h
swab.h
termbits.h
termios.h
types.h
[2] It is redundant to define generic-y when arch-specific
implementation exists in arch/$(ARCH)/include/asm/*.h
Remove the following generic-y:
cacheflush.h
module.h
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Pull hwspinlock updates from Bjorn Andersson:
"This adds support for the hardware semaphores found in STM32MP1"
* tag 'hwlock-v4.21' of git://github.com/andersson/remoteproc:
hwspinlock: fix return value check in stm32_hwspinlock_probe()
hwspinlock: add STM32 hwspinlock device
dt-bindings: hwlock: Document STM32 hwspinlock bindings
There are multiple ChromeOS EC sub-drivers spread in different
subsystems, as all of them are related to the Chrome stuff add
Benson and myself as a maintainers for all these sub-drivers.
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Benson Leung <bleung@chromium.org>
Acked-by: Wolfram Sang <wsa@the-dreams.de>
Acked-by: Chanwoo Choi <cw00.choi@samsung.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Acked-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Acked-by: Guenter Roeck <groeck@chromium.org>
In many cases we don't have to create a GART mapping at all, which
also means there is nothing to unmap. Fix the range check that was
incorrectly modified when removing the mapping_error method.
Fixes: 9e8aa6b546 ("x86/amd_gart: remove the mapping_error dma_map_ops method")
Reported-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Michal Kubecek <mkubecek@suse.cz>
In no-journal mode, we previously used __generic_file_fsync() in
no-journal mode. This triggers a lockdep warning, and in addition,
it's not safe to depend on the inode writeback mechanism in the case
ext4. We can solve both problems by calling ext4_write_inode()
directly.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
When sbi->segs_per_sec > 1, and if some segno has 0 valid blocks before
gc starts, do_garbage_collect will skip counting seg_freed++, and this
will cause seg_freed < sbi->segs_per_sec and finally skip sec_freed++.
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Pull x86 cleanups from Ingo Molnar:
"Misc cleanups"
* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/kprobes: Remove trampoline_handler() prototype
x86/kernel: Fix more -Wmissing-prototypes warnings
x86: Fix various typos in comments
x86/headers: Fix -Wmissing-prototypes warning
x86/process: Avoid unnecessary NULL check in get_wchan()
x86/traps: Complete prototype declarations
x86/mce: Fix -Wmissing-prototypes warnings
x86/gart: Rewrite early_gart_iommu_check() comment
nr_cpu_ids can be limited on the command line via nr_cpus=. This can break the
logical package management because it results in a smaller number of packages
while in kdump kernel.
Check below case:
There is a two sockets system, each socket has 8 cores, which has 16 logical
cpus while HT was turn on.
0 1 2 3 4 5 6 7 | 16 17 18 19 20 21 22 23
cores on socket 0 threads on socket 0
8 9 10 11 12 13 14 15 | 24 25 26 27 28 29 30 31
cores on socket 1 threads on socket 1
While starting the kdump kernel with command line option nr_cpus=16 panic
was triggered on one of the cpus 24-31 eg. 26, then online cpu will be
1-15, 26(cpu 0 was disabled in kdump), ncpus will be 16 and
__max_logical_packages will be 1, but actually two packages were booted on.
This issue can reproduced by set kdump option nr_cpus=<real physical core
numbers>, and then trigger panic on last socket's thread, for example:
taskset -c 26 echo c > /proc/sysrq-trigger
Use total_cpus which will not be limited by nr_cpus command line to calculate
the value of __max_logical_packages.
Signed-off-by: Hui Wang <john.wanghui@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <guijianfeng@huawei.com>
Cc: <wencongyang2@huawei.com>
Cc: <douliyang1@huawei.com>
Cc: <qiaonuohan@huawei.com>
Link: https://lkml.kernel.org/r/20181107023643.22174-1-john.wanghui@huawei.com
The comment above __raw_xsave_addr() claims that the function does not
work for compacted buffers and was introduced in:
b8b9b6ba9dec3 ("x86/fpu: Allow setting of XSAVE state")
In this commit, the function was factored out of get_xsave_addr() and
this function claims that it works with "standard format or compacted
format of xsave area". It accesses the "xstate_comp_offsets" variable
for the actual offset and it was introduced in commit
7496d6458fe32 ("Define kernel API to get address of each state in xsave area")
Based on the code (back then and now):
- xstate_offsets holds the standard offset.
- if compacted mode is not supported then xstate_comp_offsets gets the
xstate_offsets copied.
- if compacted mode is supported then xstate_comp_offsets will hold the
offset for the compacted buffer.
Based on that the function works for compacted buffers as long as the
CPU supports it and this what we care about.
Remove the "Note:" which is not accurate.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Rik van Riel <riel@surriel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20181128222035.2996-7-bigeasy@linutronix.de
Currently we issue an MFENCE before and after flushing a range. This
means that if we flush a bunch of single page ranges -- like with the
cpa array, we issue a whole bunch of superfluous MFENCEs.
Reorgainze the code a little to avoid this.
[ mingo: capitalize instructions, tweak changelog and comments. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom.StDenis@amd.com
Cc: dave.hansen@intel.com
Link: http://lkml.kernel.org/r/20181203171043.626999883@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull Kbuild fixes from Masahiro Yamada:
- fix build errors in binrpm-pkg and bindeb-pkg targets
- fix false positive matches in merge_config.sh
- fix build version mismatch in deb-pkg target
- fix dtbs_install handling in (bin)deb-pkg target
- revert a commit that allows setlocalversion to write to source tree
* tag 'kbuild-fixes-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
builddeb: Fix inclusion of dtbs in debian package
Revert "scripts/setlocalversion: git: Make -dirty check more robust"
kbuild: deb-pkg: fix too low build version number
kconfig: merge_config: avoid false positive matches from comment lines
kbuild: deb-pkg: fix bindeb-pkg breakage when O= is used
kbuild: rpm-pkg: fix binrpm-pkg breakage when O= is used
If sch_fq is used at ingress, skbs that might have been
timestamped by net_timestamp_set() if a packet capture
is requesting timestamps could be delayed by arbitrary
amount of time, since sch_fq time base is MONOTONIC.
Fix this problem by moving code from sch_netem.c to act_mirred.c.
Fixes: fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for fixing thread stack processing for the idle task,
avoid direct reference to the thread's stack. The thread stack will
change to an array of thread stacks, at which point the meaning of the
direct reference will change.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181221120620.9659-4-adrian.hunter@intel.com
[ Rename thread_stack__ts() to thread__stack() since this operates on a 'thread' struct ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
filechk_* rules often consist of multiple 'echo' lines. They must be
surrounded with { } or ( ) to work correctly. Otherwise, only the
string from the last 'echo' would be written into the target.
Let's take care of that in the 'filechk' in scripts/Kbuild.include
to clean up filechk_* rules.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Pull documentation fixes from Jonathan Corbet:
"A handful of late-arriving documentation fixes"
* tag 'docs-5.0-fixes' of git://git.lwn.net/linux:
doc: filesystems: fix bad references to nonexistent ext4.rst file
Documentation/admin-guide: update URL of LKML information link
Docs/kernel-api.rst: Remove blk-tag.c reference
In case of error, the function devm_ioremap_resource() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check should
be replaced with IS_ERR().
Fixes: f24fcff1d267 ("hwspinlock: add STM32 hwspinlock device")
Acked-by: Benjamin Gaignard <benjamin.gaignard@gmail.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Enric has volunteered to help me with maintaining chrome-platform
as we change the development model toward strictly upstream-first for
any chrome-platform and cros_ec driver.
Signed-off-by: Benson Leung <bleung@chromium.org>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Now that the slow path DMA API calls are implemented out of line a few
helpers only used by them don't need to be exported anymore.
Signed-off-by: Christoph Hellwig <hch@lst.de>
The xfstests generic/475 test switches the underlying device with
dm-error while running a stress test. This results in a large number
of file system errors, and since we can't lock the buffer head when
marking the superblock dirty in the ext4_grp_locked_error() case, it's
possible the superblock to be !buffer_uptodate() without
buffer_write_io_error() being true.
We need to set buffer_uptodate() before we call mark_buffer_dirty() or
this will trigger a WARN_ON. It's safe to do this since the
superblock must have been properly read into memory or the mount would
have been successful. So if buffer_uptodate() is not set, we can
safely assume that this happened due to a failed attempt to write the
superblock.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Previously, we only account preflush command for flush_merge mode,
so for noflush_merge mode, we can not know in-flight preflush
command count, fix it.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Pull x86 build updates from Ingo Molnar:
- Resolve LLVM build bug by removing redundant GNU specific flag
- Remove obsolete -funit-at-a-time and -fno-unit-at-a-time use from x86
PowerPC and UM.
The UML change was seen and acked by UML maintainer Richard
Weinberger.
* 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/um/vdso: Drop implicit common-page-size linker flag
x86, powerpc: Remove -funit-at-a-time compiler option entirely
x86/um: Remove -fno-unit-at-a-time workaround for pre-4.0 GCC
... and make it static. It is called only by the kretprobe_trampoline()
from asm.
It was marked __visible so that it is visible outside of the current
compilation unit but that is not needed as it is used only in this
compilation unit.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lkml.kernel.org/r/20181205162526.GB109259@gmail.com
Pull more Kbuild updates from Masahiro Yamada:
- improve boolinit.cocci and use_after_iter.cocci semantic patches
- fix alignment for kallsyms
- move 'asm goto' compiler test to Kconfig and clean up jump_label
CONFIG option
- generate asm-generic wrappers automatically if arch does not
implement mandatory UAPI headers
- remove redundant generic-y defines
- misc cleanups
* tag 'kbuild-v4.21-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kconfig: rename generated .*conf-cfg to *conf-cfg
kbuild: remove unnecessary stubs for archheader and archscripts
kbuild: use assignment instead of define ... endef for filechk_* rules
arch: remove redundant UAPI generic-y defines
kbuild: generate asm-generic wrappers if mandatory headers are missing
arch: remove stale comments "UAPI Header export list"
riscv: remove redundant kernel-space generic-y
kbuild: change filechk to surround the given command with { }
kbuild: remove redundant target cleaning on failure
kbuild: clean up rule_dtc_dt_yaml
kbuild: remove UIMAGE_IN and UIMAGE_OUT
jump_label: move 'asm goto' support test to Kconfig
kallsyms: lower alignment on ARM
scripts: coccinelle: boolinit: drop warnings on named constants
scripts: coccinelle: check for redeclaration
kconfig: remove unused "file" field of yylval union
nds32: remove redundant kernel-space generic-y
nios2: remove unneeded HAS_DMA define
Pull perf tooling updates form Ingo Molnar:
"A final batch of perf tooling changes: mostly fixes and small
improvements"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits)
perf session: Add comment for perf_session__register_idle_thread()
perf thread-stack: Fix thread stack processing for the idle task
perf thread-stack: Allocate an array of thread stacks
perf thread-stack: Factor out thread_stack__init()
perf thread-stack: Allow for a thread stack array
perf thread-stack: Avoid direct reference to the thread's stack
perf thread-stack: Tidy thread_stack__bottom() usage
perf thread-stack: Simplify some code in thread_stack__process()
tools gpio: Allow overriding CFLAGS
tools power turbostat: Override CFLAGS assignments and add LDFLAGS to build command
tools thermal tmon: Allow overriding CFLAGS assignments
tools power x86_energy_perf_policy: Override CFLAGS assignments and add LDFLAGS to build command
perf c2c: Increase the HITM ratio limit for displayed cachelines
perf c2c: Change the default coalesce setup
perf trace beauty ioctl: Beautify USBDEVFS_ commands
perf trace beauty: Export function to get the files for a thread
perf trace: Wire up ioctl's USBDEBFS_ cmd table generator
perf beauty ioctl: Add generator for USBDEVFS_ ioctl commands
tools headers uapi: Grab a copy of usbdevice_fs.h
perf trace: Store the major number for a file when storing its pathname
...
The semantics of what "in core" means for the mincore() system call are
somewhat unclear, but Linux has always (since 2.3.52, which is when
mincore() was initially done) treated it as "page is available in page
cache" rather than "page is mapped in the mapping".
The problem with that traditional semantic is that it exposes a lot of
system cache state that it really probably shouldn't, and that users
shouldn't really even care about.
So let's try to avoid that information leak by simply changing the
semantics to be that mincore() counts actual mapped pages, not pages
that might be cheaply mapped if they were faulted (note the "might be"
part of the old semantics: being in the cache doesn't actually guarantee
that you can access them without IO anyway, since things like network
filesystems may have to revalidate the cache before use).
In many ways the old semantics were somewhat insane even aside from the
information leak issue. From the very beginning (and that beginning is
a long time ago: 2.3.52 was released in March 2000, I think), the code
had a comment saying
Later we can get more picky about what "in core" means precisely.
and this is that "later". Admittedly it is much later than is really
comfortable.
NOTE! This is a real semantic change, and it is for example known to
change the output of "fincore", since that program literally does a
mmmap without populating it, and then doing "mincore()" on that mapping
that doesn't actually have any pages in it.
I'm hoping that nobody actually has any workflow that cares, and the
info leak is real.
We may have to do something different if it turns out that people have
valid reasons to want the old semantics, and if we can limit the
information leak sanely.
Cc: Kevin Easton <kevin@guarana.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Masatake YAMATO <yamato@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
perf c2c:
Jiri Olsa:
- Change the default coalesce setup to from '--coalesce pid,iaddr' to just '--coalesce iaddr'.
- Increase the HITM ratio limit for displayed cachelines.
perf script:
Andi Kleen:
- Fix LBR skid dump problems in brstackinsn.
perf trace:
Arnaldo Carvalho de Melo:
- Check if the raw_syscalls:sys_{enter,exit} are setup before setting tp filter.
- Do not hardcode the size of the tracepoint common_ fields.
- Beautify USBDEFFS_ ioctl commands.
Colin Ian King:
- Use correct SECCOMP prefix spelling, "SECOMP_*" -> "SECCOMP_*".
perf python:
Jiri Olsa:
- Do not force closing original perf descriptor in evlist.get_pollfd().
tools misc:
Jiri Olsa:
- Allow overriding CFLAGS and LDFLAGS.
perf build:
Stanislav Fomichev:
- Don't unconditionally link the libbfd feature test to -liberty and -lz
thread-stack:
Adrian Hunter:
- Fix processing for the idle task, having a stack per cpu.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit 594cc251fdd0 ("make 'user_access_begin()' do 'access_ok()'")
broke both alpha and SH booting in qemu, as noticed by Guenter Roeck.
It turns out that the bug wasn't actually in that commit itself (which
would have been surprising: it was mostly a no-op), but in how the
addition of access_ok() to the strncpy_from_user() and strnlen_user()
functions now triggered the case where those functions would test the
access of the very last byte of the user address space.
The string functions actually did that user range test before too, but
they did it manually by just comparing against user_addr_max(). But
with user_access_begin() doing the check (using "access_ok()"), it now
exposed problems in the architecture implementations of that function.
For example, on alpha, the access_ok() helper macro looked like this:
#define __access_ok(addr, size) \
((get_fs().seg & (addr | size | (addr+size))) == 0)
and what it basically tests is of any of the high bits get set (the
USER_DS masking value is 0xfffffc0000000000).
And that's completely wrong for the "addr+size" check. Because it's
off-by-one for the case where we check to the very end of the user
address space, which is exactly what the strn*_user() functions do.
Why? Because "addr+size" will be exactly the size of the address space,
so trying to access the last byte of the user address space will fail
the __access_ok() check, even though it shouldn't. As a result, the
user string accessor functions failed consistently - because they
literally don't know how long the string is going to be, and the max
access is going to be that last byte of the user address space.
Side note: that alpha macro is buggy for another reason too - it re-uses
the arguments twice.
And SH has another version of almost the exact same bug:
#define __addr_ok(addr) \
((unsigned long __force)(addr) < current_thread_info()->addr_limit.seg)
so far so good: yes, a user address must be below the limit. But then:
#define __access_ok(addr, size) \
(__addr_ok((addr) + (size)))
is wrong with the exact same off-by-one case: the case when "addr+size"
is exactly _equal_ to the limit is actually perfectly fine (think "one
byte access at the last address of the user address space")
The SH version is actually seriously buggy in another way: it doesn't
actually check for overflow, even though it did copy the _comment_ that
talks about overflow.
So it turns out that both SH and alpha actually have completely buggy
implementations of access_ok(), but they happened to work in practice
(although the SH overflow one is a serious serious security bug, not
that anybody likely cares about SH security).
This fixes the problems by using a similar macro on both alpha and SH.
It isn't trying to be clever, the end address is based on this logic:
unsigned long __ao_end = __ao_a + __ao_b - !!__ao_b;
which basically says "add start and length, and then subtract one unless
the length was zero". We can't subtract one for a zero length, or we'd
just hit an underflow instead.
For a lot of access_ok() users the length is a constant, so this isn't
actually as expensive as it initially looks.
Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Caused by making the variable static:
kernel/sched/fair.c:119:21: warning: 'capacity_margin' defined but not used [-Wunused-variable]
Seems easiest to just move it up under the existing ifdef CONFIG_SMP
that's a few lines above.
Fixes: ed8885a14433a ('sched/fair: Make some variables static')
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a comment to perf_session__register_idle_thread() to bring attention to
a pitfall with the idle task thread structure. The pitfall is that there
should really be a 'struct thread' for the idle task of each cpu, but there
is only one that can have pid == tid == 0.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181221120620.9659-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
You do not have to use define ... endef for filechk_* rules.
For simple cases, the use of assignment looks cleaner, IMHO.
I updated the usage for scripts/Kbuild.include in case somebody
misunderstands the 'define ... endif' is the requirement.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
perf creates a single 'struct thread' to represent the idle task. That
is because threads are identified by PID and TID, and the idle task
always has PID == TID == 0.
However, there are actually separate idle tasks for each CPU. That
creates a problem for thread stack processing which assumes that each
thread has a single stack, not one stack per CPU.
Fix that by passing through the CPU number, and in the case of the idle
"thread", pick the thread stack from an array based on the CPU number.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181221120620.9659-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now that Kbuild automatically creates asm-generic wrappers for missing
mandatory headers, it is redundant to list the same headers in
generic-y and mandatory-y.
Suggested-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Pull ext4 bug fixes from Ted Ts'o:
"Fix a number of ext4 bugs"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: fix special inode number checks in __ext4_iget()
ext4: track writeback errors using the generic tracking infrastructure
ext4: use ext4_write_inode() when fsyncing w/o a journal
ext4: avoid kernel warning when writing the superblock to a dead device
ext4: fix a potential fiemap/page fault deadlock w/ inline_data
ext4: make sure enough credits are reserved for dioread_nolock writes
Add support for the Adiantum encryption mode to fscrypt. Adiantum is a
tweakable, length-preserving encryption mode with security provably
reducible to that of XChaCha12 and AES-256, subject to a security bound.
It's also a true wide-block mode, unlike XTS. See the paper
"Adiantum: length-preserving encryption for entry-level processors"
(https://eprint.iacr.org/2018/720.pdf) for more details. Also see
commit 059c2a4d8e16 ("crypto: adiantum - add Adiantum support").
On sufficiently long messages, Adiantum's bottlenecks are XChaCha12 and
the NH hash function. These algorithms are fast even on processors
without dedicated crypto instructions. Adiantum makes it feasible to
enable storage encryption on low-end mobile devices that lack AES
instructions; currently such devices are unencrypted. On ARM Cortex-A7,
on 4096-byte messages Adiantum encryption is about 4 times faster than
AES-256-XTS encryption; decryption is about 5 times faster.
In fscrypt, Adiantum is suitable for encrypting both file contents and
names. With filenames, it fixes a known weakness: when two filenames in
a directory share a common prefix of >= 16 bytes, with CTS-CBC their
encrypted filenames share a common prefix too, leaking information.
Adiantum does not have this problem.
Since Adiantum also accepts long tweaks (IVs), it's also safe to use the
master key directly for Adiantum encryption rather than deriving
per-file keys, provided that the per-file nonce is included in the IVs
and the master key isn't used for any other encryption mode. This
configuration saves memory and improves performance. A new fscrypt
policy flag is added to allow users to opt-in to this configuration.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Pull x86 mm updates from Ingo Molnar:
"The main changes in this cycle were:
- Update and clean up x86 fault handling, by Andy Lutomirski.
- Drop usage of __flush_tlb_all() in kernel_physical_mapping_init()
and related fallout, by Dan Williams.
- CPA cleanups and reorganization by Peter Zijlstra: simplify the
flow and remove a few warts.
- Other misc cleanups"
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits)
x86/mm/dump_pagetables: Use DEFINE_SHOW_ATTRIBUTE()
x86/mm/cpa: Rename @addrinarray to @numpages
x86/mm/cpa: Better use CLFLUSHOPT
x86/mm/cpa: Fold cpa_flush_range() and cpa_flush_array() into a single cpa_flush() function
x86/mm/cpa: Make cpa_data::numpages invariant
x86/mm/cpa: Optimize cpa_flush_array() TLB invalidation
x86/mm/cpa: Simplify the code after making cpa->vaddr invariant
x86/mm/cpa: Make cpa_data::vaddr invariant
x86/mm/cpa: Add __cpa_addr() helper
x86/mm/cpa: Add ARRAY and PAGES_ARRAY selftests
x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init()
x86/mm: Validate kernel_physical_mapping_init() PTE population
generic/pgtable: Introduce set_pte_safe()
generic/pgtable: Introduce {p4d,pgd}_same()
generic/pgtable: Make {pmd, pud}_same() unconditionally available
x86/fault: Clean up the page fault oops decoder a bit
x86/fault: Decode page fault OOPSes better
x86/vsyscall/64: Use X86_PF constants in the simulated #PF error code
x86/oops: Show the correct CS value in show_regs()
x86/fault: Don't try to recover from an implicit supervisor access
...
The DT core will probe the DT by default now, so the OLPC platform code
calling of_platform_bus_probe() is not necessary. The algorithm for what
nodes are probed is a little different in how compatible is handled, but
since OLPC uses compatible strings for matching it is not affected by
this difference.
Also, only the battery node located at the root level gets a device
created as the dcon is a PCI device and the RTC device is created in
olpc-xo1-rtc.c.
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Ingo Molnar <mingo@redhat.com>
CC: Lubomir Rintel <lkundrak@v3.sk>
Cc: Thomas Gleixner <tglx@linutronix.de>
CC: devicetree@vger.kernel.org
CC: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/20181116201820.10065-1-robh@kernel.org
In preparation for fixing thread stack processing for the idle task,
allocate an array of thread stacks.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181221120620.9659-7-adrian.hunter@intel.com
[ No need to check for NULL when calling zfree(), noticed by Jiri Olsa ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some time ago, Sam pointed out a certain degree of overwrap between
generic-y and mandatory-y. (https://lkml.org/lkml/2017/7/10/121)
I tweaked the meaning of mandatory-y a little bit; now it defines the
minimum set of ASM headers that all architectures must have.
If arch does not have specific implementation of a mandatory header,
Kbuild will let it fallback to the asm-generic one by automatically
generating a wrapper. This will allow to drop lots of redundant
generic-y defines.
Previously, "mandatory" was used in the context of UAPI, but I guess
this can be extended to kernel space ASM headers.
Suggested-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Pull dma-mapping fixes from Christoph Hellwig:
"Fix various regressions introduced in this cycles:
- fix dma-debug tracking for the map_page / map_single
consolidatation
- properly stub out DMA mapping symbols for !HAS_DMA builds to avoid
link failures
- fix AMD Gart direct mappings
- setup the dma address for no kernel mappings using the remap
allocator"
* tag 'dma-mapping-4.21-1' of git://git.infradead.org/users/hch/dma-mapping:
dma-direct: fix DMA_ATTR_NO_KERNEL_MAPPING for remapped allocations
x86/amd_gart: fix unmapping of non-GART mappings
dma-mapping: remove a few unused exports
dma-mapping: properly stub out the DMA API for !CONFIG_HAS_DMA
dma-mapping: remove dmam_{declare,release}_coherent_memory
dma-mapping: implement dmam_alloc_coherent using dmam_alloc_attrs
dma-mapping: implement dma_map_single_attrs using dma_map_page_attrs
The check for special (reserved) inode number checks in __ext4_iget()
was broken by commit 8a363970d1dc: ("ext4: avoid declaring fs
inconsistent due to invalid file handles"). This was caused by a
botched reversal of the sense of the flag now known as
EXT4_IGET_SPECIAL (when it was previously named EXT4_IGET_NORMAL).
Fix the logic appropriately.
Fixes: 8a363970d1dc ("ext4: avoid declaring fs inconsistent...")
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable@kernel.org
In F2FS_HAS_FEATURE(), we will use F2FS_SB(sb) to get sbi pointer to
access .raw_super field, to avoid unneeded pointer conversion, this
patch changes to F2FS_HAS_FEATURE() accept sbi parameter directly.
Just do cleanup, no logic change.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Pull x86 fpu updates from Ingo Molnar:
"Misc preparatory changes for an upcoming FPU optimization that will
delay the loading of FPU registers to return-to-userspace"
* 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/fpu: Don't export __kernel_fpu_{begin,end}()
x86/fpu: Update comment for __raw_xsave_addr()
x86/fpu: Add might_fault() to user_insn()
x86/pkeys: Make init_pkru_value static
x86/thread_info: Remove _TIF_ALLWORK_MASK
x86/process/32: Remove asm/math_emu.h include
x86/fpu: Use unsigned long long shift in xfeature_uncompacted_offset()
Use DEFINE_SHOW_ATTRIBUTE() instead of open coding it.
Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: keescook@chromium.org
Cc: luto@kernel.org
Cc: peterz@infradead.org
Cc: bp@alien8.de
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/20181119154334.18265-1-tiny.windzz@gmail.com
In preparation for fixing thread stack processing for the idle task,
factor out thread_stack__init().
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181221120620.9659-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull chrome platform updates from Benson Leung:
- Changes for EC_MKBP_EVENT_SENSOR_FIFO handling.
- Also, maintainership changes. Olofj out, Enric balletbo in.
* tag 'tag-chrome-platform-for-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform:
MAINTAINERS: add maintainers for ChromeOS EC sub-drivers
MAINTAINERS: platform/chrome: Add Enric as a maintainer
MAINTAINERS: platform/chrome: remove myself as maintainer
platform/chrome: don't report EC_MKBP_EVENT_SENSOR_FIFO as wakeup
platform/chrome: straighten out cros_ec_get_{next,host}_event() error codes
We need to return a dma_addr_t even if we don't have a kernel mapping.
Do so by consolidating the phys_to_dma call in a single place and jump
to it from all the branches that return successfully.
Fixes: bfd56cd60521 ("dma-mapping: support highmem in the generic remap allocator")
Reported-by: Liviu Dudau <liviu@dudau.co.uk
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Liviu Dudau <liviu@dudau.co.uk>
Pull x86 cpu updates from Ingo Molnar:
"Misc changes:
- Fix nr_cpus= boot option interaction bug with logical package
management
- Clean up UMIP detection messages
- Add WBNOINVD instruction detection
- Remove the unused get_scattered_cpuid_leaf() function"
* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/topology: Use total_cpus for max logical packages calculation
x86/umip: Make the UMIP activated message generic
x86/umip: Print UMIP line only once
x86/cpufeatures: Add WBNOINVD feature definition
x86/cpufeatures: Remove get_scattered_cpuid_leaf()
There is one user of __kernel_fpu_begin() and before invoking it,
it invokes preempt_disable(). So it could invoke kernel_fpu_begin()
right away. The 32bit version of arch_efi_call_virt_setup() and
arch_efi_call_virt_teardown() does this already.
The comment above *kernel_fpu*() claims that before invoking
__kernel_fpu_begin() preemption should be disabled and that KVM is a
good example of doing it. Well, KVM doesn't do that since commit
f775b13eedee2 ("x86,kvm: move qemu/guest FPU switching out to vcpu_run")
so it is not an example anymore.
With EFI gone as the last user of __kernel_fpu_{begin|end}(), both can
be made static and not exported anymore.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Rik van Riel <riel@surriel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Nicolai Stange <nstange@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: linux-efi <linux-efi@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20181129150210.2k4mawt37ow6c2vq@linutronix.de
The CPA_ARRAY interface works in single pages, and everything, except
in these 'few' locations is this variable called 'numpages'.
Remove this 'addrinarray' abberation and use 'numpages' consistently.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom.StDenis@amd.com
Cc: dave.hansen@intel.com
Link: http://lkml.kernel.org/r/20181203171043.695039210@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull networking fixes from David Miller:
"One last pull request before heading to Vancouver for LPC, here we have:
1) Don't forget to free VSI contexts during ice driver unload, from
Victor Raj.
2) Don't forget napi delete calls during device remove in ice driver,
from Dave Ertman.
3) Don't request VLAN tag insertion of ibmvnic device when SKB
doesn't have VLAN tags at all.
4) IPV4 frag handling code has to accomodate the situation where two
threads try to insert the same fragment into the hash table at the
same time. From Eric Dumazet.
5) Relatedly, don't flow separate on protocol ports for fragmented
frames, also from Eric Dumazet.
6) Memory leaks in qed driver, from Denis Bolotin.
7) Correct valid MTU range in smsc95xx driver, from Stefan Wahren.
8) Validate cls_flower nested policies properly, from Jakub Kicinski.
9) Clearing of stats counters in mc88e6xxx driver doesn't retain
important bits in the G1_STATS_OP register causing the chip to
hang. Fix from Andrew Lunn"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits)
act_mirred: clear skb->tstamp on redirect
net: dsa: mv88e6xxx: Fix clearing of stats counters
tipc: fix link re-establish failure
net: sched: cls_flower: validate nested enc_opts_policy to avoid warning
net: mvneta: correct typo
flow_dissector: do not dissect l4 ports for fragments
net: qualcomm: rmnet: Fix incorrect assignment of real_dev
net: aquantia: allow rx checksum offload configuration
net: aquantia: invalid checksumm offload implementation
net: aquantia: fixed enable unicast on 32 macvlan
net: aquantia: fix potential IOMMU fault after driver unbind
net: aquantia: synchronized flow control between mac/phy
net: smsc95xx: Fix MTU range
net: stmmac: Fix RX packet size > 8191
qed: Fix potential memory corruption
qed: Fix SPQ entries not returned to pool in error flows
qed: Fix blocking/unlimited SPQ entries leak
qed: Fix memory/entry leak in qed_init_sp_request()
inet: frags: better deal with smp races
net: hns3: bugfix for not checking return value
...
In preparation for fixing thread stack processing for the idle task,
allow for a thread stack array.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181221120620.9659-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This commit removes redundant generic-y defines in
arch/riscv/include/asm/Kbuild.
[1] It is redundant to define the same generic-y in both
arch/$(ARCH)/include/asm/Kbuild and
arch/$(ARCH)/include/uapi/asm/Kbuild.
Remove the following generic-y:
errno.h
fcntl.h
ioctl.h
ioctls.h
ipcbuf.h
mman.h
msgbuf.h
param.h
poll.h
posix_types.h
resource.h
sembuf.h
setup.h
shmbuf.h
signal.h
socket.h
sockios.h
stat.h
statfs.h
swab.h
termbits.h
termios.h
types.h
[2] It is redundant to define generic-y when arch-specific
implementation exists in arch/$(ARCH)/include/asm/*.h
Remove the following generic-y:
cacheflush.h
module.h
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Pull hwspinlock updates from Bjorn Andersson:
"This adds support for the hardware semaphores found in STM32MP1"
* tag 'hwlock-v4.21' of git://github.com/andersson/remoteproc:
hwspinlock: fix return value check in stm32_hwspinlock_probe()
hwspinlock: add STM32 hwspinlock device
dt-bindings: hwlock: Document STM32 hwspinlock bindings
There are multiple ChromeOS EC sub-drivers spread in different
subsystems, as all of them are related to the Chrome stuff add
Benson and myself as a maintainers for all these sub-drivers.
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Benson Leung <bleung@chromium.org>
Acked-by: Wolfram Sang <wsa@the-dreams.de>
Acked-by: Chanwoo Choi <cw00.choi@samsung.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Acked-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Acked-by: Guenter Roeck <groeck@chromium.org>
In many cases we don't have to create a GART mapping at all, which
also means there is nothing to unmap. Fix the range check that was
incorrectly modified when removing the mapping_error method.
Fixes: 9e8aa6b546 ("x86/amd_gart: remove the mapping_error dma_map_ops method")
Reported-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Michal Kubecek <mkubecek@suse.cz>
In no-journal mode, we previously used __generic_file_fsync() in
no-journal mode. This triggers a lockdep warning, and in addition,
it's not safe to depend on the inode writeback mechanism in the case
ext4. We can solve both problems by calling ext4_write_inode()
directly.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
When sbi->segs_per_sec > 1, and if some segno has 0 valid blocks before
gc starts, do_garbage_collect will skip counting seg_freed++, and this
will cause seg_freed < sbi->segs_per_sec and finally skip sec_freed++.
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Pull x86 cleanups from Ingo Molnar:
"Misc cleanups"
* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/kprobes: Remove trampoline_handler() prototype
x86/kernel: Fix more -Wmissing-prototypes warnings
x86: Fix various typos in comments
x86/headers: Fix -Wmissing-prototypes warning
x86/process: Avoid unnecessary NULL check in get_wchan()
x86/traps: Complete prototype declarations
x86/mce: Fix -Wmissing-prototypes warnings
x86/gart: Rewrite early_gart_iommu_check() comment
nr_cpu_ids can be limited on the command line via nr_cpus=. This can break the
logical package management because it results in a smaller number of packages
while in kdump kernel.
Check below case:
There is a two sockets system, each socket has 8 cores, which has 16 logical
cpus while HT was turn on.
0 1 2 3 4 5 6 7 | 16 17 18 19 20 21 22 23
cores on socket 0 threads on socket 0
8 9 10 11 12 13 14 15 | 24 25 26 27 28 29 30 31
cores on socket 1 threads on socket 1
While starting the kdump kernel with command line option nr_cpus=16 panic
was triggered on one of the cpus 24-31 eg. 26, then online cpu will be
1-15, 26(cpu 0 was disabled in kdump), ncpus will be 16 and
__max_logical_packages will be 1, but actually two packages were booted on.
This issue can reproduced by set kdump option nr_cpus=<real physical core
numbers>, and then trigger panic on last socket's thread, for example:
taskset -c 26 echo c > /proc/sysrq-trigger
Use total_cpus which will not be limited by nr_cpus command line to calculate
the value of __max_logical_packages.
Signed-off-by: Hui Wang <john.wanghui@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <guijianfeng@huawei.com>
Cc: <wencongyang2@huawei.com>
Cc: <douliyang1@huawei.com>
Cc: <qiaonuohan@huawei.com>
Link: https://lkml.kernel.org/r/20181107023643.22174-1-john.wanghui@huawei.com
The comment above __raw_xsave_addr() claims that the function does not
work for compacted buffers and was introduced in:
b8b9b6ba9dec3 ("x86/fpu: Allow setting of XSAVE state")
In this commit, the function was factored out of get_xsave_addr() and
this function claims that it works with "standard format or compacted
format of xsave area". It accesses the "xstate_comp_offsets" variable
for the actual offset and it was introduced in commit
7496d6458fe32 ("Define kernel API to get address of each state in xsave area")
Based on the code (back then and now):
- xstate_offsets holds the standard offset.
- if compacted mode is not supported then xstate_comp_offsets gets the
xstate_offsets copied.
- if compacted mode is supported then xstate_comp_offsets will hold the
offset for the compacted buffer.
Based on that the function works for compacted buffers as long as the
CPU supports it and this what we care about.
Remove the "Note:" which is not accurate.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Rik van Riel <riel@surriel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20181128222035.2996-7-bigeasy@linutronix.de
Currently we issue an MFENCE before and after flushing a range. This
means that if we flush a bunch of single page ranges -- like with the
cpa array, we issue a whole bunch of superfluous MFENCEs.
Reorgainze the code a little to avoid this.
[ mingo: capitalize instructions, tweak changelog and comments. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom.StDenis@amd.com
Cc: dave.hansen@intel.com
Link: http://lkml.kernel.org/r/20181203171043.626999883@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull Kbuild fixes from Masahiro Yamada:
- fix build errors in binrpm-pkg and bindeb-pkg targets
- fix false positive matches in merge_config.sh
- fix build version mismatch in deb-pkg target
- fix dtbs_install handling in (bin)deb-pkg target
- revert a commit that allows setlocalversion to write to source tree
* tag 'kbuild-fixes-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
builddeb: Fix inclusion of dtbs in debian package
Revert "scripts/setlocalversion: git: Make -dirty check more robust"
kbuild: deb-pkg: fix too low build version number
kconfig: merge_config: avoid false positive matches from comment lines
kbuild: deb-pkg: fix bindeb-pkg breakage when O= is used
kbuild: rpm-pkg: fix binrpm-pkg breakage when O= is used
If sch_fq is used at ingress, skbs that might have been
timestamped by net_timestamp_set() if a packet capture
is requesting timestamps could be delayed by arbitrary
amount of time, since sch_fq time base is MONOTONIC.
Fix this problem by moving code from sch_netem.c to act_mirred.c.
Fixes: fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for fixing thread stack processing for the idle task,
avoid direct reference to the thread's stack. The thread stack will
change to an array of thread stacks, at which point the meaning of the
direct reference will change.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181221120620.9659-4-adrian.hunter@intel.com
[ Rename thread_stack__ts() to thread__stack() since this operates on a 'thread' struct ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
filechk_* rules often consist of multiple 'echo' lines. They must be
surrounded with { } or ( ) to work correctly. Otherwise, only the
string from the last 'echo' would be written into the target.
Let's take care of that in the 'filechk' in scripts/Kbuild.include
to clean up filechk_* rules.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Pull documentation fixes from Jonathan Corbet:
"A handful of late-arriving documentation fixes"
* tag 'docs-5.0-fixes' of git://git.lwn.net/linux:
doc: filesystems: fix bad references to nonexistent ext4.rst file
Documentation/admin-guide: update URL of LKML information link
Docs/kernel-api.rst: Remove blk-tag.c reference
In case of error, the function devm_ioremap_resource() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check should
be replaced with IS_ERR().
Fixes: f24fcff1d267 ("hwspinlock: add STM32 hwspinlock device")
Acked-by: Benjamin Gaignard <benjamin.gaignard@gmail.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
The xfstests generic/475 test switches the underlying device with
dm-error while running a stress test. This results in a large number
of file system errors, and since we can't lock the buffer head when
marking the superblock dirty in the ext4_grp_locked_error() case, it's
possible the superblock to be !buffer_uptodate() without
buffer_write_io_error() being true.
We need to set buffer_uptodate() before we call mark_buffer_dirty() or
this will trigger a WARN_ON. It's safe to do this since the
superblock must have been properly read into memory or the mount would
have been successful. So if buffer_uptodate() is not set, we can
safely assume that this happened due to a failed attempt to write the
superblock.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Pull x86 build updates from Ingo Molnar:
- Resolve LLVM build bug by removing redundant GNU specific flag
- Remove obsolete -funit-at-a-time and -fno-unit-at-a-time use from x86
PowerPC and UM.
The UML change was seen and acked by UML maintainer Richard
Weinberger.
* 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/um/vdso: Drop implicit common-page-size linker flag
x86, powerpc: Remove -funit-at-a-time compiler option entirely
x86/um: Remove -fno-unit-at-a-time workaround for pre-4.0 GCC
... and make it static. It is called only by the kretprobe_trampoline()
from asm.
It was marked __visible so that it is visible outside of the current
compilation unit but that is not needed as it is used only in this
compilation unit.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lkml.kernel.org/r/20181205162526.GB109259@gmail.com