Linux kernel
============
There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.
In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``. The formatted documentation can also be read online at:
https://www.kernel.org/doc/html/latest/
There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.
code
Clone this repository
https://tangled.org/tjh.dev/kernel
git@gordian.tjh.dev:tjh.dev/kernel
For self-hosted knots, clone URLs may differ based on your setup.
... and get rid of a bunch of bugs in it. Background:
the reason for path_mountpoint() is that umount() really doesn't
want attempts to revalidate the root of what it's trying to umount.
The thing we want to avoid actually happen from complete_walk();
solution was to do something parallel to normal path_lookupat()
and it both went overboard and got the boilerplate subtly
(and not so subtly) wrong.
A better solution is to do pretty much what the normal path_lookupat()
does, but instead of complete_walk() do unlazy_walk(). All it takes
to avoid that ->d_weak_revalidate() call... mountpoint_last() goes
away, along with everything it got wrong, and so does the magic around
LOOKUP_NO_REVAL.
Another source of bugs is that when we traverse mounts at the final
location (and we need to do that - umount . expects to get whatever's
overmounting ., if any, out of the lookup) we really ought to take
care of ->d_manage() - as it is, manual umount of autofs automount
in progress can lead to unpleasant surprises for the daemon. Easily
solved by using handle_lookup_down() instead of follow_mount().
Tested-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
When a filesystem is unmounted, we currently call fsnotify_sb_delete()
before evict_inodes(), which means that fsnotify_unmount_inodes()
must iterate over all inodes on the superblock looking for any inodes
with watches. This is inefficient and can lead to livelocks as it
iterates over many unwatched inodes.
At this point, SB_ACTIVE is gone and dropping refcount to zero kicks
the inode out out immediately, so anything processed by
fsnotify_sb_delete / fsnotify_unmount_inodes gets evicted in that loop.
After that, the call to evict_inodes will evict everything else with a
zero refcount.
This should speed things up overall, and avoid livelocks in
fsnotify_unmount_inodes().
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Anything that walks all inodes on sb->s_inodes list without rescheduling
risks softlockups.
Previous efforts were made in 2 functions, see:
c27d82f fs/drop_caches.c: avoid softlockups in drop_pagecache_sb()
ac05fbb inode: don't softlockup when evicting inodes
but there hasn't been an audit of all walkers, so do that now. This
also consistently moves the cond_resched() calls to the bottom of each
loop in cases where it already exists.
One loop remains: remove_dquot_ref(), because I'm not quite sure how
to deal with that one w/o taking the i_lock.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
We cannot look at 'i->pipe' unless we know the iter is a pipe. Move the
ring_size load to a branch in iov_iter_alignment() where we've already
checked the iter is a pipe to avoid bogus dereference.
Reported-by: syzbot+bea68382bae9490e7dd6@syzkaller.appspotmail.com
Fixes: 8cefc107ca54 ("pipe: Use head and tail pointers for the ring, not cursor and length")
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pull networking fixes from David Miller:
1) More jumbo frame fixes in r8169, from Heiner Kallweit.
2) Fix bpf build in minimal configuration, from Alexei Starovoitov.
3) Use after free in slcan driver, from Jouni Hogander.
4) Flower classifier port ranges don't work properly in the HW offload
case, from Yoshiki Komachi.
5) Use after free in hns3_nic_maybe_stop_tx(), from Yunsheng Lin.
6) Out of bounds access in mqprio_dump(), from Vladyslav Tarasiuk.
7) Fix flow dissection in dsa TX path, from Alexander Lobakin.
8) Stale syncookie timestampe fixes from Guillaume Nault.
[ Did an evil merge to silence a warning introduced by this pull - Linus ]
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (84 commits)
r8169: fix rtl_hw_jumbo_disable for RTL8168evl
net_sched: validate TCA_KIND attribute in tc_chain_tmplt_add()
r8169: add missing RX enabling for WoL on RTL8125
vhost/vsock: accept only packets with the right dst_cid
net: phy: dp83867: fix hfs boot in rgmii mode
net: ethernet: ti: cpsw: fix extra rx interrupt
inet: protect against too small mtu values.
gre: refetch erspan header from skb->data after pskb_may_pull()
pppoe: remove redundant BUG_ON() check in pppoe_pernet
tcp: Protect accesses to .ts_recent_stamp with {READ,WRITE}_ONCE()
tcp: tighten acceptance of ACKs not matching a child socket
tcp: fix rejected syncookies due to stale timestamps
lpc_eth: kernel BUG on remove
tcp: md5: fix potential overestimation of TCP option space
net: sched: allow indirect blocks to bind to clsact in TC
net: core: rename indirect block ingress cb function
net-sysfs: Call dev_hold always in netdev_queue_add_kobject
net: dsa: fix flow dissection on Tx path
net/tls: Fix return values to avoid ENOTSUPP
net: avoid an indirect call in ____sys_recvmsg()
...
Pull more SCSI updates from James Bottomley:
"Eleven patches, all in drivers (no core changes) that are either minor
cleanups or small fixes.
They were late arriving, but still safe for -rc1"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: MAINTAINERS: Add the linux-scsi mailing list to the ISCSI entry
scsi: megaraid_sas: Make poll_aen_lock static
scsi: sd_zbc: Improve report zones error printout
scsi: qla2xxx: Fix qla2x00_request_irqs() for MSI
scsi: qla2xxx: unregister ports after GPN_FT failure
scsi: qla2xxx: fix rports not being mark as lost in sync fabric scan
scsi: pm80xx: Remove unused include of linux/version.h
scsi: pm80xx: fix logic to break out of loop when register value is 2 or 3
scsi: scsi_transport_sas: Fix memory leak when removing devices
scsi: lpfc: size cpu map by last cpu id set
scsi: ibmvscsi_tgt: Remove unneeded variable rc
In referenced fix we removed the RTL8168e-specific jumbo config for
RTL8168evl in rtl_hw_jumbo_enable(). We have to do the same in
rtl_hw_jumbo_disable().
v2: fix referenced commit id
Fixes: 14012c9f3bb9 ("r8169: fix jumbo configuration for RTL8168evl")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull cifs fixes from Steve French:
"Nine cifs/smb3 fixes:
- one fix for stable (oops during oplock break)
- two timestamp fixes including important one for updating mtime at
close to avoid stale metadata caching issue on dirty files (also
improves perf by using SMB2_CLOSE_FLAG_POSTQUERY_ATTRIB over the
wire)
- two fixes for "modefromsid" mount option for file create (now
allows mode bits to be set more atomically and accurately on create
by adding "sd_context" on create when modefromsid specified on
mount)
- two fixes for multichannel found in testing this week against
different servers
- two small cleanup patches"
* tag '5.5-rc-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
smb3: improve check for when we send the security descriptor context on create
smb3: fix mode passed in on create for modetosid mount option
cifs: fix possible uninitialized access and race on iface_list
cifs: Fix lookup of SMB connections on multichannel
smb3: query attributes on file close
smb3: remove unused flag passed into close functions
cifs: remove redundant assignment to pointer pneg_ctxt
fs: cifs: Fix atime update check vs mtime
CIFS: Fix NULL-pointer dereference in smb2_push_mandatory_locks
Most people who review iSCSI are following linux-scsi, but some are not in
open-scsi. Make sure we are routing iSCSI patches to the right list.
Link: https://lore.kernel.org/r/85h82rvqza.fsf@collabora.com
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Use the new tcf_proto_check_kind() helper to make sure user
provided value is well formed.
BUG: KMSAN: uninit-value in string_nocheck lib/vsprintf.c:606 [inline]
BUG: KMSAN: uninit-value in string+0x4be/0x600 lib/vsprintf.c:668
CPU: 0 PID: 12358 Comm: syz-executor.1 Not tainted 5.4.0-rc8-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1c9/0x220 lib/dump_stack.c:118
kmsan_report+0x128/0x220 mm/kmsan/kmsan_report.c:108
__msan_warning+0x64/0xc0 mm/kmsan/kmsan_instr.c:245
string_nocheck lib/vsprintf.c:606 [inline]
string+0x4be/0x600 lib/vsprintf.c:668
vsnprintf+0x218f/0x3210 lib/vsprintf.c:2510
__request_module+0x2b1/0x11c0 kernel/kmod.c:143
tcf_proto_lookup_ops+0x171/0x700 net/sched/cls_api.c:139
tc_chain_tmplt_add net/sched/cls_api.c:2730 [inline]
tc_ctl_chain+0x1904/0x38a0 net/sched/cls_api.c:2850
rtnetlink_rcv_msg+0x115a/0x1580 net/core/rtnetlink.c:5224
netlink_rcv_skb+0x431/0x620 net/netlink/af_netlink.c:2477
rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:5242
netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
netlink_unicast+0xf3e/0x1020 net/netlink/af_netlink.c:1328
netlink_sendmsg+0x110f/0x1330 net/netlink/af_netlink.c:1917
sock_sendmsg_nosec net/socket.c:637 [inline]
sock_sendmsg net/socket.c:657 [inline]
___sys_sendmsg+0x14ff/0x1590 net/socket.c:2311
__sys_sendmsg net/socket.c:2356 [inline]
__do_sys_sendmsg net/socket.c:2365 [inline]
__se_sys_sendmsg+0x305/0x460 net/socket.c:2363
__x64_sys_sendmsg+0x4a/0x70 net/socket.c:2363
do_syscall_64+0xb6/0x160 arch/x86/entry/common.c:291
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45a649
Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f0790795c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 000000000045a649
RDX: 0000000000000000 RSI: 0000000020000300 RDI: 0000000000000006
RBP: 000000000075bfc8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f07907966d4
R13: 00000000004c8db5 R14: 00000000004df630 R15: 00000000ffffffff
Uninit was created at:
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:149 [inline]
kmsan_internal_poison_shadow+0x5c/0x110 mm/kmsan/kmsan.c:132
kmsan_slab_alloc+0x97/0x100 mm/kmsan/kmsan_hooks.c:86
slab_alloc_node mm/slub.c:2773 [inline]
__kmalloc_node_track_caller+0xe27/0x11a0 mm/slub.c:4381
__kmalloc_reserve net/core/skbuff.c:141 [inline]
__alloc_skb+0x306/0xa10 net/core/skbuff.c:209
alloc_skb include/linux/skbuff.h:1049 [inline]
netlink_alloc_large_skb net/netlink/af_netlink.c:1174 [inline]
netlink_sendmsg+0x783/0x1330 net/netlink/af_netlink.c:1892
sock_sendmsg_nosec net/socket.c:637 [inline]
sock_sendmsg net/socket.c:657 [inline]
___sys_sendmsg+0x14ff/0x1590 net/socket.c:2311
__sys_sendmsg net/socket.c:2356 [inline]
__do_sys_sendmsg net/socket.c:2365 [inline]
__se_sys_sendmsg+0x305/0x460 net/socket.c:2363
__x64_sys_sendmsg+0x4a/0x70 net/socket.c:2363
do_syscall_64+0xb6/0x160 arch/x86/entry/common.c:291
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: 6f96c3c6904c ("net_sched: fix backward compatibility for TCA_KIND")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull misc vfs cleanups from Al Viro:
"No common topic, just three cleanups".
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
make __d_alloc() static
fs/namespace: add __user to open_tree and move_mount syscalls
fs/fnctl: fix missing __user in fcntl_rw_hint()
We had cases in the previous patch where we were sending the security
descriptor context on SMB3 open (file create) in cases when we hadn't
mounted with with "modefromsid" mount option.
Add check for that mount flag before calling ad_sd_context in
open init.
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Fix sparse warning:
drivers/scsi/megaraid/megaraid_sas_base.c:187:12:
warning: symbol 'poll_aen_lock' was not declared. Should it be static?
Link: https://lore.kernel.org/r/20191125144454.22680-1-yuehaibing@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>