Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'uml-for-linux-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux

Pull UML updates from Johannes Berg:
"The only really new thing is the long-standing seccomp work
(originally from 2021!). Wven if it still isn't enabled by default due
to security concerns it can still be used e.g. for tests.

- remove obsolete network transports

- remove PCI IO port support

- start adding seccomp-based process handling instead of ptrace"

* tag 'uml-for-linux-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: (29 commits)
um: remove "extern" from implementation of sigchld_handler
um: fix unused variable warning
um: fix SECCOMP 32bit xstate register restore
um: pass FD for memory operations when needed
um: Add SECCOMP support detection and initialization
um: Implement kernel side of SECCOMP based process handling
um: Track userspace children dying in SECCOMP mode
um: Add helper functions to get/set state for SECCOMP
um: Add stub side of SECCOMP/futex based process handling
um: Move faultinfo extraction into userspace routine
um: vector: Use mac_pton() for MAC address parsing
um: vector: Clean up and modernize log messages
um: chan_kern: use raw spinlock for irqs_to_free_lock
MAINTAINERS: remove obsolete file entry in TUN/TAP DRIVER
um: Fix tgkill compile error on old host OSes
um: stop using PCI port I/O
um: Remove legacy network transport infrastructure
um: vector: Eliminate the dependency on uml_net
um: Remove obsolete legacy network transports
um/asm: Replace "REP; NOP" with PAUSE mnemonic
...

+2683 -4361
+7 -40
Documentation/virt/uml/user_mode_linux_howto_v2.rst
··· 147 147 are creating its image. It is a good idea to change that to avoid 148 148 "Oh, bummer, I rebooted the wrong machine". 149 149 150 - UML supports two classes of network devices - the older uml_net ones 151 - which are scheduled for obsoletion. These are called ethX. It also 152 - supports the newer vector IO devices which are significantly faster 153 - and have support for some standard virtual network encapsulations like 154 - Ethernet over GRE and Ethernet over L2TPv3. These are called vec0. 150 + UML supports vector I/O high performance network devices which have 151 + support for some standard virtual network encapsulations like 152 + Ethernet over GRE and Ethernet over L2TPv3. These are called vecX. 155 153 156 - Depending on which one is in use, ``/etc/network/interfaces`` will 157 - need entries like:: 158 - 159 - # legacy UML network devices 160 - auto eth0 161 - iface eth0 inet dhcp 154 + When vector network devices are in use, ``/etc/network/interfaces`` 155 + will need entries like:: 162 156 163 157 # vector UML network devices 164 158 auto vec0 ··· 213 219 +-----------+--------+------------------------------------+------------+ 214 220 | vde | vector | dep. on VDE VPN: Virt.Net Locator | varies | 215 221 +-----------+--------+------------------------------------+------------+ 216 - | tuntap | legacy | none | ~ 500Mbit | 217 - +-----------+--------+------------------------------------+------------+ 218 - | daemon | legacy | none | ~ 450Mbit | 219 - +-----------+--------+------------------------------------+------------+ 220 - | socket | legacy | none | ~ 450Mbit | 221 - +-----------+--------+------------------------------------+------------+ 222 - | ethertap | legacy | obsolete | ~ 500Mbit | 223 - +-----------+--------+------------------------------------+------------+ 224 - | vde | legacy | obsolete | ~ 500Mbit | 225 - +-----------+--------+------------------------------------+------------+ 226 222 227 223 * All transports which have tso and checksum offloads can deliver speeds 228 224 approaching 10G on TCP streams. ··· 220 236 * All transports which have multi-packet rx and/or tx can deliver pps 221 237 rates of up to 1Mps or more. 222 238 223 - * All legacy transports are generally limited to ~600-700MBit and 0.05Mps. 224 - 225 239 * GRE and L2TPv3 allow connections to all of: local machine, remote 226 240 machines, remote network devices and remote UML instances. 227 - 228 - * Socket allows connections only between UML instances. 229 - 230 - * Daemon and bess require running a local switch. This switch may be 231 - connected to the host as well. 232 241 233 242 234 243 Network configuration privileges 235 244 ================================ 236 245 237 246 The majority of the supported networking modes need ``root`` privileges. 238 - For example, in the legacy tuntap networking mode, users were required 239 - to be part of the group associated with the tunnel device. 240 - 241 - For newer network drivers like the vector transports, ``root`` privilege 242 - is required to fire an ioctl to setup the tun interface and/or use 243 - raw sockets where needed. 247 + For example, for vector transports, ``root`` privilege is required to fire 248 + an ioctl to setup the tun interface and/or use raw sockets where needed. 244 249 245 250 This can be achieved by granting the user a particular capability instead 246 251 of running UML as root. In case of vector transport, a user can add the ··· 582 609 connect to a local area cloud (all the UML nodes using the same 583 610 multicast address running on hosts in the same multicast domain (LAN) 584 611 will be automagically connected together to a virtual LAN. 585 - 586 - Configuring Legacy transports 587 - ============================= 588 - 589 - Legacy transports are now considered obsolete. Please use the vector 590 - versions. 591 612 592 613 *********** 593 614 Running UML
+1 -2
MAINTAINERS
··· 25156 25156 S: Orphan 25157 25157 F: drivers/net/ethernet/dec/tulip/ 25158 25158 25159 - TUN/TAP driver 25159 + TUN/TAP DRIVER 25160 25160 M: Willem de Bruijn <willemdebruijn.kernel@gmail.com> 25161 25161 M: Jason Wang <jasowang@redhat.com> 25162 25162 S: Maintained 25163 25163 W: http://vtun.sourceforge.net/tun 25164 25164 F: Documentation/networking/tuntap.rst 25165 - F: arch/um/os-Linux/drivers/ 25166 25165 F: drivers/net/tap.c 25167 25166 F: drivers/net/tun* 25168 25167
-6
arch/um/Kconfig
··· 52 52 config UML_IOMEM_EMULATION 53 53 bool 54 54 select INDIRECT_IOMEM 55 - select HAS_IOPORT 56 55 select GENERIC_PCI_IOMAP 57 - select GENERIC_IOMAP 58 - select NO_GENERIC_PCI_IOPORT_MAP 59 - 60 - config NO_IOPORT_MAP 61 - def_bool !UML_IOMEM_EMULATION 62 56 63 57 config ISA 64 58 bool
-7
arch/um/configs/i386_defconfig
··· 52 52 CONFIG_UNIX=y 53 53 CONFIG_INET=y 54 54 # CONFIG_IPV6 is not set 55 - CONFIG_UML_NET=y 56 - CONFIG_UML_NET_ETHERTAP=y 57 - CONFIG_UML_NET_TUNTAP=y 58 - CONFIG_UML_NET_SLIP=y 59 - CONFIG_UML_NET_DAEMON=y 60 - CONFIG_UML_NET_MCAST=y 61 - CONFIG_UML_NET_SLIRP=y 62 55 CONFIG_EXT4_FS=y 63 56 CONFIG_QUOTA=y 64 57 CONFIG_AUTOFS_FS=m
-7
arch/um/configs/x86_64_defconfig
··· 51 51 CONFIG_UNIX=y 52 52 CONFIG_INET=y 53 53 # CONFIG_IPV6 is not set 54 - CONFIG_UML_NET=y 55 - CONFIG_UML_NET_ETHERTAP=y 56 - CONFIG_UML_NET_TUNTAP=y 57 - CONFIG_UML_NET_SLIP=y 58 - CONFIG_UML_NET_DAEMON=y 59 - CONFIG_UML_NET_MCAST=y 60 - CONFIG_UML_NET_SLIRP=y 61 54 CONFIG_EXT4_FS=y 62 55 CONFIG_QUOTA=y 63 56 CONFIG_AUTOFS_FS=m
+12 -192
arch/um/drivers/Kconfig
··· 124 124 menu "UML Network Devices" 125 125 depends on NET 126 126 127 - # UML virtual driver 128 - config UML_NET 129 - bool "Virtual network device" 130 - help 131 - While the User-Mode port cannot directly talk to any physical 132 - hardware devices, this choice and the following transport options 133 - provide one or more virtual network devices through which the UML 134 - kernels can talk to each other, the host, and with the host's help, 135 - machines on the outside world. 136 - 137 - For more information, including explanations of the networking and 138 - sample configurations, see 139 - <http://user-mode-linux.sourceforge.net/old/networking.html>. 140 - 141 - If you'd like to be able to enable networking in the User-Mode 142 - linux environment, say Y; otherwise say N. Note that you must 143 - enable at least one of the following transport options to actually 144 - make use of UML networking. 145 - 146 - config UML_NET_ETHERTAP 147 - bool "Ethertap transport (obsolete)" 148 - depends on UML_NET 149 - help 150 - The Ethertap User-Mode Linux network transport allows a single 151 - running UML to exchange packets with its host over one of the 152 - host's Ethertap devices, such as /dev/tap0. Additional running 153 - UMLs can use additional Ethertap devices, one per running UML. 154 - While the UML believes it's on a (multi-device, broadcast) virtual 155 - Ethernet network, it's in fact communicating over a point-to-point 156 - link with the host. 157 - 158 - To use this, your host kernel must have support for Ethertap 159 - devices. Also, if your host kernel is 2.4.x, it must have 160 - CONFIG_NETLINK_DEV configured as Y or M. 161 - 162 - For more information, see 163 - <http://user-mode-linux.sourceforge.net/old/networking.html> That site 164 - has examples of the UML command line to use to enable Ethertap 165 - networking. 166 - 167 - NOTE: THIS TRANSPORT IS DEPRECATED AND WILL BE REMOVED SOON!!! Please 168 - migrate to UML_NET_VECTOR. 169 - 170 - If unsure, say N. 171 - 172 - config UML_NET_TUNTAP 173 - bool "TUN/TAP transport (obsolete)" 174 - depends on UML_NET 175 - help 176 - The UML TUN/TAP network transport allows a UML instance to exchange 177 - packets with the host over a TUN/TAP device. This option will only 178 - work with a 2.4 host, unless you've applied the TUN/TAP patch to 179 - your 2.2 host kernel. 180 - 181 - To use this transport, your host kernel must have support for TUN/TAP 182 - devices, either built-in or as a module. 183 - 184 - NOTE: THIS TRANSPORT IS DEPRECATED AND WILL BE REMOVED SOON!!! Please 185 - migrate to UML_NET_VECTOR. 186 - 187 - If unsure, say N. 188 - 189 - config UML_NET_SLIP 190 - bool "SLIP transport (obsolete)" 191 - depends on UML_NET 192 - help 193 - The slip User-Mode Linux network transport allows a running UML to 194 - network with its host over a point-to-point link. Unlike Ethertap, 195 - which can carry any Ethernet frame (and hence even non-IP packets), 196 - the slip transport can only carry IP packets. 197 - 198 - To use this, your host must support slip devices. 199 - 200 - For more information, see 201 - <http://user-mode-linux.sourceforge.net/old/networking.html>. 202 - has examples of the UML command line to use to enable slip 203 - networking, and details of a few quirks with it. 204 - 205 - NOTE: THIS TRANSPORT IS DEPRECATED AND WILL BE REMOVED SOON!!! Please 206 - migrate to UML_NET_VECTOR. 207 - 208 - If unsure, say N. 209 - 210 - config UML_NET_DAEMON 211 - bool "Daemon transport (obsolete)" 212 - depends on UML_NET 213 - help 214 - This User-Mode Linux network transport allows one or more running 215 - UMLs on a single host to communicate with each other, but not to 216 - the host. 217 - 218 - To use this form of networking, you'll need to run the UML 219 - networking daemon on the host. 220 - 221 - For more information, see 222 - <http://user-mode-linux.sourceforge.net/old/networking.html> That site 223 - has examples of the UML command line to use to enable Daemon 224 - networking. 225 - 226 - NOTE: THIS TRANSPORT IS DEPRECATED AND WILL BE REMOVED SOON!!! Please 227 - migrate to UML_NET_VECTOR. 228 - 229 - If unsure, say N. 230 - 231 - config UML_NET_DAEMON_DEFAULT_SOCK 232 - string "Default socket for daemon transport" 233 - default "/tmp/uml.ctl" 234 - depends on UML_NET_DAEMON 235 - help 236 - This option allows setting the default socket for the daemon 237 - transport, normally it defaults to /tmp/uml.ctl. 238 - 239 127 config UML_NET_VECTOR 240 128 bool "Vector I/O high performance network devices" 241 - depends on UML_NET 242 129 select MAY_HAVE_RUNTIME_DEPS 243 130 help 244 131 This User-Mode Linux network driver uses multi-message send 245 132 and receive functions. The host running the UML guest must have 246 133 a linux kernel version above 3.0 and a libc version > 2.13. 247 - This driver provides tap, raw, gre and l2tpv3 network transports 248 - with up to 4 times higher network throughput than the UML network 249 - drivers. 134 + This driver provides tap, raw, gre and l2tpv3 network transports. 250 135 251 - config UML_NET_VDE 252 - bool "VDE transport (obsolete)" 253 - depends on UML_NET 254 - depends on !MODVERSIONS 255 - select MAY_HAVE_RUNTIME_DEPS 256 - help 257 - This User-Mode Linux network transport allows one or more running 258 - UMLs on a single host to communicate with each other and also 259 - with the rest of the world using Virtual Distributed Ethernet, 260 - an improved fork of uml_switch. 261 - 262 - You must have libvdeplug installed in order to build the vde 263 - transport into UML. 264 - 265 - To use this form of networking, you will need to run vde_switch 266 - on the host. 267 - 268 - For more information, see <http://wiki.virtualsquare.org/> 269 - That site has a good overview of what VDE is and also examples 270 - of the UML command line to use to enable VDE networking. 271 - 272 - NOTE: THIS TRANSPORT IS DEPRECATED AND WILL BE REMOVED SOON!!! Please 273 - migrate to UML_NET_VECTOR. 274 - 275 - If unsure, say N. 276 - 277 - config UML_NET_MCAST 278 - bool "Multicast transport (obsolete)" 279 - depends on UML_NET 280 - help 281 - This Multicast User-Mode Linux network transport allows multiple 282 - UMLs (even ones running on different host machines!) to talk to 283 - each other over a virtual ethernet network. However, it requires 284 - at least one UML with one of the other transports to act as a 285 - bridge if any of them need to be able to talk to their hosts or any 286 - other IP machines. 287 - 288 - To use this, your host kernel(s) must support IP Multicasting. 289 - 290 - For more information, see 291 - <http://user-mode-linux.sourceforge.net/old/networking.html> That site 292 - has examples of the UML command line to use to enable Multicast 293 - networking, and notes about the security of this approach. 294 - 295 - NOTE: THIS TRANSPORT IS DEPRECATED AND WILL BE REMOVED SOON!!! Please 296 - migrate to UML_NET_VECTOR. 297 - 298 - If unsure, say N. 299 - 300 - config UML_NET_SLIRP 301 - bool "SLiRP transport (obsolete)" 302 - depends on UML_NET 303 - help 304 - The SLiRP User-Mode Linux network transport allows a running UML 305 - to network by invoking a program that can handle SLIP encapsulated 306 - packets. This is commonly (but not limited to) the application 307 - known as SLiRP, a program that can re-socket IP packets back onto 308 - he host on which it is run. Only IP packets are supported, 309 - unlike other network transports that can handle all Ethernet 310 - frames. In general, slirp allows the UML the same IP connectivity 311 - to the outside world that the host user is permitted, and unlike 312 - other transports, SLiRP works without the need of root level 313 - privileges, setuid binaries, or SLIP devices on the host. This 314 - also means not every type of connection is possible, but most 315 - situations can be accommodated with carefully crafted slirp 316 - commands that can be passed along as part of the network device's 317 - setup string. The effect of this transport on the UML is similar 318 - that of a host behind a firewall that masquerades all network 319 - connections passing through it (but is less secure). 320 - 321 - NOTE: THIS TRANSPORT IS DEPRECATED AND WILL BE REMOVED SOON!!! Please 322 - migrate to UML_NET_VECTOR. 323 - 324 - If unsure, say N. 325 - 326 - Startup example: "eth0=slirp,FE:FD:01:02:03:04,/usr/local/bin/slirp" 136 + For more information, including explanations of the networking 137 + and sample configurations, see 138 + <file:Documentation/virt/uml/user_mode_linux_howto_v2.rst>. 327 139 328 140 endmenu 329 141 ··· 179 367 There's no official device ID assigned (yet), set the one you 180 368 wish to use for experimentation here. The default of -1 is 181 369 not valid and will cause the driver to fail at probe. 370 + 371 + config UML_PCI_OVER_VFIO 372 + bool "Enable VFIO-based PCI passthrough" 373 + select UML_PCI 374 + help 375 + This driver provides support for VFIO-based PCI passthrough. 376 + Currently, only MSI-X capable devices are supported, and it 377 + is assumed that drivers will use MSI-X.
+3 -19
arch/um/drivers/Makefile
··· 6 6 # pcap is broken in 2.5 because kbuild doesn't allow pcap.a to be linked 7 7 # in to pcap.o 8 8 9 - slip-objs := slip_kern.o slip_user.o 10 - slirp-objs := slirp_kern.o slirp_user.o 11 - daemon-objs := daemon_kern.o daemon_user.o 12 9 vector-objs := vector_kern.o vector_user.o vector_transports.o 13 - umcast-objs := umcast_kern.o umcast_user.o 14 - net-objs := net_kern.o net_user.o 15 10 mconsole-objs := mconsole_kern.o mconsole_user.o 16 11 hostaudio-objs := hostaudio_kern.o 17 12 ubd-objs := ubd_kern.o ubd_user.o ··· 14 19 harddog-objs := harddog_kern.o 15 20 harddog-builtin-$(CONFIG_UML_WATCHDOG) := harddog_user.o harddog_user_exp.o 16 21 rtc-objs := rtc_kern.o rtc_user.o 17 - 18 - LDFLAGS_vde.o = $(shell $(CC) $(CFLAGS) -print-file-name=libvdeplug.a) 19 - 20 - targets := vde_kern.o vde_user.o 21 - 22 - $(obj)/vde.o: $(obj)/vde_kern.o $(obj)/vde_user.o 23 - $(LD) -r -dp -o $@ $^ $(ld_flags) 22 + vfio_uml-objs := vfio_kern.o vfio_user.o 24 23 25 24 #XXX: The call below does not work because the flags are added before the 26 25 # object name, so nothing from the library gets linked. ··· 27 38 obj-$(CONFIG_SSL) += ssl.o 28 39 obj-$(CONFIG_STDERR_CONSOLE) += stderr_console.o 29 40 30 - obj-$(CONFIG_UML_NET_SLIP) += slip.o slip_common.o 31 - obj-$(CONFIG_UML_NET_SLIRP) += slirp.o slip_common.o 32 - obj-$(CONFIG_UML_NET_DAEMON) += daemon.o 33 41 obj-$(CONFIG_UML_NET_VECTOR) += vector.o 34 - obj-$(CONFIG_UML_NET_VDE) += vde.o 35 - obj-$(CONFIG_UML_NET_MCAST) += umcast.o 36 - obj-$(CONFIG_UML_NET) += net.o 37 42 obj-$(CONFIG_MCONSOLE) += mconsole.o 38 43 obj-$(CONFIG_MMAPPER) += mmapper_kern.o 39 44 obj-$(CONFIG_BLK_DEV_UBD) += ubd.o ··· 45 62 obj-$(CONFIG_UML_RTC) += rtc.o 46 63 obj-$(CONFIG_UML_PCI) += virt-pci.o 47 64 obj-$(CONFIG_UML_PCI_OVER_VIRTIO) += virtio_pcidev.o 65 + obj-$(CONFIG_UML_PCI_OVER_VFIO) += vfio_uml.o 48 66 49 67 # pcap_user.o must be added explicitly. 50 - USER_OBJS := fd.o null.o pty.o tty.o xterm.o slip_common.o vde_user.o vector_user.o 68 + USER_OBJS := fd.o null.o pty.o tty.o xterm.o vector_user.o 51 69 CFLAGS_null.o = -DDEV_NULL=$(DEV_NULL_PATH) 52 70 53 71 CFLAGS_xterm.o += '-DCONFIG_XTERM_CHAN_DEFAULT_EMULATOR="$(CONFIG_XTERM_CHAN_DEFAULT_EMULATOR)"'
+5 -5
arch/um/drivers/chan_kern.c
··· 212 212 * be permanently disabled. This is discovered in IRQ context, but 213 213 * the freeing of the IRQ must be done later. 214 214 */ 215 - static DEFINE_SPINLOCK(irqs_to_free_lock); 215 + static DEFINE_RAW_SPINLOCK(irqs_to_free_lock); 216 216 static LIST_HEAD(irqs_to_free); 217 217 218 218 void free_irqs(void) ··· 222 222 struct list_head *ele; 223 223 unsigned long flags; 224 224 225 - spin_lock_irqsave(&irqs_to_free_lock, flags); 225 + raw_spin_lock_irqsave(&irqs_to_free_lock, flags); 226 226 list_splice_init(&irqs_to_free, &list); 227 - spin_unlock_irqrestore(&irqs_to_free_lock, flags); 227 + raw_spin_unlock_irqrestore(&irqs_to_free_lock, flags); 228 228 229 229 list_for_each(ele, &list) { 230 230 chan = list_entry(ele, struct chan, free_list); ··· 246 246 return; 247 247 248 248 if (delay_free_irq) { 249 - spin_lock_irqsave(&irqs_to_free_lock, flags); 249 + raw_spin_lock_irqsave(&irqs_to_free_lock, flags); 250 250 list_add(&chan->free_list, &irqs_to_free); 251 - spin_unlock_irqrestore(&irqs_to_free_lock, flags); 251 + raw_spin_unlock_irqrestore(&irqs_to_free_lock, flags); 252 252 } else { 253 253 if (chan->input && chan->enabled) 254 254 um_free_irq(chan->line->read_irq, chan);
-29
arch/um/drivers/daemon.h
··· 1 - /* SPDX-License-Identifier: GPL-2.0 */ 2 - /* 3 - * Copyright (C) 2001 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com) 4 - */ 5 - 6 - #ifndef __DAEMON_H__ 7 - #define __DAEMON_H__ 8 - 9 - #include <net_user.h> 10 - 11 - #define SWITCH_VERSION 3 12 - 13 - struct daemon_data { 14 - char *sock_type; 15 - char *ctl_sock; 16 - void *ctl_addr; 17 - void *data_addr; 18 - void *local_addr; 19 - int fd; 20 - int control; 21 - void *dev; 22 - }; 23 - 24 - extern const struct net_user_info daemon_user_info; 25 - 26 - extern int daemon_user_write(int fd, void *buf, int len, 27 - struct daemon_data *pri); 28 - 29 - #endif
-95
arch/um/drivers/daemon_kern.c
··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - /* 3 - * Copyright (C) 2001 Lennert Buytenhek (buytenh@gnu.org) and 4 - * James Leu (jleu@mindspring.net). 5 - * Copyright (C) 2001 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com) 6 - * Copyright (C) 2001 by various other people who didn't put their name here. 7 - */ 8 - 9 - #include <linux/init.h> 10 - #include <linux/netdevice.h> 11 - #include <net_kern.h> 12 - #include "daemon.h" 13 - 14 - struct daemon_init { 15 - char *sock_type; 16 - char *ctl_sock; 17 - }; 18 - 19 - static void daemon_init(struct net_device *dev, void *data) 20 - { 21 - struct uml_net_private *pri; 22 - struct daemon_data *dpri; 23 - struct daemon_init *init = data; 24 - 25 - pri = netdev_priv(dev); 26 - dpri = (struct daemon_data *) pri->user; 27 - dpri->sock_type = init->sock_type; 28 - dpri->ctl_sock = init->ctl_sock; 29 - dpri->fd = -1; 30 - dpri->control = -1; 31 - dpri->dev = dev; 32 - /* We will free this pointer. If it contains crap we're burned. */ 33 - dpri->ctl_addr = NULL; 34 - dpri->data_addr = NULL; 35 - dpri->local_addr = NULL; 36 - 37 - printk("daemon backend (uml_switch version %d) - %s:%s", 38 - SWITCH_VERSION, dpri->sock_type, dpri->ctl_sock); 39 - printk("\n"); 40 - } 41 - 42 - static int daemon_read(int fd, struct sk_buff *skb, struct uml_net_private *lp) 43 - { 44 - return net_recvfrom(fd, skb_mac_header(skb), 45 - skb->dev->mtu + ETH_HEADER_OTHER); 46 - } 47 - 48 - static int daemon_write(int fd, struct sk_buff *skb, struct uml_net_private *lp) 49 - { 50 - return daemon_user_write(fd, skb->data, skb->len, 51 - (struct daemon_data *) &lp->user); 52 - } 53 - 54 - static const struct net_kern_info daemon_kern_info = { 55 - .init = daemon_init, 56 - .protocol = eth_protocol, 57 - .read = daemon_read, 58 - .write = daemon_write, 59 - }; 60 - 61 - static int daemon_setup(char *str, char **mac_out, void *data) 62 - { 63 - struct daemon_init *init = data; 64 - char *remain; 65 - 66 - *init = ((struct daemon_init) 67 - { .sock_type = "unix", 68 - .ctl_sock = CONFIG_UML_NET_DAEMON_DEFAULT_SOCK }); 69 - 70 - remain = split_if_spec(str, mac_out, &init->sock_type, &init->ctl_sock, 71 - NULL); 72 - if (remain != NULL) 73 - printk(KERN_WARNING "daemon_setup : Ignoring data socket " 74 - "specification\n"); 75 - 76 - return 1; 77 - } 78 - 79 - static struct transport daemon_transport = { 80 - .list = LIST_HEAD_INIT(daemon_transport.list), 81 - .name = "daemon", 82 - .setup = daemon_setup, 83 - .user = &daemon_user_info, 84 - .kern = &daemon_kern_info, 85 - .private_size = sizeof(struct daemon_data), 86 - .setup_size = sizeof(struct daemon_init), 87 - }; 88 - 89 - static int register_daemon(void) 90 - { 91 - register_transport(&daemon_transport); 92 - return 0; 93 - } 94 - 95 - late_initcall(register_daemon);
-194
arch/um/drivers/daemon_user.c
··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - /* 3 - * Copyright (C) 2001 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com) 4 - * Copyright (C) 2001 Lennert Buytenhek (buytenh@gnu.org) and 5 - * James Leu (jleu@mindspring.net). 6 - * Copyright (C) 2001 by various other people who didn't put their name here. 7 - */ 8 - 9 - #include <stdint.h> 10 - #include <string.h> 11 - #include <unistd.h> 12 - #include <errno.h> 13 - #include <sys/types.h> 14 - #include <sys/socket.h> 15 - #include <sys/time.h> 16 - #include <sys/un.h> 17 - #include "daemon.h" 18 - #include <net_user.h> 19 - #include <os.h> 20 - #include <um_malloc.h> 21 - 22 - enum request_type { REQ_NEW_CONTROL }; 23 - 24 - #define SWITCH_MAGIC 0xfeedface 25 - 26 - struct request_v3 { 27 - uint32_t magic; 28 - uint32_t version; 29 - enum request_type type; 30 - struct sockaddr_un sock; 31 - }; 32 - 33 - static struct sockaddr_un *new_addr(void *name, int len) 34 - { 35 - struct sockaddr_un *sun; 36 - 37 - sun = uml_kmalloc(sizeof(struct sockaddr_un), UM_GFP_KERNEL); 38 - if (sun == NULL) { 39 - printk(UM_KERN_ERR "new_addr: allocation of sockaddr_un " 40 - "failed\n"); 41 - return NULL; 42 - } 43 - sun->sun_family = AF_UNIX; 44 - memcpy(sun->sun_path, name, len); 45 - return sun; 46 - } 47 - 48 - static int connect_to_switch(struct daemon_data *pri) 49 - { 50 - struct sockaddr_un *ctl_addr = pri->ctl_addr; 51 - struct sockaddr_un *local_addr = pri->local_addr; 52 - struct sockaddr_un *sun; 53 - struct request_v3 req; 54 - int fd, n, err; 55 - 56 - pri->control = socket(AF_UNIX, SOCK_STREAM, 0); 57 - if (pri->control < 0) { 58 - err = -errno; 59 - printk(UM_KERN_ERR "daemon_open : control socket failed, " 60 - "errno = %d\n", -err); 61 - return err; 62 - } 63 - 64 - if (connect(pri->control, (struct sockaddr *) ctl_addr, 65 - sizeof(*ctl_addr)) < 0) { 66 - err = -errno; 67 - printk(UM_KERN_ERR "daemon_open : control connect failed, " 68 - "errno = %d\n", -err); 69 - goto out; 70 - } 71 - 72 - fd = socket(AF_UNIX, SOCK_DGRAM, 0); 73 - if (fd < 0) { 74 - err = -errno; 75 - printk(UM_KERN_ERR "daemon_open : data socket failed, " 76 - "errno = %d\n", -err); 77 - goto out; 78 - } 79 - if (bind(fd, (struct sockaddr *) local_addr, sizeof(*local_addr)) < 0) { 80 - err = -errno; 81 - printk(UM_KERN_ERR "daemon_open : data bind failed, " 82 - "errno = %d\n", -err); 83 - goto out_close; 84 - } 85 - 86 - sun = uml_kmalloc(sizeof(struct sockaddr_un), UM_GFP_KERNEL); 87 - if (sun == NULL) { 88 - printk(UM_KERN_ERR "new_addr: allocation of sockaddr_un " 89 - "failed\n"); 90 - err = -ENOMEM; 91 - goto out_close; 92 - } 93 - 94 - req.magic = SWITCH_MAGIC; 95 - req.version = SWITCH_VERSION; 96 - req.type = REQ_NEW_CONTROL; 97 - req.sock = *local_addr; 98 - n = write(pri->control, &req, sizeof(req)); 99 - if (n != sizeof(req)) { 100 - printk(UM_KERN_ERR "daemon_open : control setup request " 101 - "failed, err = %d\n", -errno); 102 - err = -ENOTCONN; 103 - goto out_free; 104 - } 105 - 106 - n = read(pri->control, sun, sizeof(*sun)); 107 - if (n != sizeof(*sun)) { 108 - printk(UM_KERN_ERR "daemon_open : read of data socket failed, " 109 - "err = %d\n", -errno); 110 - err = -ENOTCONN; 111 - goto out_free; 112 - } 113 - 114 - pri->data_addr = sun; 115 - return fd; 116 - 117 - out_free: 118 - kfree(sun); 119 - out_close: 120 - close(fd); 121 - out: 122 - close(pri->control); 123 - return err; 124 - } 125 - 126 - static int daemon_user_init(void *data, void *dev) 127 - { 128 - struct daemon_data *pri = data; 129 - struct timeval tv; 130 - struct { 131 - char zero; 132 - int pid; 133 - int usecs; 134 - } name; 135 - 136 - if (!strcmp(pri->sock_type, "unix")) 137 - pri->ctl_addr = new_addr(pri->ctl_sock, 138 - strlen(pri->ctl_sock) + 1); 139 - name.zero = 0; 140 - name.pid = os_getpid(); 141 - gettimeofday(&tv, NULL); 142 - name.usecs = tv.tv_usec; 143 - pri->local_addr = new_addr(&name, sizeof(name)); 144 - pri->dev = dev; 145 - pri->fd = connect_to_switch(pri); 146 - if (pri->fd < 0) { 147 - kfree(pri->local_addr); 148 - pri->local_addr = NULL; 149 - return pri->fd; 150 - } 151 - 152 - return 0; 153 - } 154 - 155 - static int daemon_open(void *data) 156 - { 157 - struct daemon_data *pri = data; 158 - return pri->fd; 159 - } 160 - 161 - static void daemon_remove(void *data) 162 - { 163 - struct daemon_data *pri = data; 164 - 165 - close(pri->fd); 166 - pri->fd = -1; 167 - close(pri->control); 168 - pri->control = -1; 169 - 170 - kfree(pri->data_addr); 171 - pri->data_addr = NULL; 172 - kfree(pri->ctl_addr); 173 - pri->ctl_addr = NULL; 174 - kfree(pri->local_addr); 175 - pri->local_addr = NULL; 176 - } 177 - 178 - int daemon_user_write(int fd, void *buf, int len, struct daemon_data *pri) 179 - { 180 - struct sockaddr_un *data_addr = pri->data_addr; 181 - 182 - return net_sendto(fd, buf, len, data_addr, sizeof(*data_addr)); 183 - } 184 - 185 - const struct net_user_info daemon_user_info = { 186 - .init = daemon_user_init, 187 - .open = daemon_open, 188 - .close = NULL, 189 - .remove = daemon_remove, 190 - .add_address = NULL, 191 - .delete_address = NULL, 192 - .mtu = ETH_MAX_PACKET, 193 - .max_packet = ETH_MAX_PACKET + ETH_HEADER_OTHER, 194 - };
-889
arch/um/drivers/net_kern.c
··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - /* 3 - * Copyright (C) 2001 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com) 4 - * Copyright (C) 2001 Lennert Buytenhek (buytenh@gnu.org) and 5 - * James Leu (jleu@mindspring.net). 6 - * Copyright (C) 2001 by various other people who didn't put their name here. 7 - */ 8 - 9 - #include <linux/memblock.h> 10 - #include <linux/etherdevice.h> 11 - #include <linux/ethtool.h> 12 - #include <linux/inetdevice.h> 13 - #include <linux/init.h> 14 - #include <linux/list.h> 15 - #include <linux/netdevice.h> 16 - #include <linux/platform_device.h> 17 - #include <linux/rtnetlink.h> 18 - #include <linux/skbuff.h> 19 - #include <linux/slab.h> 20 - #include <linux/spinlock.h> 21 - #include <init.h> 22 - #include <irq_kern.h> 23 - #include <irq_user.h> 24 - #include "mconsole_kern.h" 25 - #include <net_kern.h> 26 - #include <net_user.h> 27 - 28 - #define DRIVER_NAME "uml-netdev" 29 - 30 - static DEFINE_SPINLOCK(opened_lock); 31 - static LIST_HEAD(opened); 32 - 33 - /* 34 - * The drop_skb is used when we can't allocate an skb. The 35 - * packet is read into drop_skb in order to get the data off the 36 - * connection to the host. 37 - * It is reallocated whenever a maximum packet size is seen which is 38 - * larger than any seen before. update_drop_skb is called from 39 - * eth_configure when a new interface is added. 40 - */ 41 - static DEFINE_SPINLOCK(drop_lock); 42 - static struct sk_buff *drop_skb; 43 - static int drop_max; 44 - 45 - static int update_drop_skb(int max) 46 - { 47 - struct sk_buff *new; 48 - unsigned long flags; 49 - int err = 0; 50 - 51 - spin_lock_irqsave(&drop_lock, flags); 52 - 53 - if (max <= drop_max) 54 - goto out; 55 - 56 - err = -ENOMEM; 57 - new = dev_alloc_skb(max); 58 - if (new == NULL) 59 - goto out; 60 - 61 - skb_put(new, max); 62 - 63 - kfree_skb(drop_skb); 64 - drop_skb = new; 65 - drop_max = max; 66 - err = 0; 67 - out: 68 - spin_unlock_irqrestore(&drop_lock, flags); 69 - 70 - return err; 71 - } 72 - 73 - static int uml_net_rx(struct net_device *dev) 74 - { 75 - struct uml_net_private *lp = netdev_priv(dev); 76 - int pkt_len; 77 - struct sk_buff *skb; 78 - 79 - /* If we can't allocate memory, try again next round. */ 80 - skb = dev_alloc_skb(lp->max_packet); 81 - if (skb == NULL) { 82 - drop_skb->dev = dev; 83 - /* Read a packet into drop_skb and don't do anything with it. */ 84 - (*lp->read)(lp->fd, drop_skb, lp); 85 - dev->stats.rx_dropped++; 86 - return 0; 87 - } 88 - 89 - skb->dev = dev; 90 - skb_put(skb, lp->max_packet); 91 - skb_reset_mac_header(skb); 92 - pkt_len = (*lp->read)(lp->fd, skb, lp); 93 - 94 - if (pkt_len > 0) { 95 - skb_trim(skb, pkt_len); 96 - skb->protocol = (*lp->protocol)(skb); 97 - 98 - dev->stats.rx_bytes += skb->len; 99 - dev->stats.rx_packets++; 100 - netif_rx(skb); 101 - return pkt_len; 102 - } 103 - 104 - kfree_skb(skb); 105 - return pkt_len; 106 - } 107 - 108 - static void uml_dev_close(struct work_struct *work) 109 - { 110 - struct uml_net_private *lp = 111 - container_of(work, struct uml_net_private, work); 112 - dev_close(lp->dev); 113 - } 114 - 115 - static irqreturn_t uml_net_interrupt(int irq, void *dev_id) 116 - { 117 - struct net_device *dev = dev_id; 118 - struct uml_net_private *lp = netdev_priv(dev); 119 - int err; 120 - 121 - if (!netif_running(dev)) 122 - return IRQ_NONE; 123 - 124 - spin_lock(&lp->lock); 125 - while ((err = uml_net_rx(dev)) > 0) ; 126 - if (err < 0) { 127 - printk(KERN_ERR 128 - "Device '%s' read returned %d, shutting it down\n", 129 - dev->name, err); 130 - /* dev_close can't be called in interrupt context, and takes 131 - * again lp->lock. 132 - * And dev_close() can be safely called multiple times on the 133 - * same device, since it tests for (dev->flags & IFF_UP). So 134 - * there's no harm in delaying the device shutdown. 135 - * Furthermore, the workqueue will not re-enqueue an already 136 - * enqueued work item. */ 137 - schedule_work(&lp->work); 138 - goto out; 139 - } 140 - out: 141 - spin_unlock(&lp->lock); 142 - return IRQ_HANDLED; 143 - } 144 - 145 - static int uml_net_open(struct net_device *dev) 146 - { 147 - struct uml_net_private *lp = netdev_priv(dev); 148 - int err; 149 - 150 - if (lp->fd >= 0) { 151 - err = -ENXIO; 152 - goto out; 153 - } 154 - 155 - lp->fd = (*lp->open)(&lp->user); 156 - if (lp->fd < 0) { 157 - err = lp->fd; 158 - goto out; 159 - } 160 - 161 - err = um_request_irq(dev->irq, lp->fd, IRQ_READ, uml_net_interrupt, 162 - IRQF_SHARED, dev->name, dev); 163 - if (err < 0) { 164 - printk(KERN_ERR "uml_net_open: failed to get irq(%d)\n", err); 165 - err = -ENETUNREACH; 166 - goto out_close; 167 - } 168 - 169 - netif_start_queue(dev); 170 - 171 - /* clear buffer - it can happen that the host side of the interface 172 - * is full when we get here. In this case, new data is never queued, 173 - * SIGIOs never arrive, and the net never works. 174 - */ 175 - while ((err = uml_net_rx(dev)) > 0) ; 176 - 177 - spin_lock(&opened_lock); 178 - list_add(&lp->list, &opened); 179 - spin_unlock(&opened_lock); 180 - 181 - return 0; 182 - out_close: 183 - if (lp->close != NULL) (*lp->close)(lp->fd, &lp->user); 184 - lp->fd = -1; 185 - out: 186 - return err; 187 - } 188 - 189 - static int uml_net_close(struct net_device *dev) 190 - { 191 - struct uml_net_private *lp = netdev_priv(dev); 192 - 193 - netif_stop_queue(dev); 194 - 195 - um_free_irq(dev->irq, dev); 196 - if (lp->close != NULL) 197 - (*lp->close)(lp->fd, &lp->user); 198 - lp->fd = -1; 199 - 200 - spin_lock(&opened_lock); 201 - list_del(&lp->list); 202 - spin_unlock(&opened_lock); 203 - 204 - return 0; 205 - } 206 - 207 - static netdev_tx_t uml_net_start_xmit(struct sk_buff *skb, struct net_device *dev) 208 - { 209 - struct uml_net_private *lp = netdev_priv(dev); 210 - unsigned long flags; 211 - int len; 212 - 213 - netif_stop_queue(dev); 214 - 215 - spin_lock_irqsave(&lp->lock, flags); 216 - 217 - len = (*lp->write)(lp->fd, skb, lp); 218 - skb_tx_timestamp(skb); 219 - 220 - if (len == skb->len) { 221 - dev->stats.tx_packets++; 222 - dev->stats.tx_bytes += skb->len; 223 - netif_trans_update(dev); 224 - netif_start_queue(dev); 225 - 226 - /* this is normally done in the interrupt when tx finishes */ 227 - netif_wake_queue(dev); 228 - } 229 - else if (len == 0) { 230 - netif_start_queue(dev); 231 - dev->stats.tx_dropped++; 232 - } 233 - else { 234 - netif_start_queue(dev); 235 - printk(KERN_ERR "uml_net_start_xmit: failed(%d)\n", len); 236 - } 237 - 238 - spin_unlock_irqrestore(&lp->lock, flags); 239 - 240 - dev_consume_skb_any(skb); 241 - 242 - return NETDEV_TX_OK; 243 - } 244 - 245 - static void uml_net_set_multicast_list(struct net_device *dev) 246 - { 247 - return; 248 - } 249 - 250 - static void uml_net_tx_timeout(struct net_device *dev, unsigned int txqueue) 251 - { 252 - netif_trans_update(dev); 253 - netif_wake_queue(dev); 254 - } 255 - 256 - #ifdef CONFIG_NET_POLL_CONTROLLER 257 - static void uml_net_poll_controller(struct net_device *dev) 258 - { 259 - disable_irq(dev->irq); 260 - uml_net_interrupt(dev->irq, dev); 261 - enable_irq(dev->irq); 262 - } 263 - #endif 264 - 265 - static void uml_net_get_drvinfo(struct net_device *dev, 266 - struct ethtool_drvinfo *info) 267 - { 268 - strscpy(info->driver, DRIVER_NAME); 269 - } 270 - 271 - static const struct ethtool_ops uml_net_ethtool_ops = { 272 - .get_drvinfo = uml_net_get_drvinfo, 273 - .get_link = ethtool_op_get_link, 274 - .get_ts_info = ethtool_op_get_ts_info, 275 - }; 276 - 277 - void uml_net_setup_etheraddr(struct net_device *dev, char *str) 278 - { 279 - u8 addr[ETH_ALEN]; 280 - char *end; 281 - int i; 282 - 283 - if (str == NULL) 284 - goto random; 285 - 286 - for (i = 0; i < 6; i++) { 287 - addr[i] = simple_strtoul(str, &end, 16); 288 - if ((end == str) || 289 - ((*end != ':') && (*end != ',') && (*end != '\0'))) { 290 - printk(KERN_ERR 291 - "setup_etheraddr: failed to parse '%s' " 292 - "as an ethernet address\n", str); 293 - goto random; 294 - } 295 - str = end + 1; 296 - } 297 - if (is_multicast_ether_addr(addr)) { 298 - printk(KERN_ERR 299 - "Attempt to assign a multicast ethernet address to a " 300 - "device disallowed\n"); 301 - goto random; 302 - } 303 - if (!is_valid_ether_addr(addr)) { 304 - printk(KERN_ERR 305 - "Attempt to assign an invalid ethernet address to a " 306 - "device disallowed\n"); 307 - goto random; 308 - } 309 - if (!is_local_ether_addr(addr)) { 310 - printk(KERN_WARNING 311 - "Warning: Assigning a globally valid ethernet " 312 - "address to a device\n"); 313 - printk(KERN_WARNING "You should set the 2nd rightmost bit in " 314 - "the first byte of the MAC,\n"); 315 - printk(KERN_WARNING "i.e. %02x:%02x:%02x:%02x:%02x:%02x\n", 316 - addr[0] | 0x02, addr[1], addr[2], addr[3], addr[4], 317 - addr[5]); 318 - } 319 - eth_hw_addr_set(dev, addr); 320 - return; 321 - 322 - random: 323 - printk(KERN_INFO 324 - "Choosing a random ethernet address for device %s\n", dev->name); 325 - eth_hw_addr_random(dev); 326 - } 327 - 328 - static DEFINE_SPINLOCK(devices_lock); 329 - static LIST_HEAD(devices); 330 - 331 - static struct platform_driver uml_net_driver = { 332 - .driver = { 333 - .name = DRIVER_NAME, 334 - }, 335 - }; 336 - 337 - static void net_device_release(struct device *dev) 338 - { 339 - struct uml_net *device = container_of(dev, struct uml_net, pdev.dev); 340 - struct net_device *netdev = device->dev; 341 - struct uml_net_private *lp = netdev_priv(netdev); 342 - 343 - if (lp->remove != NULL) 344 - (*lp->remove)(&lp->user); 345 - list_del(&device->list); 346 - kfree(device); 347 - free_netdev(netdev); 348 - } 349 - 350 - static const struct net_device_ops uml_netdev_ops = { 351 - .ndo_open = uml_net_open, 352 - .ndo_stop = uml_net_close, 353 - .ndo_start_xmit = uml_net_start_xmit, 354 - .ndo_set_rx_mode = uml_net_set_multicast_list, 355 - .ndo_tx_timeout = uml_net_tx_timeout, 356 - .ndo_set_mac_address = eth_mac_addr, 357 - .ndo_validate_addr = eth_validate_addr, 358 - #ifdef CONFIG_NET_POLL_CONTROLLER 359 - .ndo_poll_controller = uml_net_poll_controller, 360 - #endif 361 - }; 362 - 363 - /* 364 - * Ensures that platform_driver_register is called only once by 365 - * eth_configure. Will be set in an initcall. 366 - */ 367 - static int driver_registered; 368 - 369 - static void eth_configure(int n, void *init, char *mac, 370 - struct transport *transport, gfp_t gfp_mask) 371 - { 372 - struct uml_net *device; 373 - struct net_device *dev; 374 - struct uml_net_private *lp; 375 - int err, size; 376 - 377 - size = transport->private_size + sizeof(struct uml_net_private); 378 - 379 - device = kzalloc(sizeof(*device), gfp_mask); 380 - if (device == NULL) { 381 - printk(KERN_ERR "eth_configure failed to allocate struct " 382 - "uml_net\n"); 383 - return; 384 - } 385 - 386 - dev = alloc_etherdev(size); 387 - if (dev == NULL) { 388 - printk(KERN_ERR "eth_configure: failed to allocate struct " 389 - "net_device for eth%d\n", n); 390 - goto out_free_device; 391 - } 392 - 393 - INIT_LIST_HEAD(&device->list); 394 - device->index = n; 395 - 396 - /* If this name ends up conflicting with an existing registered 397 - * netdevice, that is OK, register_netdev{,ice}() will notice this 398 - * and fail. 399 - */ 400 - snprintf(dev->name, sizeof(dev->name), "eth%d", n); 401 - 402 - uml_net_setup_etheraddr(dev, mac); 403 - 404 - printk(KERN_INFO "Netdevice %d (%pM) : ", n, dev->dev_addr); 405 - 406 - lp = netdev_priv(dev); 407 - /* This points to the transport private data. It's still clear, but we 408 - * must memset it to 0 *now*. Let's help the drivers. */ 409 - memset(lp, 0, size); 410 - INIT_WORK(&lp->work, uml_dev_close); 411 - 412 - /* sysfs register */ 413 - if (!driver_registered) { 414 - platform_driver_register(&uml_net_driver); 415 - driver_registered = 1; 416 - } 417 - device->pdev.id = n; 418 - device->pdev.name = DRIVER_NAME; 419 - device->pdev.dev.release = net_device_release; 420 - dev_set_drvdata(&device->pdev.dev, device); 421 - if (platform_device_register(&device->pdev)) 422 - goto out_free_netdev; 423 - SET_NETDEV_DEV(dev,&device->pdev.dev); 424 - 425 - device->dev = dev; 426 - 427 - /* 428 - * These just fill in a data structure, so there's no failure 429 - * to be worried about. 430 - */ 431 - (*transport->kern->init)(dev, init); 432 - 433 - *lp = ((struct uml_net_private) 434 - { .list = LIST_HEAD_INIT(lp->list), 435 - .dev = dev, 436 - .fd = -1, 437 - .mac = { 0xfe, 0xfd, 0x0, 0x0, 0x0, 0x0}, 438 - .max_packet = transport->user->max_packet, 439 - .protocol = transport->kern->protocol, 440 - .open = transport->user->open, 441 - .close = transport->user->close, 442 - .remove = transport->user->remove, 443 - .read = transport->kern->read, 444 - .write = transport->kern->write, 445 - .add_address = transport->user->add_address, 446 - .delete_address = transport->user->delete_address }); 447 - 448 - spin_lock_init(&lp->lock); 449 - memcpy(lp->mac, dev->dev_addr, sizeof(lp->mac)); 450 - 451 - if ((transport->user->init != NULL) && 452 - ((*transport->user->init)(&lp->user, dev) != 0)) 453 - goto out_unregister; 454 - 455 - dev->mtu = transport->user->mtu; 456 - dev->netdev_ops = &uml_netdev_ops; 457 - dev->ethtool_ops = &uml_net_ethtool_ops; 458 - dev->watchdog_timeo = (HZ >> 1); 459 - dev->irq = UM_ETH_IRQ; 460 - 461 - err = update_drop_skb(lp->max_packet); 462 - if (err) 463 - goto out_undo_user_init; 464 - 465 - rtnl_lock(); 466 - err = register_netdevice(dev); 467 - rtnl_unlock(); 468 - if (err) 469 - goto out_undo_user_init; 470 - 471 - spin_lock(&devices_lock); 472 - list_add(&device->list, &devices); 473 - spin_unlock(&devices_lock); 474 - 475 - return; 476 - 477 - out_undo_user_init: 478 - if (transport->user->remove != NULL) 479 - (*transport->user->remove)(&lp->user); 480 - out_unregister: 481 - platform_device_unregister(&device->pdev); 482 - return; /* platform_device_unregister frees dev and device */ 483 - out_free_netdev: 484 - free_netdev(dev); 485 - out_free_device: 486 - kfree(device); 487 - } 488 - 489 - static struct uml_net *find_device(int n) 490 - { 491 - struct uml_net *device; 492 - struct list_head *ele; 493 - 494 - spin_lock(&devices_lock); 495 - list_for_each(ele, &devices) { 496 - device = list_entry(ele, struct uml_net, list); 497 - if (device->index == n) 498 - goto out; 499 - } 500 - device = NULL; 501 - out: 502 - spin_unlock(&devices_lock); 503 - return device; 504 - } 505 - 506 - static int eth_parse(char *str, int *index_out, char **str_out, 507 - char **error_out) 508 - { 509 - char *end; 510 - int n, err = -EINVAL; 511 - 512 - n = simple_strtoul(str, &end, 0); 513 - if (end == str) { 514 - *error_out = "Bad device number"; 515 - return err; 516 - } 517 - 518 - str = end; 519 - if (*str != '=') { 520 - *error_out = "Expected '=' after device number"; 521 - return err; 522 - } 523 - 524 - str++; 525 - if (find_device(n)) { 526 - *error_out = "Device already configured"; 527 - return err; 528 - } 529 - 530 - *index_out = n; 531 - *str_out = str; 532 - return 0; 533 - } 534 - 535 - struct eth_init { 536 - struct list_head list; 537 - char *init; 538 - int index; 539 - }; 540 - 541 - static DEFINE_SPINLOCK(transports_lock); 542 - static LIST_HEAD(transports); 543 - 544 - /* Filled in during early boot */ 545 - static LIST_HEAD(eth_cmd_line); 546 - 547 - static int check_transport(struct transport *transport, char *eth, int n, 548 - void **init_out, char **mac_out, gfp_t gfp_mask) 549 - { 550 - int len; 551 - 552 - len = strlen(transport->name); 553 - if (strncmp(eth, transport->name, len)) 554 - return 0; 555 - 556 - eth += len; 557 - if (*eth == ',') 558 - eth++; 559 - else if (*eth != '\0') 560 - return 0; 561 - 562 - *init_out = kmalloc(transport->setup_size, gfp_mask); 563 - if (*init_out == NULL) 564 - return 1; 565 - 566 - if (!transport->setup(eth, mac_out, *init_out)) { 567 - kfree(*init_out); 568 - *init_out = NULL; 569 - } 570 - return 1; 571 - } 572 - 573 - void register_transport(struct transport *new) 574 - { 575 - struct list_head *ele, *next; 576 - struct eth_init *eth; 577 - void *init; 578 - char *mac = NULL; 579 - int match; 580 - 581 - spin_lock(&transports_lock); 582 - BUG_ON(!list_empty(&new->list)); 583 - list_add(&new->list, &transports); 584 - spin_unlock(&transports_lock); 585 - 586 - list_for_each_safe(ele, next, &eth_cmd_line) { 587 - eth = list_entry(ele, struct eth_init, list); 588 - match = check_transport(new, eth->init, eth->index, &init, 589 - &mac, GFP_KERNEL); 590 - if (!match) 591 - continue; 592 - else if (init != NULL) { 593 - eth_configure(eth->index, init, mac, new, GFP_KERNEL); 594 - kfree(init); 595 - } 596 - list_del(&eth->list); 597 - } 598 - } 599 - 600 - static int eth_setup_common(char *str, int index) 601 - { 602 - struct list_head *ele; 603 - struct transport *transport; 604 - void *init; 605 - char *mac = NULL; 606 - int found = 0; 607 - 608 - spin_lock(&transports_lock); 609 - list_for_each(ele, &transports) { 610 - transport = list_entry(ele, struct transport, list); 611 - if (!check_transport(transport, str, index, &init, 612 - &mac, GFP_ATOMIC)) 613 - continue; 614 - if (init != NULL) { 615 - eth_configure(index, init, mac, transport, GFP_ATOMIC); 616 - kfree(init); 617 - } 618 - found = 1; 619 - break; 620 - } 621 - 622 - spin_unlock(&transports_lock); 623 - return found; 624 - } 625 - 626 - static int __init eth_setup(char *str) 627 - { 628 - struct eth_init *new; 629 - char *error; 630 - int n, err; 631 - 632 - err = eth_parse(str, &n, &str, &error); 633 - if (err) { 634 - printk(KERN_ERR "eth_setup - Couldn't parse '%s' : %s\n", 635 - str, error); 636 - return 1; 637 - } 638 - 639 - new = memblock_alloc_or_panic(sizeof(*new), SMP_CACHE_BYTES); 640 - 641 - INIT_LIST_HEAD(&new->list); 642 - new->index = n; 643 - new->init = str; 644 - 645 - list_add_tail(&new->list, &eth_cmd_line); 646 - return 1; 647 - } 648 - 649 - __setup("eth", eth_setup); 650 - __uml_help(eth_setup, 651 - "eth[0-9]+=<transport>,<options>\n" 652 - " Configure a network device.\n\n" 653 - ); 654 - 655 - static int net_config(char *str, char **error_out) 656 - { 657 - int n, err; 658 - 659 - err = eth_parse(str, &n, &str, error_out); 660 - if (err) 661 - return err; 662 - 663 - /* This string is broken up and the pieces used by the underlying 664 - * driver. So, it is freed only if eth_setup_common fails. 665 - */ 666 - str = kstrdup(str, GFP_KERNEL); 667 - if (str == NULL) { 668 - *error_out = "net_config failed to strdup string"; 669 - return -ENOMEM; 670 - } 671 - err = !eth_setup_common(str, n); 672 - if (err) 673 - kfree(str); 674 - return err; 675 - } 676 - 677 - static int net_id(char **str, int *start_out, int *end_out) 678 - { 679 - char *end; 680 - int n; 681 - 682 - n = simple_strtoul(*str, &end, 0); 683 - if ((*end != '\0') || (end == *str)) 684 - return -1; 685 - 686 - *start_out = n; 687 - *end_out = n; 688 - *str = end; 689 - return n; 690 - } 691 - 692 - static int net_remove(int n, char **error_out) 693 - { 694 - struct uml_net *device; 695 - struct net_device *dev; 696 - struct uml_net_private *lp; 697 - 698 - device = find_device(n); 699 - if (device == NULL) 700 - return -ENODEV; 701 - 702 - dev = device->dev; 703 - lp = netdev_priv(dev); 704 - if (lp->fd > 0) 705 - return -EBUSY; 706 - unregister_netdev(dev); 707 - platform_device_unregister(&device->pdev); 708 - 709 - return 0; 710 - } 711 - 712 - static struct mc_device net_mc = { 713 - .list = LIST_HEAD_INIT(net_mc.list), 714 - .name = "eth", 715 - .config = net_config, 716 - .get_config = NULL, 717 - .id = net_id, 718 - .remove = net_remove, 719 - }; 720 - 721 - #ifdef CONFIG_INET 722 - static int uml_inetaddr_event(struct notifier_block *this, unsigned long event, 723 - void *ptr) 724 - { 725 - struct in_ifaddr *ifa = ptr; 726 - struct net_device *dev = ifa->ifa_dev->dev; 727 - struct uml_net_private *lp; 728 - void (*proc)(unsigned char *, unsigned char *, void *); 729 - unsigned char addr_buf[4], netmask_buf[4]; 730 - 731 - if (dev->netdev_ops->ndo_open != uml_net_open) 732 - return NOTIFY_DONE; 733 - 734 - lp = netdev_priv(dev); 735 - 736 - proc = NULL; 737 - switch (event) { 738 - case NETDEV_UP: 739 - proc = lp->add_address; 740 - break; 741 - case NETDEV_DOWN: 742 - proc = lp->delete_address; 743 - break; 744 - } 745 - if (proc != NULL) { 746 - memcpy(addr_buf, &ifa->ifa_address, sizeof(addr_buf)); 747 - memcpy(netmask_buf, &ifa->ifa_mask, sizeof(netmask_buf)); 748 - (*proc)(addr_buf, netmask_buf, &lp->user); 749 - } 750 - return NOTIFY_DONE; 751 - } 752 - 753 - /* uml_net_init shouldn't be called twice on two CPUs at the same time */ 754 - static struct notifier_block uml_inetaddr_notifier = { 755 - .notifier_call = uml_inetaddr_event, 756 - }; 757 - 758 - static void inet_register(void) 759 - { 760 - struct list_head *ele; 761 - struct uml_net_private *lp; 762 - struct in_device *ip; 763 - struct in_ifaddr *in; 764 - 765 - register_inetaddr_notifier(&uml_inetaddr_notifier); 766 - 767 - /* Devices may have been opened already, so the uml_inetaddr_notifier 768 - * didn't get a chance to run for them. This fakes it so that 769 - * addresses which have already been set up get handled properly. 770 - */ 771 - spin_lock(&opened_lock); 772 - list_for_each(ele, &opened) { 773 - lp = list_entry(ele, struct uml_net_private, list); 774 - ip = lp->dev->ip_ptr; 775 - if (ip == NULL) 776 - continue; 777 - in = ip->ifa_list; 778 - while (in != NULL) { 779 - uml_inetaddr_event(NULL, NETDEV_UP, in); 780 - in = in->ifa_next; 781 - } 782 - } 783 - spin_unlock(&opened_lock); 784 - } 785 - #else 786 - static inline void inet_register(void) 787 - { 788 - } 789 - #endif 790 - 791 - static int uml_net_init(void) 792 - { 793 - mconsole_register_dev(&net_mc); 794 - inet_register(); 795 - return 0; 796 - } 797 - 798 - __initcall(uml_net_init); 799 - 800 - static void close_devices(void) 801 - { 802 - struct list_head *ele; 803 - struct uml_net_private *lp; 804 - 805 - spin_lock(&opened_lock); 806 - list_for_each(ele, &opened) { 807 - lp = list_entry(ele, struct uml_net_private, list); 808 - um_free_irq(lp->dev->irq, lp->dev); 809 - if ((lp->close != NULL) && (lp->fd >= 0)) 810 - (*lp->close)(lp->fd, &lp->user); 811 - if (lp->remove != NULL) 812 - (*lp->remove)(&lp->user); 813 - } 814 - spin_unlock(&opened_lock); 815 - } 816 - 817 - __uml_exitcall(close_devices); 818 - 819 - void iter_addresses(void *d, void (*cb)(unsigned char *, unsigned char *, 820 - void *), 821 - void *arg) 822 - { 823 - struct net_device *dev = d; 824 - struct in_device *ip = dev->ip_ptr; 825 - struct in_ifaddr *in; 826 - unsigned char address[4], netmask[4]; 827 - 828 - if (ip == NULL) return; 829 - in = ip->ifa_list; 830 - while (in != NULL) { 831 - memcpy(address, &in->ifa_address, sizeof(address)); 832 - memcpy(netmask, &in->ifa_mask, sizeof(netmask)); 833 - (*cb)(address, netmask, arg); 834 - in = in->ifa_next; 835 - } 836 - } 837 - 838 - int dev_netmask(void *d, void *m) 839 - { 840 - struct net_device *dev = d; 841 - struct in_device *ip = dev->ip_ptr; 842 - struct in_ifaddr *in; 843 - __be32 *mask_out = m; 844 - 845 - if (ip == NULL) 846 - return 1; 847 - 848 - in = ip->ifa_list; 849 - if (in == NULL) 850 - return 1; 851 - 852 - *mask_out = in->ifa_mask; 853 - return 0; 854 - } 855 - 856 - void *get_output_buffer(int *len_out) 857 - { 858 - void *ret; 859 - 860 - ret = (void *) __get_free_pages(GFP_KERNEL, 0); 861 - if (ret) *len_out = PAGE_SIZE; 862 - else *len_out = 0; 863 - return ret; 864 - } 865 - 866 - void free_output_buffer(void *buffer) 867 - { 868 - free_pages((unsigned long) buffer, 0); 869 - } 870 - 871 - int tap_setup_common(char *str, char *type, char **dev_name, char **mac_out, 872 - char **gate_addr) 873 - { 874 - char *remain; 875 - 876 - remain = split_if_spec(str, dev_name, mac_out, gate_addr, NULL); 877 - if (remain != NULL) { 878 - printk(KERN_ERR "tap_setup_common - Extra garbage on " 879 - "specification : '%s'\n", remain); 880 - return 1; 881 - } 882 - 883 - return 0; 884 - } 885 - 886 - unsigned short eth_protocol(struct sk_buff *skb) 887 - { 888 - return eth_type_trans(skb, skb->dev); 889 - }
-271
arch/um/drivers/net_user.c
··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - /* 3 - * Copyright (C) 2001 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com) 4 - */ 5 - 6 - #include <stdio.h> 7 - #include <unistd.h> 8 - #include <stdarg.h> 9 - #include <errno.h> 10 - #include <stddef.h> 11 - #include <string.h> 12 - #include <sys/socket.h> 13 - #include <sys/wait.h> 14 - #include <net_user.h> 15 - #include <os.h> 16 - #include <um_malloc.h> 17 - 18 - int tap_open_common(void *dev, char *gate_addr) 19 - { 20 - int tap_addr[4]; 21 - 22 - if (gate_addr == NULL) 23 - return 0; 24 - if (sscanf(gate_addr, "%d.%d.%d.%d", &tap_addr[0], 25 - &tap_addr[1], &tap_addr[2], &tap_addr[3]) != 4) { 26 - printk(UM_KERN_ERR "Invalid tap IP address - '%s'\n", 27 - gate_addr); 28 - return -EINVAL; 29 - } 30 - return 0; 31 - } 32 - 33 - void tap_check_ips(char *gate_addr, unsigned char *eth_addr) 34 - { 35 - int tap_addr[4]; 36 - 37 - if ((gate_addr != NULL) && 38 - (sscanf(gate_addr, "%d.%d.%d.%d", &tap_addr[0], 39 - &tap_addr[1], &tap_addr[2], &tap_addr[3]) == 4) && 40 - (eth_addr[0] == tap_addr[0]) && 41 - (eth_addr[1] == tap_addr[1]) && 42 - (eth_addr[2] == tap_addr[2]) && 43 - (eth_addr[3] == tap_addr[3])) { 44 - printk(UM_KERN_ERR "The tap IP address and the UML eth IP " 45 - "address must be different\n"); 46 - } 47 - } 48 - 49 - /* Do reliable error handling as this fails frequently enough. */ 50 - void read_output(int fd, char *output, int len) 51 - { 52 - int remain, ret, expected; 53 - char c; 54 - char *str; 55 - 56 - if (output == NULL) { 57 - output = &c; 58 - len = sizeof(c); 59 - } 60 - 61 - *output = '\0'; 62 - ret = read(fd, &remain, sizeof(remain)); 63 - 64 - if (ret != sizeof(remain)) { 65 - if (ret < 0) 66 - ret = -errno; 67 - expected = sizeof(remain); 68 - str = "length"; 69 - goto err; 70 - } 71 - 72 - while (remain != 0) { 73 - expected = (remain < len) ? remain : len; 74 - ret = read(fd, output, expected); 75 - if (ret != expected) { 76 - if (ret < 0) 77 - ret = -errno; 78 - str = "data"; 79 - goto err; 80 - } 81 - remain -= ret; 82 - } 83 - 84 - return; 85 - 86 - err: 87 - if (ret < 0) 88 - printk(UM_KERN_ERR "read_output - read of %s failed, " 89 - "errno = %d\n", str, -ret); 90 - else 91 - printk(UM_KERN_ERR "read_output - read of %s failed, read only " 92 - "%d of %d bytes\n", str, ret, expected); 93 - } 94 - 95 - int net_read(int fd, void *buf, int len) 96 - { 97 - int n; 98 - 99 - n = read(fd, buf, len); 100 - 101 - if ((n < 0) && (errno == EAGAIN)) 102 - return 0; 103 - else if (n == 0) 104 - return -ENOTCONN; 105 - return n; 106 - } 107 - 108 - int net_recvfrom(int fd, void *buf, int len) 109 - { 110 - int n; 111 - 112 - CATCH_EINTR(n = recvfrom(fd, buf, len, 0, NULL, NULL)); 113 - if (n < 0) { 114 - if (errno == EAGAIN) 115 - return 0; 116 - return -errno; 117 - } 118 - else if (n == 0) 119 - return -ENOTCONN; 120 - return n; 121 - } 122 - 123 - int net_write(int fd, void *buf, int len) 124 - { 125 - int n; 126 - 127 - n = write(fd, buf, len); 128 - 129 - if ((n < 0) && (errno == EAGAIN)) 130 - return 0; 131 - else if (n == 0) 132 - return -ENOTCONN; 133 - return n; 134 - } 135 - 136 - int net_send(int fd, void *buf, int len) 137 - { 138 - int n; 139 - 140 - CATCH_EINTR(n = send(fd, buf, len, 0)); 141 - if (n < 0) { 142 - if (errno == EAGAIN) 143 - return 0; 144 - return -errno; 145 - } 146 - else if (n == 0) 147 - return -ENOTCONN; 148 - return n; 149 - } 150 - 151 - int net_sendto(int fd, void *buf, int len, void *to, int sock_len) 152 - { 153 - int n; 154 - 155 - CATCH_EINTR(n = sendto(fd, buf, len, 0, (struct sockaddr *) to, 156 - sock_len)); 157 - if (n < 0) { 158 - if (errno == EAGAIN) 159 - return 0; 160 - return -errno; 161 - } 162 - else if (n == 0) 163 - return -ENOTCONN; 164 - return n; 165 - } 166 - 167 - struct change_pre_exec_data { 168 - int close_me; 169 - int stdout_fd; 170 - }; 171 - 172 - static void change_pre_exec(void *arg) 173 - { 174 - struct change_pre_exec_data *data = arg; 175 - 176 - close(data->close_me); 177 - dup2(data->stdout_fd, 1); 178 - } 179 - 180 - static int change_tramp(char **argv, char *output, int output_len) 181 - { 182 - int pid, fds[2], err; 183 - struct change_pre_exec_data pe_data; 184 - 185 - err = os_pipe(fds, 1, 0); 186 - if (err < 0) { 187 - printk(UM_KERN_ERR "change_tramp - pipe failed, err = %d\n", 188 - -err); 189 - return err; 190 - } 191 - pe_data.close_me = fds[0]; 192 - pe_data.stdout_fd = fds[1]; 193 - pid = run_helper(change_pre_exec, &pe_data, argv); 194 - 195 - if (pid > 0) /* Avoid hang as we won't get data in failure case. */ 196 - read_output(fds[0], output, output_len); 197 - 198 - close(fds[0]); 199 - close(fds[1]); 200 - 201 - if (pid > 0) 202 - helper_wait(pid); 203 - return pid; 204 - } 205 - 206 - static void change(char *dev, char *what, unsigned char *addr, 207 - unsigned char *netmask) 208 - { 209 - char addr_buf[sizeof("255.255.255.255\0")]; 210 - char netmask_buf[sizeof("255.255.255.255\0")]; 211 - char version[sizeof("nnnnn\0")]; 212 - char *argv[] = { "uml_net", version, what, dev, addr_buf, 213 - netmask_buf, NULL }; 214 - char *output; 215 - int output_len, pid; 216 - 217 - sprintf(version, "%d", UML_NET_VERSION); 218 - sprintf(addr_buf, "%d.%d.%d.%d", addr[0], addr[1], addr[2], addr[3]); 219 - sprintf(netmask_buf, "%d.%d.%d.%d", netmask[0], netmask[1], 220 - netmask[2], netmask[3]); 221 - 222 - output_len = UM_KERN_PAGE_SIZE; 223 - output = uml_kmalloc(output_len, UM_GFP_KERNEL); 224 - if (output == NULL) 225 - printk(UM_KERN_ERR "change : failed to allocate output " 226 - "buffer\n"); 227 - 228 - pid = change_tramp(argv, output, output_len); 229 - if (pid < 0) { 230 - kfree(output); 231 - return; 232 - } 233 - 234 - if (output != NULL) { 235 - printk("%s", output); 236 - kfree(output); 237 - } 238 - } 239 - 240 - void open_addr(unsigned char *addr, unsigned char *netmask, void *arg) 241 - { 242 - change(arg, "add", addr, netmask); 243 - } 244 - 245 - void close_addr(unsigned char *addr, unsigned char *netmask, void *arg) 246 - { 247 - change(arg, "del", addr, netmask); 248 - } 249 - 250 - char *split_if_spec(char *str, ...) 251 - { 252 - char **arg, *end, *ret = NULL; 253 - va_list ap; 254 - 255 - va_start(ap, str); 256 - while ((arg = va_arg(ap, char **)) != NULL) { 257 - if (*str == '\0') 258 - goto out; 259 - end = strchr(str, ','); 260 - if (end != str) 261 - *arg = str; 262 - if (end == NULL) 263 - goto out; 264 - *end++ = '\0'; 265 - str = end; 266 - } 267 - ret = str; 268 - out: 269 - va_end(ap); 270 - return ret; 271 - }
-21
arch/um/drivers/slip.h
··· 1 - /* SPDX-License-Identifier: GPL-2.0 */ 2 - #ifndef __UM_SLIP_H 3 - #define __UM_SLIP_H 4 - 5 - #include "slip_common.h" 6 - 7 - struct slip_data { 8 - void *dev; 9 - char name[sizeof("slnnnnn\0")]; 10 - char *addr; 11 - char *gate_addr; 12 - int slave; 13 - struct slip_proto slip; 14 - }; 15 - 16 - extern const struct net_user_info slip_user_info; 17 - 18 - extern int slip_user_read(int fd, void *buf, int len, struct slip_data *pri); 19 - extern int slip_user_write(int fd, void *buf, int len, struct slip_data *pri); 20 - 21 - #endif
-55
arch/um/drivers/slip_common.c
··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - #include <string.h> 3 - #include "slip_common.h" 4 - #include <net_user.h> 5 - 6 - int slip_proto_read(int fd, void *buf, int len, struct slip_proto *slip) 7 - { 8 - int i, n, size, start; 9 - 10 - if(slip->more > 0){ 11 - i = 0; 12 - while(i < slip->more){ 13 - size = slip_unesc(slip->ibuf[i++], slip->ibuf, 14 - &slip->pos, &slip->esc); 15 - if(size){ 16 - memcpy(buf, slip->ibuf, size); 17 - memmove(slip->ibuf, &slip->ibuf[i], 18 - slip->more - i); 19 - slip->more = slip->more - i; 20 - return size; 21 - } 22 - } 23 - slip->more = 0; 24 - } 25 - 26 - n = net_read(fd, &slip->ibuf[slip->pos], 27 - sizeof(slip->ibuf) - slip->pos); 28 - if(n <= 0) 29 - return n; 30 - 31 - start = slip->pos; 32 - for(i = 0; i < n; i++){ 33 - size = slip_unesc(slip->ibuf[start + i], slip->ibuf,&slip->pos, 34 - &slip->esc); 35 - if(size){ 36 - memcpy(buf, slip->ibuf, size); 37 - memmove(slip->ibuf, &slip->ibuf[start+i+1], 38 - n - (i + 1)); 39 - slip->more = n - (i + 1); 40 - return size; 41 - } 42 - } 43 - return 0; 44 - } 45 - 46 - int slip_proto_write(int fd, void *buf, int len, struct slip_proto *slip) 47 - { 48 - int actual, n; 49 - 50 - actual = slip_esc(buf, slip->obuf, len); 51 - n = net_write(fd, slip->obuf, actual); 52 - if(n < 0) 53 - return n; 54 - else return len; 55 - }
-106
arch/um/drivers/slip_common.h
··· 1 - /* SPDX-License-Identifier: GPL-2.0 */ 2 - #ifndef __UM_SLIP_COMMON_H 3 - #define __UM_SLIP_COMMON_H 4 - 5 - #define BUF_SIZE 1500 6 - /* two bytes each for a (pathological) max packet of escaped chars + * 7 - * terminating END char + initial END char */ 8 - #define ENC_BUF_SIZE (2 * BUF_SIZE + 2) 9 - 10 - /* SLIP protocol characters. */ 11 - #define SLIP_END 0300 /* indicates end of frame */ 12 - #define SLIP_ESC 0333 /* indicates byte stuffing */ 13 - #define SLIP_ESC_END 0334 /* ESC ESC_END means END 'data' */ 14 - #define SLIP_ESC_ESC 0335 /* ESC ESC_ESC means ESC 'data' */ 15 - 16 - static inline int slip_unesc(unsigned char c, unsigned char *buf, int *pos, 17 - int *esc) 18 - { 19 - int ret; 20 - 21 - switch(c){ 22 - case SLIP_END: 23 - *esc = 0; 24 - ret=*pos; 25 - *pos=0; 26 - return(ret); 27 - case SLIP_ESC: 28 - *esc = 1; 29 - return(0); 30 - case SLIP_ESC_ESC: 31 - if(*esc){ 32 - *esc = 0; 33 - c = SLIP_ESC; 34 - } 35 - break; 36 - case SLIP_ESC_END: 37 - if(*esc){ 38 - *esc = 0; 39 - c = SLIP_END; 40 - } 41 - break; 42 - } 43 - buf[(*pos)++] = c; 44 - return(0); 45 - } 46 - 47 - static inline int slip_esc(unsigned char *s, unsigned char *d, int len) 48 - { 49 - unsigned char *ptr = d; 50 - unsigned char c; 51 - 52 - /* 53 - * Send an initial END character to flush out any 54 - * data that may have accumulated in the receiver 55 - * due to line noise. 56 - */ 57 - 58 - *ptr++ = SLIP_END; 59 - 60 - /* 61 - * For each byte in the packet, send the appropriate 62 - * character sequence, according to the SLIP protocol. 63 - */ 64 - 65 - while (len-- > 0) { 66 - switch(c = *s++) { 67 - case SLIP_END: 68 - *ptr++ = SLIP_ESC; 69 - *ptr++ = SLIP_ESC_END; 70 - break; 71 - case SLIP_ESC: 72 - *ptr++ = SLIP_ESC; 73 - *ptr++ = SLIP_ESC_ESC; 74 - break; 75 - default: 76 - *ptr++ = c; 77 - break; 78 - } 79 - } 80 - *ptr++ = SLIP_END; 81 - return (ptr - d); 82 - } 83 - 84 - struct slip_proto { 85 - unsigned char ibuf[ENC_BUF_SIZE]; 86 - unsigned char obuf[ENC_BUF_SIZE]; 87 - int more; /* more data: do not read fd until ibuf has been drained */ 88 - int pos; 89 - int esc; 90 - }; 91 - 92 - static inline void slip_proto_init(struct slip_proto * slip) 93 - { 94 - memset(slip->ibuf, 0, sizeof(slip->ibuf)); 95 - memset(slip->obuf, 0, sizeof(slip->obuf)); 96 - slip->more = 0; 97 - slip->pos = 0; 98 - slip->esc = 0; 99 - } 100 - 101 - extern int slip_proto_read(int fd, void *buf, int len, 102 - struct slip_proto *slip); 103 - extern int slip_proto_write(int fd, void *buf, int len, 104 - struct slip_proto *slip); 105 - 106 - #endif
-93
arch/um/drivers/slip_kern.c
··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - /* 3 - * Copyright (C) 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com) 4 - */ 5 - 6 - #include <linux/if_arp.h> 7 - #include <linux/init.h> 8 - #include <linux/netdevice.h> 9 - #include <net_kern.h> 10 - #include "slip.h" 11 - 12 - struct slip_init { 13 - char *gate_addr; 14 - }; 15 - 16 - static void slip_init(struct net_device *dev, void *data) 17 - { 18 - struct uml_net_private *private; 19 - struct slip_data *spri; 20 - struct slip_init *init = data; 21 - 22 - private = netdev_priv(dev); 23 - spri = (struct slip_data *) private->user; 24 - 25 - memset(spri->name, 0, sizeof(spri->name)); 26 - spri->addr = NULL; 27 - spri->gate_addr = init->gate_addr; 28 - spri->slave = -1; 29 - spri->dev = dev; 30 - 31 - slip_proto_init(&spri->slip); 32 - 33 - dev->hard_header_len = 0; 34 - dev->header_ops = NULL; 35 - dev->addr_len = 0; 36 - dev->type = ARPHRD_SLIP; 37 - dev->tx_queue_len = 256; 38 - dev->flags = IFF_NOARP; 39 - printk("SLIP backend - SLIP IP = %s\n", spri->gate_addr); 40 - } 41 - 42 - static unsigned short slip_protocol(struct sk_buff *skbuff) 43 - { 44 - return htons(ETH_P_IP); 45 - } 46 - 47 - static int slip_read(int fd, struct sk_buff *skb, struct uml_net_private *lp) 48 - { 49 - return slip_user_read(fd, skb_mac_header(skb), skb->dev->mtu, 50 - (struct slip_data *) &lp->user); 51 - } 52 - 53 - static int slip_write(int fd, struct sk_buff *skb, struct uml_net_private *lp) 54 - { 55 - return slip_user_write(fd, skb->data, skb->len, 56 - (struct slip_data *) &lp->user); 57 - } 58 - 59 - static const struct net_kern_info slip_kern_info = { 60 - .init = slip_init, 61 - .protocol = slip_protocol, 62 - .read = slip_read, 63 - .write = slip_write, 64 - }; 65 - 66 - static int slip_setup(char *str, char **mac_out, void *data) 67 - { 68 - struct slip_init *init = data; 69 - 70 - *init = ((struct slip_init) { .gate_addr = NULL }); 71 - 72 - if (str[0] != '\0') 73 - init->gate_addr = str; 74 - return 1; 75 - } 76 - 77 - static struct transport slip_transport = { 78 - .list = LIST_HEAD_INIT(slip_transport.list), 79 - .name = "slip", 80 - .setup = slip_setup, 81 - .user = &slip_user_info, 82 - .kern = &slip_kern_info, 83 - .private_size = sizeof(struct slip_data), 84 - .setup_size = sizeof(struct slip_init), 85 - }; 86 - 87 - static int register_slip(void) 88 - { 89 - register_transport(&slip_transport); 90 - return 0; 91 - } 92 - 93 - late_initcall(register_slip);
-252
arch/um/drivers/slip_user.c
··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - /* 3 - * Copyright (C) 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com) 4 - */ 5 - 6 - #include <stdio.h> 7 - #include <stdlib.h> 8 - #include <unistd.h> 9 - #include <errno.h> 10 - #include <fcntl.h> 11 - #include <string.h> 12 - #include <termios.h> 13 - #include <sys/wait.h> 14 - #include <net_user.h> 15 - #include <os.h> 16 - #include "slip.h" 17 - #include <um_malloc.h> 18 - 19 - static int slip_user_init(void *data, void *dev) 20 - { 21 - struct slip_data *pri = data; 22 - 23 - pri->dev = dev; 24 - return 0; 25 - } 26 - 27 - static int set_up_tty(int fd) 28 - { 29 - int i; 30 - struct termios tios; 31 - 32 - if (tcgetattr(fd, &tios) < 0) { 33 - printk(UM_KERN_ERR "could not get initial terminal " 34 - "attributes\n"); 35 - return -1; 36 - } 37 - 38 - tios.c_cflag = CS8 | CREAD | HUPCL | CLOCAL; 39 - tios.c_iflag = IGNBRK | IGNPAR; 40 - tios.c_oflag = 0; 41 - tios.c_lflag = 0; 42 - for (i = 0; i < NCCS; i++) 43 - tios.c_cc[i] = 0; 44 - tios.c_cc[VMIN] = 1; 45 - tios.c_cc[VTIME] = 0; 46 - 47 - cfsetospeed(&tios, B38400); 48 - cfsetispeed(&tios, B38400); 49 - 50 - if (tcsetattr(fd, TCSAFLUSH, &tios) < 0) { 51 - printk(UM_KERN_ERR "failed to set terminal attributes\n"); 52 - return -1; 53 - } 54 - return 0; 55 - } 56 - 57 - struct slip_pre_exec_data { 58 - int stdin_fd; 59 - int stdout_fd; 60 - int close_me; 61 - }; 62 - 63 - static void slip_pre_exec(void *arg) 64 - { 65 - struct slip_pre_exec_data *data = arg; 66 - 67 - if (data->stdin_fd >= 0) 68 - dup2(data->stdin_fd, 0); 69 - dup2(data->stdout_fd, 1); 70 - if (data->close_me >= 0) 71 - close(data->close_me); 72 - } 73 - 74 - static int slip_tramp(char **argv, int fd) 75 - { 76 - struct slip_pre_exec_data pe_data; 77 - char *output; 78 - int pid, fds[2], err, output_len; 79 - 80 - err = os_pipe(fds, 1, 0); 81 - if (err < 0) { 82 - printk(UM_KERN_ERR "slip_tramp : pipe failed, err = %d\n", 83 - -err); 84 - goto out; 85 - } 86 - 87 - err = 0; 88 - pe_data.stdin_fd = fd; 89 - pe_data.stdout_fd = fds[1]; 90 - pe_data.close_me = fds[0]; 91 - err = run_helper(slip_pre_exec, &pe_data, argv); 92 - if (err < 0) 93 - goto out_close; 94 - pid = err; 95 - 96 - output_len = UM_KERN_PAGE_SIZE; 97 - output = uml_kmalloc(output_len, UM_GFP_KERNEL); 98 - if (output == NULL) { 99 - printk(UM_KERN_ERR "slip_tramp : failed to allocate output " 100 - "buffer\n"); 101 - os_kill_process(pid, 1); 102 - err = -ENOMEM; 103 - goto out_close; 104 - } 105 - 106 - close(fds[1]); 107 - read_output(fds[0], output, output_len); 108 - printk("%s", output); 109 - 110 - err = helper_wait(pid); 111 - close(fds[0]); 112 - 113 - kfree(output); 114 - return err; 115 - 116 - out_close: 117 - close(fds[0]); 118 - close(fds[1]); 119 - out: 120 - return err; 121 - } 122 - 123 - static int slip_open(void *data) 124 - { 125 - struct slip_data *pri = data; 126 - char version_buf[sizeof("nnnnn\0")]; 127 - char gate_buf[sizeof("nnn.nnn.nnn.nnn\0")]; 128 - char *argv[] = { "uml_net", version_buf, "slip", "up", gate_buf, 129 - NULL }; 130 - int sfd, mfd, err; 131 - 132 - err = get_pty(); 133 - if (err < 0) { 134 - printk(UM_KERN_ERR "slip-open : Failed to open pty, err = %d\n", 135 - -err); 136 - goto out; 137 - } 138 - mfd = err; 139 - 140 - err = open(ptsname(mfd), O_RDWR, 0); 141 - if (err < 0) { 142 - printk(UM_KERN_ERR "Couldn't open tty for slip line, " 143 - "err = %d\n", -err); 144 - goto out_close; 145 - } 146 - sfd = err; 147 - 148 - err = set_up_tty(sfd); 149 - if (err) 150 - goto out_close2; 151 - 152 - pri->slave = sfd; 153 - pri->slip.pos = 0; 154 - pri->slip.esc = 0; 155 - if (pri->gate_addr != NULL) { 156 - sprintf(version_buf, "%d", UML_NET_VERSION); 157 - strcpy(gate_buf, pri->gate_addr); 158 - 159 - err = slip_tramp(argv, sfd); 160 - 161 - if (err < 0) { 162 - printk(UM_KERN_ERR "slip_tramp failed - err = %d\n", 163 - -err); 164 - goto out_close2; 165 - } 166 - err = os_get_ifname(pri->slave, pri->name); 167 - if (err < 0) { 168 - printk(UM_KERN_ERR "get_ifname failed, err = %d\n", 169 - -err); 170 - goto out_close2; 171 - } 172 - iter_addresses(pri->dev, open_addr, pri->name); 173 - } 174 - else { 175 - err = os_set_slip(sfd); 176 - if (err < 0) { 177 - printk(UM_KERN_ERR "Failed to set slip discipline " 178 - "encapsulation - err = %d\n", -err); 179 - goto out_close2; 180 - } 181 - } 182 - return mfd; 183 - out_close2: 184 - close(sfd); 185 - out_close: 186 - close(mfd); 187 - out: 188 - return err; 189 - } 190 - 191 - static void slip_close(int fd, void *data) 192 - { 193 - struct slip_data *pri = data; 194 - char version_buf[sizeof("nnnnn\0")]; 195 - char *argv[] = { "uml_net", version_buf, "slip", "down", pri->name, 196 - NULL }; 197 - int err; 198 - 199 - if (pri->gate_addr != NULL) 200 - iter_addresses(pri->dev, close_addr, pri->name); 201 - 202 - sprintf(version_buf, "%d", UML_NET_VERSION); 203 - 204 - err = slip_tramp(argv, pri->slave); 205 - 206 - if (err != 0) 207 - printk(UM_KERN_ERR "slip_tramp failed - errno = %d\n", -err); 208 - close(fd); 209 - close(pri->slave); 210 - pri->slave = -1; 211 - } 212 - 213 - int slip_user_read(int fd, void *buf, int len, struct slip_data *pri) 214 - { 215 - return slip_proto_read(fd, buf, len, &pri->slip); 216 - } 217 - 218 - int slip_user_write(int fd, void *buf, int len, struct slip_data *pri) 219 - { 220 - return slip_proto_write(fd, buf, len, &pri->slip); 221 - } 222 - 223 - static void slip_add_addr(unsigned char *addr, unsigned char *netmask, 224 - void *data) 225 - { 226 - struct slip_data *pri = data; 227 - 228 - if (pri->slave < 0) 229 - return; 230 - open_addr(addr, netmask, pri->name); 231 - } 232 - 233 - static void slip_del_addr(unsigned char *addr, unsigned char *netmask, 234 - void *data) 235 - { 236 - struct slip_data *pri = data; 237 - 238 - if (pri->slave < 0) 239 - return; 240 - close_addr(addr, netmask, pri->name); 241 - } 242 - 243 - const struct net_user_info slip_user_info = { 244 - .init = slip_user_init, 245 - .open = slip_open, 246 - .close = slip_close, 247 - .remove = NULL, 248 - .add_address = slip_add_addr, 249 - .delete_address = slip_del_addr, 250 - .mtu = BUF_SIZE, 251 - .max_packet = BUF_SIZE, 252 - };
-34
arch/um/drivers/slirp.h
··· 1 - /* SPDX-License-Identifier: GPL-2.0 */ 2 - #ifndef __UM_SLIRP_H 3 - #define __UM_SLIRP_H 4 - 5 - #include "slip_common.h" 6 - 7 - #define SLIRP_MAX_ARGS 100 8 - /* 9 - * XXX this next definition is here because I don't understand why this 10 - * initializer doesn't work in slirp_kern.c: 11 - * 12 - * argv : { init->argv[ 0 ... SLIRP_MAX_ARGS-1 ] }, 13 - * 14 - * or why I can't typecast like this: 15 - * 16 - * argv : (char* [SLIRP_MAX_ARGS])(init->argv), 17 - */ 18 - struct arg_list_dummy_wrapper { char *argv[SLIRP_MAX_ARGS]; }; 19 - 20 - struct slirp_data { 21 - void *dev; 22 - struct arg_list_dummy_wrapper argw; 23 - int pid; 24 - int slave; 25 - struct slip_proto slip; 26 - }; 27 - 28 - extern const struct net_user_info slirp_user_info; 29 - 30 - extern int slirp_user_read(int fd, void *buf, int len, struct slirp_data *pri); 31 - extern int slirp_user_write(int fd, void *buf, int len, 32 - struct slirp_data *pri); 33 - 34 - #endif
-120
arch/um/drivers/slirp_kern.c
··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - /* 3 - * Copyright (C) 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com) 4 - */ 5 - 6 - #include <linux/if_arp.h> 7 - #include <linux/init.h> 8 - #include <linux/netdevice.h> 9 - #include <linux/string.h> 10 - #include <net_kern.h> 11 - #include <net_user.h> 12 - #include "slirp.h" 13 - 14 - struct slirp_init { 15 - struct arg_list_dummy_wrapper argw; /* XXX should be simpler... */ 16 - }; 17 - 18 - static void slirp_init(struct net_device *dev, void *data) 19 - { 20 - struct uml_net_private *private; 21 - struct slirp_data *spri; 22 - struct slirp_init *init = data; 23 - int i; 24 - 25 - private = netdev_priv(dev); 26 - spri = (struct slirp_data *) private->user; 27 - 28 - spri->argw = init->argw; 29 - spri->pid = -1; 30 - spri->slave = -1; 31 - spri->dev = dev; 32 - 33 - slip_proto_init(&spri->slip); 34 - 35 - dev->hard_header_len = 0; 36 - dev->header_ops = NULL; 37 - dev->addr_len = 0; 38 - dev->type = ARPHRD_SLIP; 39 - dev->tx_queue_len = 256; 40 - dev->flags = IFF_NOARP; 41 - printk("SLIRP backend - command line:"); 42 - for (i = 0; spri->argw.argv[i] != NULL; i++) 43 - printk(" '%s'",spri->argw.argv[i]); 44 - printk("\n"); 45 - } 46 - 47 - static unsigned short slirp_protocol(struct sk_buff *skbuff) 48 - { 49 - return htons(ETH_P_IP); 50 - } 51 - 52 - static int slirp_read(int fd, struct sk_buff *skb, struct uml_net_private *lp) 53 - { 54 - return slirp_user_read(fd, skb_mac_header(skb), skb->dev->mtu, 55 - (struct slirp_data *) &lp->user); 56 - } 57 - 58 - static int slirp_write(int fd, struct sk_buff *skb, struct uml_net_private *lp) 59 - { 60 - return slirp_user_write(fd, skb->data, skb->len, 61 - (struct slirp_data *) &lp->user); 62 - } 63 - 64 - const struct net_kern_info slirp_kern_info = { 65 - .init = slirp_init, 66 - .protocol = slirp_protocol, 67 - .read = slirp_read, 68 - .write = slirp_write, 69 - }; 70 - 71 - static int slirp_setup(char *str, char **mac_out, void *data) 72 - { 73 - struct slirp_init *init = data; 74 - int i=0; 75 - 76 - *init = ((struct slirp_init) { .argw = { { "slirp", NULL } } }); 77 - 78 - str = split_if_spec(str, mac_out, NULL); 79 - 80 - if (str == NULL) /* no command line given after MAC addr */ 81 - return 1; 82 - 83 - do { 84 - if (i >= SLIRP_MAX_ARGS - 1) { 85 - printk(KERN_WARNING "slirp_setup: truncating slirp " 86 - "arguments\n"); 87 - break; 88 - } 89 - init->argw.argv[i++] = str; 90 - while(*str && *str!=',') { 91 - if (*str == '_') 92 - *str=' '; 93 - str++; 94 - } 95 - if (*str != ',') 96 - break; 97 - *str++ = '\0'; 98 - } while (1); 99 - 100 - init->argw.argv[i] = NULL; 101 - return 1; 102 - } 103 - 104 - static struct transport slirp_transport = { 105 - .list = LIST_HEAD_INIT(slirp_transport.list), 106 - .name = "slirp", 107 - .setup = slirp_setup, 108 - .user = &slirp_user_info, 109 - .kern = &slirp_kern_info, 110 - .private_size = sizeof(struct slirp_data), 111 - .setup_size = sizeof(struct slirp_init), 112 - }; 113 - 114 - static int register_slirp(void) 115 - { 116 - register_transport(&slirp_transport); 117 - return 0; 118 - } 119 - 120 - late_initcall(register_slirp);
-124
arch/um/drivers/slirp_user.c
··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - /* 3 - * Copyright (C) 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com) 4 - */ 5 - 6 - #include <unistd.h> 7 - #include <errno.h> 8 - #include <string.h> 9 - #include <sys/wait.h> 10 - #include <net_user.h> 11 - #include <os.h> 12 - #include "slirp.h" 13 - 14 - static int slirp_user_init(void *data, void *dev) 15 - { 16 - struct slirp_data *pri = data; 17 - 18 - pri->dev = dev; 19 - return 0; 20 - } 21 - 22 - struct slirp_pre_exec_data { 23 - int stdin_fd; 24 - int stdout_fd; 25 - }; 26 - 27 - static void slirp_pre_exec(void *arg) 28 - { 29 - struct slirp_pre_exec_data *data = arg; 30 - 31 - if (data->stdin_fd != -1) 32 - dup2(data->stdin_fd, 0); 33 - if (data->stdout_fd != -1) 34 - dup2(data->stdout_fd, 1); 35 - } 36 - 37 - static int slirp_tramp(char **argv, int fd) 38 - { 39 - struct slirp_pre_exec_data pe_data; 40 - int pid; 41 - 42 - pe_data.stdin_fd = fd; 43 - pe_data.stdout_fd = fd; 44 - pid = run_helper(slirp_pre_exec, &pe_data, argv); 45 - 46 - return pid; 47 - } 48 - 49 - static int slirp_open(void *data) 50 - { 51 - struct slirp_data *pri = data; 52 - int fds[2], err; 53 - 54 - err = os_pipe(fds, 1, 1); 55 - if (err) 56 - return err; 57 - 58 - err = slirp_tramp(pri->argw.argv, fds[1]); 59 - if (err < 0) { 60 - printk(UM_KERN_ERR "slirp_tramp failed - errno = %d\n", -err); 61 - goto out; 62 - } 63 - 64 - pri->slave = fds[1]; 65 - pri->slip.pos = 0; 66 - pri->slip.esc = 0; 67 - pri->pid = err; 68 - 69 - return fds[0]; 70 - out: 71 - close(fds[0]); 72 - close(fds[1]); 73 - return err; 74 - } 75 - 76 - static void slirp_close(int fd, void *data) 77 - { 78 - struct slirp_data *pri = data; 79 - int err; 80 - 81 - close(fd); 82 - close(pri->slave); 83 - 84 - pri->slave = -1; 85 - 86 - if (pri->pid<1) { 87 - printk(UM_KERN_ERR "slirp_close: no child process to shut " 88 - "down\n"); 89 - return; 90 - } 91 - 92 - #if 0 93 - if (kill(pri->pid, SIGHUP)<0) { 94 - printk(UM_KERN_ERR "slirp_close: sending hangup to %d failed " 95 - "(%d)\n", pri->pid, errno); 96 - } 97 - #endif 98 - err = helper_wait(pri->pid); 99 - if (err < 0) 100 - return; 101 - 102 - pri->pid = -1; 103 - } 104 - 105 - int slirp_user_read(int fd, void *buf, int len, struct slirp_data *pri) 106 - { 107 - return slip_proto_read(fd, buf, len, &pri->slip); 108 - } 109 - 110 - int slirp_user_write(int fd, void *buf, int len, struct slirp_data *pri) 111 - { 112 - return slip_proto_write(fd, buf, len, &pri->slip); 113 - } 114 - 115 - const struct net_user_info slirp_user_info = { 116 - .init = slirp_user_init, 117 - .open = slirp_open, 118 - .close = slirp_close, 119 - .remove = NULL, 120 - .add_address = NULL, 121 - .delete_address = NULL, 122 - .mtu = BUF_SIZE, 123 - .max_packet = BUF_SIZE, 124 - };
-27
arch/um/drivers/umcast.h
··· 1 - /* SPDX-License-Identifier: GPL-2.0 */ 2 - /* 3 - * Copyright (C) 2001 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com) 4 - */ 5 - 6 - #ifndef __DRIVERS_UMCAST_H 7 - #define __DRIVERS_UMCAST_H 8 - 9 - #include <net_user.h> 10 - 11 - struct umcast_data { 12 - char *addr; 13 - unsigned short lport; 14 - unsigned short rport; 15 - void *listen_addr; 16 - void *remote_addr; 17 - int ttl; 18 - int unicast; 19 - void *dev; 20 - }; 21 - 22 - extern const struct net_user_info umcast_user_info; 23 - 24 - extern int umcast_user_write(int fd, void *buf, int len, 25 - struct umcast_data *pri); 26 - 27 - #endif
-188
arch/um/drivers/umcast_kern.c
··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - /* 3 - * user-mode-linux networking multicast transport 4 - * Copyright (C) 2001 by Harald Welte <laforge@gnumonks.org> 5 - * Copyright (C) 2001 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com) 6 - * 7 - * based on the existing uml-networking code, which is 8 - * Copyright (C) 2001 Lennert Buytenhek (buytenh@gnu.org) and 9 - * James Leu (jleu@mindspring.net). 10 - * Copyright (C) 2001 by various other people who didn't put their name here. 11 - * 12 - */ 13 - 14 - #include <linux/init.h> 15 - #include <linux/netdevice.h> 16 - #include "umcast.h" 17 - #include <net_kern.h> 18 - 19 - struct umcast_init { 20 - char *addr; 21 - int lport; 22 - int rport; 23 - int ttl; 24 - bool unicast; 25 - }; 26 - 27 - static void umcast_init(struct net_device *dev, void *data) 28 - { 29 - struct uml_net_private *pri; 30 - struct umcast_data *dpri; 31 - struct umcast_init *init = data; 32 - 33 - pri = netdev_priv(dev); 34 - dpri = (struct umcast_data *) pri->user; 35 - dpri->addr = init->addr; 36 - dpri->lport = init->lport; 37 - dpri->rport = init->rport; 38 - dpri->unicast = init->unicast; 39 - dpri->ttl = init->ttl; 40 - dpri->dev = dev; 41 - 42 - if (dpri->unicast) { 43 - printk(KERN_INFO "ucast backend address: %s:%u listen port: " 44 - "%u\n", dpri->addr, dpri->rport, dpri->lport); 45 - } else { 46 - printk(KERN_INFO "mcast backend multicast address: %s:%u, " 47 - "TTL:%u\n", dpri->addr, dpri->lport, dpri->ttl); 48 - } 49 - } 50 - 51 - static int umcast_read(int fd, struct sk_buff *skb, struct uml_net_private *lp) 52 - { 53 - return net_recvfrom(fd, skb_mac_header(skb), 54 - skb->dev->mtu + ETH_HEADER_OTHER); 55 - } 56 - 57 - static int umcast_write(int fd, struct sk_buff *skb, struct uml_net_private *lp) 58 - { 59 - return umcast_user_write(fd, skb->data, skb->len, 60 - (struct umcast_data *) &lp->user); 61 - } 62 - 63 - static const struct net_kern_info umcast_kern_info = { 64 - .init = umcast_init, 65 - .protocol = eth_protocol, 66 - .read = umcast_read, 67 - .write = umcast_write, 68 - }; 69 - 70 - static int mcast_setup(char *str, char **mac_out, void *data) 71 - { 72 - struct umcast_init *init = data; 73 - char *port_str = NULL, *ttl_str = NULL, *remain; 74 - char *last; 75 - 76 - *init = ((struct umcast_init) 77 - { .addr = "239.192.168.1", 78 - .lport = 1102, 79 - .ttl = 1 }); 80 - 81 - remain = split_if_spec(str, mac_out, &init->addr, &port_str, &ttl_str, 82 - NULL); 83 - if (remain != NULL) { 84 - printk(KERN_ERR "mcast_setup - Extra garbage on " 85 - "specification : '%s'\n", remain); 86 - return 0; 87 - } 88 - 89 - if (port_str != NULL) { 90 - init->lport = simple_strtoul(port_str, &last, 10); 91 - if ((*last != '\0') || (last == port_str)) { 92 - printk(KERN_ERR "mcast_setup - Bad port : '%s'\n", 93 - port_str); 94 - return 0; 95 - } 96 - } 97 - 98 - if (ttl_str != NULL) { 99 - init->ttl = simple_strtoul(ttl_str, &last, 10); 100 - if ((*last != '\0') || (last == ttl_str)) { 101 - printk(KERN_ERR "mcast_setup - Bad ttl : '%s'\n", 102 - ttl_str); 103 - return 0; 104 - } 105 - } 106 - 107 - init->unicast = false; 108 - init->rport = init->lport; 109 - 110 - printk(KERN_INFO "Configured mcast device: %s:%u-%u\n", init->addr, 111 - init->lport, init->ttl); 112 - 113 - return 1; 114 - } 115 - 116 - static int ucast_setup(char *str, char **mac_out, void *data) 117 - { 118 - struct umcast_init *init = data; 119 - char *lport_str = NULL, *rport_str = NULL, *remain; 120 - char *last; 121 - 122 - *init = ((struct umcast_init) 123 - { .addr = "", 124 - .lport = 1102, 125 - .rport = 1102 }); 126 - 127 - remain = split_if_spec(str, mac_out, &init->addr, 128 - &lport_str, &rport_str, NULL); 129 - if (remain != NULL) { 130 - printk(KERN_ERR "ucast_setup - Extra garbage on " 131 - "specification : '%s'\n", remain); 132 - return 0; 133 - } 134 - 135 - if (lport_str != NULL) { 136 - init->lport = simple_strtoul(lport_str, &last, 10); 137 - if ((*last != '\0') || (last == lport_str)) { 138 - printk(KERN_ERR "ucast_setup - Bad listen port : " 139 - "'%s'\n", lport_str); 140 - return 0; 141 - } 142 - } 143 - 144 - if (rport_str != NULL) { 145 - init->rport = simple_strtoul(rport_str, &last, 10); 146 - if ((*last != '\0') || (last == rport_str)) { 147 - printk(KERN_ERR "ucast_setup - Bad remote port : " 148 - "'%s'\n", rport_str); 149 - return 0; 150 - } 151 - } 152 - 153 - init->unicast = true; 154 - 155 - printk(KERN_INFO "Configured ucast device: :%u -> %s:%u\n", 156 - init->lport, init->addr, init->rport); 157 - 158 - return 1; 159 - } 160 - 161 - static struct transport mcast_transport = { 162 - .list = LIST_HEAD_INIT(mcast_transport.list), 163 - .name = "mcast", 164 - .setup = mcast_setup, 165 - .user = &umcast_user_info, 166 - .kern = &umcast_kern_info, 167 - .private_size = sizeof(struct umcast_data), 168 - .setup_size = sizeof(struct umcast_init), 169 - }; 170 - 171 - static struct transport ucast_transport = { 172 - .list = LIST_HEAD_INIT(ucast_transport.list), 173 - .name = "ucast", 174 - .setup = ucast_setup, 175 - .user = &umcast_user_info, 176 - .kern = &umcast_kern_info, 177 - .private_size = sizeof(struct umcast_data), 178 - .setup_size = sizeof(struct umcast_init), 179 - }; 180 - 181 - static int register_umcast(void) 182 - { 183 - register_transport(&mcast_transport); 184 - register_transport(&ucast_transport); 185 - return 0; 186 - } 187 - 188 - late_initcall(register_umcast);
-184
arch/um/drivers/umcast_user.c
··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - /* 3 - * user-mode-linux networking multicast transport 4 - * Copyright (C) 2001 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com) 5 - * Copyright (C) 2001 by Harald Welte <laforge@gnumonks.org> 6 - * 7 - * based on the existing uml-networking code, which is 8 - * Copyright (C) 2001 Lennert Buytenhek (buytenh@gnu.org) and 9 - * James Leu (jleu@mindspring.net). 10 - * Copyright (C) 2001 by various other people who didn't put their name here. 11 - * 12 - * 13 - */ 14 - 15 - #include <unistd.h> 16 - #include <errno.h> 17 - #include <netinet/in.h> 18 - #include "umcast.h" 19 - #include <net_user.h> 20 - #include <um_malloc.h> 21 - 22 - static struct sockaddr_in *new_addr(char *addr, unsigned short port) 23 - { 24 - struct sockaddr_in *sin; 25 - 26 - sin = uml_kmalloc(sizeof(struct sockaddr_in), UM_GFP_KERNEL); 27 - if (sin == NULL) { 28 - printk(UM_KERN_ERR "new_addr: allocation of sockaddr_in " 29 - "failed\n"); 30 - return NULL; 31 - } 32 - sin->sin_family = AF_INET; 33 - if (addr) 34 - sin->sin_addr.s_addr = in_aton(addr); 35 - else 36 - sin->sin_addr.s_addr = INADDR_ANY; 37 - sin->sin_port = htons(port); 38 - return sin; 39 - } 40 - 41 - static int umcast_user_init(void *data, void *dev) 42 - { 43 - struct umcast_data *pri = data; 44 - 45 - pri->remote_addr = new_addr(pri->addr, pri->rport); 46 - if (pri->unicast) 47 - pri->listen_addr = new_addr(NULL, pri->lport); 48 - else 49 - pri->listen_addr = pri->remote_addr; 50 - pri->dev = dev; 51 - return 0; 52 - } 53 - 54 - static void umcast_remove(void *data) 55 - { 56 - struct umcast_data *pri = data; 57 - 58 - kfree(pri->listen_addr); 59 - if (pri->unicast) 60 - kfree(pri->remote_addr); 61 - pri->listen_addr = pri->remote_addr = NULL; 62 - } 63 - 64 - static int umcast_open(void *data) 65 - { 66 - struct umcast_data *pri = data; 67 - struct sockaddr_in *lsin = pri->listen_addr; 68 - struct sockaddr_in *rsin = pri->remote_addr; 69 - struct ip_mreq mreq; 70 - int fd, yes = 1, err = -EINVAL; 71 - 72 - 73 - if ((!pri->unicast && lsin->sin_addr.s_addr == 0) || 74 - (rsin->sin_addr.s_addr == 0) || 75 - (lsin->sin_port == 0) || (rsin->sin_port == 0)) 76 - goto out; 77 - 78 - fd = socket(AF_INET, SOCK_DGRAM, 0); 79 - 80 - if (fd < 0) { 81 - err = -errno; 82 - printk(UM_KERN_ERR "umcast_open : data socket failed, " 83 - "errno = %d\n", errno); 84 - goto out; 85 - } 86 - 87 - if (setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &yes, sizeof(yes)) < 0) { 88 - err = -errno; 89 - printk(UM_KERN_ERR "umcast_open: SO_REUSEADDR failed, " 90 - "errno = %d\n", errno); 91 - goto out_close; 92 - } 93 - 94 - if (!pri->unicast) { 95 - /* set ttl according to config */ 96 - if (setsockopt(fd, SOL_IP, IP_MULTICAST_TTL, &pri->ttl, 97 - sizeof(pri->ttl)) < 0) { 98 - err = -errno; 99 - printk(UM_KERN_ERR "umcast_open: IP_MULTICAST_TTL " 100 - "failed, error = %d\n", errno); 101 - goto out_close; 102 - } 103 - 104 - /* set LOOP, so data does get fed back to local sockets */ 105 - if (setsockopt(fd, SOL_IP, IP_MULTICAST_LOOP, 106 - &yes, sizeof(yes)) < 0) { 107 - err = -errno; 108 - printk(UM_KERN_ERR "umcast_open: IP_MULTICAST_LOOP " 109 - "failed, error = %d\n", errno); 110 - goto out_close; 111 - } 112 - } 113 - 114 - /* bind socket to the address */ 115 - if (bind(fd, (struct sockaddr *) lsin, sizeof(*lsin)) < 0) { 116 - err = -errno; 117 - printk(UM_KERN_ERR "umcast_open : data bind failed, " 118 - "errno = %d\n", errno); 119 - goto out_close; 120 - } 121 - 122 - if (!pri->unicast) { 123 - /* subscribe to the multicast group */ 124 - mreq.imr_multiaddr.s_addr = lsin->sin_addr.s_addr; 125 - mreq.imr_interface.s_addr = 0; 126 - if (setsockopt(fd, SOL_IP, IP_ADD_MEMBERSHIP, 127 - &mreq, sizeof(mreq)) < 0) { 128 - err = -errno; 129 - printk(UM_KERN_ERR "umcast_open: IP_ADD_MEMBERSHIP " 130 - "failed, error = %d\n", errno); 131 - printk(UM_KERN_ERR "There appears not to be a " 132 - "multicast-capable network interface on the " 133 - "host.\n"); 134 - printk(UM_KERN_ERR "eth0 should be configured in order " 135 - "to use the multicast transport.\n"); 136 - goto out_close; 137 - } 138 - } 139 - 140 - return fd; 141 - 142 - out_close: 143 - close(fd); 144 - out: 145 - return err; 146 - } 147 - 148 - static void umcast_close(int fd, void *data) 149 - { 150 - struct umcast_data *pri = data; 151 - 152 - if (!pri->unicast) { 153 - struct ip_mreq mreq; 154 - struct sockaddr_in *lsin = pri->listen_addr; 155 - 156 - mreq.imr_multiaddr.s_addr = lsin->sin_addr.s_addr; 157 - mreq.imr_interface.s_addr = 0; 158 - if (setsockopt(fd, SOL_IP, IP_DROP_MEMBERSHIP, 159 - &mreq, sizeof(mreq)) < 0) { 160 - printk(UM_KERN_ERR "umcast_close: IP_DROP_MEMBERSHIP " 161 - "failed, error = %d\n", errno); 162 - } 163 - } 164 - 165 - close(fd); 166 - } 167 - 168 - int umcast_user_write(int fd, void *buf, int len, struct umcast_data *pri) 169 - { 170 - struct sockaddr_in *data_addr = pri->remote_addr; 171 - 172 - return net_sendto(fd, buf, len, data_addr, sizeof(*data_addr)); 173 - } 174 - 175 - const struct net_user_info umcast_user_info = { 176 - .init = umcast_user_init, 177 - .open = umcast_open, 178 - .close = umcast_close, 179 - .remove = umcast_remove, 180 - .add_address = NULL, 181 - .delete_address = NULL, 182 - .mtu = ETH_MAX_PACKET, 183 - .max_packet = ETH_MAX_PACKET + ETH_HEADER_OTHER, 184 - };
-32
arch/um/drivers/vde.h
··· 1 - /* SPDX-License-Identifier: GPL-2.0 */ 2 - /* 3 - * Copyright (C) 2007 Luca Bigliardi (shammash@artha.org). 4 - */ 5 - 6 - #ifndef __UM_VDE_H__ 7 - #define __UM_VDE_H__ 8 - 9 - struct vde_data { 10 - char *vde_switch; 11 - char *descr; 12 - void *args; 13 - void *conn; 14 - void *dev; 15 - }; 16 - 17 - struct vde_init { 18 - char *vde_switch; 19 - char *descr; 20 - int port; 21 - char *group; 22 - int mode; 23 - }; 24 - 25 - extern const struct net_user_info vde_user_info; 26 - 27 - extern void vde_init_libstuff(struct vde_data *vpri, struct vde_init *init); 28 - 29 - extern int vde_user_read(void *conn, void *buf, int len); 30 - extern int vde_user_write(void *conn, void *buf, int len); 31 - 32 - #endif
-129
arch/um/drivers/vde_kern.c
··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - /* 3 - * Copyright (C) 2007 Luca Bigliardi (shammash@artha.org). 4 - * 5 - * Transport usage: 6 - * ethN=vde,<vde_switch>,<mac addr>,<port>,<group>,<mode>,<description> 7 - * 8 - */ 9 - 10 - #include <linux/init.h> 11 - #include <linux/netdevice.h> 12 - #include <net_kern.h> 13 - #include <net_user.h> 14 - #include "vde.h" 15 - 16 - static void vde_init(struct net_device *dev, void *data) 17 - { 18 - struct vde_init *init = data; 19 - struct uml_net_private *pri; 20 - struct vde_data *vpri; 21 - 22 - pri = netdev_priv(dev); 23 - vpri = (struct vde_data *) pri->user; 24 - 25 - vpri->vde_switch = init->vde_switch; 26 - vpri->descr = init->descr ? init->descr : "UML vde_transport"; 27 - vpri->args = NULL; 28 - vpri->conn = NULL; 29 - vpri->dev = dev; 30 - 31 - printk("vde backend - %s, ", vpri->vde_switch ? 32 - vpri->vde_switch : "(default socket)"); 33 - 34 - vde_init_libstuff(vpri, init); 35 - 36 - printk("\n"); 37 - } 38 - 39 - static int vde_read(int fd, struct sk_buff *skb, struct uml_net_private *lp) 40 - { 41 - struct vde_data *pri = (struct vde_data *) &lp->user; 42 - 43 - if (pri->conn != NULL) 44 - return vde_user_read(pri->conn, skb_mac_header(skb), 45 - skb->dev->mtu + ETH_HEADER_OTHER); 46 - 47 - printk(KERN_ERR "vde_read - we have no VDECONN to read from"); 48 - return -EBADF; 49 - } 50 - 51 - static int vde_write(int fd, struct sk_buff *skb, struct uml_net_private *lp) 52 - { 53 - struct vde_data *pri = (struct vde_data *) &lp->user; 54 - 55 - if (pri->conn != NULL) 56 - return vde_user_write((void *)pri->conn, skb->data, 57 - skb->len); 58 - 59 - printk(KERN_ERR "vde_write - we have no VDECONN to write to"); 60 - return -EBADF; 61 - } 62 - 63 - static const struct net_kern_info vde_kern_info = { 64 - .init = vde_init, 65 - .protocol = eth_protocol, 66 - .read = vde_read, 67 - .write = vde_write, 68 - }; 69 - 70 - static int vde_setup(char *str, char **mac_out, void *data) 71 - { 72 - struct vde_init *init = data; 73 - char *remain, *port_str = NULL, *mode_str = NULL, *last; 74 - 75 - *init = ((struct vde_init) 76 - { .vde_switch = NULL, 77 - .descr = NULL, 78 - .port = 0, 79 - .group = NULL, 80 - .mode = 0 }); 81 - 82 - remain = split_if_spec(str, &init->vde_switch, mac_out, &port_str, 83 - &init->group, &mode_str, &init->descr, NULL); 84 - 85 - if (remain != NULL) 86 - printk(KERN_WARNING "vde_setup - Ignoring extra data :" 87 - "'%s'\n", remain); 88 - 89 - if (port_str != NULL) { 90 - init->port = simple_strtoul(port_str, &last, 10); 91 - if ((*last != '\0') || (last == port_str)) { 92 - printk(KERN_ERR "vde_setup - Bad port : '%s'\n", 93 - port_str); 94 - return 0; 95 - } 96 - } 97 - 98 - if (mode_str != NULL) { 99 - init->mode = simple_strtoul(mode_str, &last, 8); 100 - if ((*last != '\0') || (last == mode_str)) { 101 - printk(KERN_ERR "vde_setup - Bad mode : '%s'\n", 102 - mode_str); 103 - return 0; 104 - } 105 - } 106 - 107 - printk(KERN_INFO "Configured vde device: %s\n", init->vde_switch ? 108 - init->vde_switch : "(default socket)"); 109 - 110 - return 1; 111 - } 112 - 113 - static struct transport vde_transport = { 114 - .list = LIST_HEAD_INIT(vde_transport.list), 115 - .name = "vde", 116 - .setup = vde_setup, 117 - .user = &vde_user_info, 118 - .kern = &vde_kern_info, 119 - .private_size = sizeof(struct vde_data), 120 - .setup_size = sizeof(struct vde_init), 121 - }; 122 - 123 - static int register_vde(void) 124 - { 125 - register_transport(&vde_transport); 126 - return 0; 127 - } 128 - 129 - late_initcall(register_vde);
-125
arch/um/drivers/vde_user.c
··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - /* 3 - * Copyright (C) 2007 Luca Bigliardi (shammash@artha.org). 4 - */ 5 - 6 - #include <stddef.h> 7 - #include <errno.h> 8 - #include <libvdeplug.h> 9 - #include <net_user.h> 10 - #include <um_malloc.h> 11 - #include "vde.h" 12 - 13 - static int vde_user_init(void *data, void *dev) 14 - { 15 - struct vde_data *pri = data; 16 - VDECONN *conn = NULL; 17 - int err = -EINVAL; 18 - 19 - pri->dev = dev; 20 - 21 - conn = vde_open(pri->vde_switch, pri->descr, pri->args); 22 - 23 - if (conn == NULL) { 24 - err = -errno; 25 - printk(UM_KERN_ERR "vde_user_init: vde_open failed, " 26 - "errno = %d\n", errno); 27 - return err; 28 - } 29 - 30 - printk(UM_KERN_INFO "vde backend - connection opened\n"); 31 - 32 - pri->conn = conn; 33 - 34 - return 0; 35 - } 36 - 37 - static int vde_user_open(void *data) 38 - { 39 - struct vde_data *pri = data; 40 - 41 - if (pri->conn != NULL) 42 - return vde_datafd(pri->conn); 43 - 44 - printk(UM_KERN_WARNING "vde_open - we have no VDECONN to open"); 45 - return -EINVAL; 46 - } 47 - 48 - static void vde_remove(void *data) 49 - { 50 - struct vde_data *pri = data; 51 - 52 - if (pri->conn != NULL) { 53 - printk(UM_KERN_INFO "vde backend - closing connection\n"); 54 - vde_close(pri->conn); 55 - pri->conn = NULL; 56 - kfree(pri->args); 57 - pri->args = NULL; 58 - return; 59 - } 60 - 61 - printk(UM_KERN_WARNING "vde_remove - we have no VDECONN to remove"); 62 - } 63 - 64 - const struct net_user_info vde_user_info = { 65 - .init = vde_user_init, 66 - .open = vde_user_open, 67 - .close = NULL, 68 - .remove = vde_remove, 69 - .add_address = NULL, 70 - .delete_address = NULL, 71 - .mtu = ETH_MAX_PACKET, 72 - .max_packet = ETH_MAX_PACKET + ETH_HEADER_OTHER, 73 - }; 74 - 75 - void vde_init_libstuff(struct vde_data *vpri, struct vde_init *init) 76 - { 77 - struct vde_open_args *args; 78 - 79 - vpri->args = uml_kmalloc(sizeof(struct vde_open_args), UM_GFP_KERNEL); 80 - if (vpri->args == NULL) { 81 - printk(UM_KERN_ERR "vde_init_libstuff - vde_open_args " 82 - "allocation failed"); 83 - return; 84 - } 85 - 86 - args = vpri->args; 87 - 88 - args->port = init->port; 89 - args->group = init->group; 90 - args->mode = init->mode ? init->mode : 0700; 91 - 92 - args->port ? printk("port %d", args->port) : 93 - printk("undefined port"); 94 - } 95 - 96 - int vde_user_read(void *conn, void *buf, int len) 97 - { 98 - VDECONN *vconn = conn; 99 - int rv; 100 - 101 - if (vconn == NULL) 102 - return 0; 103 - 104 - rv = vde_recv(vconn, buf, len, 0); 105 - if (rv < 0) { 106 - if (errno == EAGAIN) 107 - return 0; 108 - return -errno; 109 - } 110 - else if (rv == 0) 111 - return -ENOTCONN; 112 - 113 - return rv; 114 - } 115 - 116 - int vde_user_write(void *conn, void *buf, int len) 117 - { 118 - VDECONN *vconn = conn; 119 - 120 - if (vconn == NULL) 121 - return 0; 122 - 123 - return vde_send(vconn, buf, len, 0); 124 - } 125 -
+40 -8
arch/um/drivers/vector_kern.c
··· 8 8 * Copyright (C) 2001 by various other people who didn't put their name here. 9 9 */ 10 10 11 + #define pr_fmt(fmt) "uml-vector: " fmt 12 + 11 13 #include <linux/memblock.h> 12 14 #include <linux/etherdevice.h> 13 15 #include <linux/ethtool.h> ··· 29 27 #include <init.h> 30 28 #include <irq_kern.h> 31 29 #include <irq_user.h> 32 - #include <net_kern.h> 33 30 #include <os.h> 34 31 #include "mconsole_kern.h" 35 32 #include "vector_user.h" ··· 1540 1539 napi_schedule(&vp->napi); 1541 1540 } 1542 1541 1542 + static void vector_setup_etheraddr(struct net_device *dev, char *str) 1543 + { 1544 + u8 addr[ETH_ALEN]; 1543 1545 1546 + if (str == NULL) 1547 + goto random; 1548 + 1549 + if (!mac_pton(str, addr)) { 1550 + netdev_err(dev, 1551 + "Failed to parse '%s' as an ethernet address\n", str); 1552 + goto random; 1553 + } 1554 + if (is_multicast_ether_addr(addr)) { 1555 + netdev_err(dev, 1556 + "Attempt to assign a multicast ethernet address to a device disallowed\n"); 1557 + goto random; 1558 + } 1559 + if (!is_valid_ether_addr(addr)) { 1560 + netdev_err(dev, 1561 + "Attempt to assign an invalid ethernet address to a device disallowed\n"); 1562 + goto random; 1563 + } 1564 + if (!is_local_ether_addr(addr)) { 1565 + netdev_warn(dev, "Warning: Assigning a globally valid ethernet address to a device\n"); 1566 + netdev_warn(dev, "You should set the 2nd rightmost bit in the first byte of the MAC,\n"); 1567 + netdev_warn(dev, "i.e. %02x:%02x:%02x:%02x:%02x:%02x\n", 1568 + addr[0] | 0x02, addr[1], addr[2], addr[3], addr[4], addr[5]); 1569 + } 1570 + eth_hw_addr_set(dev, addr); 1571 + return; 1572 + 1573 + random: 1574 + netdev_info(dev, "Choosing a random ethernet address\n"); 1575 + eth_hw_addr_random(dev); 1576 + } 1544 1577 1545 1578 static void vector_eth_configure( 1546 1579 int n, ··· 1588 1553 1589 1554 device = kzalloc(sizeof(*device), GFP_KERNEL); 1590 1555 if (device == NULL) { 1591 - printk(KERN_ERR "eth_configure failed to allocate struct " 1592 - "vector_device\n"); 1556 + pr_err("Failed to allocate struct vector_device for vec%d\n", n); 1593 1557 return; 1594 1558 } 1595 1559 dev = alloc_etherdev(sizeof(struct vector_private)); 1596 1560 if (dev == NULL) { 1597 - printk(KERN_ERR "eth_configure: failed to allocate struct " 1598 - "net_device for vec%d\n", n); 1561 + pr_err("Failed to allocate struct net_device for vec%d\n", n); 1599 1562 goto out_free_device; 1600 1563 } 1601 1564 ··· 1607 1574 * and fail. 1608 1575 */ 1609 1576 snprintf(dev->name, sizeof(dev->name), "vec%d", n); 1610 - uml_net_setup_etheraddr(dev, uml_vector_fetch_arg(def, "mac")); 1577 + vector_setup_etheraddr(dev, uml_vector_fetch_arg(def, "mac")); 1611 1578 vp = netdev_priv(dev); 1612 1579 1613 1580 /* sysfs register */ ··· 1723 1690 1724 1691 err = vector_parse(str, &n, &str, &error); 1725 1692 if (err) { 1726 - printk(KERN_ERR "vector_setup - Couldn't parse '%s' : %s\n", 1727 - str, error); 1693 + pr_err("Couldn't parse '%s': %s\n", str, error); 1728 1694 return 1; 1729 1695 } 1730 1696 new = memblock_alloc_or_panic(sizeof(*new), SMP_CACHE_BYTES);
+642
arch/um/drivers/vfio_kern.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* 3 + * Copyright (C) 2025 Ant Group 4 + * Author: Tiwei Bie <tiwei.btw@antgroup.com> 5 + */ 6 + 7 + #define pr_fmt(fmt) "vfio-uml: " fmt 8 + 9 + #include <linux/module.h> 10 + #include <linux/logic_iomem.h> 11 + #include <linux/mutex.h> 12 + #include <linux/list.h> 13 + #include <linux/string.h> 14 + #include <linux/unaligned.h> 15 + #include <irq_kern.h> 16 + #include <init.h> 17 + #include <os.h> 18 + 19 + #include "virt-pci.h" 20 + #include "vfio_user.h" 21 + 22 + #define to_vdev(_pdev) container_of(_pdev, struct uml_vfio_device, pdev) 23 + 24 + struct uml_vfio_intr_ctx { 25 + struct uml_vfio_device *dev; 26 + int irq; 27 + }; 28 + 29 + struct uml_vfio_device { 30 + const char *name; 31 + int group; 32 + 33 + struct um_pci_device pdev; 34 + struct uml_vfio_user_device udev; 35 + struct uml_vfio_intr_ctx *intr_ctx; 36 + 37 + int msix_cap; 38 + int msix_bar; 39 + int msix_offset; 40 + int msix_size; 41 + u32 *msix_data; 42 + 43 + struct list_head list; 44 + }; 45 + 46 + struct uml_vfio_group { 47 + int id; 48 + int fd; 49 + int users; 50 + struct list_head list; 51 + }; 52 + 53 + static struct { 54 + int fd; 55 + int users; 56 + } uml_vfio_container = { .fd = -1 }; 57 + static DEFINE_MUTEX(uml_vfio_container_mtx); 58 + 59 + static LIST_HEAD(uml_vfio_groups); 60 + static DEFINE_MUTEX(uml_vfio_groups_mtx); 61 + 62 + static LIST_HEAD(uml_vfio_devices); 63 + 64 + static int uml_vfio_set_container(int group_fd) 65 + { 66 + int err; 67 + 68 + guard(mutex)(&uml_vfio_container_mtx); 69 + 70 + err = uml_vfio_user_set_container(uml_vfio_container.fd, group_fd); 71 + if (err) 72 + return err; 73 + 74 + uml_vfio_container.users++; 75 + if (uml_vfio_container.users > 1) 76 + return 0; 77 + 78 + err = uml_vfio_user_setup_iommu(uml_vfio_container.fd); 79 + if (err) { 80 + uml_vfio_user_unset_container(uml_vfio_container.fd, group_fd); 81 + uml_vfio_container.users--; 82 + } 83 + return err; 84 + } 85 + 86 + static void uml_vfio_unset_container(int group_fd) 87 + { 88 + guard(mutex)(&uml_vfio_container_mtx); 89 + 90 + uml_vfio_user_unset_container(uml_vfio_container.fd, group_fd); 91 + uml_vfio_container.users--; 92 + } 93 + 94 + static int uml_vfio_open_group(int group_id) 95 + { 96 + struct uml_vfio_group *group; 97 + int err; 98 + 99 + guard(mutex)(&uml_vfio_groups_mtx); 100 + 101 + list_for_each_entry(group, &uml_vfio_groups, list) { 102 + if (group->id == group_id) { 103 + group->users++; 104 + return group->fd; 105 + } 106 + } 107 + 108 + group = kzalloc(sizeof(*group), GFP_KERNEL); 109 + if (!group) 110 + return -ENOMEM; 111 + 112 + group->fd = uml_vfio_user_open_group(group_id); 113 + if (group->fd < 0) { 114 + err = group->fd; 115 + goto free_group; 116 + } 117 + 118 + err = uml_vfio_set_container(group->fd); 119 + if (err) 120 + goto close_group; 121 + 122 + group->id = group_id; 123 + group->users = 1; 124 + 125 + list_add(&group->list, &uml_vfio_groups); 126 + 127 + return group->fd; 128 + 129 + close_group: 130 + os_close_file(group->fd); 131 + free_group: 132 + kfree(group); 133 + return err; 134 + } 135 + 136 + static int uml_vfio_release_group(int group_fd) 137 + { 138 + struct uml_vfio_group *group; 139 + 140 + guard(mutex)(&uml_vfio_groups_mtx); 141 + 142 + list_for_each_entry(group, &uml_vfio_groups, list) { 143 + if (group->fd == group_fd) { 144 + group->users--; 145 + if (group->users == 0) { 146 + uml_vfio_unset_container(group_fd); 147 + os_close_file(group_fd); 148 + list_del(&group->list); 149 + kfree(group); 150 + } 151 + return 0; 152 + } 153 + } 154 + 155 + return -ENOENT; 156 + } 157 + 158 + static irqreturn_t uml_vfio_interrupt(int unused, void *opaque) 159 + { 160 + struct uml_vfio_intr_ctx *ctx = opaque; 161 + struct uml_vfio_device *dev = ctx->dev; 162 + int index = ctx - dev->intr_ctx; 163 + int irqfd = dev->udev.irqfd[index]; 164 + int irq = dev->msix_data[index]; 165 + uint64_t v; 166 + int r; 167 + 168 + do { 169 + r = os_read_file(irqfd, &v, sizeof(v)); 170 + if (r == sizeof(v)) 171 + generic_handle_irq(irq); 172 + } while (r == sizeof(v) || r == -EINTR); 173 + WARN(r != -EAGAIN, "read returned %d\n", r); 174 + 175 + return IRQ_HANDLED; 176 + } 177 + 178 + static int uml_vfio_activate_irq(struct uml_vfio_device *dev, int index) 179 + { 180 + struct uml_vfio_intr_ctx *ctx = &dev->intr_ctx[index]; 181 + int err, irqfd; 182 + 183 + if (ctx->irq >= 0) 184 + return 0; 185 + 186 + irqfd = uml_vfio_user_activate_irq(&dev->udev, index); 187 + if (irqfd < 0) 188 + return irqfd; 189 + 190 + ctx->irq = um_request_irq(UM_IRQ_ALLOC, irqfd, IRQ_READ, 191 + uml_vfio_interrupt, 0, 192 + "vfio-uml", ctx); 193 + if (ctx->irq < 0) { 194 + err = ctx->irq; 195 + goto deactivate; 196 + } 197 + 198 + err = add_sigio_fd(irqfd); 199 + if (err) 200 + goto free_irq; 201 + 202 + return 0; 203 + 204 + free_irq: 205 + um_free_irq(ctx->irq, ctx); 206 + ctx->irq = -1; 207 + deactivate: 208 + uml_vfio_user_deactivate_irq(&dev->udev, index); 209 + return err; 210 + } 211 + 212 + static int uml_vfio_deactivate_irq(struct uml_vfio_device *dev, int index) 213 + { 214 + struct uml_vfio_intr_ctx *ctx = &dev->intr_ctx[index]; 215 + 216 + if (ctx->irq >= 0) { 217 + ignore_sigio_fd(dev->udev.irqfd[index]); 218 + um_free_irq(ctx->irq, ctx); 219 + uml_vfio_user_deactivate_irq(&dev->udev, index); 220 + ctx->irq = -1; 221 + } 222 + return 0; 223 + } 224 + 225 + static int uml_vfio_update_msix_cap(struct uml_vfio_device *dev, 226 + unsigned int offset, int size, 227 + unsigned long val) 228 + { 229 + /* 230 + * Here, we handle only the operations we care about, 231 + * ignoring the rest. 232 + */ 233 + if (size == 2 && offset == dev->msix_cap + PCI_MSIX_FLAGS) { 234 + switch (val & ~PCI_MSIX_FLAGS_QSIZE) { 235 + case PCI_MSIX_FLAGS_ENABLE: 236 + case 0: 237 + return uml_vfio_user_update_irqs(&dev->udev); 238 + } 239 + } 240 + return 0; 241 + } 242 + 243 + static int uml_vfio_update_msix_table(struct uml_vfio_device *dev, 244 + unsigned int offset, int size, 245 + unsigned long val) 246 + { 247 + int index; 248 + 249 + /* 250 + * Here, we handle only the operations we care about, 251 + * ignoring the rest. 252 + */ 253 + offset -= dev->msix_offset + PCI_MSIX_ENTRY_DATA; 254 + 255 + if (size != 4 || offset % PCI_MSIX_ENTRY_SIZE != 0) 256 + return 0; 257 + 258 + index = offset / PCI_MSIX_ENTRY_SIZE; 259 + if (index >= dev->udev.irq_count) 260 + return -EINVAL; 261 + 262 + dev->msix_data[index] = val; 263 + 264 + return val ? uml_vfio_activate_irq(dev, index) : 265 + uml_vfio_deactivate_irq(dev, index); 266 + } 267 + 268 + static unsigned long __uml_vfio_cfgspace_read(struct uml_vfio_device *dev, 269 + unsigned int offset, int size) 270 + { 271 + u8 data[8]; 272 + 273 + memset(data, 0xff, sizeof(data)); 274 + 275 + if (uml_vfio_user_cfgspace_read(&dev->udev, offset, data, size)) 276 + return ULONG_MAX; 277 + 278 + switch (size) { 279 + case 1: 280 + return data[0]; 281 + case 2: 282 + return le16_to_cpup((void *)data); 283 + case 4: 284 + return le32_to_cpup((void *)data); 285 + #ifdef CONFIG_64BIT 286 + case 8: 287 + return le64_to_cpup((void *)data); 288 + #endif 289 + default: 290 + return ULONG_MAX; 291 + } 292 + } 293 + 294 + static unsigned long uml_vfio_cfgspace_read(struct um_pci_device *pdev, 295 + unsigned int offset, int size) 296 + { 297 + struct uml_vfio_device *dev = to_vdev(pdev); 298 + 299 + return __uml_vfio_cfgspace_read(dev, offset, size); 300 + } 301 + 302 + static void __uml_vfio_cfgspace_write(struct uml_vfio_device *dev, 303 + unsigned int offset, int size, 304 + unsigned long val) 305 + { 306 + u8 data[8]; 307 + 308 + switch (size) { 309 + case 1: 310 + data[0] = (u8)val; 311 + break; 312 + case 2: 313 + put_unaligned_le16(val, (void *)data); 314 + break; 315 + case 4: 316 + put_unaligned_le32(val, (void *)data); 317 + break; 318 + #ifdef CONFIG_64BIT 319 + case 8: 320 + put_unaligned_le64(val, (void *)data); 321 + break; 322 + #endif 323 + } 324 + 325 + WARN_ON(uml_vfio_user_cfgspace_write(&dev->udev, offset, data, size)); 326 + } 327 + 328 + static void uml_vfio_cfgspace_write(struct um_pci_device *pdev, 329 + unsigned int offset, int size, 330 + unsigned long val) 331 + { 332 + struct uml_vfio_device *dev = to_vdev(pdev); 333 + 334 + if (offset < dev->msix_cap + PCI_CAP_MSIX_SIZEOF && 335 + offset + size > dev->msix_cap) 336 + WARN_ON(uml_vfio_update_msix_cap(dev, offset, size, val)); 337 + 338 + __uml_vfio_cfgspace_write(dev, offset, size, val); 339 + } 340 + 341 + static void uml_vfio_bar_copy_from(struct um_pci_device *pdev, int bar, 342 + void *buffer, unsigned int offset, int size) 343 + { 344 + struct uml_vfio_device *dev = to_vdev(pdev); 345 + 346 + memset(buffer, 0xff, size); 347 + uml_vfio_user_bar_read(&dev->udev, bar, offset, buffer, size); 348 + } 349 + 350 + static unsigned long uml_vfio_bar_read(struct um_pci_device *pdev, int bar, 351 + unsigned int offset, int size) 352 + { 353 + u8 data[8]; 354 + 355 + uml_vfio_bar_copy_from(pdev, bar, data, offset, size); 356 + 357 + switch (size) { 358 + case 1: 359 + return data[0]; 360 + case 2: 361 + return le16_to_cpup((void *)data); 362 + case 4: 363 + return le32_to_cpup((void *)data); 364 + #ifdef CONFIG_64BIT 365 + case 8: 366 + return le64_to_cpup((void *)data); 367 + #endif 368 + default: 369 + return ULONG_MAX; 370 + } 371 + } 372 + 373 + static void uml_vfio_bar_copy_to(struct um_pci_device *pdev, int bar, 374 + unsigned int offset, const void *buffer, 375 + int size) 376 + { 377 + struct uml_vfio_device *dev = to_vdev(pdev); 378 + 379 + uml_vfio_user_bar_write(&dev->udev, bar, offset, buffer, size); 380 + } 381 + 382 + static void uml_vfio_bar_write(struct um_pci_device *pdev, int bar, 383 + unsigned int offset, int size, 384 + unsigned long val) 385 + { 386 + struct uml_vfio_device *dev = to_vdev(pdev); 387 + u8 data[8]; 388 + 389 + if (bar == dev->msix_bar && offset + size > dev->msix_offset && 390 + offset < dev->msix_offset + dev->msix_size) 391 + WARN_ON(uml_vfio_update_msix_table(dev, offset, size, val)); 392 + 393 + switch (size) { 394 + case 1: 395 + data[0] = (u8)val; 396 + break; 397 + case 2: 398 + put_unaligned_le16(val, (void *)data); 399 + break; 400 + case 4: 401 + put_unaligned_le32(val, (void *)data); 402 + break; 403 + #ifdef CONFIG_64BIT 404 + case 8: 405 + put_unaligned_le64(val, (void *)data); 406 + break; 407 + #endif 408 + } 409 + 410 + uml_vfio_bar_copy_to(pdev, bar, offset, data, size); 411 + } 412 + 413 + static void uml_vfio_bar_set(struct um_pci_device *pdev, int bar, 414 + unsigned int offset, u8 value, int size) 415 + { 416 + struct uml_vfio_device *dev = to_vdev(pdev); 417 + int i; 418 + 419 + for (i = 0; i < size; i++) 420 + uml_vfio_user_bar_write(&dev->udev, bar, offset + i, &value, 1); 421 + } 422 + 423 + static const struct um_pci_ops uml_vfio_um_pci_ops = { 424 + .cfgspace_read = uml_vfio_cfgspace_read, 425 + .cfgspace_write = uml_vfio_cfgspace_write, 426 + .bar_read = uml_vfio_bar_read, 427 + .bar_write = uml_vfio_bar_write, 428 + .bar_copy_from = uml_vfio_bar_copy_from, 429 + .bar_copy_to = uml_vfio_bar_copy_to, 430 + .bar_set = uml_vfio_bar_set, 431 + }; 432 + 433 + static u8 uml_vfio_find_capability(struct uml_vfio_device *dev, u8 cap) 434 + { 435 + u8 id, pos; 436 + u16 ent; 437 + int ttl = 48; /* PCI_FIND_CAP_TTL */ 438 + 439 + pos = __uml_vfio_cfgspace_read(dev, PCI_CAPABILITY_LIST, sizeof(pos)); 440 + 441 + while (pos && ttl--) { 442 + ent = __uml_vfio_cfgspace_read(dev, pos, sizeof(ent)); 443 + 444 + id = ent & 0xff; 445 + if (id == 0xff) 446 + break; 447 + if (id == cap) 448 + return pos; 449 + 450 + pos = ent >> 8; 451 + } 452 + 453 + return 0; 454 + } 455 + 456 + static int uml_vfio_read_msix_table(struct uml_vfio_device *dev) 457 + { 458 + unsigned int off; 459 + u16 flags; 460 + u32 tbl; 461 + 462 + off = uml_vfio_find_capability(dev, PCI_CAP_ID_MSIX); 463 + if (!off) 464 + return -ENOTSUPP; 465 + 466 + dev->msix_cap = off; 467 + 468 + tbl = __uml_vfio_cfgspace_read(dev, off + PCI_MSIX_TABLE, sizeof(tbl)); 469 + flags = __uml_vfio_cfgspace_read(dev, off + PCI_MSIX_FLAGS, sizeof(flags)); 470 + 471 + dev->msix_bar = tbl & PCI_MSIX_TABLE_BIR; 472 + dev->msix_offset = tbl & PCI_MSIX_TABLE_OFFSET; 473 + dev->msix_size = ((flags & PCI_MSIX_FLAGS_QSIZE) + 1) * PCI_MSIX_ENTRY_SIZE; 474 + 475 + dev->msix_data = kzalloc(dev->msix_size, GFP_KERNEL); 476 + if (!dev->msix_data) 477 + return -ENOMEM; 478 + 479 + return 0; 480 + } 481 + 482 + static void uml_vfio_open_device(struct uml_vfio_device *dev) 483 + { 484 + struct uml_vfio_intr_ctx *ctx; 485 + int err, group_id, i; 486 + 487 + group_id = uml_vfio_user_get_group_id(dev->name); 488 + if (group_id < 0) { 489 + pr_err("Failed to get group id (%s), error %d\n", 490 + dev->name, group_id); 491 + goto free_dev; 492 + } 493 + 494 + dev->group = uml_vfio_open_group(group_id); 495 + if (dev->group < 0) { 496 + pr_err("Failed to open group %d (%s), error %d\n", 497 + group_id, dev->name, dev->group); 498 + goto free_dev; 499 + } 500 + 501 + err = uml_vfio_user_setup_device(&dev->udev, dev->group, dev->name); 502 + if (err) { 503 + pr_err("Failed to setup device (%s), error %d\n", 504 + dev->name, err); 505 + goto release_group; 506 + } 507 + 508 + err = uml_vfio_read_msix_table(dev); 509 + if (err) { 510 + pr_err("Failed to read MSI-X table (%s), error %d\n", 511 + dev->name, err); 512 + goto teardown_udev; 513 + } 514 + 515 + dev->intr_ctx = kmalloc_array(dev->udev.irq_count, 516 + sizeof(struct uml_vfio_intr_ctx), 517 + GFP_KERNEL); 518 + if (!dev->intr_ctx) { 519 + pr_err("Failed to allocate interrupt context (%s)\n", 520 + dev->name); 521 + goto free_msix; 522 + } 523 + 524 + for (i = 0; i < dev->udev.irq_count; i++) { 525 + ctx = &dev->intr_ctx[i]; 526 + ctx->dev = dev; 527 + ctx->irq = -1; 528 + } 529 + 530 + dev->pdev.ops = &uml_vfio_um_pci_ops; 531 + 532 + err = um_pci_device_register(&dev->pdev); 533 + if (err) { 534 + pr_err("Failed to register UM PCI device (%s), error %d\n", 535 + dev->name, err); 536 + goto free_intr_ctx; 537 + } 538 + 539 + return; 540 + 541 + free_intr_ctx: 542 + kfree(dev->intr_ctx); 543 + free_msix: 544 + kfree(dev->msix_data); 545 + teardown_udev: 546 + uml_vfio_user_teardown_device(&dev->udev); 547 + release_group: 548 + uml_vfio_release_group(dev->group); 549 + free_dev: 550 + list_del(&dev->list); 551 + kfree(dev->name); 552 + kfree(dev); 553 + } 554 + 555 + static void uml_vfio_release_device(struct uml_vfio_device *dev) 556 + { 557 + int i; 558 + 559 + for (i = 0; i < dev->udev.irq_count; i++) 560 + uml_vfio_deactivate_irq(dev, i); 561 + uml_vfio_user_update_irqs(&dev->udev); 562 + 563 + um_pci_device_unregister(&dev->pdev); 564 + kfree(dev->intr_ctx); 565 + kfree(dev->msix_data); 566 + uml_vfio_user_teardown_device(&dev->udev); 567 + uml_vfio_release_group(dev->group); 568 + list_del(&dev->list); 569 + kfree(dev->name); 570 + kfree(dev); 571 + } 572 + 573 + static int uml_vfio_cmdline_set(const char *device, const struct kernel_param *kp) 574 + { 575 + struct uml_vfio_device *dev; 576 + int fd; 577 + 578 + if (uml_vfio_container.fd < 0) { 579 + fd = uml_vfio_user_open_container(); 580 + if (fd < 0) 581 + return fd; 582 + uml_vfio_container.fd = fd; 583 + } 584 + 585 + dev = kzalloc(sizeof(*dev), GFP_KERNEL); 586 + if (!dev) 587 + return -ENOMEM; 588 + 589 + dev->name = kstrdup(device, GFP_KERNEL); 590 + if (!dev->name) { 591 + kfree(dev); 592 + return -ENOMEM; 593 + } 594 + 595 + list_add_tail(&dev->list, &uml_vfio_devices); 596 + return 0; 597 + } 598 + 599 + static int uml_vfio_cmdline_get(char *buffer, const struct kernel_param *kp) 600 + { 601 + return 0; 602 + } 603 + 604 + static const struct kernel_param_ops uml_vfio_cmdline_param_ops = { 605 + .set = uml_vfio_cmdline_set, 606 + .get = uml_vfio_cmdline_get, 607 + }; 608 + 609 + device_param_cb(device, &uml_vfio_cmdline_param_ops, NULL, 0400); 610 + __uml_help(uml_vfio_cmdline_param_ops, 611 + "vfio_uml.device=<domain:bus:slot.function>\n" 612 + " Pass through a PCI device to UML via VFIO. Currently, only MSI-X\n" 613 + " capable devices are supported, and it is assumed that drivers will\n" 614 + " use MSI-X. This parameter can be specified multiple times to pass\n" 615 + " through multiple PCI devices to UML.\n\n" 616 + ); 617 + 618 + static int __init uml_vfio_init(void) 619 + { 620 + struct uml_vfio_device *dev, *n; 621 + 622 + sigio_broken(); 623 + 624 + /* If the opening fails, the device will be released. */ 625 + list_for_each_entry_safe(dev, n, &uml_vfio_devices, list) 626 + uml_vfio_open_device(dev); 627 + 628 + return 0; 629 + } 630 + late_initcall(uml_vfio_init); 631 + 632 + static void __exit uml_vfio_exit(void) 633 + { 634 + struct uml_vfio_device *dev, *n; 635 + 636 + list_for_each_entry_safe(dev, n, &uml_vfio_devices, list) 637 + uml_vfio_release_device(dev); 638 + 639 + if (uml_vfio_container.fd >= 0) 640 + os_close_file(uml_vfio_container.fd); 641 + } 642 + module_exit(uml_vfio_exit);
+327
arch/um/drivers/vfio_user.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* 3 + * Copyright (C) 2025 Ant Group 4 + * Author: Tiwei Bie <tiwei.btw@antgroup.com> 5 + */ 6 + #include <errno.h> 7 + #include <fcntl.h> 8 + #include <unistd.h> 9 + #include <stdio.h> 10 + #include <stdint.h> 11 + #include <stdlib.h> 12 + #include <string.h> 13 + #include <sys/ioctl.h> 14 + #include <sys/eventfd.h> 15 + #include <linux/limits.h> 16 + #include <linux/vfio.h> 17 + #include <linux/pci_regs.h> 18 + #include <as-layout.h> 19 + #include <um_malloc.h> 20 + 21 + #include "vfio_user.h" 22 + 23 + int uml_vfio_user_open_container(void) 24 + { 25 + int r, fd; 26 + 27 + fd = open("/dev/vfio/vfio", O_RDWR); 28 + if (fd < 0) 29 + return -errno; 30 + 31 + r = ioctl(fd, VFIO_GET_API_VERSION); 32 + if (r != VFIO_API_VERSION) { 33 + r = r < 0 ? -errno : -EINVAL; 34 + goto error; 35 + } 36 + 37 + r = ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1_IOMMU); 38 + if (r <= 0) { 39 + r = r < 0 ? -errno : -EINVAL; 40 + goto error; 41 + } 42 + 43 + return fd; 44 + 45 + error: 46 + close(fd); 47 + return r; 48 + } 49 + 50 + int uml_vfio_user_setup_iommu(int container) 51 + { 52 + /* 53 + * This is a bit tricky. See the big comment in 54 + * vhost_user_set_mem_table() in virtio_uml.c. 55 + */ 56 + unsigned long reserved = uml_reserved - uml_physmem; 57 + struct vfio_iommu_type1_dma_map dma_map = { 58 + .argsz = sizeof(dma_map), 59 + .flags = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE, 60 + .vaddr = uml_reserved, 61 + .iova = reserved, 62 + .size = physmem_size - reserved, 63 + }; 64 + 65 + if (ioctl(container, VFIO_SET_IOMMU, VFIO_TYPE1_IOMMU) < 0) 66 + return -errno; 67 + 68 + if (ioctl(container, VFIO_IOMMU_MAP_DMA, &dma_map) < 0) 69 + return -errno; 70 + 71 + return 0; 72 + } 73 + 74 + int uml_vfio_user_get_group_id(const char *device) 75 + { 76 + char *path, *buf, *end; 77 + const char *name; 78 + int r; 79 + 80 + path = uml_kmalloc(PATH_MAX, UM_GFP_KERNEL); 81 + if (!path) 82 + return -ENOMEM; 83 + 84 + sprintf(path, "/sys/bus/pci/devices/%s/iommu_group", device); 85 + 86 + buf = uml_kmalloc(PATH_MAX + 1, UM_GFP_KERNEL); 87 + if (!buf) { 88 + r = -ENOMEM; 89 + goto free_path; 90 + } 91 + 92 + r = readlink(path, buf, PATH_MAX); 93 + if (r < 0) { 94 + r = -errno; 95 + goto free_buf; 96 + } 97 + buf[r] = '\0'; 98 + 99 + name = basename(buf); 100 + 101 + r = strtoul(name, &end, 10); 102 + if (*end != '\0' || end == name) { 103 + r = -EINVAL; 104 + goto free_buf; 105 + } 106 + 107 + free_buf: 108 + kfree(buf); 109 + free_path: 110 + kfree(path); 111 + return r; 112 + } 113 + 114 + int uml_vfio_user_open_group(int group_id) 115 + { 116 + char *path; 117 + int fd; 118 + 119 + path = uml_kmalloc(PATH_MAX, UM_GFP_KERNEL); 120 + if (!path) 121 + return -ENOMEM; 122 + 123 + sprintf(path, "/dev/vfio/%d", group_id); 124 + 125 + fd = open(path, O_RDWR); 126 + if (fd < 0) { 127 + fd = -errno; 128 + goto out; 129 + } 130 + 131 + out: 132 + kfree(path); 133 + return fd; 134 + } 135 + 136 + int uml_vfio_user_set_container(int container, int group) 137 + { 138 + if (ioctl(group, VFIO_GROUP_SET_CONTAINER, &container) < 0) 139 + return -errno; 140 + return 0; 141 + } 142 + 143 + int uml_vfio_user_unset_container(int container, int group) 144 + { 145 + if (ioctl(group, VFIO_GROUP_UNSET_CONTAINER, &container) < 0) 146 + return -errno; 147 + return 0; 148 + } 149 + 150 + static int vfio_set_irqs(int device, int start, int count, int *irqfd) 151 + { 152 + struct vfio_irq_set *irq_set; 153 + int argsz = sizeof(*irq_set) + sizeof(*irqfd) * count; 154 + int err = 0; 155 + 156 + irq_set = uml_kmalloc(argsz, UM_GFP_KERNEL); 157 + if (!irq_set) 158 + return -ENOMEM; 159 + 160 + irq_set->argsz = argsz; 161 + irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER; 162 + irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX; 163 + irq_set->start = start; 164 + irq_set->count = count; 165 + memcpy(irq_set->data, irqfd, sizeof(*irqfd) * count); 166 + 167 + if (ioctl(device, VFIO_DEVICE_SET_IRQS, irq_set) < 0) { 168 + err = -errno; 169 + goto out; 170 + } 171 + 172 + out: 173 + kfree(irq_set); 174 + return err; 175 + } 176 + 177 + int uml_vfio_user_setup_device(struct uml_vfio_user_device *dev, 178 + int group, const char *device) 179 + { 180 + struct vfio_device_info device_info = { .argsz = sizeof(device_info) }; 181 + struct vfio_irq_info irq_info = { .argsz = sizeof(irq_info) }; 182 + int err, i; 183 + 184 + dev->device = ioctl(group, VFIO_GROUP_GET_DEVICE_FD, device); 185 + if (dev->device < 0) 186 + return -errno; 187 + 188 + if (ioctl(dev->device, VFIO_DEVICE_GET_INFO, &device_info) < 0) { 189 + err = -errno; 190 + goto close_device; 191 + } 192 + 193 + dev->num_regions = device_info.num_regions; 194 + if (dev->num_regions > VFIO_PCI_CONFIG_REGION_INDEX + 1) 195 + dev->num_regions = VFIO_PCI_CONFIG_REGION_INDEX + 1; 196 + 197 + dev->region = uml_kmalloc(sizeof(*dev->region) * dev->num_regions, 198 + UM_GFP_KERNEL); 199 + if (!dev->region) { 200 + err = -ENOMEM; 201 + goto close_device; 202 + } 203 + 204 + for (i = 0; i < dev->num_regions; i++) { 205 + struct vfio_region_info region = { 206 + .argsz = sizeof(region), 207 + .index = i, 208 + }; 209 + if (ioctl(dev->device, VFIO_DEVICE_GET_REGION_INFO, &region) < 0) { 210 + err = -errno; 211 + goto free_region; 212 + } 213 + dev->region[i].size = region.size; 214 + dev->region[i].offset = region.offset; 215 + } 216 + 217 + /* Only MSI-X is supported currently. */ 218 + irq_info.index = VFIO_PCI_MSIX_IRQ_INDEX; 219 + if (ioctl(dev->device, VFIO_DEVICE_GET_IRQ_INFO, &irq_info) < 0) { 220 + err = -errno; 221 + goto free_region; 222 + } 223 + 224 + dev->irq_count = irq_info.count; 225 + 226 + dev->irqfd = uml_kmalloc(sizeof(int) * dev->irq_count, UM_GFP_KERNEL); 227 + if (!dev->irqfd) { 228 + err = -ENOMEM; 229 + goto free_region; 230 + } 231 + 232 + memset(dev->irqfd, -1, sizeof(int) * dev->irq_count); 233 + 234 + err = vfio_set_irqs(dev->device, 0, dev->irq_count, dev->irqfd); 235 + if (err) 236 + goto free_irqfd; 237 + 238 + return 0; 239 + 240 + free_irqfd: 241 + kfree(dev->irqfd); 242 + free_region: 243 + kfree(dev->region); 244 + close_device: 245 + close(dev->device); 246 + return err; 247 + } 248 + 249 + void uml_vfio_user_teardown_device(struct uml_vfio_user_device *dev) 250 + { 251 + kfree(dev->irqfd); 252 + kfree(dev->region); 253 + close(dev->device); 254 + } 255 + 256 + int uml_vfio_user_activate_irq(struct uml_vfio_user_device *dev, int index) 257 + { 258 + int irqfd; 259 + 260 + irqfd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC); 261 + if (irqfd < 0) 262 + return -errno; 263 + 264 + dev->irqfd[index] = irqfd; 265 + return irqfd; 266 + } 267 + 268 + void uml_vfio_user_deactivate_irq(struct uml_vfio_user_device *dev, int index) 269 + { 270 + close(dev->irqfd[index]); 271 + dev->irqfd[index] = -1; 272 + } 273 + 274 + int uml_vfio_user_update_irqs(struct uml_vfio_user_device *dev) 275 + { 276 + return vfio_set_irqs(dev->device, 0, dev->irq_count, dev->irqfd); 277 + } 278 + 279 + static int vfio_region_read(struct uml_vfio_user_device *dev, unsigned int index, 280 + uint64_t offset, void *buf, uint64_t size) 281 + { 282 + if (index >= dev->num_regions || offset + size > dev->region[index].size) 283 + return -EINVAL; 284 + 285 + if (pread(dev->device, buf, size, dev->region[index].offset + offset) < 0) 286 + return -errno; 287 + 288 + return 0; 289 + } 290 + 291 + static int vfio_region_write(struct uml_vfio_user_device *dev, unsigned int index, 292 + uint64_t offset, const void *buf, uint64_t size) 293 + { 294 + if (index >= dev->num_regions || offset + size > dev->region[index].size) 295 + return -EINVAL; 296 + 297 + if (pwrite(dev->device, buf, size, dev->region[index].offset + offset) < 0) 298 + return -errno; 299 + 300 + return 0; 301 + } 302 + 303 + int uml_vfio_user_cfgspace_read(struct uml_vfio_user_device *dev, 304 + unsigned int offset, void *buf, int size) 305 + { 306 + return vfio_region_read(dev, VFIO_PCI_CONFIG_REGION_INDEX, 307 + offset, buf, size); 308 + } 309 + 310 + int uml_vfio_user_cfgspace_write(struct uml_vfio_user_device *dev, 311 + unsigned int offset, const void *buf, int size) 312 + { 313 + return vfio_region_write(dev, VFIO_PCI_CONFIG_REGION_INDEX, 314 + offset, buf, size); 315 + } 316 + 317 + int uml_vfio_user_bar_read(struct uml_vfio_user_device *dev, int bar, 318 + unsigned int offset, void *buf, int size) 319 + { 320 + return vfio_region_read(dev, bar, offset, buf, size); 321 + } 322 + 323 + int uml_vfio_user_bar_write(struct uml_vfio_user_device *dev, int bar, 324 + unsigned int offset, const void *buf, int size) 325 + { 326 + return vfio_region_write(dev, bar, offset, buf, size); 327 + }
+44
arch/um/drivers/vfio_user.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + #ifndef __UM_VFIO_USER_H 3 + #define __UM_VFIO_USER_H 4 + 5 + struct uml_vfio_user_device { 6 + int device; 7 + 8 + struct { 9 + uint64_t size; 10 + uint64_t offset; 11 + } *region; 12 + int num_regions; 13 + 14 + int32_t *irqfd; 15 + int irq_count; 16 + }; 17 + 18 + int uml_vfio_user_open_container(void); 19 + int uml_vfio_user_setup_iommu(int container); 20 + 21 + int uml_vfio_user_get_group_id(const char *device); 22 + int uml_vfio_user_open_group(int group_id); 23 + int uml_vfio_user_set_container(int container, int group); 24 + int uml_vfio_user_unset_container(int container, int group); 25 + 26 + int uml_vfio_user_setup_device(struct uml_vfio_user_device *dev, 27 + int group, const char *device); 28 + void uml_vfio_user_teardown_device(struct uml_vfio_user_device *dev); 29 + 30 + int uml_vfio_user_activate_irq(struct uml_vfio_user_device *dev, int index); 31 + void uml_vfio_user_deactivate_irq(struct uml_vfio_user_device *dev, int index); 32 + int uml_vfio_user_update_irqs(struct uml_vfio_user_device *dev); 33 + 34 + int uml_vfio_user_cfgspace_read(struct uml_vfio_user_device *dev, 35 + unsigned int offset, void *buf, int size); 36 + int uml_vfio_user_cfgspace_write(struct uml_vfio_user_device *dev, 37 + unsigned int offset, const void *buf, int size); 38 + 39 + int uml_vfio_user_bar_read(struct uml_vfio_user_device *dev, int bar, 40 + unsigned int offset, void *buf, int size); 41 + int uml_vfio_user_bar_write(struct uml_vfio_user_device *dev, int bar, 42 + unsigned int offset, const void *buf, int size); 43 + 44 + #endif /* __UM_VFIO_USER_H */
+5 -10
arch/um/drivers/virt-pci.c
··· 538 538 539 539 static int __init um_pci_init(void) 540 540 { 541 - struct irq_domain_info inner_domain_info = { 542 - .size = MAX_MSI_VECTORS, 543 - .hwirq_max = MAX_MSI_VECTORS, 544 - .ops = &um_pci_inner_domain_ops, 545 - }; 546 541 int err, i; 547 542 548 543 WARN_ON(logic_iomem_add_region(&virt_cfgspace_resource, ··· 559 564 goto free; 560 565 } 561 566 562 - inner_domain_info.fwnode = um_pci_fwnode; 563 - um_pci_inner_domain = irq_domain_instantiate(&inner_domain_info); 564 - if (IS_ERR(um_pci_inner_domain)) { 565 - err = PTR_ERR(um_pci_inner_domain); 567 + um_pci_inner_domain = irq_domain_create_linear(um_pci_fwnode, MAX_MSI_VECTORS, 568 + &um_pci_inner_domain_ops, NULL); 569 + if (!um_pci_inner_domain) { 570 + err = -ENOMEM; 566 571 goto free; 567 572 } 568 573 ··· 597 602 return 0; 598 603 599 604 free: 600 - if (!IS_ERR_OR_NULL(um_pci_inner_domain)) 605 + if (um_pci_inner_domain) 601 606 irq_domain_remove(um_pci_inner_domain); 602 607 if (um_pci_fwnode) 603 608 irq_domain_free_fwnode(um_pci_fwnode);
+4 -7
arch/um/drivers/xterm.c
··· 81 81 " '<switch> command arg1 arg2 ...'.\n" 82 82 " The default values are 'xterm=" CONFIG_XTERM_CHAN_DEFAULT_EMULATOR 83 83 ",-T,-e'.\n" 84 - " Values for gnome-terminal are 'xterm=gnome-terminal,-t,-x'.\n\n" 84 + " Values for gnome-terminal are 'xterm=gnome-terminal,-t,--'.\n\n" 85 85 ); 86 86 87 87 static int xterm_open(int input, int output, int primary, void *d, ··· 97 97 if (access(argv[4], X_OK) < 0) 98 98 argv[4] = "port-helper"; 99 99 100 - /* 101 - * Check that DISPLAY is set, this doesn't guarantee the xterm 102 - * will work but w/o it we can be pretty sure it won't. 103 - */ 104 - if (getenv("DISPLAY") == NULL) { 105 - printk(UM_KERN_ERR "xterm_open: $DISPLAY not set.\n"); 100 + /* Ensure we are running on Xorg or Wayland. */ 101 + if (!getenv("DISPLAY") && !getenv("WAYLAND_DISPLAY")) { 102 + printk(UM_KERN_ERR "xterm_open : neither $DISPLAY nor $WAYLAND_DISPLAY is set.\n"); 106 103 return -ENODEV; 107 104 } 108 105
+5
arch/um/include/asm/asm-prototypes.h
··· 1 1 #include <asm-generic/asm-prototypes.h> 2 + #include <asm/checksum.h> 3 + 4 + #ifdef CONFIG_UML_X86 5 + extern void cmpxchg8b_emu(void); 6 + #endif
+3 -2
arch/um/include/asm/irq.h
··· 13 13 #define TELNETD_IRQ 8 14 14 #define XTERM_IRQ 9 15 15 #define RANDOM_IRQ 10 16 + #define SIGCHLD_IRQ 11 16 17 17 18 #ifdef CONFIG_UML_NET_VECTOR 18 19 19 - #define VECTOR_BASE_IRQ (RANDOM_IRQ + 1) 20 + #define VECTOR_BASE_IRQ (SIGCHLD_IRQ + 1) 20 21 #define VECTOR_IRQ_SPACE 8 21 22 22 23 #define UM_FIRST_DYN_IRQ (VECTOR_IRQ_SPACE + VECTOR_BASE_IRQ) 23 24 24 25 #else 25 26 26 - #define UM_FIRST_DYN_IRQ (RANDOM_IRQ + 1) 27 + #define UM_FIRST_DYN_IRQ (SIGCHLD_IRQ + 1) 27 28 28 29 #endif 29 30
+3
arch/um/include/asm/mmu.h
··· 6 6 #ifndef __ARCH_UM_MMU_H 7 7 #define __ARCH_UM_MMU_H 8 8 9 + #include "linux/types.h" 9 10 #include <mm_id.h> 10 11 11 12 typedef struct mm_context { 12 13 struct mm_id id; 14 + 15 + struct list_head list; 13 16 14 17 /* Address range in need of a TLB sync */ 15 18 unsigned long sync_tlb_range_from;
+4
arch/um/include/shared/common-offsets.h
··· 14 14 15 15 DEFINE(UM_NSEC_PER_SEC, NSEC_PER_SEC); 16 16 DEFINE(UM_NSEC_PER_USEC, NSEC_PER_USEC); 17 + 18 + DEFINE(UM_KERN_GDT_ENTRY_TLS_ENTRIES, GDT_ENTRY_TLS_ENTRIES); 19 + 20 + DEFINE(UM_SECCOMP_ARCH_NATIVE, SECCOMP_ARCH_NATIVE);
+2
arch/um/include/shared/irq_user.h
··· 17 17 struct siginfo; 18 18 extern void sigio_handler(int sig, struct siginfo *unused_si, 19 19 struct uml_pt_regs *regs, void *mc); 20 + extern void sigchld_handler(int sig, struct siginfo *unused_si, 21 + struct uml_pt_regs *regs, void *mc); 20 22 void sigio_run_timetravel_handlers(void); 21 23 extern void free_irq_by_fd(int fd); 22 24 extern void deactivate_fd(int fd, int irqnum);
-69
arch/um/include/shared/net_kern.h
··· 1 - /* SPDX-License-Identifier: GPL-2.0 */ 2 - /* 3 - * Copyright (C) 2002 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com) 4 - */ 5 - 6 - #ifndef __UM_NET_KERN_H 7 - #define __UM_NET_KERN_H 8 - 9 - #include <linux/netdevice.h> 10 - #include <linux/platform_device.h> 11 - #include <linux/skbuff.h> 12 - #include <linux/socket.h> 13 - #include <linux/list.h> 14 - #include <linux/workqueue.h> 15 - 16 - struct uml_net { 17 - struct list_head list; 18 - struct net_device *dev; 19 - struct platform_device pdev; 20 - int index; 21 - }; 22 - 23 - struct uml_net_private { 24 - struct list_head list; 25 - spinlock_t lock; 26 - struct net_device *dev; 27 - struct timer_list tl; 28 - 29 - struct work_struct work; 30 - int fd; 31 - unsigned char mac[ETH_ALEN]; 32 - int max_packet; 33 - unsigned short (*protocol)(struct sk_buff *); 34 - int (*open)(void *); 35 - void (*close)(int, void *); 36 - void (*remove)(void *); 37 - int (*read)(int, struct sk_buff *skb, struct uml_net_private *); 38 - int (*write)(int, struct sk_buff *skb, struct uml_net_private *); 39 - 40 - void (*add_address)(unsigned char *, unsigned char *, void *); 41 - void (*delete_address)(unsigned char *, unsigned char *, void *); 42 - char user[]; 43 - }; 44 - 45 - struct net_kern_info { 46 - void (*init)(struct net_device *, void *); 47 - unsigned short (*protocol)(struct sk_buff *); 48 - int (*read)(int, struct sk_buff *skb, struct uml_net_private *); 49 - int (*write)(int, struct sk_buff *skb, struct uml_net_private *); 50 - }; 51 - 52 - struct transport { 53 - struct list_head list; 54 - const char *name; 55 - int (* const setup)(char *, char **, void *); 56 - const struct net_user_info *user; 57 - const struct net_kern_info *kern; 58 - const int private_size; 59 - const int setup_size; 60 - }; 61 - 62 - extern int tap_setup_common(char *str, char *type, char **dev_name, 63 - char **mac_out, char **gate_addr); 64 - extern void register_transport(struct transport *new); 65 - extern unsigned short eth_protocol(struct sk_buff *skb); 66 - extern void uml_net_setup_etheraddr(struct net_device *dev, char *str); 67 - 68 - 69 - #endif
-52
arch/um/include/shared/net_user.h
··· 1 - /* SPDX-License-Identifier: GPL-2.0 */ 2 - /* 3 - * Copyright (C) 2002 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com) 4 - */ 5 - 6 - #ifndef __UM_NET_USER_H__ 7 - #define __UM_NET_USER_H__ 8 - 9 - #define ETH_ADDR_LEN (6) 10 - #define ETH_HEADER_ETHERTAP (16) 11 - #define ETH_HEADER_OTHER (26) /* 14 for ethernet + VLAN + MPLS for crazy people */ 12 - #define ETH_MAX_PACKET (1500) 13 - 14 - #define UML_NET_VERSION (4) 15 - 16 - struct net_user_info { 17 - int (*init)(void *, void *); 18 - int (*open)(void *); 19 - void (*close)(int, void *); 20 - void (*remove)(void *); 21 - void (*add_address)(unsigned char *, unsigned char *, void *); 22 - void (*delete_address)(unsigned char *, unsigned char *, void *); 23 - int max_packet; 24 - int mtu; 25 - }; 26 - 27 - extern void iter_addresses(void *d, void (*cb)(unsigned char *, 28 - unsigned char *, void *), 29 - void *arg); 30 - 31 - extern void *get_output_buffer(int *len_out); 32 - extern void free_output_buffer(void *buffer); 33 - 34 - extern int tap_open_common(void *dev, char *gate_addr); 35 - extern void tap_check_ips(char *gate_addr, unsigned char *eth_addr); 36 - 37 - extern void read_output(int fd, char *output_out, int len); 38 - 39 - extern int net_read(int fd, void *buf, int len); 40 - extern int net_recvfrom(int fd, void *buf, int len); 41 - extern int net_write(int fd, void *buf, int len); 42 - extern int net_send(int fd, void *buf, int len); 43 - extern int net_sendto(int fd, void *buf, int len, void *to, int sock_len); 44 - 45 - extern void open_addr(unsigned char *addr, unsigned char *netmask, void *arg); 46 - extern void close_addr(unsigned char *addr, unsigned char *netmask, void *arg); 47 - 48 - extern char *split_if_spec(char *str, ...); 49 - 50 - extern int dev_netmask(void *d, void *m); 51 - 52 - #endif
+2 -2
arch/um/include/shared/os.h
··· 143 143 extern int os_set_exec_close(int fd); 144 144 extern int os_ioctl_generic(int fd, unsigned int cmd, unsigned long arg); 145 145 extern int os_get_ifname(int fd, char *namebuf); 146 - extern int os_set_slip(int fd); 147 146 extern int os_mode_fd(int fd, int mode); 148 147 149 148 extern int os_seek_file(int fd, unsigned long long offset); ··· 197 198 extern void report_enomem(void); 198 199 199 200 /* process.c */ 201 + pid_t os_reap_child(void); 200 202 extern void os_alarm_process(int pid); 201 203 extern void os_kill_process(int pid, int reap_child); 202 204 extern void os_kill_ptraced_process(int pid, int reap_child); ··· 286 286 287 287 /* skas/process.c */ 288 288 extern int is_skas_winch(int pid, int fd, void *data); 289 - extern int start_userspace(unsigned long stub_stack); 289 + extern int start_userspace(struct mm_id *mm_id); 290 290 extern void userspace(struct uml_pt_regs *regs); 291 291 extern void new_thread(void *stack, jmp_buf *buf, void (*handler)(void)); 292 292 extern void switch_threads(jmp_buf *me, jmp_buf *you);
+9
arch/um/include/shared/skas/mm_id.h
··· 6 6 #ifndef __MM_ID_H 7 7 #define __MM_ID_H 8 8 9 + #define STUB_MAX_FDS 4 10 + 9 11 struct mm_id { 10 12 int pid; 11 13 unsigned long stack; 12 14 int syscall_data_len; 15 + 16 + /* Only used with SECCOMP mode */ 17 + int sock; 18 + int syscall_fd_num; 19 + int syscall_fd_map[STUB_MAX_FDS]; 13 20 }; 14 21 15 22 void __switch_mm(struct mm_id *mm_idp); 23 + 24 + void notify_mm_kill(int pid); 16 25 17 26 #endif
+1
arch/um/include/shared/skas/skas.h
··· 8 8 9 9 #include <sysdep/ptrace.h> 10 10 11 + extern int using_seccomp; 11 12 extern int userspace_pid[]; 12 13 13 14 extern void new_thread_handler(void);
+19 -1
arch/um/include/shared/skas/stub-data.h
··· 11 11 #include <linux/compiler_types.h> 12 12 #include <as-layout.h> 13 13 #include <sysdep/tls.h> 14 + #include <sysdep/stub-data.h> 15 + #include <mm_id.h> 16 + 17 + #define FUTEX_IN_CHILD 0 18 + #define FUTEX_IN_KERN 1 14 19 15 20 struct stub_init_data { 21 + int seccomp; 22 + 16 23 unsigned long stub_start; 17 24 18 25 int stub_code_fd; ··· 27 20 int stub_data_fd; 28 21 unsigned long stub_data_offset; 29 22 30 - unsigned long segv_handler; 23 + unsigned long signal_handler; 24 + unsigned long signal_restorer; 31 25 }; 32 26 33 27 #define STUB_NEXT_SYSCALL(s) \ ··· 59 51 int syscall_data_len; 60 52 /* 128 leaves enough room for additional fields in the struct */ 61 53 struct stub_syscall syscall_data[(UM_KERN_PAGE_SIZE - 128) / sizeof(struct stub_syscall)] __aligned(16); 54 + 55 + /* data shared with signal handler (only used in seccomp mode) */ 56 + short restart_wait; 57 + unsigned int futex; 58 + int signal; 59 + unsigned short si_offset; 60 + unsigned short mctx_offset; 61 + 62 + /* seccomp architecture specific state restore */ 63 + struct stub_data_arch arch_data; 62 64 63 65 /* Stack for our signal handlers and for calling into . */ 64 66 unsigned char sigstack[UM_KERN_PAGE_SIZE] __aligned(UM_KERN_PAGE_SIZE);
-1
arch/um/kernel/Makefile
··· 25 25 obj-$(CONFIG_OF) += dtb.o 26 26 obj-$(CONFIG_EARLY_PRINTK) += early_printk.o 27 27 obj-$(CONFIG_STACKTRACE) += stacktrace.o 28 - obj-$(CONFIG_GENERIC_PCI_IOMAP) += ioport.o 29 28 30 29 USER_OBJS := config.o 31 30
-13
arch/um/kernel/ioport.c
··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - /* 3 - * Copyright (C) 2021 Intel Corporation 4 - * Author: Johannes Berg <johannes@sipsolutions.net> 5 - */ 6 - #include <asm/iomap.h> 7 - #include <asm-generic/pci_iomap.h> 8 - 9 - void __iomem *__pci_ioport_map(struct pci_dev *dev, unsigned long port, 10 - unsigned int nr) 11 - { 12 - return NULL; 13 - }
+6
arch/um/kernel/irq.c
··· 690 690 /* Initialize EPOLL Loop */ 691 691 os_setup_epoll(); 692 692 } 693 + 694 + void sigchld_handler(int sig, struct siginfo *unused_si, 695 + struct uml_pt_regs *regs, void *mc) 696 + { 697 + do_IRQ(SIGCHLD_IRQ, regs); 698 + }
+82 -9
arch/um/kernel/skas/mmu.c
··· 8 8 #include <linux/sched/signal.h> 9 9 #include <linux/slab.h> 10 10 11 + #include <shared/irq_kern.h> 11 12 #include <asm/pgalloc.h> 12 13 #include <asm/sections.h> 13 14 #include <asm/mmu_context.h> ··· 19 18 20 19 /* Ensure the stub_data struct covers the allocated area */ 21 20 static_assert(sizeof(struct stub_data) == STUB_DATA_PAGES * UM_KERN_PAGE_SIZE); 21 + 22 + spinlock_t mm_list_lock; 23 + struct list_head mm_list; 22 24 23 25 int init_new_context(struct task_struct *task, struct mm_struct *mm) 24 26 { ··· 35 31 36 32 new_id->stack = stack; 37 33 38 - block_signals_trace(); 39 - new_id->pid = start_userspace(stack); 40 - unblock_signals_trace(); 41 - 42 - if (new_id->pid < 0) { 43 - ret = new_id->pid; 44 - goto out_free; 34 + scoped_guard(spinlock_irqsave, &mm_list_lock) { 35 + /* Insert into list, used for lookups when the child dies */ 36 + list_add(&mm->context.list, &mm_list); 45 37 } 38 + 39 + ret = start_userspace(new_id); 40 + if (ret < 0) 41 + goto out_free; 46 42 47 43 /* Ensure the new MM is clean and nothing unwanted is mapped */ 48 44 unmap(new_id, 0, STUB_START); ··· 64 60 * zero, resulting in a kill(0), which will result in the 65 61 * whole UML suddenly dying. Also, cover negative and 66 62 * 1 cases, since they shouldn't happen either. 63 + * 64 + * Negative cases happen if the child died unexpectedly. 67 65 */ 68 - if (mmu->id.pid < 2) { 66 + if (mmu->id.pid >= 0 && mmu->id.pid < 2) { 69 67 printk(KERN_ERR "corrupt mm_context - pid = %d\n", 70 68 mmu->id.pid); 71 69 return; 72 70 } 73 - os_kill_ptraced_process(mmu->id.pid, 1); 71 + 72 + if (mmu->id.pid > 0) { 73 + os_kill_ptraced_process(mmu->id.pid, 1); 74 + mmu->id.pid = -1; 75 + } 76 + 77 + if (using_seccomp && mmu->id.sock) 78 + os_close_file(mmu->id.sock); 74 79 75 80 free_pages(mmu->id.stack, ilog2(STUB_DATA_PAGES)); 81 + 82 + guard(spinlock_irqsave)(&mm_list_lock); 83 + 84 + list_del(&mm->context.list); 76 85 } 86 + 87 + static irqreturn_t mm_sigchld_irq(int irq, void* dev) 88 + { 89 + struct mm_context *mm_context; 90 + pid_t pid; 91 + 92 + guard(spinlock)(&mm_list_lock); 93 + 94 + while ((pid = os_reap_child()) > 0) { 95 + /* 96 + * A child died, check if we have an MM with the PID. This is 97 + * only relevant in SECCOMP mode (as ptrace will fail anyway). 98 + * 99 + * See wait_stub_done_seccomp for more details. 100 + */ 101 + list_for_each_entry(mm_context, &mm_list, list) { 102 + if (mm_context->id.pid == pid) { 103 + struct stub_data *stub_data; 104 + printk("Unexpectedly lost MM child! Affected tasks will segfault."); 105 + 106 + /* Marks the MM as dead */ 107 + mm_context->id.pid = -1; 108 + 109 + /* 110 + * NOTE: If SMP is implemented, a futex_wake 111 + * needs to be added here. 112 + */ 113 + stub_data = (void *)mm_context->id.stack; 114 + stub_data->futex = FUTEX_IN_KERN; 115 + 116 + /* 117 + * NOTE: Currently executing syscalls by 118 + * affected tasks may finish normally. 119 + */ 120 + break; 121 + } 122 + } 123 + } 124 + 125 + return IRQ_HANDLED; 126 + } 127 + 128 + static int __init init_child_tracking(void) 129 + { 130 + int err; 131 + 132 + spin_lock_init(&mm_list_lock); 133 + INIT_LIST_HEAD(&mm_list); 134 + 135 + err = request_irq(SIGCHLD_IRQ, mm_sigchld_irq, 0, "SIGCHLD", NULL); 136 + if (err < 0) 137 + panic("Failed to register SIGCHLD IRQ: %d", err); 138 + 139 + return 0; 140 + } 141 + early_initcall(init_child_tracking)
+127 -5
arch/um/kernel/skas/stub.c
··· 5 5 6 6 #include <sysdep/stub.h> 7 7 8 - static __always_inline int syscall_handler(struct stub_data *d) 8 + #include <linux/futex.h> 9 + #include <sys/socket.h> 10 + #include <errno.h> 11 + 12 + /* 13 + * Known security issues 14 + * 15 + * Userspace can jump to this address to execute *any* syscall that is 16 + * permitted by the stub. As we will return afterwards, it can do 17 + * whatever it likes, including: 18 + * - Tricking the kernel into handing out the memory FD 19 + * - Using this memory FD to read/write all physical memory 20 + * - Running in parallel to the kernel processing a syscall 21 + * (possibly creating data races?) 22 + * - Blocking e.g. SIGALRM to avoid time based scheduling 23 + * 24 + * To avoid this, the permitted location for each syscall needs to be 25 + * checked for in the SECCOMP filter (which is reasonably simple). Also, 26 + * more care will need to go into considerations how the code might be 27 + * tricked by using a prepared stack (or even modifying the stack from 28 + * another thread in case SMP support is added). 29 + * 30 + * As for the SIGALRM, the best counter measure will be to check in the 31 + * kernel that the process is reporting back the SIGALRM in a timely 32 + * fashion. 33 + */ 34 + static __always_inline int syscall_handler(int fd_map[STUB_MAX_FDS]) 9 35 { 36 + struct stub_data *d = get_stub_data(); 10 37 int i; 11 38 unsigned long res; 39 + int fd; 12 40 13 41 for (i = 0; i < d->syscall_data_len; i++) { 14 42 struct stub_syscall *sc = &d->syscall_data[i]; 15 43 16 44 switch (sc->syscall) { 17 45 case STUB_SYSCALL_MMAP: 46 + if (fd_map) 47 + fd = fd_map[sc->mem.fd]; 48 + else 49 + fd = sc->mem.fd; 50 + 18 51 res = stub_syscall6(STUB_MMAP_NR, 19 52 sc->mem.addr, sc->mem.length, 20 53 sc->mem.prot, 21 54 MAP_SHARED | MAP_FIXED, 22 - sc->mem.fd, sc->mem.offset); 55 + fd, sc->mem.offset); 23 56 if (res != sc->mem.addr) { 24 57 d->err = res; 25 58 d->syscall_data_len = i; ··· 84 51 void __section(".__syscall_stub") 85 52 stub_syscall_handler(void) 86 53 { 87 - struct stub_data *d = get_stub_data(); 88 - 89 - syscall_handler(d); 54 + syscall_handler(NULL); 90 55 91 56 trap_myself(); 57 + } 58 + 59 + void __section(".__syscall_stub") 60 + stub_signal_interrupt(int sig, siginfo_t *info, void *p) 61 + { 62 + struct stub_data *d = get_stub_data(); 63 + char rcv_data; 64 + union { 65 + char data[CMSG_SPACE(sizeof(int) * STUB_MAX_FDS)]; 66 + struct cmsghdr align; 67 + } ctrl = {}; 68 + struct iovec iov = { 69 + .iov_base = &rcv_data, 70 + .iov_len = 1, 71 + }; 72 + struct msghdr msghdr = { 73 + .msg_iov = &iov, 74 + .msg_iovlen = 1, 75 + .msg_control = &ctrl, 76 + .msg_controllen = sizeof(ctrl), 77 + }; 78 + ucontext_t *uc = p; 79 + struct cmsghdr *fd_msg; 80 + int *fd_map; 81 + int num_fds; 82 + long res; 83 + 84 + d->signal = sig; 85 + d->si_offset = (unsigned long)info - (unsigned long)&d->sigstack[0]; 86 + d->mctx_offset = (unsigned long)&uc->uc_mcontext - (unsigned long)&d->sigstack[0]; 87 + 88 + restart_wait: 89 + d->futex = FUTEX_IN_KERN; 90 + do { 91 + res = stub_syscall3(__NR_futex, (unsigned long)&d->futex, 92 + FUTEX_WAKE, 1); 93 + } while (res == -EINTR); 94 + 95 + do { 96 + res = stub_syscall4(__NR_futex, (unsigned long)&d->futex, 97 + FUTEX_WAIT, FUTEX_IN_KERN, 0); 98 + } while (res == -EINTR || d->futex == FUTEX_IN_KERN); 99 + 100 + if (res < 0 && res != -EAGAIN) 101 + stub_syscall1(__NR_exit_group, 1); 102 + 103 + if (d->syscall_data_len) { 104 + /* Read passed FDs (if any) */ 105 + do { 106 + res = stub_syscall3(__NR_recvmsg, 0, (unsigned long)&msghdr, 0); 107 + } while (res == -EINTR); 108 + 109 + /* We should never have a receive error (other than -EAGAIN) */ 110 + if (res < 0 && res != -EAGAIN) 111 + stub_syscall1(__NR_exit_group, 1); 112 + 113 + /* Receive the FDs */ 114 + num_fds = 0; 115 + fd_msg = msghdr.msg_control; 116 + fd_map = (void *)&CMSG_DATA(fd_msg); 117 + if (res == iov.iov_len && msghdr.msg_controllen > sizeof(struct cmsghdr)) 118 + num_fds = (fd_msg->cmsg_len - CMSG_LEN(0)) / sizeof(int); 119 + 120 + /* Try running queued syscalls. */ 121 + res = syscall_handler(fd_map); 122 + 123 + while (num_fds) 124 + stub_syscall2(__NR_close, fd_map[--num_fds], 0); 125 + } else { 126 + res = 0; 127 + } 128 + 129 + if (res < 0 || d->restart_wait) { 130 + /* Report SIGSYS if we restart. */ 131 + d->signal = SIGSYS; 132 + d->restart_wait = 0; 133 + 134 + goto restart_wait; 135 + } 136 + 137 + /* Restore arch dependent state that is not part of the mcontext */ 138 + stub_seccomp_restore_state(&d->arch_data); 139 + 140 + /* Return so that the host modified mcontext is restored. */ 141 + } 142 + 143 + void __section(".__syscall_stub") 144 + stub_signal_restorer(void) 145 + { 146 + /* We must not have anything on the stack when doing rt_sigreturn */ 147 + stub_syscall0(__NR_rt_sigreturn); 92 148 }
+147 -12
arch/um/kernel/skas/stub_exe.c
··· 1 1 #include <sys/ptrace.h> 2 2 #include <sys/prctl.h> 3 + #include <sys/fcntl.h> 3 4 #include <asm/unistd.h> 4 5 #include <sysdep/stub.h> 5 6 #include <stub-data.h> 7 + #include <linux/filter.h> 8 + #include <linux/seccomp.h> 9 + #include <generated/asm-offsets.h> 6 10 7 11 void _start(void); 8 12 ··· 29 25 } sa = { 30 26 /* Need to set SA_RESTORER (but the handler never returns) */ 31 27 .sa_flags = SA_ONSTACK | SA_NODEFER | SA_SIGINFO | 0x04000000, 32 - /* no need to mask any signals */ 33 - .sa_mask = 0, 34 28 }; 35 29 36 30 /* set a nice name */ ··· 37 35 /* Make sure this process dies if the kernel dies */ 38 36 stub_syscall2(__NR_prctl, PR_SET_PDEATHSIG, SIGKILL); 39 37 38 + /* Needed in SECCOMP mode (and safe to do anyway) */ 39 + stub_syscall5(__NR_prctl, PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0); 40 + 40 41 /* read information from STDIN and close it */ 41 42 res = stub_syscall3(__NR_read, 0, 42 43 (unsigned long)&init_data, sizeof(init_data)); 43 44 if (res != sizeof(init_data)) 44 45 stub_syscall1(__NR_exit, 10); 45 46 46 - stub_syscall1(__NR_close, 0); 47 + /* In SECCOMP mode, FD 0 is a socket and is later used for FD passing */ 48 + if (!init_data.seccomp) 49 + stub_syscall1(__NR_close, 0); 50 + else 51 + stub_syscall3(__NR_fcntl, 0, F_SETFL, O_NONBLOCK); 47 52 48 53 /* map stub code + data */ 49 54 res = stub_syscall6(STUB_MMAP_NR, ··· 68 59 if (res != init_data.stub_start + UM_KERN_PAGE_SIZE) 69 60 stub_syscall1(__NR_exit, 12); 70 61 62 + /* In SECCOMP mode, we only need the signalling FD from now on */ 63 + if (init_data.seccomp) { 64 + res = stub_syscall3(__NR_close_range, 1, ~0U, 0); 65 + if (res != 0) 66 + stub_syscall1(__NR_exit, 13); 67 + } 68 + 71 69 /* setup signal stack inside stub data */ 72 70 stack.ss_sp = (void *)init_data.stub_start + UM_KERN_PAGE_SIZE; 73 71 stub_syscall2(__NR_sigaltstack, (unsigned long)&stack, 0); 74 72 75 - /* register SIGSEGV handler */ 76 - sa.sa_handler_ = (void *) init_data.segv_handler; 77 - res = stub_syscall4(__NR_rt_sigaction, SIGSEGV, (unsigned long)&sa, 0, 78 - sizeof(sa.sa_mask)); 79 - if (res != 0) 80 - stub_syscall1(__NR_exit, 13); 73 + /* register signal handlers */ 74 + sa.sa_handler_ = (void *) init_data.signal_handler; 75 + sa.sa_restorer = (void *) init_data.signal_restorer; 76 + if (!init_data.seccomp) { 77 + /* In ptrace mode, the SIGSEGV handler never returns */ 78 + sa.sa_mask = 0; 81 79 82 - stub_syscall4(__NR_ptrace, PTRACE_TRACEME, 0, 0, 0); 80 + res = stub_syscall4(__NR_rt_sigaction, SIGSEGV, 81 + (unsigned long)&sa, 0, sizeof(sa.sa_mask)); 82 + if (res != 0) 83 + stub_syscall1(__NR_exit, 14); 84 + } else { 85 + /* SECCOMP mode uses rt_sigreturn, need to mask all signals */ 86 + sa.sa_mask = ~0ULL; 83 87 84 - stub_syscall2(__NR_kill, stub_syscall0(__NR_getpid), SIGSTOP); 88 + res = stub_syscall4(__NR_rt_sigaction, SIGSEGV, 89 + (unsigned long)&sa, 0, sizeof(sa.sa_mask)); 90 + if (res != 0) 91 + stub_syscall1(__NR_exit, 15); 85 92 86 - stub_syscall1(__NR_exit, 14); 93 + res = stub_syscall4(__NR_rt_sigaction, SIGSYS, 94 + (unsigned long)&sa, 0, sizeof(sa.sa_mask)); 95 + if (res != 0) 96 + stub_syscall1(__NR_exit, 16); 97 + 98 + res = stub_syscall4(__NR_rt_sigaction, SIGALRM, 99 + (unsigned long)&sa, 0, sizeof(sa.sa_mask)); 100 + if (res != 0) 101 + stub_syscall1(__NR_exit, 17); 102 + 103 + res = stub_syscall4(__NR_rt_sigaction, SIGTRAP, 104 + (unsigned long)&sa, 0, sizeof(sa.sa_mask)); 105 + if (res != 0) 106 + stub_syscall1(__NR_exit, 18); 107 + 108 + res = stub_syscall4(__NR_rt_sigaction, SIGILL, 109 + (unsigned long)&sa, 0, sizeof(sa.sa_mask)); 110 + if (res != 0) 111 + stub_syscall1(__NR_exit, 19); 112 + 113 + res = stub_syscall4(__NR_rt_sigaction, SIGFPE, 114 + (unsigned long)&sa, 0, sizeof(sa.sa_mask)); 115 + if (res != 0) 116 + stub_syscall1(__NR_exit, 20); 117 + } 118 + 119 + /* 120 + * If in seccomp mode, install the SECCOMP filter and trigger a syscall. 121 + * Otherwise set PTRACE_TRACEME and do a SIGSTOP. 122 + */ 123 + if (init_data.seccomp) { 124 + struct sock_filter filter[] = { 125 + #if __BITS_PER_LONG > 32 126 + /* [0] Load upper 32bit of instruction pointer from seccomp_data */ 127 + BPF_STMT(BPF_LD | BPF_W | BPF_ABS, 128 + (offsetof(struct seccomp_data, instruction_pointer) + 4)), 129 + 130 + /* [1] Jump forward 3 instructions if the upper address is not identical */ 131 + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, (init_data.stub_start) >> 32, 0, 3), 132 + #endif 133 + /* [2] Load lower 32bit of instruction pointer from seccomp_data */ 134 + BPF_STMT(BPF_LD | BPF_W | BPF_ABS, 135 + (offsetof(struct seccomp_data, instruction_pointer))), 136 + 137 + /* [3] Mask out lower bits */ 138 + BPF_STMT(BPF_ALU | BPF_AND | BPF_K, 0xfffff000), 139 + 140 + /* [4] Jump to [6] if the lower bits are not on the expected page */ 141 + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, (init_data.stub_start) & 0xfffff000, 1, 0), 142 + 143 + /* [5] Trap call, allow */ 144 + BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_TRAP), 145 + 146 + /* [6,7] Check architecture */ 147 + BPF_STMT(BPF_LD | BPF_W | BPF_ABS, 148 + offsetof(struct seccomp_data, arch)), 149 + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, 150 + UM_SECCOMP_ARCH_NATIVE, 1, 0), 151 + 152 + /* [8] Kill (for architecture check) */ 153 + BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_KILL_PROCESS), 154 + 155 + /* [9] Load syscall number */ 156 + BPF_STMT(BPF_LD | BPF_W | BPF_ABS, 157 + offsetof(struct seccomp_data, nr)), 158 + 159 + /* [10-16] Check against permitted syscalls */ 160 + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_futex, 161 + 7, 0), 162 + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K,__NR_recvmsg, 163 + 6, 0), 164 + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K,__NR_close, 165 + 5, 0), 166 + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, STUB_MMAP_NR, 167 + 4, 0), 168 + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_munmap, 169 + 3, 0), 170 + #ifdef __i386__ 171 + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_set_thread_area, 172 + 2, 0), 173 + #else 174 + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_arch_prctl, 175 + 2, 0), 176 + #endif 177 + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_rt_sigreturn, 178 + 1, 0), 179 + 180 + /* [17] Not one of the permitted syscalls */ 181 + BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_KILL_PROCESS), 182 + 183 + /* [18] Permitted call for the stub */ 184 + BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_ALLOW), 185 + }; 186 + struct sock_fprog prog = { 187 + .len = sizeof(filter) / sizeof(filter[0]), 188 + .filter = filter, 189 + }; 190 + 191 + if (stub_syscall3(__NR_seccomp, SECCOMP_SET_MODE_FILTER, 192 + SECCOMP_FILTER_FLAG_TSYNC, 193 + (unsigned long)&prog) != 0) 194 + stub_syscall1(__NR_exit, 21); 195 + 196 + /* Fall through, the exit syscall will cause SIGSYS */ 197 + } else { 198 + stub_syscall4(__NR_ptrace, PTRACE_TRACEME, 0, 0, 0); 199 + 200 + stub_syscall2(__NR_kill, stub_syscall0(__NR_getpid), SIGSTOP); 201 + } 202 + 203 + stub_syscall1(__NR_exit, 30); 87 204 88 205 __builtin_unreachable(); 89 206 }
+9 -4
arch/um/kernel/time.c
··· 856 856 857 857 static irqreturn_t um_timer(int irq, void *dev) 858 858 { 859 - if (get_current()->mm != NULL) 860 - { 861 - /* userspace - relay signal, results in correct userspace timers */ 859 + /* 860 + * Interrupt the (possibly) running userspace process, technically this 861 + * should only happen if userspace is currently executing. 862 + * With infinite CPU time-travel, we can only get here when userspace 863 + * is not executing. Do not notify there and avoid spurious scheduling. 864 + */ 865 + if (time_travel_mode != TT_MODE_INFCPU && 866 + time_travel_mode != TT_MODE_EXTERNAL && 867 + get_current()->mm) 862 868 os_alarm_process(get_current()->mm->context.id.pid); 863 - } 864 869 865 870 (*timer_clockevent.event_handler)(&timer_clockevent); 866 871
+117 -13
arch/um/kernel/trap.c
··· 16 16 #include <kern_util.h> 17 17 #include <os.h> 18 18 #include <skas.h> 19 - #include <arch.h> 19 + 20 + /* 21 + * NOTE: UML does not have exception tables. As such, this is almost a copy 22 + * of the code in mm/memory.c, only adjusting the logic to simply check whether 23 + * we are coming from the kernel instead of doing an additional lookup in the 24 + * exception table. 25 + * We can do this simplification because we never get here if the exception was 26 + * fixable. 27 + */ 28 + static inline bool get_mmap_lock_carefully(struct mm_struct *mm, bool is_user) 29 + { 30 + if (likely(mmap_read_trylock(mm))) 31 + return true; 32 + 33 + if (!is_user) 34 + return false; 35 + 36 + return !mmap_read_lock_killable(mm); 37 + } 38 + 39 + static inline bool mmap_upgrade_trylock(struct mm_struct *mm) 40 + { 41 + /* 42 + * We don't have this operation yet. 43 + * 44 + * It should be easy enough to do: it's basically a 45 + * atomic_long_try_cmpxchg_acquire() 46 + * from RWSEM_READER_BIAS -> RWSEM_WRITER_LOCKED, but 47 + * it also needs the proper lockdep magic etc. 48 + */ 49 + return false; 50 + } 51 + 52 + static inline bool upgrade_mmap_lock_carefully(struct mm_struct *mm, bool is_user) 53 + { 54 + mmap_read_unlock(mm); 55 + if (!is_user) 56 + return false; 57 + 58 + return !mmap_write_lock_killable(mm); 59 + } 60 + 61 + /* 62 + * Helper for page fault handling. 63 + * 64 + * This is kind of equivalend to "mmap_read_lock()" followed 65 + * by "find_extend_vma()", except it's a lot more careful about 66 + * the locking (and will drop the lock on failure). 67 + * 68 + * For example, if we have a kernel bug that causes a page 69 + * fault, we don't want to just use mmap_read_lock() to get 70 + * the mm lock, because that would deadlock if the bug were 71 + * to happen while we're holding the mm lock for writing. 72 + * 73 + * So this checks the exception tables on kernel faults in 74 + * order to only do this all for instructions that are actually 75 + * expected to fault. 76 + * 77 + * We can also actually take the mm lock for writing if we 78 + * need to extend the vma, which helps the VM layer a lot. 79 + */ 80 + static struct vm_area_struct * 81 + um_lock_mm_and_find_vma(struct mm_struct *mm, 82 + unsigned long addr, bool is_user) 83 + { 84 + struct vm_area_struct *vma; 85 + 86 + if (!get_mmap_lock_carefully(mm, is_user)) 87 + return NULL; 88 + 89 + vma = find_vma(mm, addr); 90 + if (likely(vma && (vma->vm_start <= addr))) 91 + return vma; 92 + 93 + /* 94 + * Well, dang. We might still be successful, but only 95 + * if we can extend a vma to do so. 96 + */ 97 + if (!vma || !(vma->vm_flags & VM_GROWSDOWN)) { 98 + mmap_read_unlock(mm); 99 + return NULL; 100 + } 101 + 102 + /* 103 + * We can try to upgrade the mmap lock atomically, 104 + * in which case we can continue to use the vma 105 + * we already looked up. 106 + * 107 + * Otherwise we'll have to drop the mmap lock and 108 + * re-take it, and also look up the vma again, 109 + * re-checking it. 110 + */ 111 + if (!mmap_upgrade_trylock(mm)) { 112 + if (!upgrade_mmap_lock_carefully(mm, is_user)) 113 + return NULL; 114 + 115 + vma = find_vma(mm, addr); 116 + if (!vma) 117 + goto fail; 118 + if (vma->vm_start <= addr) 119 + goto success; 120 + if (!(vma->vm_flags & VM_GROWSDOWN)) 121 + goto fail; 122 + } 123 + 124 + if (expand_stack_locked(vma, addr)) 125 + goto fail; 126 + 127 + success: 128 + mmap_write_downgrade(mm); 129 + return vma; 130 + 131 + fail: 132 + mmap_write_unlock(mm); 133 + return NULL; 134 + } 20 135 21 136 /* 22 137 * Note this is constrained to return 0, -EFAULT, -EACCES, -ENOMEM by ··· 159 44 if (is_user) 160 45 flags |= FAULT_FLAG_USER; 161 46 retry: 162 - mmap_read_lock(mm); 163 - vma = find_vma(mm, address); 164 - if (!vma) 165 - goto out; 166 - if (vma->vm_start <= address) 167 - goto good_area; 168 - if (!(vma->vm_flags & VM_GROWSDOWN)) 169 - goto out; 170 - if (is_user && !ARCH_IS_STACKGROW(address)) 171 - goto out; 172 - vma = expand_stack(mm, address); 47 + vma = um_lock_mm_and_find_vma(mm, address, is_user); 173 48 if (!vma) 174 49 goto out_nosemaphore; 175 50 176 - good_area: 177 51 *code_out = SEGV_ACCERR; 178 52 if (is_write) { 179 53 if (!(vma->vm_flags & VM_WRITE))
+1 -1
arch/um/os-Linux/Makefile
··· 8 8 9 9 obj-y = execvp.o file.o helper.o irq.o main.o mem.o process.o \ 10 10 registers.o sigio.o signal.o start_up.o time.o tty.o \ 11 - umid.o user_syms.o util.o drivers/ skas/ 11 + umid.o user_syms.o util.o skas/ 12 12 13 13 CFLAGS_signal.o += -Wframe-larger-than=4096 14 14
-13
arch/um/os-Linux/drivers/Makefile
··· 1 - # SPDX-License-Identifier: GPL-2.0 2 - # 3 - # Copyright (C) 2000, 2002 Jeff Dike (jdike@karaya.com) 4 - # 5 - 6 - ethertap-objs := ethertap_kern.o ethertap_user.o 7 - tuntap-objs := tuntap_kern.o tuntap_user.o 8 - 9 - obj-y = 10 - obj-$(CONFIG_UML_NET_ETHERTAP) += ethertap.o 11 - obj-$(CONFIG_UML_NET_TUNTAP) += tuntap.o 12 - 13 - include $(srctree)/arch/um/scripts/Makefile.rules
-21
arch/um/os-Linux/drivers/etap.h
··· 1 - /* SPDX-License-Identifier: GPL-2.0 */ 2 - /* 3 - * Copyright (C) 2001 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com) 4 - */ 5 - 6 - #ifndef __DRIVERS_ETAP_H 7 - #define __DRIVERS_ETAP_H 8 - 9 - #include <net_user.h> 10 - 11 - struct ethertap_data { 12 - char *dev_name; 13 - char *gate_addr; 14 - int data_fd; 15 - int control_fd; 16 - void *dev; 17 - }; 18 - 19 - extern const struct net_user_info ethertap_user_info; 20 - 21 - #endif
-100
arch/um/os-Linux/drivers/ethertap_kern.c
··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - /* 3 - * Copyright (C) 2001 Lennert Buytenhek (buytenh@gnu.org) and 4 - * James Leu (jleu@mindspring.net). 5 - * Copyright (C) 2001 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com) 6 - * Copyright (C) 2001 by various other people who didn't put their name here. 7 - */ 8 - 9 - #include <linux/init.h> 10 - #include <linux/netdevice.h> 11 - #include "etap.h" 12 - #include <net_kern.h> 13 - 14 - struct ethertap_init { 15 - char *dev_name; 16 - char *gate_addr; 17 - }; 18 - 19 - static void etap_init(struct net_device *dev, void *data) 20 - { 21 - struct uml_net_private *pri; 22 - struct ethertap_data *epri; 23 - struct ethertap_init *init = data; 24 - 25 - pri = netdev_priv(dev); 26 - epri = (struct ethertap_data *) pri->user; 27 - epri->dev_name = init->dev_name; 28 - epri->gate_addr = init->gate_addr; 29 - epri->data_fd = -1; 30 - epri->control_fd = -1; 31 - epri->dev = dev; 32 - 33 - printk(KERN_INFO "ethertap backend - %s", epri->dev_name); 34 - if (epri->gate_addr != NULL) 35 - printk(KERN_CONT ", IP = %s", epri->gate_addr); 36 - printk(KERN_CONT "\n"); 37 - } 38 - 39 - static int etap_read(int fd, struct sk_buff *skb, struct uml_net_private *lp) 40 - { 41 - int len; 42 - 43 - len = net_recvfrom(fd, skb_mac_header(skb), 44 - skb->dev->mtu + 2 + ETH_HEADER_ETHERTAP); 45 - if (len <= 0) 46 - return(len); 47 - 48 - skb_pull(skb, 2); 49 - len -= 2; 50 - return len; 51 - } 52 - 53 - static int etap_write(int fd, struct sk_buff *skb, struct uml_net_private *lp) 54 - { 55 - skb_push(skb, 2); 56 - return net_send(fd, skb->data, skb->len); 57 - } 58 - 59 - const struct net_kern_info ethertap_kern_info = { 60 - .init = etap_init, 61 - .protocol = eth_protocol, 62 - .read = etap_read, 63 - .write = etap_write, 64 - }; 65 - 66 - static int ethertap_setup(char *str, char **mac_out, void *data) 67 - { 68 - struct ethertap_init *init = data; 69 - 70 - *init = ((struct ethertap_init) 71 - { .dev_name = NULL, 72 - .gate_addr = NULL }); 73 - if (tap_setup_common(str, "ethertap", &init->dev_name, mac_out, 74 - &init->gate_addr)) 75 - return 0; 76 - if (init->dev_name == NULL) { 77 - printk(KERN_ERR "ethertap_setup : Missing tap device name\n"); 78 - return 0; 79 - } 80 - 81 - return 1; 82 - } 83 - 84 - static struct transport ethertap_transport = { 85 - .list = LIST_HEAD_INIT(ethertap_transport.list), 86 - .name = "ethertap", 87 - .setup = ethertap_setup, 88 - .user = &ethertap_user_info, 89 - .kern = &ethertap_kern_info, 90 - .private_size = sizeof(struct ethertap_data), 91 - .setup_size = sizeof(struct ethertap_init), 92 - }; 93 - 94 - static int register_ethertap(void) 95 - { 96 - register_transport(&ethertap_transport); 97 - return 0; 98 - } 99 - 100 - late_initcall(register_ethertap);
-248
arch/um/os-Linux/drivers/ethertap_user.c
··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - /* 3 - * Copyright (C) 2001 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com) 4 - * Copyright (C) 2001 Lennert Buytenhek (buytenh@gnu.org) and 5 - * James Leu (jleu@mindspring.net). 6 - * Copyright (C) 2001 by various other people who didn't put their name here. 7 - */ 8 - 9 - #include <stdio.h> 10 - #include <unistd.h> 11 - #include <errno.h> 12 - #include <string.h> 13 - #include <sys/socket.h> 14 - #include <sys/wait.h> 15 - #include "etap.h" 16 - #include <os.h> 17 - #include <net_user.h> 18 - #include <um_malloc.h> 19 - 20 - #define MAX_PACKET ETH_MAX_PACKET 21 - 22 - static int etap_user_init(void *data, void *dev) 23 - { 24 - struct ethertap_data *pri = data; 25 - 26 - pri->dev = dev; 27 - return 0; 28 - } 29 - 30 - struct addr_change { 31 - enum { ADD_ADDR, DEL_ADDR } what; 32 - unsigned char addr[4]; 33 - unsigned char netmask[4]; 34 - }; 35 - 36 - static void etap_change(int op, unsigned char *addr, unsigned char *netmask, 37 - int fd) 38 - { 39 - struct addr_change change; 40 - char *output; 41 - int n; 42 - 43 - change.what = op; 44 - memcpy(change.addr, addr, sizeof(change.addr)); 45 - memcpy(change.netmask, netmask, sizeof(change.netmask)); 46 - CATCH_EINTR(n = write(fd, &change, sizeof(change))); 47 - if (n != sizeof(change)) { 48 - printk(UM_KERN_ERR "etap_change - request failed, err = %d\n", 49 - errno); 50 - return; 51 - } 52 - 53 - output = uml_kmalloc(UM_KERN_PAGE_SIZE, UM_GFP_KERNEL); 54 - if (output == NULL) 55 - printk(UM_KERN_ERR "etap_change : Failed to allocate output " 56 - "buffer\n"); 57 - read_output(fd, output, UM_KERN_PAGE_SIZE); 58 - if (output != NULL) { 59 - printk("%s", output); 60 - kfree(output); 61 - } 62 - } 63 - 64 - static void etap_open_addr(unsigned char *addr, unsigned char *netmask, 65 - void *arg) 66 - { 67 - etap_change(ADD_ADDR, addr, netmask, *((int *) arg)); 68 - } 69 - 70 - static void etap_close_addr(unsigned char *addr, unsigned char *netmask, 71 - void *arg) 72 - { 73 - etap_change(DEL_ADDR, addr, netmask, *((int *) arg)); 74 - } 75 - 76 - struct etap_pre_exec_data { 77 - int control_remote; 78 - int control_me; 79 - int data_me; 80 - }; 81 - 82 - static void etap_pre_exec(void *arg) 83 - { 84 - struct etap_pre_exec_data *data = arg; 85 - 86 - dup2(data->control_remote, 1); 87 - close(data->data_me); 88 - close(data->control_me); 89 - } 90 - 91 - static int etap_tramp(char *dev, char *gate, int control_me, 92 - int control_remote, int data_me, int data_remote) 93 - { 94 - struct etap_pre_exec_data pe_data; 95 - int pid, err, n; 96 - char version_buf[sizeof("nnnnn\0")]; 97 - char data_fd_buf[sizeof("nnnnnn\0")]; 98 - char gate_buf[sizeof("nnn.nnn.nnn.nnn\0")]; 99 - char *setup_args[] = { "uml_net", version_buf, "ethertap", dev, 100 - data_fd_buf, gate_buf, NULL }; 101 - char *nosetup_args[] = { "uml_net", version_buf, "ethertap", 102 - dev, data_fd_buf, NULL }; 103 - char **args, c; 104 - 105 - sprintf(data_fd_buf, "%d", data_remote); 106 - sprintf(version_buf, "%d", UML_NET_VERSION); 107 - if (gate != NULL) { 108 - strscpy(gate_buf, gate); 109 - args = setup_args; 110 - } 111 - else args = nosetup_args; 112 - 113 - err = 0; 114 - pe_data.control_remote = control_remote; 115 - pe_data.control_me = control_me; 116 - pe_data.data_me = data_me; 117 - pid = run_helper(etap_pre_exec, &pe_data, args); 118 - 119 - if (pid < 0) 120 - err = pid; 121 - close(data_remote); 122 - close(control_remote); 123 - CATCH_EINTR(n = read(control_me, &c, sizeof(c))); 124 - if (n != sizeof(c)) { 125 - err = -errno; 126 - printk(UM_KERN_ERR "etap_tramp : read of status failed, " 127 - "err = %d\n", -err); 128 - return err; 129 - } 130 - if (c != 1) { 131 - printk(UM_KERN_ERR "etap_tramp : uml_net failed\n"); 132 - err = helper_wait(pid); 133 - } 134 - return err; 135 - } 136 - 137 - static int etap_open(void *data) 138 - { 139 - struct ethertap_data *pri = data; 140 - char *output; 141 - int data_fds[2], control_fds[2], err, output_len; 142 - 143 - err = tap_open_common(pri->dev, pri->gate_addr); 144 - if (err) 145 - return err; 146 - 147 - err = socketpair(AF_UNIX, SOCK_DGRAM, 0, data_fds); 148 - if (err) { 149 - err = -errno; 150 - printk(UM_KERN_ERR "etap_open - data socketpair failed - " 151 - "err = %d\n", errno); 152 - return err; 153 - } 154 - 155 - err = socketpair(AF_UNIX, SOCK_STREAM, 0, control_fds); 156 - if (err) { 157 - err = -errno; 158 - printk(UM_KERN_ERR "etap_open - control socketpair failed - " 159 - "err = %d\n", errno); 160 - goto out_close_data; 161 - } 162 - 163 - err = etap_tramp(pri->dev_name, pri->gate_addr, control_fds[0], 164 - control_fds[1], data_fds[0], data_fds[1]); 165 - output_len = UM_KERN_PAGE_SIZE; 166 - output = uml_kmalloc(output_len, UM_GFP_KERNEL); 167 - read_output(control_fds[0], output, output_len); 168 - 169 - if (output == NULL) 170 - printk(UM_KERN_ERR "etap_open : failed to allocate output " 171 - "buffer\n"); 172 - else { 173 - printk("%s", output); 174 - kfree(output); 175 - } 176 - 177 - if (err < 0) { 178 - printk(UM_KERN_ERR "etap_tramp failed - err = %d\n", -err); 179 - goto out_close_control; 180 - } 181 - 182 - pri->data_fd = data_fds[0]; 183 - pri->control_fd = control_fds[0]; 184 - iter_addresses(pri->dev, etap_open_addr, &pri->control_fd); 185 - return data_fds[0]; 186 - 187 - out_close_control: 188 - close(control_fds[0]); 189 - close(control_fds[1]); 190 - out_close_data: 191 - close(data_fds[0]); 192 - close(data_fds[1]); 193 - return err; 194 - } 195 - 196 - static void etap_close(int fd, void *data) 197 - { 198 - struct ethertap_data *pri = data; 199 - 200 - iter_addresses(pri->dev, etap_close_addr, &pri->control_fd); 201 - close(fd); 202 - 203 - if (shutdown(pri->data_fd, SHUT_RDWR) < 0) 204 - printk(UM_KERN_ERR "etap_close - shutdown data socket failed, " 205 - "errno = %d\n", errno); 206 - 207 - if (shutdown(pri->control_fd, SHUT_RDWR) < 0) 208 - printk(UM_KERN_ERR "etap_close - shutdown control socket " 209 - "failed, errno = %d\n", errno); 210 - 211 - close(pri->data_fd); 212 - pri->data_fd = -1; 213 - close(pri->control_fd); 214 - pri->control_fd = -1; 215 - } 216 - 217 - static void etap_add_addr(unsigned char *addr, unsigned char *netmask, 218 - void *data) 219 - { 220 - struct ethertap_data *pri = data; 221 - 222 - tap_check_ips(pri->gate_addr, addr); 223 - if (pri->control_fd == -1) 224 - return; 225 - etap_open_addr(addr, netmask, &pri->control_fd); 226 - } 227 - 228 - static void etap_del_addr(unsigned char *addr, unsigned char *netmask, 229 - void *data) 230 - { 231 - struct ethertap_data *pri = data; 232 - 233 - if (pri->control_fd == -1) 234 - return; 235 - 236 - etap_close_addr(addr, netmask, &pri->control_fd); 237 - } 238 - 239 - const struct net_user_info ethertap_user_info = { 240 - .init = etap_user_init, 241 - .open = etap_open, 242 - .close = etap_close, 243 - .remove = NULL, 244 - .add_address = etap_add_addr, 245 - .delete_address = etap_del_addr, 246 - .mtu = ETH_MAX_PACKET, 247 - .max_packet = ETH_MAX_PACKET + ETH_HEADER_ETHERTAP, 248 - };
-21
arch/um/os-Linux/drivers/tuntap.h
··· 1 - /* SPDX-License-Identifier: GPL-2.0 */ 2 - /* 3 - * Copyright (C) 2001 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com) 4 - */ 5 - 6 - #ifndef __UM_TUNTAP_H 7 - #define __UM_TUNTAP_H 8 - 9 - #include <net_user.h> 10 - 11 - struct tuntap_data { 12 - char *dev_name; 13 - int fixed_config; 14 - char *gate_addr; 15 - int fd; 16 - void *dev; 17 - }; 18 - 19 - extern const struct net_user_info tuntap_user_info; 20 - 21 - #endif
-86
arch/um/os-Linux/drivers/tuntap_kern.c
··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - /* 3 - * Copyright (C) 2001 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com) 4 - */ 5 - 6 - #include <linux/netdevice.h> 7 - #include <linux/init.h> 8 - #include <linux/skbuff.h> 9 - #include <asm/errno.h> 10 - #include <net_kern.h> 11 - #include "tuntap.h" 12 - 13 - struct tuntap_init { 14 - char *dev_name; 15 - char *gate_addr; 16 - }; 17 - 18 - static void tuntap_init(struct net_device *dev, void *data) 19 - { 20 - struct uml_net_private *pri; 21 - struct tuntap_data *tpri; 22 - struct tuntap_init *init = data; 23 - 24 - pri = netdev_priv(dev); 25 - tpri = (struct tuntap_data *) pri->user; 26 - tpri->dev_name = init->dev_name; 27 - tpri->fixed_config = (init->dev_name != NULL); 28 - tpri->gate_addr = init->gate_addr; 29 - tpri->fd = -1; 30 - tpri->dev = dev; 31 - 32 - printk(KERN_INFO "TUN/TAP backend - "); 33 - if (tpri->gate_addr != NULL) 34 - printk(KERN_CONT "IP = %s", tpri->gate_addr); 35 - printk(KERN_CONT "\n"); 36 - } 37 - 38 - static int tuntap_read(int fd, struct sk_buff *skb, struct uml_net_private *lp) 39 - { 40 - return net_read(fd, skb_mac_header(skb), 41 - skb->dev->mtu + ETH_HEADER_OTHER); 42 - } 43 - 44 - static int tuntap_write(int fd, struct sk_buff *skb, struct uml_net_private *lp) 45 - { 46 - return net_write(fd, skb->data, skb->len); 47 - } 48 - 49 - const struct net_kern_info tuntap_kern_info = { 50 - .init = tuntap_init, 51 - .protocol = eth_protocol, 52 - .read = tuntap_read, 53 - .write = tuntap_write, 54 - }; 55 - 56 - static int tuntap_setup(char *str, char **mac_out, void *data) 57 - { 58 - struct tuntap_init *init = data; 59 - 60 - *init = ((struct tuntap_init) 61 - { .dev_name = NULL, 62 - .gate_addr = NULL }); 63 - if (tap_setup_common(str, "tuntap", &init->dev_name, mac_out, 64 - &init->gate_addr)) 65 - return 0; 66 - 67 - return 1; 68 - } 69 - 70 - static struct transport tuntap_transport = { 71 - .list = LIST_HEAD_INIT(tuntap_transport.list), 72 - .name = "tuntap", 73 - .setup = tuntap_setup, 74 - .user = &tuntap_user_info, 75 - .kern = &tuntap_kern_info, 76 - .private_size = sizeof(struct tuntap_data), 77 - .setup_size = sizeof(struct tuntap_init), 78 - }; 79 - 80 - static int register_tuntap(void) 81 - { 82 - register_transport(&tuntap_transport); 83 - return 0; 84 - } 85 - 86 - late_initcall(register_tuntap);
-215
arch/um/os-Linux/drivers/tuntap_user.c
··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - /* 3 - * Copyright (C) 2001 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com) 4 - */ 5 - 6 - #include <stdio.h> 7 - #include <unistd.h> 8 - #include <errno.h> 9 - #include <string.h> 10 - #include <linux/if_tun.h> 11 - #include <net/if.h> 12 - #include <sys/ioctl.h> 13 - #include <sys/socket.h> 14 - #include <sys/wait.h> 15 - #include <sys/uio.h> 16 - #include <kern_util.h> 17 - #include <os.h> 18 - #include "tuntap.h" 19 - 20 - static int tuntap_user_init(void *data, void *dev) 21 - { 22 - struct tuntap_data *pri = data; 23 - 24 - pri->dev = dev; 25 - return 0; 26 - } 27 - 28 - static void tuntap_add_addr(unsigned char *addr, unsigned char *netmask, 29 - void *data) 30 - { 31 - struct tuntap_data *pri = data; 32 - 33 - tap_check_ips(pri->gate_addr, addr); 34 - if ((pri->fd == -1) || pri->fixed_config) 35 - return; 36 - open_addr(addr, netmask, pri->dev_name); 37 - } 38 - 39 - static void tuntap_del_addr(unsigned char *addr, unsigned char *netmask, 40 - void *data) 41 - { 42 - struct tuntap_data *pri = data; 43 - 44 - if ((pri->fd == -1) || pri->fixed_config) 45 - return; 46 - close_addr(addr, netmask, pri->dev_name); 47 - } 48 - 49 - struct tuntap_pre_exec_data { 50 - int stdout_fd; 51 - int close_me; 52 - }; 53 - 54 - static void tuntap_pre_exec(void *arg) 55 - { 56 - struct tuntap_pre_exec_data *data = arg; 57 - 58 - dup2(data->stdout_fd, 1); 59 - close(data->close_me); 60 - } 61 - 62 - static int tuntap_open_tramp(char *gate, int *fd_out, int me, int remote, 63 - char *buffer, int buffer_len, int *used_out) 64 - { 65 - struct tuntap_pre_exec_data data; 66 - char version_buf[sizeof("nnnnn\0")]; 67 - char *argv[] = { "uml_net", version_buf, "tuntap", "up", gate, 68 - NULL }; 69 - char buf[CMSG_SPACE(sizeof(*fd_out))]; 70 - struct msghdr msg; 71 - struct cmsghdr *cmsg; 72 - struct iovec iov; 73 - int pid, n, err; 74 - 75 - sprintf(version_buf, "%d", UML_NET_VERSION); 76 - 77 - data.stdout_fd = remote; 78 - data.close_me = me; 79 - 80 - pid = run_helper(tuntap_pre_exec, &data, argv); 81 - 82 - if (pid < 0) 83 - return pid; 84 - 85 - close(remote); 86 - 87 - msg.msg_name = NULL; 88 - msg.msg_namelen = 0; 89 - if (buffer != NULL) { 90 - iov = ((struct iovec) { buffer, buffer_len }); 91 - msg.msg_iov = &iov; 92 - msg.msg_iovlen = 1; 93 - } 94 - else { 95 - msg.msg_iov = NULL; 96 - msg.msg_iovlen = 0; 97 - } 98 - msg.msg_control = buf; 99 - msg.msg_controllen = sizeof(buf); 100 - msg.msg_flags = 0; 101 - n = recvmsg(me, &msg, 0); 102 - *used_out = n; 103 - if (n < 0) { 104 - err = -errno; 105 - printk(UM_KERN_ERR "tuntap_open_tramp : recvmsg failed - " 106 - "errno = %d\n", errno); 107 - return err; 108 - } 109 - helper_wait(pid); 110 - 111 - cmsg = CMSG_FIRSTHDR(&msg); 112 - if (cmsg == NULL) { 113 - printk(UM_KERN_ERR "tuntap_open_tramp : didn't receive a " 114 - "message\n"); 115 - return -EINVAL; 116 - } 117 - if ((cmsg->cmsg_level != SOL_SOCKET) || 118 - (cmsg->cmsg_type != SCM_RIGHTS)) { 119 - printk(UM_KERN_ERR "tuntap_open_tramp : didn't receive a " 120 - "descriptor\n"); 121 - return -EINVAL; 122 - } 123 - *fd_out = ((int *) CMSG_DATA(cmsg))[0]; 124 - os_set_exec_close(*fd_out); 125 - return 0; 126 - } 127 - 128 - static int tuntap_open(void *data) 129 - { 130 - struct ifreq ifr; 131 - struct tuntap_data *pri = data; 132 - char *output, *buffer; 133 - int err, fds[2], len, used; 134 - 135 - err = tap_open_common(pri->dev, pri->gate_addr); 136 - if (err < 0) 137 - return err; 138 - 139 - if (pri->fixed_config) { 140 - pri->fd = os_open_file("/dev/net/tun", 141 - of_cloexec(of_rdwr(OPENFLAGS())), 0); 142 - if (pri->fd < 0) { 143 - printk(UM_KERN_ERR "Failed to open /dev/net/tun, " 144 - "err = %d\n", -pri->fd); 145 - return pri->fd; 146 - } 147 - memset(&ifr, 0, sizeof(ifr)); 148 - ifr.ifr_flags = IFF_TAP | IFF_NO_PI; 149 - strscpy(ifr.ifr_name, pri->dev_name); 150 - if (ioctl(pri->fd, TUNSETIFF, &ifr) < 0) { 151 - err = -errno; 152 - printk(UM_KERN_ERR "TUNSETIFF failed, errno = %d\n", 153 - errno); 154 - close(pri->fd); 155 - return err; 156 - } 157 - } 158 - else { 159 - err = socketpair(AF_UNIX, SOCK_DGRAM, 0, fds); 160 - if (err) { 161 - err = -errno; 162 - printk(UM_KERN_ERR "tuntap_open : socketpair failed - " 163 - "errno = %d\n", errno); 164 - return err; 165 - } 166 - 167 - buffer = get_output_buffer(&len); 168 - if (buffer != NULL) 169 - len--; 170 - used = 0; 171 - 172 - err = tuntap_open_tramp(pri->gate_addr, &pri->fd, fds[0], 173 - fds[1], buffer, len, &used); 174 - 175 - output = buffer; 176 - if (err < 0) { 177 - printk("%s", output); 178 - free_output_buffer(buffer); 179 - printk(UM_KERN_ERR "tuntap_open_tramp failed - " 180 - "err = %d\n", -err); 181 - return err; 182 - } 183 - 184 - pri->dev_name = uml_strdup(buffer); 185 - output += IFNAMSIZ; 186 - printk("%s", output); 187 - free_output_buffer(buffer); 188 - 189 - close(fds[0]); 190 - iter_addresses(pri->dev, open_addr, pri->dev_name); 191 - } 192 - 193 - return pri->fd; 194 - } 195 - 196 - static void tuntap_close(int fd, void *data) 197 - { 198 - struct tuntap_data *pri = data; 199 - 200 - if (!pri->fixed_config) 201 - iter_addresses(pri->dev, close_addr, pri->dev_name); 202 - close(fd); 203 - pri->fd = -1; 204 - } 205 - 206 - const struct net_user_info tuntap_user_info = { 207 - .init = tuntap_user_init, 208 - .open = tuntap_open, 209 - .close = tuntap_close, 210 - .remove = NULL, 211 - .add_address = tuntap_add_addr, 212 - .delete_address = tuntap_del_addr, 213 - .mtu = ETH_MAX_PACKET, 214 - .max_packet = ETH_MAX_PACKET + ETH_HEADER_OTHER, 215 - };
-15
arch/um/os-Linux/file.c
··· 106 106 return 0; 107 107 } 108 108 109 - int os_set_slip(int fd) 110 - { 111 - int disc, sencap; 112 - 113 - disc = N_SLIP; 114 - if (ioctl(fd, TIOCSETD, &disc) < 0) 115 - return -errno; 116 - 117 - sencap = 0; 118 - if (ioctl(fd, SIOCSIFENCAP, &sencap) < 0) 119 - return -errno; 120 - 121 - return 0; 122 - } 123 - 124 109 int os_mode_fd(int fd, int mode) 125 110 { 126 111 int err;
+4 -1
arch/um/os-Linux/internal.h
··· 2 2 #ifndef __UM_OS_LINUX_INTERNAL_H 3 3 #define __UM_OS_LINUX_INTERNAL_H 4 4 5 + #include <mm_id.h> 6 + #include <stub-data.h> 7 + 5 8 /* 6 9 * elf_aux.c 7 10 */ ··· 19 16 * skas/process.c 20 17 */ 21 18 void wait_stub_done(int pid); 22 - 19 + void wait_stub_done_seccomp(struct mm_id *mm_idp, int running, int wait_sigsys); 23 20 #endif /* __UM_OS_LINUX_INTERNAL_H */
+31
arch/um/os-Linux/process.c
··· 18 18 #include <init.h> 19 19 #include <longjmp.h> 20 20 #include <os.h> 21 + #include <skas/skas.h> 21 22 22 23 void os_alarm_process(int pid) 23 24 { 25 + if (pid <= 0) 26 + return; 27 + 24 28 kill(pid, SIGALRM); 25 29 } 26 30 27 31 void os_kill_process(int pid, int reap_child) 28 32 { 33 + if (pid <= 0) 34 + return; 35 + 36 + /* Block signals until child is reaped */ 37 + block_signals(); 38 + 29 39 kill(pid, SIGKILL); 30 40 if (reap_child) 31 41 CATCH_EINTR(waitpid(pid, NULL, __WALL)); 42 + 43 + unblock_signals(); 32 44 } 33 45 34 46 /* Kill off a ptraced child by all means available. kill it normally first, ··· 50 38 51 39 void os_kill_ptraced_process(int pid, int reap_child) 52 40 { 41 + if (pid <= 0) 42 + return; 43 + 44 + /* Block signals until child is reaped */ 45 + block_signals(); 46 + 53 47 kill(pid, SIGKILL); 54 48 ptrace(PTRACE_KILL, pid); 55 49 ptrace(PTRACE_CONT, pid); 56 50 if (reap_child) 57 51 CATCH_EINTR(waitpid(pid, NULL, __WALL)); 52 + 53 + unblock_signals(); 54 + } 55 + 56 + pid_t os_reap_child(void) 57 + { 58 + int status; 59 + 60 + /* Try to reap a child */ 61 + return waitpid(-1, &status, WNOHANG); 58 62 } 59 63 60 64 /* Don't use the glibc version, which caches the result in TLS. It misses some ··· 179 151 set_handler(SIGBUS); 180 152 signal(SIGHUP, SIG_IGN); 181 153 set_handler(SIGIO); 154 + /* We (currently) only use the child reaper IRQ in seccomp mode */ 155 + if (using_seccomp) 156 + set_handler(SIGCHLD); 182 157 signal(SIGWINCH, SIG_IGN); 183 158 } 184 159
+2 -2
arch/um/os-Linux/registers.c
··· 14 14 15 15 /* This is set once at boot time and not changed thereafter */ 16 16 17 - static unsigned long exec_regs[MAX_REG_NR]; 18 - static unsigned long *exec_fp_regs; 17 + unsigned long exec_regs[MAX_REG_NR]; 18 + unsigned long *exec_fp_regs; 19 19 20 20 int init_pid_registers(int pid) 21 21 {
+2 -1
arch/um/os-Linux/sigio.c
··· 12 12 #include <signal.h> 13 13 #include <string.h> 14 14 #include <sys/epoll.h> 15 + #include <asm/unistd.h> 15 16 #include <kern_util.h> 16 17 #include <init.h> 17 18 #include <os.h> ··· 47 46 __func__, errno); 48 47 } 49 48 50 - CATCH_EINTR(r = tgkill(pid, pid, SIGIO)); 49 + CATCH_EINTR(r = syscall(__NR_tgkill, pid, pid, SIGIO)); 51 50 if (r < 0) 52 51 printk(UM_KERN_ERR "%s: tgkill failed, errno = %d\n", 53 52 __func__, errno);
+18 -1
arch/um/os-Linux/signal.c
··· 29 29 [SIGBUS] = relay_signal, 30 30 [SIGSEGV] = segv_handler, 31 31 [SIGIO] = sigio_handler, 32 + [SIGCHLD] = sigchld_handler, 32 33 }; 33 34 34 35 static void sig_handler_common(int sig, struct siginfo *si, mcontext_t *mc) ··· 45 44 } 46 45 47 46 /* enable signals if sig isn't IRQ signal */ 48 - if ((sig != SIGIO) && (sig != SIGWINCH)) 47 + if ((sig != SIGIO) && (sig != SIGWINCH) && (sig != SIGCHLD)) 49 48 unblock_signals_trace(); 50 49 51 50 (*sig_info[sig])(sig, si, &r, mc); ··· 64 63 65 64 #define SIGALRM_BIT 1 66 65 #define SIGALRM_MASK (1 << SIGALRM_BIT) 66 + 67 + #define SIGCHLD_BIT 2 68 + #define SIGCHLD_MASK (1 << SIGCHLD_BIT) 67 69 68 70 int signals_enabled; 69 71 #if IS_ENABLED(CONFIG_UML_TIME_TRAVEL_SUPPORT) ··· 103 99 sigio_run_timetravel_handlers(); 104 100 else 105 101 signals_pending |= SIGIO_MASK; 102 + return; 103 + } 104 + 105 + if (!enabled && (sig == SIGCHLD)) { 106 + signals_pending |= SIGCHLD_MASK; 106 107 return; 107 108 } 108 109 ··· 190 181 191 182 [SIGIO] = sig_handler, 192 183 [SIGWINCH] = sig_handler, 184 + /* SIGCHLD is only actually registered in seccomp mode. */ 185 + [SIGCHLD] = sig_handler, 193 186 [SIGALRM] = timer_alarm_handler, 194 187 195 188 [SIGUSR1] = sigusr1_handler, ··· 319 308 */ 320 309 if (save_pending & SIGIO_MASK) 321 310 sig_handler_common(SIGIO, NULL, NULL); 311 + 312 + if (save_pending & SIGCHLD_MASK) { 313 + struct uml_pt_regs regs = {}; 314 + 315 + sigchld_handler(SIGCHLD, NULL, &regs, NULL); 316 + } 322 317 323 318 /* Do not reenter the handler */ 324 319
+83 -18
arch/um/os-Linux/skas/mem.c
··· 43 43 44 44 print_hex_dump(UM_KERN_ERR, " syscall data: ", 0, 45 45 16, 4, sc, sizeof(*sc), 0); 46 + 47 + if (using_seccomp) { 48 + printk(UM_KERN_ERR "%s: FD map num: %d", __func__, 49 + mm_idp->syscall_fd_num); 50 + print_hex_dump(UM_KERN_ERR, 51 + " FD map: ", 0, 16, 52 + sizeof(mm_idp->syscall_fd_map[0]), 53 + mm_idp->syscall_fd_map, 54 + sizeof(mm_idp->syscall_fd_map), 0); 55 + } 46 56 } 47 57 48 58 static inline unsigned long *check_init_stack(struct mm_id * mm_idp, ··· 90 80 int n, i; 91 81 int err, pid = mm_idp->pid; 92 82 93 - n = ptrace_setregs(pid, syscall_regs); 94 - if (n < 0) { 95 - printk(UM_KERN_ERR "Registers - \n"); 96 - for (i = 0; i < MAX_REG_NR; i++) 97 - printk(UM_KERN_ERR "\t%d\t0x%lx\n", i, syscall_regs[i]); 98 - panic("%s : PTRACE_SETREGS failed, errno = %d\n", 99 - __func__, -n); 100 - } 101 - 102 83 /* Inform process how much we have filled in. */ 103 84 proc_data->syscall_data_len = mm_idp->syscall_data_len; 104 85 105 - err = ptrace(PTRACE_CONT, pid, 0, 0); 106 - if (err) 107 - panic("Failed to continue stub, pid = %d, errno = %d\n", pid, 108 - errno); 86 + if (using_seccomp) { 87 + proc_data->restart_wait = 1; 88 + wait_stub_done_seccomp(mm_idp, 0, 1); 89 + } else { 90 + n = ptrace_setregs(pid, syscall_regs); 91 + if (n < 0) { 92 + printk(UM_KERN_ERR "Registers -\n"); 93 + for (i = 0; i < MAX_REG_NR; i++) 94 + printk(UM_KERN_ERR "\t%d\t0x%lx\n", i, syscall_regs[i]); 95 + panic("%s : PTRACE_SETREGS failed, errno = %d\n", 96 + __func__, -n); 97 + } 109 98 110 - wait_stub_done(pid); 99 + err = ptrace(PTRACE_CONT, pid, 0, 0); 100 + if (err) 101 + panic("Failed to continue stub, pid = %d, errno = %d\n", 102 + pid, errno); 103 + 104 + wait_stub_done(pid); 105 + } 111 106 112 107 /* 113 - * proc_data->err will be non-zero if there was an (unexpected) error. 108 + * proc_data->err will be negative if there was an (unexpected) error. 114 109 * In that case, syscall_data_len points to the last executed syscall, 115 110 * otherwise it will be zero (but we do not need to rely on that). 116 111 */ ··· 127 112 } else { 128 113 mm_idp->syscall_data_len = 0; 129 114 } 115 + 116 + if (using_seccomp) 117 + mm_idp->syscall_fd_num = 0; 130 118 131 119 return mm_idp->syscall_data_len; 132 120 } ··· 193 175 return NULL; 194 176 } 195 177 178 + static int get_stub_fd(struct mm_id *mm_idp, int fd) 179 + { 180 + int i; 181 + 182 + /* Find an FD slot (or flush and use first) */ 183 + if (!using_seccomp) 184 + return fd; 185 + 186 + /* Already crashed, value does not matter */ 187 + if (mm_idp->syscall_data_len < 0) 188 + return 0; 189 + 190 + /* Find existing FD in map if we can allocate another syscall */ 191 + if (mm_idp->syscall_data_len < 192 + ARRAY_SIZE(((struct stub_data *)NULL)->syscall_data)) { 193 + for (i = 0; i < mm_idp->syscall_fd_num; i++) { 194 + if (mm_idp->syscall_fd_map[i] == fd) 195 + return i; 196 + } 197 + 198 + if (mm_idp->syscall_fd_num < STUB_MAX_FDS) { 199 + i = mm_idp->syscall_fd_num; 200 + mm_idp->syscall_fd_map[i] = fd; 201 + 202 + mm_idp->syscall_fd_num++; 203 + 204 + return i; 205 + } 206 + } 207 + 208 + /* FD map full or no syscall space available, continue after flush */ 209 + do_syscall_stub(mm_idp); 210 + mm_idp->syscall_fd_map[0] = fd; 211 + mm_idp->syscall_fd_num = 1; 212 + 213 + return 0; 214 + } 215 + 196 216 int map(struct mm_id *mm_idp, unsigned long virt, unsigned long len, int prot, 197 217 int phys_fd, unsigned long long offset) 198 218 { ··· 238 182 239 183 /* Compress with previous syscall if that is possible */ 240 184 sc = syscall_stub_get_previous(mm_idp, STUB_SYSCALL_MMAP, virt); 241 - if (sc && sc->mem.prot == prot && sc->mem.fd == phys_fd && 185 + if (sc && sc->mem.prot == prot && 242 186 sc->mem.offset == MMAP_OFFSET(offset - sc->mem.length)) { 243 - sc->mem.length += len; 244 - return 0; 187 + int prev_fd = sc->mem.fd; 188 + 189 + if (using_seccomp) 190 + prev_fd = mm_idp->syscall_fd_map[sc->mem.fd]; 191 + 192 + if (phys_fd == prev_fd) { 193 + sc->mem.length += len; 194 + return 0; 195 + } 245 196 } 197 + 198 + phys_fd = get_stub_fd(mm_idp, phys_fd); 246 199 247 200 sc = syscall_stub_alloc(mm_idp); 248 201 sc->syscall = STUB_SYSCALL_MMAP;
+357 -131
arch/um/os-Linux/skas/process.c
··· 1 1 // SPDX-License-Identifier: GPL-2.0 2 2 /* 3 + * Copyright (C) 2021 Benjamin Berg <benjamin@sipsolutions.net> 3 4 * Copyright (C) 2015 Thomas Meyer (thomas@m3y3r.de) 4 5 * Copyright (C) 2002- 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com) 5 6 */ ··· 16 15 #include <sys/mman.h> 17 16 #include <sys/wait.h> 18 17 #include <sys/stat.h> 18 + #include <sys/socket.h> 19 19 #include <asm/unistd.h> 20 20 #include <as-layout.h> 21 21 #include <init.h> ··· 27 25 #include <registers.h> 28 26 #include <skas.h> 29 27 #include <sysdep/stub.h> 28 + #include <sysdep/mcontext.h> 29 + #include <linux/futex.h> 30 30 #include <linux/threads.h> 31 31 #include <timetravel.h> 32 + #include <asm-generic/rwonce.h> 32 33 #include "../internal.h" 33 34 34 35 int is_skas_winch(int pid, int fd, void *data) ··· 147 142 fatal_sigsegv(); 148 143 } 149 144 145 + void wait_stub_done_seccomp(struct mm_id *mm_idp, int running, int wait_sigsys) 146 + { 147 + struct stub_data *data = (void *)mm_idp->stack; 148 + int ret; 149 + 150 + do { 151 + const char byte = 0; 152 + struct iovec iov = { 153 + .iov_base = (void *)&byte, 154 + .iov_len = sizeof(byte), 155 + }; 156 + union { 157 + char data[CMSG_SPACE(sizeof(mm_idp->syscall_fd_map))]; 158 + struct cmsghdr align; 159 + } ctrl; 160 + struct msghdr msgh = { 161 + .msg_iov = &iov, 162 + .msg_iovlen = 1, 163 + }; 164 + 165 + if (!running) { 166 + if (mm_idp->syscall_fd_num) { 167 + unsigned int fds_size = 168 + sizeof(int) * mm_idp->syscall_fd_num; 169 + struct cmsghdr *cmsg; 170 + 171 + msgh.msg_control = ctrl.data; 172 + msgh.msg_controllen = CMSG_SPACE(fds_size); 173 + cmsg = CMSG_FIRSTHDR(&msgh); 174 + cmsg->cmsg_level = SOL_SOCKET; 175 + cmsg->cmsg_type = SCM_RIGHTS; 176 + cmsg->cmsg_len = CMSG_LEN(fds_size); 177 + memcpy(CMSG_DATA(cmsg), mm_idp->syscall_fd_map, 178 + fds_size); 179 + 180 + CATCH_EINTR(syscall(__NR_sendmsg, mm_idp->sock, 181 + &msgh, 0)); 182 + } 183 + 184 + data->signal = 0; 185 + data->futex = FUTEX_IN_CHILD; 186 + CATCH_EINTR(syscall(__NR_futex, &data->futex, 187 + FUTEX_WAKE, 1, NULL, NULL, 0)); 188 + } 189 + 190 + do { 191 + /* 192 + * We need to check whether the child is still alive 193 + * before and after the FUTEX_WAIT call. Before, in 194 + * case it just died but we still updated data->futex 195 + * to FUTEX_IN_CHILD. And after, in case it died while 196 + * we were waiting (and SIGCHLD woke us up, see the 197 + * IRQ handler in mmu.c). 198 + * 199 + * Either way, if PID is negative, then we have no 200 + * choice but to kill the task. 201 + */ 202 + if (__READ_ONCE(mm_idp->pid) < 0) 203 + goto out_kill; 204 + 205 + ret = syscall(__NR_futex, &data->futex, 206 + FUTEX_WAIT, FUTEX_IN_CHILD, 207 + NULL, NULL, 0); 208 + if (ret < 0 && errno != EINTR && errno != EAGAIN) { 209 + printk(UM_KERN_ERR "%s : FUTEX_WAIT failed, errno = %d\n", 210 + __func__, errno); 211 + goto out_kill; 212 + } 213 + } while (data->futex == FUTEX_IN_CHILD); 214 + 215 + if (__READ_ONCE(mm_idp->pid) < 0) 216 + goto out_kill; 217 + 218 + running = 0; 219 + 220 + /* We may receive a SIGALRM before SIGSYS, iterate again. */ 221 + } while (wait_sigsys && data->signal == SIGALRM); 222 + 223 + if (data->mctx_offset > sizeof(data->sigstack) - sizeof(mcontext_t)) { 224 + printk(UM_KERN_ERR "%s : invalid mcontext offset", __func__); 225 + goto out_kill; 226 + } 227 + 228 + if (wait_sigsys && data->signal != SIGSYS) { 229 + printk(UM_KERN_ERR "%s : expected SIGSYS but got %d", 230 + __func__, data->signal); 231 + goto out_kill; 232 + } 233 + 234 + return; 235 + 236 + out_kill: 237 + printk(UM_KERN_ERR "%s : failed to wait for stub, pid = %d, errno = %d\n", 238 + __func__, mm_idp->pid, errno); 239 + /* This is not true inside start_userspace */ 240 + if (current_mm_id() == mm_idp) 241 + fatal_sigsegv(); 242 + } 243 + 150 244 extern unsigned long current_stub_stack(void); 151 245 152 246 static void get_skas_faultinfo(int pid, struct faultinfo *fi) ··· 267 163 memcpy(fi, (void *)current_stub_stack(), sizeof(*fi)); 268 164 } 269 165 270 - static void handle_segv(int pid, struct uml_pt_regs *regs) 271 - { 272 - get_skas_faultinfo(pid, &regs->faultinfo); 273 - segv(regs->faultinfo, 0, 1, NULL, NULL); 274 - } 275 - 276 166 static void handle_trap(int pid, struct uml_pt_regs *regs) 277 167 { 278 168 if ((UPT_IP(regs) >= STUB_START) && (UPT_IP(regs) < STUB_END)) ··· 279 181 280 182 static int stub_exe_fd; 281 183 184 + struct tramp_data { 185 + struct stub_data *stub_data; 186 + /* 0 is inherited, 1 is the kernel side */ 187 + int sockpair[2]; 188 + }; 189 + 282 190 #ifndef CLOSE_RANGE_CLOEXEC 283 191 #define CLOSE_RANGE_CLOEXEC (1U << 2) 284 192 #endif 285 193 286 - static int userspace_tramp(void *stack) 194 + static int userspace_tramp(void *data) 287 195 { 196 + struct tramp_data *tramp_data = data; 288 197 char *const argv[] = { "uml-userspace", NULL }; 289 - int pipe_fds[2]; 290 198 unsigned long long offset; 291 199 struct stub_init_data init_data = { 200 + .seccomp = using_seccomp, 292 201 .stub_start = STUB_START, 293 - .segv_handler = STUB_CODE + 294 - (unsigned long) stub_segv_handler - 295 - (unsigned long) __syscall_stub_start, 296 202 }; 297 203 struct iomem_region *iomem; 298 204 int ret; 205 + 206 + if (using_seccomp) { 207 + init_data.signal_handler = STUB_CODE + 208 + (unsigned long) stub_signal_interrupt - 209 + (unsigned long) __syscall_stub_start; 210 + init_data.signal_restorer = STUB_CODE + 211 + (unsigned long) stub_signal_restorer - 212 + (unsigned long) __syscall_stub_start; 213 + } else { 214 + init_data.signal_handler = STUB_CODE + 215 + (unsigned long) stub_segv_handler - 216 + (unsigned long) __syscall_stub_start; 217 + init_data.signal_restorer = 0; 218 + } 299 219 300 220 init_data.stub_code_fd = phys_mapping(uml_to_phys(__syscall_stub_start), 301 221 &offset); 302 222 init_data.stub_code_offset = MMAP_OFFSET(offset); 303 223 304 - init_data.stub_data_fd = phys_mapping(uml_to_phys(stack), &offset); 224 + init_data.stub_data_fd = phys_mapping(uml_to_phys(tramp_data->stub_data), 225 + &offset); 305 226 init_data.stub_data_offset = MMAP_OFFSET(offset); 306 227 307 228 /* ··· 331 214 syscall(__NR_close_range, 0, ~0U, CLOSE_RANGE_CLOEXEC); 332 215 333 216 fcntl(init_data.stub_data_fd, F_SETFD, 0); 334 - for (iomem = iomem_regions; iomem; iomem = iomem->next) 335 - fcntl(iomem->fd, F_SETFD, 0); 336 217 337 - /* Create a pipe for init_data (no CLOEXEC) and dup2 to STDIN */ 338 - if (pipe(pipe_fds)) 339 - exit(2); 218 + /* In SECCOMP mode, these FDs are passed when needed */ 219 + if (!using_seccomp) { 220 + for (iomem = iomem_regions; iomem; iomem = iomem->next) 221 + fcntl(iomem->fd, F_SETFD, 0); 222 + } 340 223 341 - if (dup2(pipe_fds[0], 0) < 0) 224 + /* dup2 signaling FD/socket to STDIN */ 225 + if (dup2(tramp_data->sockpair[0], 0) < 0) 342 226 exit(3); 343 - close(pipe_fds[0]); 227 + close(tramp_data->sockpair[0]); 344 228 345 229 /* Write init_data and close write side */ 346 - ret = write(pipe_fds[1], &init_data, sizeof(init_data)); 347 - close(pipe_fds[1]); 230 + ret = write(tramp_data->sockpair[1], &init_data, sizeof(init_data)); 231 + close(tramp_data->sockpair[1]); 348 232 349 233 if (ret != sizeof(init_data)) 350 234 exit(4); ··· 433 315 } 434 316 __initcall(init_stub_exe_fd); 435 317 318 + int using_seccomp; 436 319 int userspace_pid[NR_CPUS]; 437 320 438 321 /** 439 322 * start_userspace() - prepare a new userspace process 440 - * @stub_stack: pointer to the stub stack. 323 + * @mm_id: The corresponding struct mm_id 441 324 * 442 325 * Setups a new temporary stack page that is used while userspace_tramp() runs 443 326 * Clones the kernel process into a new userspace process, with FDs only. ··· 447 328 * when negative: an error number. 448 329 * FIXME: can PIDs become negative?! 449 330 */ 450 - int start_userspace(unsigned long stub_stack) 331 + int start_userspace(struct mm_id *mm_id) 451 332 { 333 + struct stub_data *proc_data = (void *)mm_id->stack; 334 + struct tramp_data tramp_data = { 335 + .stub_data = proc_data, 336 + }; 452 337 void *stack; 453 338 unsigned long sp; 454 - int pid, status, n, err; 339 + int status, n, err; 455 340 456 341 /* setup a temporary stack page */ 457 342 stack = mmap(NULL, UM_KERN_PAGE_SIZE, ··· 471 348 /* set stack pointer to the end of the stack page, so it can grow downwards */ 472 349 sp = (unsigned long)stack + UM_KERN_PAGE_SIZE; 473 350 474 - /* clone into new userspace process */ 475 - pid = clone(userspace_tramp, (void *) sp, 476 - CLONE_VFORK | CLONE_VM | SIGCHLD, 477 - (void *)stub_stack); 478 - if (pid < 0) { 351 + /* socket pair for init data and SECCOMP FD passing (no CLOEXEC here) */ 352 + if (socketpair(AF_UNIX, SOCK_STREAM, 0, tramp_data.sockpair)) { 479 353 err = -errno; 480 - printk(UM_KERN_ERR "%s : clone failed, errno = %d\n", 354 + printk(UM_KERN_ERR "%s : socketpair failed, errno = %d\n", 481 355 __func__, errno); 482 356 return err; 483 357 } 484 358 485 - do { 486 - CATCH_EINTR(n = waitpid(pid, &status, WUNTRACED | __WALL)); 487 - if (n < 0) { 359 + if (using_seccomp) 360 + proc_data->futex = FUTEX_IN_CHILD; 361 + 362 + mm_id->pid = clone(userspace_tramp, (void *) sp, 363 + CLONE_VFORK | CLONE_VM | SIGCHLD, 364 + (void *)&tramp_data); 365 + if (mm_id->pid < 0) { 366 + err = -errno; 367 + printk(UM_KERN_ERR "%s : clone failed, errno = %d\n", 368 + __func__, errno); 369 + goto out_close; 370 + } 371 + 372 + if (using_seccomp) { 373 + wait_stub_done_seccomp(mm_id, 1, 1); 374 + } else { 375 + do { 376 + CATCH_EINTR(n = waitpid(mm_id->pid, &status, 377 + WUNTRACED | __WALL)); 378 + if (n < 0) { 379 + err = -errno; 380 + printk(UM_KERN_ERR "%s : wait failed, errno = %d\n", 381 + __func__, errno); 382 + goto out_kill; 383 + } 384 + } while (WIFSTOPPED(status) && (WSTOPSIG(status) == SIGALRM)); 385 + 386 + if (!WIFSTOPPED(status) || (WSTOPSIG(status) != SIGSTOP)) { 387 + err = -EINVAL; 388 + printk(UM_KERN_ERR "%s : expected SIGSTOP, got status = %d\n", 389 + __func__, status); 390 + goto out_kill; 391 + } 392 + 393 + if (ptrace(PTRACE_SETOPTIONS, mm_id->pid, NULL, 394 + (void *) PTRACE_O_TRACESYSGOOD) < 0) { 488 395 err = -errno; 489 - printk(UM_KERN_ERR "%s : wait failed, errno = %d\n", 396 + printk(UM_KERN_ERR "%s : PTRACE_SETOPTIONS failed, errno = %d\n", 490 397 __func__, errno); 491 398 goto out_kill; 492 399 } 493 - } while (WIFSTOPPED(status) && (WSTOPSIG(status) == SIGALRM)); 494 - 495 - if (!WIFSTOPPED(status) || (WSTOPSIG(status) != SIGSTOP)) { 496 - err = -EINVAL; 497 - printk(UM_KERN_ERR "%s : expected SIGSTOP, got status = %d\n", 498 - __func__, status); 499 - goto out_kill; 500 - } 501 - 502 - if (ptrace(PTRACE_SETOPTIONS, pid, NULL, 503 - (void *) PTRACE_O_TRACESYSGOOD) < 0) { 504 - err = -errno; 505 - printk(UM_KERN_ERR "%s : PTRACE_SETOPTIONS failed, errno = %d\n", 506 - __func__, errno); 507 - goto out_kill; 508 400 } 509 401 510 402 if (munmap(stack, UM_KERN_PAGE_SIZE) < 0) { ··· 529 391 goto out_kill; 530 392 } 531 393 532 - return pid; 394 + close(tramp_data.sockpair[0]); 395 + if (using_seccomp) 396 + mm_id->sock = tramp_data.sockpair[1]; 397 + else 398 + close(tramp_data.sockpair[1]); 533 399 534 - out_kill: 535 - os_kill_ptraced_process(pid, 1); 400 + return 0; 401 + 402 + out_kill: 403 + os_kill_ptraced_process(mm_id->pid, 1); 404 + out_close: 405 + close(tramp_data.sockpair[0]); 406 + close(tramp_data.sockpair[1]); 407 + 408 + mm_id->pid = -1; 409 + 536 410 return err; 537 411 } 538 412 ··· 554 404 void userspace(struct uml_pt_regs *regs) 555 405 { 556 406 int err, status, op, pid = userspace_pid[0]; 557 - siginfo_t si; 407 + siginfo_t si_ptrace; 408 + siginfo_t *si; 409 + int sig; 558 410 559 411 /* Handle any immediate reschedules or signals */ 560 412 interrupt_end(); ··· 589 437 590 438 current_mm_sync(); 591 439 592 - /* Flush out any pending syscalls */ 593 - err = syscall_stub_flush(current_mm_id()); 594 - if (err) { 595 - if (err == -ENOMEM) 596 - report_enomem(); 440 + if (using_seccomp) { 441 + struct mm_id *mm_id = current_mm_id(); 442 + struct stub_data *proc_data = (void *) mm_id->stack; 443 + int ret; 597 444 598 - printk(UM_KERN_ERR "%s - Error flushing stub syscalls: %d", 599 - __func__, -err); 600 - fatal_sigsegv(); 601 - } 445 + ret = set_stub_state(regs, proc_data, singlestepping()); 446 + if (ret) { 447 + printk(UM_KERN_ERR "%s - failed to set regs: %d", 448 + __func__, ret); 449 + fatal_sigsegv(); 450 + } 602 451 603 - /* 604 - * This can legitimately fail if the process loads a 605 - * bogus value into a segment register. It will 606 - * segfault and PTRACE_GETREGS will read that value 607 - * out of the process. However, PTRACE_SETREGS will 608 - * fail. In this case, there is nothing to do but 609 - * just kill the process. 610 - */ 611 - if (ptrace(PTRACE_SETREGS, pid, 0, regs->gp)) { 612 - printk(UM_KERN_ERR "%s - ptrace set regs failed, errno = %d\n", 613 - __func__, errno); 614 - fatal_sigsegv(); 615 - } 452 + /* Must have been reset by the syscall caller */ 453 + if (proc_data->restart_wait != 0) 454 + panic("Programming error: Flag to only run syscalls in child was not cleared!"); 616 455 617 - if (put_fp_registers(pid, regs->fp)) { 618 - printk(UM_KERN_ERR "%s - ptrace set fp regs failed, errno = %d\n", 619 - __func__, errno); 620 - fatal_sigsegv(); 621 - } 456 + /* Mark pending syscalls for flushing */ 457 + proc_data->syscall_data_len = mm_id->syscall_data_len; 622 458 623 - if (singlestepping()) 624 - op = PTRACE_SYSEMU_SINGLESTEP; 625 - else 626 - op = PTRACE_SYSEMU; 459 + wait_stub_done_seccomp(mm_id, 0, 0); 627 460 628 - if (ptrace(op, pid, 0, 0)) { 629 - printk(UM_KERN_ERR "%s - ptrace continue failed, op = %d, errno = %d\n", 630 - __func__, op, errno); 631 - fatal_sigsegv(); 632 - } 461 + sig = proc_data->signal; 633 462 634 - CATCH_EINTR(err = waitpid(pid, &status, WUNTRACED | __WALL)); 635 - if (err < 0) { 636 - printk(UM_KERN_ERR "%s - wait failed, errno = %d\n", 637 - __func__, errno); 638 - fatal_sigsegv(); 639 - } 463 + if (sig == SIGTRAP && proc_data->err != 0) { 464 + printk(UM_KERN_ERR "%s - Error flushing stub syscalls", 465 + __func__); 466 + syscall_stub_dump_error(mm_id); 467 + mm_id->syscall_data_len = proc_data->err; 468 + fatal_sigsegv(); 469 + } 640 470 641 - regs->is_user = 1; 642 - if (ptrace(PTRACE_GETREGS, pid, 0, regs->gp)) { 643 - printk(UM_KERN_ERR "%s - PTRACE_GETREGS failed, errno = %d\n", 644 - __func__, errno); 645 - fatal_sigsegv(); 646 - } 471 + mm_id->syscall_data_len = 0; 472 + mm_id->syscall_fd_num = 0; 647 473 648 - if (get_fp_registers(pid, regs->fp)) { 649 - printk(UM_KERN_ERR "%s - get_fp_registers failed, errno = %d\n", 650 - __func__, errno); 651 - fatal_sigsegv(); 474 + ret = get_stub_state(regs, proc_data, NULL); 475 + if (ret) { 476 + printk(UM_KERN_ERR "%s - failed to get regs: %d", 477 + __func__, ret); 478 + fatal_sigsegv(); 479 + } 480 + 481 + if (proc_data->si_offset > sizeof(proc_data->sigstack) - sizeof(*si)) 482 + panic("%s - Invalid siginfo offset from child", 483 + __func__); 484 + si = (void *)&proc_data->sigstack[proc_data->si_offset]; 485 + 486 + regs->is_user = 1; 487 + 488 + /* Fill in ORIG_RAX and extract fault information */ 489 + PT_SYSCALL_NR(regs->gp) = si->si_syscall; 490 + if (sig == SIGSEGV) { 491 + mcontext_t *mcontext = (void *)&proc_data->sigstack[proc_data->mctx_offset]; 492 + 493 + GET_FAULTINFO_FROM_MC(regs->faultinfo, mcontext); 494 + } 495 + } else { 496 + /* Flush out any pending syscalls */ 497 + err = syscall_stub_flush(current_mm_id()); 498 + if (err) { 499 + if (err == -ENOMEM) 500 + report_enomem(); 501 + 502 + printk(UM_KERN_ERR "%s - Error flushing stub syscalls: %d", 503 + __func__, -err); 504 + fatal_sigsegv(); 505 + } 506 + 507 + /* 508 + * This can legitimately fail if the process loads a 509 + * bogus value into a segment register. It will 510 + * segfault and PTRACE_GETREGS will read that value 511 + * out of the process. However, PTRACE_SETREGS will 512 + * fail. In this case, there is nothing to do but 513 + * just kill the process. 514 + */ 515 + if (ptrace(PTRACE_SETREGS, pid, 0, regs->gp)) { 516 + printk(UM_KERN_ERR "%s - ptrace set regs failed, errno = %d\n", 517 + __func__, errno); 518 + fatal_sigsegv(); 519 + } 520 + 521 + if (put_fp_registers(pid, regs->fp)) { 522 + printk(UM_KERN_ERR "%s - ptrace set fp regs failed, errno = %d\n", 523 + __func__, errno); 524 + fatal_sigsegv(); 525 + } 526 + 527 + if (singlestepping()) 528 + op = PTRACE_SYSEMU_SINGLESTEP; 529 + else 530 + op = PTRACE_SYSEMU; 531 + 532 + if (ptrace(op, pid, 0, 0)) { 533 + printk(UM_KERN_ERR "%s - ptrace continue failed, op = %d, errno = %d\n", 534 + __func__, op, errno); 535 + fatal_sigsegv(); 536 + } 537 + 538 + CATCH_EINTR(err = waitpid(pid, &status, WUNTRACED | __WALL)); 539 + if (err < 0) { 540 + printk(UM_KERN_ERR "%s - wait failed, errno = %d\n", 541 + __func__, errno); 542 + fatal_sigsegv(); 543 + } 544 + 545 + regs->is_user = 1; 546 + if (ptrace(PTRACE_GETREGS, pid, 0, regs->gp)) { 547 + printk(UM_KERN_ERR "%s - PTRACE_GETREGS failed, errno = %d\n", 548 + __func__, errno); 549 + fatal_sigsegv(); 550 + } 551 + 552 + if (get_fp_registers(pid, regs->fp)) { 553 + printk(UM_KERN_ERR "%s - get_fp_registers failed, errno = %d\n", 554 + __func__, errno); 555 + fatal_sigsegv(); 556 + } 557 + 558 + if (WIFSTOPPED(status)) { 559 + sig = WSTOPSIG(status); 560 + 561 + /* 562 + * These signal handlers need the si argument 563 + * and SIGSEGV needs the faultinfo. 564 + * The SIGIO and SIGALARM handlers which constitute 565 + * the majority of invocations, do not use it. 566 + */ 567 + switch (sig) { 568 + case SIGSEGV: 569 + get_skas_faultinfo(pid, 570 + &regs->faultinfo); 571 + fallthrough; 572 + case SIGTRAP: 573 + case SIGILL: 574 + case SIGBUS: 575 + case SIGFPE: 576 + case SIGWINCH: 577 + ptrace(PTRACE_GETSIGINFO, pid, 0, 578 + (struct siginfo *)&si_ptrace); 579 + si = &si_ptrace; 580 + break; 581 + default: 582 + si = NULL; 583 + break; 584 + } 585 + } else { 586 + sig = 0; 587 + } 652 588 } 653 589 654 590 UPT_SYSCALL_NR(regs) = -1; /* Assume: It's not a syscall */ 655 591 656 - if (WIFSTOPPED(status)) { 657 - int sig = WSTOPSIG(status); 658 - 659 - /* These signal handlers need the si argument. 660 - * The SIGIO and SIGALARM handlers which constitute the 661 - * majority of invocations, do not use it. 662 - */ 592 + if (sig) { 663 593 switch (sig) { 664 594 case SIGSEGV: 665 - case SIGTRAP: 666 - case SIGILL: 667 - case SIGBUS: 668 - case SIGFPE: 669 - case SIGWINCH: 670 - ptrace(PTRACE_GETSIGINFO, pid, 0, (struct siginfo *)&si); 671 - break; 672 - } 673 - 674 - switch (sig) { 675 - case SIGSEGV: 676 - if (PTRACE_FULL_FAULTINFO) { 677 - get_skas_faultinfo(pid, 678 - &regs->faultinfo); 679 - (*sig_info[SIGSEGV])(SIGSEGV, (struct siginfo *)&si, 595 + if (using_seccomp || PTRACE_FULL_FAULTINFO) 596 + (*sig_info[SIGSEGV])(SIGSEGV, 597 + (struct siginfo *)si, 680 598 regs, NULL); 681 - } 682 - else handle_segv(pid, regs); 599 + else 600 + segv(regs->faultinfo, 0, 1, NULL, NULL); 601 + 602 + break; 603 + case SIGSYS: 604 + handle_syscall(regs); 683 605 break; 684 606 case SIGTRAP + 0x80: 685 607 handle_trap(pid, regs); 686 608 break; 687 609 case SIGTRAP: 688 - relay_signal(SIGTRAP, (struct siginfo *)&si, regs, NULL); 610 + relay_signal(SIGTRAP, (struct siginfo *)si, regs, NULL); 689 611 break; 690 612 case SIGALRM: 691 613 break; ··· 769 543 case SIGFPE: 770 544 case SIGWINCH: 771 545 block_signals_trace(); 772 - (*sig_info[sig])(sig, (struct siginfo *)&si, regs, NULL); 546 + (*sig_info[sig])(sig, (struct siginfo *)si, regs, NULL); 773 547 unblock_signals_trace(); 774 548 break; 775 549 default:
+193 -2
arch/um/os-Linux/start_up.c
··· 1 1 // SPDX-License-Identifier: GPL-2.0 2 2 /* 3 + * Copyright (C) 2021 Benjamin Berg <benjamin@sipsolutions.net> 3 4 * Copyright (C) 2000 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com) 4 5 */ 5 6 ··· 25 24 #include <kern_util.h> 26 25 #include <mem_user.h> 27 26 #include <ptrace_user.h> 27 + #include <stdbool.h> 28 + #include <stub-data.h> 29 + #include <sys/prctl.h> 30 + #include <linux/seccomp.h> 31 + #include <linux/filter.h> 32 + #include <sysdep/mcontext.h> 33 + #include <sysdep/stub.h> 28 34 #include <registers.h> 29 35 #include <skas.h> 30 36 #include "internal.h" ··· 232 224 check_sysemu(); 233 225 } 234 226 227 + extern unsigned long host_fp_size; 228 + extern unsigned long exec_regs[MAX_REG_NR]; 229 + extern unsigned long *exec_fp_regs; 230 + 231 + __initdata static struct stub_data *seccomp_test_stub_data; 232 + 233 + static void __init sigsys_handler(int sig, siginfo_t *info, void *p) 234 + { 235 + ucontext_t *uc = p; 236 + 237 + /* Stow away the location of the mcontext in the stack */ 238 + seccomp_test_stub_data->mctx_offset = (unsigned long)&uc->uc_mcontext - 239 + (unsigned long)&seccomp_test_stub_data->sigstack[0]; 240 + 241 + /* Prevent libc from clearing memory (mctx_offset in particular) */ 242 + syscall(__NR_exit, 0); 243 + } 244 + 245 + static int __init seccomp_helper(void *data) 246 + { 247 + static struct sock_filter filter[] = { 248 + BPF_STMT(BPF_LD | BPF_W | BPF_ABS, 249 + offsetof(struct seccomp_data, nr)), 250 + BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_clock_nanosleep, 1, 0), 251 + BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_ALLOW), 252 + BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_TRAP), 253 + }; 254 + static struct sock_fprog prog = { 255 + .len = ARRAY_SIZE(filter), 256 + .filter = filter, 257 + }; 258 + struct sigaction sa; 259 + 260 + /* close_range is needed for the stub */ 261 + if (stub_syscall3(__NR_close_range, 1, ~0U, 0)) 262 + exit(1); 263 + 264 + set_sigstack(seccomp_test_stub_data->sigstack, 265 + sizeof(seccomp_test_stub_data->sigstack)); 266 + 267 + sa.sa_flags = SA_ONSTACK | SA_NODEFER | SA_SIGINFO; 268 + sa.sa_sigaction = (void *) sigsys_handler; 269 + sa.sa_restorer = NULL; 270 + if (sigaction(SIGSYS, &sa, NULL) < 0) 271 + exit(2); 272 + 273 + prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0); 274 + if (syscall(__NR_seccomp, SECCOMP_SET_MODE_FILTER, 275 + SECCOMP_FILTER_FLAG_TSYNC, &prog) != 0) 276 + exit(3); 277 + 278 + sleep(0); 279 + 280 + /* Never reached. */ 281 + _exit(4); 282 + } 283 + 284 + static bool __init init_seccomp(void) 285 + { 286 + int pid; 287 + int status; 288 + int n; 289 + unsigned long sp; 290 + 291 + /* 292 + * We check that we can install a seccomp filter and then exit(0) 293 + * from a trapped syscall. 294 + * 295 + * Note that we cannot verify that no seccomp filter already exists 296 + * for a syscall that results in the process/thread to be killed. 297 + */ 298 + 299 + os_info("Checking that seccomp filters can be installed..."); 300 + 301 + seccomp_test_stub_data = mmap(0, sizeof(*seccomp_test_stub_data), 302 + PROT_READ | PROT_WRITE, 303 + MAP_SHARED | MAP_ANON, 0, 0); 304 + 305 + /* Use the syscall data area as stack, we just need something */ 306 + sp = (unsigned long)&seccomp_test_stub_data->syscall_data + 307 + sizeof(seccomp_test_stub_data->syscall_data) - 308 + sizeof(void *); 309 + pid = clone(seccomp_helper, (void *)sp, CLONE_VFORK | CLONE_VM, NULL); 310 + 311 + if (pid < 0) 312 + fatal_perror("check_seccomp : clone failed"); 313 + 314 + CATCH_EINTR(n = waitpid(pid, &status, __WCLONE)); 315 + if (n < 0) 316 + fatal_perror("check_seccomp : waitpid failed"); 317 + 318 + if (WIFEXITED(status) && WEXITSTATUS(status) == 0) { 319 + struct uml_pt_regs *regs; 320 + unsigned long fp_size; 321 + int r; 322 + 323 + /* Fill in the host_fp_size from the mcontext. */ 324 + regs = calloc(1, sizeof(struct uml_pt_regs)); 325 + get_stub_state(regs, seccomp_test_stub_data, &fp_size); 326 + host_fp_size = fp_size; 327 + free(regs); 328 + 329 + /* Repeat with the correct size */ 330 + regs = calloc(1, sizeof(struct uml_pt_regs) + host_fp_size); 331 + r = get_stub_state(regs, seccomp_test_stub_data, NULL); 332 + 333 + /* Store as the default startup registers */ 334 + exec_fp_regs = malloc(host_fp_size); 335 + memcpy(exec_regs, regs->gp, sizeof(exec_regs)); 336 + memcpy(exec_fp_regs, regs->fp, host_fp_size); 337 + 338 + munmap(seccomp_test_stub_data, sizeof(*seccomp_test_stub_data)); 339 + 340 + free(regs); 341 + 342 + if (r) { 343 + os_info("failed to fetch registers: %d\n", r); 344 + return false; 345 + } 346 + 347 + os_info("OK\n"); 348 + return true; 349 + } 350 + 351 + if (WIFEXITED(status) && WEXITSTATUS(status) == 2) 352 + os_info("missing\n"); 353 + else 354 + os_info("error\n"); 355 + 356 + munmap(seccomp_test_stub_data, sizeof(*seccomp_test_stub_data)); 357 + return false; 358 + } 359 + 360 + 235 361 static void __init check_coredump_limit(void) 236 362 { 237 363 struct rlimit lim; ··· 420 278 } 421 279 } 422 280 281 + static int seccomp_config __initdata; 282 + 283 + static int __init uml_seccomp_config(char *line, int *add) 284 + { 285 + *add = 0; 286 + 287 + if (strcmp(line, "off") == 0) 288 + seccomp_config = 0; 289 + else if (strcmp(line, "auto") == 0) 290 + seccomp_config = 1; 291 + else if (strcmp(line, "on") == 0) 292 + seccomp_config = 2; 293 + else 294 + fatal("Invalid seccomp option '%s', expected on/auto/off\n", 295 + line); 296 + 297 + return 0; 298 + } 299 + 300 + __uml_setup("seccomp=", uml_seccomp_config, 301 + "seccomp=<on/auto/off>\n" 302 + " Configure whether or not SECCOMP is used. With SECCOMP, userspace\n" 303 + " processes work collaboratively with the kernel instead of being\n" 304 + " traced using ptrace. All syscalls from the application are caught and\n" 305 + " redirected using a signal. This signal handler in turn is permitted to\n" 306 + " do the selected set of syscalls to communicate with the UML kernel and\n" 307 + " do the required memory management.\n" 308 + "\n" 309 + " This method is overall faster than the ptrace based userspace, primarily\n" 310 + " because it reduces the number of context switches for (minor) page faults.\n" 311 + "\n" 312 + " However, the SECCOMP filter is not (yet) restrictive enough to prevent\n" 313 + " userspace from reading and writing all physical memory. Userspace\n" 314 + " processes could also trick the stub into disabling SIGALRM which\n" 315 + " prevents it from being interrupted for scheduling purposes.\n" 316 + "\n" 317 + " This is insecure and should only be used with a trusted userspace\n\n" 318 + ); 423 319 424 320 void __init os_early_checks(void) 425 321 { ··· 466 286 /* Print out the core dump limits early */ 467 287 check_coredump_limit(); 468 288 469 - check_ptrace(); 470 - 471 289 /* Need to check this early because mmapping happens before the 472 290 * kernel is running. 473 291 */ 474 292 check_tmpexec(); 293 + 294 + if (seccomp_config) { 295 + if (init_seccomp()) { 296 + using_seccomp = 1; 297 + return; 298 + } 299 + 300 + if (seccomp_config == 2) 301 + fatal("SECCOMP userspace requested but not functional!\n"); 302 + } 303 + 304 + using_seccomp = 0; 305 + check_ptrace(); 475 306 476 307 pid = start_ptraced_child(); 477 308 if (init_pid_registers(pid))
+3
arch/x86/um/asm/checksum.h
··· 20 20 */ 21 21 extern __wsum csum_partial(const void *buff, int len, __wsum sum); 22 22 23 + /* Do not call this directly. Declared for export type visibility. */ 24 + extern __visible __wsum csum_partial_copy_generic(const void *src, void *dst, int len); 25 + 23 26 /** 24 27 * csum_fold - Fold and invert a 32bit checksum. 25 28 * sum: 32bit unfolded sum
+4 -4
arch/x86/um/asm/processor.h
··· 21 21 22 22 #include <asm/user.h> 23 23 24 - /* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */ 25 - static __always_inline void rep_nop(void) 24 + /* PAUSE is a good thing to insert into busy-wait loops. */ 25 + static __always_inline void native_pause(void) 26 26 { 27 - __asm__ __volatile__("rep;nop": : :"memory"); 27 + __asm__ __volatile__("pause": : :"memory"); 28 28 } 29 29 30 30 static __always_inline void cpu_relax(void) ··· 33 33 time_travel_mode == TT_MODE_EXTERNAL) 34 34 time_travel_ndelay(1); 35 35 else 36 - rep_nop(); 36 + native_pause(); 37 37 } 38 38 39 39 #define task_pt_regs(t) (&(t)->thread.regs)
+217 -1
arch/x86/um/os-Linux/mcontext.c
··· 1 1 // SPDX-License-Identifier: GPL-2.0 2 - #include <sys/ucontext.h> 3 2 #define __FRAME_OFFSETS 3 + #include <linux/errno.h> 4 + #include <linux/string.h> 5 + #include <sys/ucontext.h> 4 6 #include <asm/ptrace.h> 7 + #include <asm/sigcontext.h> 5 8 #include <sysdep/ptrace.h> 6 9 #include <sysdep/mcontext.h> 7 10 #include <arch.h> ··· 21 18 COPY2(UESP, ESP); /* sic */ 22 19 COPY(EBX); COPY(EDX); COPY(ECX); COPY(EAX); 23 20 COPY(EIP); COPY_SEG_CPL3(CS); COPY(EFL); COPY_SEG_CPL3(SS); 21 + #undef COPY2 22 + #undef COPY 23 + #undef COPY_SEG 24 + #undef COPY_SEG_CPL3 24 25 #else 25 26 #define COPY2(X,Y) regs->gp[X/sizeof(unsigned long)] = mc->gregs[REG_##Y] 26 27 #define COPY(X) regs->gp[X/sizeof(unsigned long)] = mc->gregs[REG_##X] ··· 36 29 COPY2(EFLAGS, EFL); 37 30 COPY2(CS, CSGSFS); 38 31 regs->gp[SS / sizeof(unsigned long)] = mc->gregs[REG_CSGSFS] >> 48; 32 + #undef COPY2 33 + #undef COPY 39 34 #endif 40 35 } 41 36 ··· 50 41 #else 51 42 mc->gregs[REG_RIP] = (unsigned long)target; 52 43 #endif 44 + } 45 + 46 + /* Same thing, but the copy macros are turned around. */ 47 + void get_mc_from_regs(struct uml_pt_regs *regs, mcontext_t *mc, int single_stepping) 48 + { 49 + #ifdef __i386__ 50 + #define COPY2(X,Y) mc->gregs[REG_##Y] = regs->gp[X] 51 + #define COPY(X) mc->gregs[REG_##X] = regs->gp[X] 52 + #define COPY_SEG(X) mc->gregs[REG_##X] = regs->gp[X] & 0xffff; 53 + #define COPY_SEG_CPL3(X) mc->gregs[REG_##X] = (regs->gp[X] & 0xffff) | 3; 54 + COPY_SEG(GS); COPY_SEG(FS); COPY_SEG(ES); COPY_SEG(DS); 55 + COPY(EDI); COPY(ESI); COPY(EBP); 56 + COPY2(UESP, ESP); /* sic */ 57 + COPY(EBX); COPY(EDX); COPY(ECX); COPY(EAX); 58 + COPY(EIP); COPY_SEG_CPL3(CS); COPY(EFL); COPY_SEG_CPL3(SS); 59 + #else 60 + #define COPY2(X,Y) mc->gregs[REG_##Y] = regs->gp[X/sizeof(unsigned long)] 61 + #define COPY(X) mc->gregs[REG_##X] = regs->gp[X/sizeof(unsigned long)] 62 + COPY(R8); COPY(R9); COPY(R10); COPY(R11); 63 + COPY(R12); COPY(R13); COPY(R14); COPY(R15); 64 + COPY(RDI); COPY(RSI); COPY(RBP); COPY(RBX); 65 + COPY(RDX); COPY(RAX); COPY(RCX); COPY(RSP); 66 + COPY(RIP); 67 + COPY2(EFLAGS, EFL); 68 + mc->gregs[REG_CSGSFS] = mc->gregs[REG_CSGSFS] & 0xffffffffffffl; 69 + mc->gregs[REG_CSGSFS] |= (regs->gp[SS / sizeof(unsigned long)] & 0xffff) << 48; 70 + #endif 71 + 72 + if (single_stepping) 73 + mc->gregs[REG_EFL] |= X86_EFLAGS_TF; 74 + else 75 + mc->gregs[REG_EFL] &= ~X86_EFLAGS_TF; 76 + } 77 + 78 + #ifdef CONFIG_X86_32 79 + struct _xstate_64 { 80 + struct _fpstate_64 fpstate; 81 + struct _header xstate_hdr; 82 + struct _ymmh_state ymmh; 83 + /* New processor state extensions go here: */ 84 + }; 85 + 86 + /* Not quite the right structures as these contain more information */ 87 + int um_i387_from_fxsr(struct _fpstate_32 *i387, 88 + const struct _fpstate_64 *fxsave); 89 + int um_fxsr_from_i387(struct _fpstate_64 *fxsave, 90 + const struct _fpstate_32 *from); 91 + #else 92 + #define _xstate_64 _xstate 93 + #endif 94 + 95 + static struct _fpstate *get_fpstate(struct stub_data *data, 96 + mcontext_t *mcontext, 97 + int *fp_size) 98 + { 99 + struct _fpstate *res; 100 + 101 + /* Assume floating point registers are on the same page */ 102 + res = (void *)(((unsigned long)mcontext->fpregs & 103 + (UM_KERN_PAGE_SIZE - 1)) + 104 + (unsigned long)&data->sigstack[0]); 105 + 106 + if ((void *)res + sizeof(struct _fpstate) > 107 + (void *)data->sigstack + sizeof(data->sigstack)) 108 + return NULL; 109 + 110 + if (res->sw_reserved.magic1 != FP_XSTATE_MAGIC1) { 111 + *fp_size = sizeof(struct _fpstate); 112 + } else { 113 + char *magic2_addr; 114 + 115 + magic2_addr = (void *)res; 116 + magic2_addr += res->sw_reserved.extended_size; 117 + magic2_addr -= FP_XSTATE_MAGIC2_SIZE; 118 + 119 + /* We still need to be within our stack */ 120 + if ((void *)magic2_addr > 121 + (void *)data->sigstack + sizeof(data->sigstack)) 122 + return NULL; 123 + 124 + /* If we do not read MAGIC2, then we did something wrong */ 125 + if (*(__u32 *)magic2_addr != FP_XSTATE_MAGIC2) 126 + return NULL; 127 + 128 + /* Remove MAGIC2 from the size, we do not save/restore it */ 129 + *fp_size = res->sw_reserved.extended_size - 130 + FP_XSTATE_MAGIC2_SIZE; 131 + } 132 + 133 + return res; 134 + } 135 + 136 + int get_stub_state(struct uml_pt_regs *regs, struct stub_data *data, 137 + unsigned long *fp_size_out) 138 + { 139 + mcontext_t *mcontext; 140 + struct _fpstate *fpstate_stub; 141 + struct _xstate_64 *xstate_stub; 142 + int fp_size, xstate_size; 143 + 144 + /* mctx_offset is verified by wait_stub_done_seccomp */ 145 + mcontext = (void *)&data->sigstack[data->mctx_offset]; 146 + 147 + get_regs_from_mc(regs, mcontext); 148 + 149 + fpstate_stub = get_fpstate(data, mcontext, &fp_size); 150 + if (!fpstate_stub) 151 + return -EINVAL; 152 + 153 + #ifdef CONFIG_X86_32 154 + xstate_stub = (void *)&fpstate_stub->_fxsr_env; 155 + xstate_size = fp_size - offsetof(struct _fpstate_32, _fxsr_env); 156 + #else 157 + xstate_stub = (void *)fpstate_stub; 158 + xstate_size = fp_size; 159 + #endif 160 + 161 + if (fp_size_out) 162 + *fp_size_out = xstate_size; 163 + 164 + if (xstate_size > host_fp_size) 165 + return -ENOSPC; 166 + 167 + memcpy(&regs->fp, xstate_stub, xstate_size); 168 + 169 + /* We do not need to read the x86_64 FS_BASE/GS_BASE registers as 170 + * we do not permit userspace to set them directly. 171 + */ 172 + 173 + #ifdef CONFIG_X86_32 174 + /* Read the i387 legacy FP registers */ 175 + if (um_fxsr_from_i387((void *)&regs->fp, fpstate_stub)) 176 + return -EINVAL; 177 + #endif 178 + 179 + return 0; 180 + } 181 + 182 + /* Copied because we cannot include regset.h here. */ 183 + struct task_struct; 184 + struct user_regset; 185 + struct membuf { 186 + void *p; 187 + size_t left; 188 + }; 189 + 190 + int fpregs_legacy_get(struct task_struct *target, 191 + const struct user_regset *regset, 192 + struct membuf to); 193 + 194 + int set_stub_state(struct uml_pt_regs *regs, struct stub_data *data, 195 + int single_stepping) 196 + { 197 + mcontext_t *mcontext; 198 + struct _fpstate *fpstate_stub; 199 + struct _xstate_64 *xstate_stub; 200 + int fp_size, xstate_size; 201 + 202 + /* mctx_offset is verified by wait_stub_done_seccomp */ 203 + mcontext = (void *)&data->sigstack[data->mctx_offset]; 204 + 205 + if ((unsigned long)mcontext < (unsigned long)data->sigstack || 206 + (unsigned long)mcontext > 207 + (unsigned long) data->sigstack + 208 + sizeof(data->sigstack) - sizeof(*mcontext)) 209 + return -EINVAL; 210 + 211 + get_mc_from_regs(regs, mcontext, single_stepping); 212 + 213 + fpstate_stub = get_fpstate(data, mcontext, &fp_size); 214 + if (!fpstate_stub) 215 + return -EINVAL; 216 + 217 + #ifdef CONFIG_X86_32 218 + xstate_stub = (void *)&fpstate_stub->_fxsr_env; 219 + xstate_size = fp_size - offsetof(struct _fpstate_32, _fxsr_env); 220 + #else 221 + xstate_stub = (void *)fpstate_stub; 222 + xstate_size = fp_size; 223 + #endif 224 + 225 + memcpy(xstate_stub, &regs->fp, xstate_size); 226 + 227 + #ifdef __i386__ 228 + /* 229 + * On x86, the GDT entries are updated by arch_set_tls. 230 + */ 231 + 232 + /* Store the i387 legacy FP registers which the host will use */ 233 + if (um_i387_from_fxsr(fpstate_stub, (void *)&regs->fp)) 234 + return -EINVAL; 235 + #else 236 + /* 237 + * On x86_64, we need to sync the FS_BASE/GS_BASE registers using the 238 + * arch specific data. 239 + */ 240 + if (data->arch_data.fs_base != regs->gp[FS_BASE / sizeof(unsigned long)]) { 241 + data->arch_data.fs_base = regs->gp[FS_BASE / sizeof(unsigned long)]; 242 + data->arch_data.sync |= STUB_SYNC_FS_BASE; 243 + } 244 + if (data->arch_data.gs_base != regs->gp[GS_BASE / sizeof(unsigned long)]) { 245 + data->arch_data.gs_base = regs->gp[GS_BASE / sizeof(unsigned long)]; 246 + data->arch_data.sync |= STUB_SYNC_GS_BASE; 247 + } 248 + #endif 249 + 250 + return 0; 53 251 }
+57 -19
arch/x86/um/ptrace.c
··· 25 25 return tmp; 26 26 } 27 27 28 - static inline unsigned long twd_fxsr_to_i387(struct user_fxsr_struct *fxsave) 28 + static inline unsigned long 29 + twd_fxsr_to_i387(const struct user_fxsr_struct *fxsave) 29 30 { 30 31 struct _fpxreg *st = NULL; 31 32 unsigned long twd = (unsigned long) fxsave->twd; ··· 70 69 return ret; 71 70 } 72 71 73 - /* Get/set the old 32bit i387 registers (pre-FPX) */ 74 - static int fpregs_legacy_get(struct task_struct *target, 75 - const struct user_regset *regset, 76 - struct membuf to) 72 + /* 73 + * Get/set the old 32bit i387 registers (pre-FPX) 74 + * 75 + * We provide simple wrappers for mcontext.c, they are only defined locally 76 + * because mcontext.c is userspace facing and needs to a different definition 77 + * of the structures. 78 + */ 79 + static int _um_i387_from_fxsr(struct membuf to, 80 + const struct user_fxsr_struct *fxsave) 77 81 { 78 - struct user_fxsr_struct *fxsave = (void *)target->thread.regs.regs.fp; 79 82 int i; 80 83 81 84 membuf_store(&to, (unsigned long)fxsave->cwd | 0xffff0000ul); ··· 96 91 return 0; 97 92 } 98 93 99 - static int fpregs_legacy_set(struct task_struct *target, 94 + int um_i387_from_fxsr(struct user_i387_struct *i387, 95 + const struct user_fxsr_struct *fxsave); 96 + 97 + int um_i387_from_fxsr(struct user_i387_struct *i387, 98 + const struct user_fxsr_struct *fxsave) 99 + { 100 + struct membuf to = { 101 + .p = i387, 102 + .left = sizeof(*i387), 103 + }; 104 + 105 + return _um_i387_from_fxsr(to, fxsave); 106 + } 107 + 108 + static int fpregs_legacy_get(struct task_struct *target, 100 109 const struct user_regset *regset, 101 - unsigned int pos, unsigned int count, 102 - const void *kbuf, const void __user *ubuf) 110 + struct membuf to) 103 111 { 104 112 struct user_fxsr_struct *fxsave = (void *)target->thread.regs.regs.fp; 105 - const struct user_i387_struct *from; 106 - struct user_i387_struct buf; 107 - int i; 108 113 109 - if (ubuf) { 110 - if (copy_from_user(&buf, ubuf, sizeof(buf))) 111 - return -EFAULT; 112 - from = &buf; 113 - } else { 114 - from = kbuf; 115 - } 114 + return _um_i387_from_fxsr(to, fxsave); 115 + } 116 + 117 + int um_fxsr_from_i387(struct user_fxsr_struct *fxsave, 118 + const struct user_i387_struct *from); 119 + 120 + int um_fxsr_from_i387(struct user_fxsr_struct *fxsave, 121 + const struct user_i387_struct *from) 122 + { 123 + int i; 116 124 117 125 fxsave->cwd = (unsigned short)(from->cwd & 0xffff); 118 126 fxsave->swd = (unsigned short)(from->swd & 0xffff); ··· 142 124 } 143 125 144 126 return 0; 127 + } 128 + 129 + static int fpregs_legacy_set(struct task_struct *target, 130 + const struct user_regset *regset, 131 + unsigned int pos, unsigned int count, 132 + const void *kbuf, const void __user *ubuf) 133 + { 134 + struct user_fxsr_struct *fxsave = (void *)target->thread.regs.regs.fp; 135 + const struct user_i387_struct *from; 136 + struct user_i387_struct buf; 137 + 138 + if (ubuf) { 139 + if (copy_from_user(&buf, ubuf, sizeof(buf))) 140 + return -EFAULT; 141 + from = &buf; 142 + } else { 143 + from = kbuf; 144 + } 145 + 146 + return um_fxsr_from_i387(fxsave, &buf); 145 147 } 146 148 #endif 147 149
+2
arch/x86/um/shared/sysdep/kernel-offsets.h
··· 4 4 #include <linux/elf.h> 5 5 #include <linux/crypto.h> 6 6 #include <linux/kbuild.h> 7 + #include <linux/audit.h> 7 8 #include <asm/mman.h> 9 + #include <asm/seccomp.h> 8 10 9 11 /* workaround for a warning with -Wmissing-prototypes */ 10 12 void foo(void);
+9
arch/x86/um/shared/sysdep/mcontext.h
··· 6 6 #ifndef __SYS_SIGCONTEXT_X86_H 7 7 #define __SYS_SIGCONTEXT_X86_H 8 8 9 + #include <stub-data.h> 10 + 9 11 extern void get_regs_from_mc(struct uml_pt_regs *, mcontext_t *); 12 + extern void get_mc_from_regs(struct uml_pt_regs *regs, mcontext_t *mc, 13 + int single_stepping); 14 + 15 + extern int get_stub_state(struct uml_pt_regs *regs, struct stub_data *data, 16 + unsigned long *fp_size_out); 17 + extern int set_stub_state(struct uml_pt_regs *regs, struct stub_data *data, 18 + int single_stepping); 10 19 11 20 #ifdef __i386__ 12 21
+23
arch/x86/um/shared/sysdep/stub-data.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + #ifndef __ARCH_STUB_DATA_H 3 + #define __ARCH_STUB_DATA_H 4 + 5 + #ifdef __i386__ 6 + #include <generated/asm-offsets.h> 7 + #include <asm/ldt.h> 8 + 9 + struct stub_data_arch { 10 + int sync; 11 + struct user_desc tls[UM_KERN_GDT_ENTRY_TLS_ENTRIES]; 12 + }; 13 + #else 14 + #define STUB_SYNC_FS_BASE (1 << 0) 15 + #define STUB_SYNC_GS_BASE (1 << 1) 16 + struct stub_data_arch { 17 + int sync; 18 + unsigned long fs_base; 19 + unsigned long gs_base; 20 + }; 21 + #endif 22 + 23 + #endif /* __ARCH_STUB_DATA_H */
+2
arch/x86/um/shared/sysdep/stub.h
··· 13 13 14 14 extern void stub_segv_handler(int, siginfo_t *, void *); 15 15 extern void stub_syscall_handler(void); 16 + extern void stub_signal_interrupt(int, siginfo_t *, void *); 17 + extern void stub_signal_restorer(void);
+13
arch/x86/um/shared/sysdep/stub_32.h
··· 131 131 "call *%%eax ;" \ 132 132 :: "i" ((1 + STUB_DATA_PAGES) * UM_KERN_PAGE_SIZE), \ 133 133 "i" (&fn)) 134 + 135 + static __always_inline void 136 + stub_seccomp_restore_state(struct stub_data_arch *arch) 137 + { 138 + for (int i = 0; i < sizeof(arch->tls) / sizeof(arch->tls[0]); i++) { 139 + if (arch->sync & (1 << i)) 140 + stub_syscall1(__NR_set_thread_area, 141 + (unsigned long) &arch->tls[i]); 142 + } 143 + 144 + arch->sync = 0; 145 + } 146 + 134 147 #endif
+17
arch/x86/um/shared/sysdep/stub_64.h
··· 10 10 #include <sysdep/ptrace_user.h> 11 11 #include <generated/asm-offsets.h> 12 12 #include <linux/stddef.h> 13 + #include <asm/prctl.h> 13 14 14 15 #define STUB_MMAP_NR __NR_mmap 15 16 #define MMAP_OFFSET(o) (o) ··· 135 134 "call *%%rax ;" \ 136 135 :: "i" ((1 + STUB_DATA_PAGES) * UM_KERN_PAGE_SIZE), \ 137 136 "i" (&fn)) 137 + 138 + static __always_inline void 139 + stub_seccomp_restore_state(struct stub_data_arch *arch) 140 + { 141 + /* 142 + * We could use _writefsbase_u64/_writegsbase_u64 if the host reports 143 + * support in the hwcaps (HWCAP2_FSGSBASE). 144 + */ 145 + if (arch->sync & STUB_SYNC_FS_BASE) 146 + stub_syscall2(__NR_arch_prctl, ARCH_SET_FS, arch->fs_base); 147 + if (arch->sync & STUB_SYNC_GS_BASE) 148 + stub_syscall2(__NR_arch_prctl, ARCH_SET_GS, arch->gs_base); 149 + 150 + arch->sync = 0; 151 + } 152 + 138 153 #endif
+19 -7
arch/x86/um/tls_32.c
··· 12 12 #include <skas.h> 13 13 #include <sysdep/tls.h> 14 14 #include <asm/desc.h> 15 + #include <stub-data.h> 15 16 16 17 /* 17 18 * If needed we can detect when it's uninitialized. ··· 22 21 static int host_supports_tls = -1; 23 22 int host_gdt_entry_tls_min; 24 23 25 - static int do_set_thread_area(struct user_desc *info) 24 + static int do_set_thread_area(struct task_struct* task, struct user_desc *info) 26 25 { 27 26 int ret; 28 - u32 cpu; 29 27 30 - cpu = get_cpu(); 31 - ret = os_set_thread_area(info, userspace_pid[cpu]); 32 - put_cpu(); 28 + if (info->entry_number < host_gdt_entry_tls_min || 29 + info->entry_number >= host_gdt_entry_tls_min + GDT_ENTRY_TLS_ENTRIES) 30 + return -EINVAL; 31 + 32 + if (using_seccomp) { 33 + int idx = info->entry_number - host_gdt_entry_tls_min; 34 + struct stub_data *data = (void *)task->mm->context.id.stack; 35 + 36 + data->arch_data.tls[idx] = *info; 37 + data->arch_data.sync |= BIT(idx); 38 + 39 + return 0; 40 + } 41 + 42 + ret = os_set_thread_area(info, task->mm->context.id.pid); 33 43 34 44 if (ret) 35 45 printk(KERN_ERR "PTRACE_SET_THREAD_AREA failed, err = %d, " ··· 109 97 if (!(flags & O_FORCE) && curr->flushed) 110 98 continue; 111 99 112 - ret = do_set_thread_area(&curr->tls); 100 + ret = do_set_thread_area(current, &curr->tls); 113 101 if (ret) 114 102 goto out; 115 103 ··· 287 275 return -EFAULT; 288 276 } 289 277 290 - ret = do_set_thread_area(&info); 278 + ret = do_set_thread_area(current, &info); 291 279 if (ret) 292 280 return ret; 293 281 return set_tls_entry(current, &info, idx, 1);