commits
* anonvma:
anonvma: when setting up page->mapping, we need to pick the _oldest_ anonvma
anon_vma: clone the anon_vma chain in the right order
vma_adjust: fix the copying of anon_vma chains
Simplify and comment on anon_vma re-use for anon_vma_prepare()
* master.kernel.org:/home/rmk/linux-2.6-arm: (21 commits)
ARM: Fix ioremap_cached()/ioremap_wc() for SMP platforms
ARM: 6043/1: AT91 slow-clock resume: Don't wait for a disabled PLL to lock
ARM: 6031/1: fix Thumb-2 decompressor
ARM: 6029/1: ep93xx: gpio.c: local functions should be static
ARM: 6028/1: ARM: add MAINTAINERS for U300
ARM: 6024/1: bcmring: fix missing down on semaphore in dma.c
MXC: mach_armadillo5x0: Add USB Host support.
ARM mach-mx3: duplicated include
ARM mach-mx3: duplicated include
imx31: add watchdog device on litekit board.
imx3: Add watchdog platform device support
MXC: mach-mx31_3ds: add support for freescale mc13783 power management device.
MXC: mach-mx31_3ds: Add SPI1 device support.
MXC: mach-mx31_3ds: Add support for on board NAND Flash.
MXC: mach-mx31_3ds: Update variable names over recent mach name modification.
imx31: fix parent clock for rtc
i.MX51: remove NFC AXI static mapping
i.MX51: determine silicon revision dynamically
i.MX51: map TZIC dynamically
i.MX51: Use correct clock for gpt
...
Otherwise we might be mapping in a page in a new mapping, but that page
(through the swapcache) would later be mapped into an old mapping too.
The page->mapping must be the case that works for everybody, not just
the mapping that happened to page it in first.
Here's the scenario:
- page gets allocated/mapped by process A. Let's call the anon_vma we
associate the page with 'A' to keep it easy to track.
- Process A forks, creating process B. The anon_vma in B is 'B', and has
a chain that looks like 'B' -> 'A'. Everything is fine.
- Swapping happens. The page (with mapping pointing to 'A') gets swapped
out (perhaps not to disk - it's enough to assume that it's just not
mapped any more, and lives entirely in the swap-cache)
- Process B pages it in, which goes like this:
do_swap_page ->
page = lookup_swap_cache(entry);
...
set_pte_at(mm, address, page_table, pte);
page_add_anon_rmap(page, vma, address);
And think about what happens here!
In particular, what happens is that this will now be the "first"
mapping of that page, so page_add_anon_rmap() used to do
if (first)
__page_set_anon_rmap(page, vma, address);
and notice what anon_vma it will use? It will use the anon_vma for
process B!
What happens then? Trivial: process 'A' also pages it in (nothing
happens, it's not the first mapping), and then process 'B' execve's
or exits or unmaps, making anon_vma B go away.
End result: process A has a page that points to anon_vma B, but
anon_vma B does not exist any more. This can go on forever. Forget
about RCU grace periods, forget about locking, forget anything like
that. The bug is simply that page->mapping points to an anon_vma
that was correct at one point, but was _not_ the one that was shared
by all users of that possible mapping.
Changing it to always use the deepest anon_vma in the anonvma chain gets
us to the safest model.
This can be improved in certain cases: if we know the page is private to
just this particular mapping (for example, it's a new page, or it is the
only swapcache entry), we could pick the top (most specific) anon_vma.
But that's a future optimization. Make it _work_ reliably first.
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Borislav Petkov <bp@alien8.de> [ "What do you know, I think you fixed it!" ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
Btrfs: make sure the chunk allocator doesn't create zero length chunks
Btrfs: fix data enospc check overflow
Write combining/cached device mappings are not setting the shared bit,
which could potentially cause problems on SMP systems since the cache
lines won't participate in the cache coherency protocol.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
We want to walk the chain in reverse order when cloning it, so that the
order of the result chain will be the same as the order in the source
chain. When we add entries to the chain, they go at the head of the
chain, so we want to add the source head last.
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Borislav Petkov <bp@alien8.de> [ "No, it still oopses" ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6:
quota: Fix possible dq_flags corruption
quota: Hide warnings about writes to the filesystem before quota was turned on
ext3: symlink must be handled via filesystem specific operation
ext2: symlink must be handled via filesystem specific operation
A recent commit allowed for smaller chunks to be created, but didn't
make sure they were always bigger than a stripe. After some divides,
this led to zero length stripes.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
at91 slow-clock resume: Don't wait for a disabled PLL to lock.
We run into this problem with the PLLB on the at91: ohci-at91 disables
the PLLB when going to suspend. The slowclock code however tries to do
the same: It saves the PLLB register value and when restoring the value
during resume, it waits for the PLLB to lock again. However the PLL will
never lock and the loop would run into its timeout because the slowclock
code just stored and restored an empty register.
This fixes the problem by only restoring PLLA/PLLB when they were enabled
at suspend time.
Cc: Andrew Victor <avictor.za@gmail.com>
Signed-off-by: Anders Larsen <al@alarsen.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When we move the boundaries between two vma's due to things like
mprotect, we need to make sure that the anon_vma of the pages that got
moved from one vma to another gets properly copied around. And that was
not always the case, in this rather hard-to-follow code sequence.
Clarify the code, and fix it so that it copies the anon_vma from the
right source.
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Borislav Petkov <bp@alien8.de> [ "Yeah, not so much this one either" ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6:
udf: add speciffic ->setattr callback
udf: potential integer overflow
dq_flags are modified non-atomically in do_set_dqblk via __set_bit calls and
atomically for example in mark_dquot_dirty or clear_dquot_dirty. Hence a
change done by an atomic operation can be overwritten by a change done by a
non-atomic one. Fix the problem by using atomic bitops even in do_set_dqblk.
Signed-off-by: Andrew Perepechko <andrew.perepechko@sun.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Because we account for reserved space we get from the allocator before we
actually account for allocating delalloc space, we can have a small window where
the amount of "used" space in a space_info is more than the total amount of
space in the space_info. This will cause a overflow in our check, so it will
seem like we have _tons_ of free space, and we'll allow reservations to occur
that will end up larger than the amount of space we have. I've seen users
report ENOSPC panic's in cow_file_range a few times recently, so I tried to
reproduce this problem and found I could reproduce it if I ran one of my tests
in a loop for like 20 minutes. With this patch my test ran all night without
issues. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Conflicts:
arch/arm/mach-mx3/mach-pcm037.c
This changes the anon_vma reuse case to require that we only reuse
simple anon_vma's - ie the case when the vma only has a single anon_vma
associated with it.
This means that a reuse of an anon_vma from an adjacent vma will always
guarantee that both vma's are associated not only with the same
anon_vma, they will also have the same anon_vma chain (of just a single
entry in this case).
And since anon_vma re-use was the only case where the same anon_vma
might be associated with different chains of anon_vma's, we now have the
case that every vma that shares the same anon_vma will always also have
the same chain. That makes it much easier to think about merging vma's
that share the same anon_vma's: you can always just drop the other
anon_vma chain in anon_vma_merge() since you know that they are always
identical.
This also splits up the function to validate the anon_vma re-use, and
adds a lot of commentary about the possible races.
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Borislav Petkov <bp@alien8.de> [ "That didn't fix it" ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: (36 commits)
MIPS: Calculate proper ebase value for 64-bit kernels
MIPS: Alchemy: DB1200: Remove custom wait implementation
MIPS: Big Sur: Make defconfig more useful.
MIPS: Fix __vmalloc() etc. on MIPS for non-GPL modules
MIPS: Sibyte: Fix M3 TLB exception handler workaround.
MIPS: BCM63xx: Fix build failure in board_bcm963xx.c
MIPS: uasm: Add OR instruction.
MIPS: Sibyte: Apply M3 workaround only on affected chip types and versions.
MIPS: BCM63xx: Initialize gpio_out_low & out_high to current value at boot.
MIPS: BCM63xx: Register SSB SPROM fallback in board's first stage callback
MIPS: BCM63xx: Fix typo in cpu-feature-overrides file.
MIPS: BCM63xx: Add support for second uart.
MIPS: BCM63xx: Fix double gpio registration.
MIPS: BCM63xx: Add DWVS0 board
MIPS: BCM63xx: Add the RTA1025W-16 BCM6348-based board to suppported boards.
MIPS: BCM63xx: Fix BCM6338 and BCM6345 gpio count
MIPS: libgcc.h: Checkpatch cleanup
MIPS: Loongson-2F: Flush the branch target history in BTB and RAS
MIPS: Move signal trampolines off of the stack.
MIPS: Preliminary VDSO
...
generic setattr not longer responsible for quota transfer.
use udf_setattr for all udf's inodes.
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Jan Kara <jack@suse.cz>
For a root filesystem write to the filesystem before quota is turned on happens
regularly and there's no way around it because of writes to syslog, /etc/mtab,
and similar. So the warning is rather pointless for ordinary users. It's
still useful during development so we just hide the warning behind
__DQUOT_PARANOIA config option.
Signed-off-by: Jan Kara <jack@suse.cz>
setup_leaf_for_split needs to drop the path and search again, and has
checks to see if the item we want to split changed size. But, it misses
the case where the leaf changed and now has enough room for the item
we want to insert.
This adds an extra check to make sure the leaf really needs splitting
before we call btrfs_split_leaf(), which keeps us from trying to split
a leaf with a single item.
btrfs_split_leaf() will blindly split the single item leaf, leaving us
with one good leaf and one empty leaf and then a crash.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
98e12b5a6e05413 ("ARM: Fix decompressor's kernel size estimation for
ROM=y") broke the Thumb-2 decompressor because it added an entry in the
LC0 table but didn't adjust the offset the Thumb-2 code uses to load the
SP from that table. Fix it.
Cc: stable <stable@kernel.org>
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This add USB Host capability. The Armadillo 500 board is supplied
with two USB Host connectors driven by the USB OTG and USB Host 2
ports, through two NXP isp 1504 transceivers.
Signed-off-by: Alberto Panizzo <maramaopercheseimorto@gmail.com>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/mlx4: Check correct variable for allocation failure
RDMA/nes: Correct cap.max_inline_data assignment in nes_query_qp()
RDMA/cm: Set num_paths when manually assigning path records
IB/cm: Fix device_create() return value check
* 'for-2.6.34' of git://linux-nfs.org/~bfields/linux:
svcrdma: RDMA support not yet compatible with RPC6
The ebase is relative to CKSEG0 not CAC_BASE. On a 32-bit kernel they
are the same thing, for a 64-bit kernel they are not.
It happens to kind of work on a 64-bit kernel as they both reference
the same physical memory. However since the CPU uses the CKSEG0 base,
determining if a J instruction will reach always gives the wrong result
unless we use the same number the CPU uses.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/1093/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
bloc->logicalBlockNum is unsigned so it's never less than zero.
When I saw that, it made me worry that "bloc->logicalBlockNum + count"
could overflow. That's why I changed the check for less than zero
to an overflow check. (The test works because "count" is also
unsigned.)
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
generic setattr implementation is no longer responsible for
quota transfer so synlinks must be handled via ext3_setattr.
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Jan Kara <jack@suse.cz>
This creates the reference to a new snapshot in the same commit as the
snapshot itself. This avoids the need for a second commit in order for a
snapshot to be persistent, and also avoids the problem of "leaking" a
new snapshot tree root if the host crashes before the second commit takes
place.
It is not at all clear to me why it wasn't always done this way. If there
is still a reason for the two-stage {create,finish}_pending_snapshots()
approach I'm missing something! :)
I've been running this for a couple weeks under pretty heavy usage (a few
snapshots per minute) without obvious problems.
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
The functions ep93xx_gpio_update_int_params and ep93xx_gpio_int_mask
are not exported and should be static. This was overlooked when
moving the code from core.c.
Also, change a comment to better indicate what the code is for.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Acked-by: Ryan Mallon <ryan@bluewatersys.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
arch/arm/mach-mx3/mx31lite-db.c: linux/platform_device.h is included more than once.
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
[S390] Update default configuration.
[S390] nss: add missing .previous statement to asm function
[S390] increase default size of vmalloc area
[S390] s390: disable change bit override
[S390] fix io_return critical section cleanup
[S390] sclp_async: potential buffer overflow
[S390] arch/s390/kernel: Add missing unlock
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
nilfs2: fix typo "numer" -> "number" in alloc.c
nilfs2: Remove an uninitialization warning in nilfs_btree_propagate_v()
nilfs2: fix a wrong type conversion in nilfs_ioctl()
RPC6 requires that it be possible to create endpoints that listen
exclusively for IPv4 or IPv6 connection requests. This is not currently
supported by the RDMA API.
This fixes a server RDMA regression introduced by 37498292a "NFSD:
Create PF_INET6 listener in write_ports".
Signed-off-by: Tom Tucker<tom@opengridcomputing.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
While playing with the out-of-tree MAE driver module, the system would
panic after a while in the db1200 custom wait code after wakeup due to
a clobbered k0 register being used as target address of a store op.
Remove the custom wait implementation and revert back to the Alchemy-
recommended implementation already set as default.
Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
To: Linux-MIPS <linux-mips@linux-mips.org>
Patchwork: http://patchwork.linux-mips.org/patch/1092/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
* 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6:
pcmcia: fix up alignf issues
generic setattr implementation is no longer responsible for
quota transfer so synlinks must be handled via ext2_setattr.
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Everytime we start a new flushing thread, we init the waitqueue if there isn't a
flushing thread running. The problem with this is we check
space_info->flushing, which we clear right before doing a wake_up on the
flushing waitqueue, which causes problems if we init the waitqueue in the middle
of clearing the flushing flagh and calling wake_up. This is hard to hit, but
the code is wrong anyway, so init the flushing/allocating waitqueue when
creating the space info and let it be. I haven't seen the panic since I've been
using this patch. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
This adds myself as maintainer of the U300 machine and associated
system-on-chip drivers.
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
arch/arm/mach-mx3/mach-pcm037.c: linux/fsl_devices.h is included more than once.
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: (34 commits)
cfq-iosched: Fix the incorrect timeslice accounting with forced_dispatch
loop: Update mtime when writing using aops
block: expose the statistics in blkio.time and blkio.sectors for the root cgroup
backing-dev: Handle class_create() failure
Block: Fix block/elevator.c elevator_get() off-by-one error
drbd: lc_element_by_index() never returns NULL
cciss: unlock on error path
cfq-iosched: Do not merge queues of BE and IDLE classes
cfq-iosched: Add additional blktrace log messages in CFQ for easier debugging
i2o: Remove the dangerous kobj_to_i2o_device macro
block: remove 16 bytes of padding from struct request on 64bits
cfq-iosched: fix a kbuild regression
block: make CONFIG_BLK_CGROUP visible
Remove GENHD_FL_DRIVERFS
block: Export max number of segments and max segment size in sysfs
block: Finalize conversion of block limits functions
block: Fix overrun in lcm() and move it to lib
vfs: improve writeback_inodes_wb()
paride: fix off-by-one test
drbd: fix al-to-on-disk-bitmap for 4k logical_block_size
...
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When manually assigning the path records to use for a connection, save
the number of paths that were set. Otherwise, checks against num_path
will show 0, even though path record data is available.
This was discovered by manually setting the path records from user
space, then querying the kernel to see if the correct path records
were assigned, only to discover that the kernel returned 0 path
records to the query.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Use IS_ERR() instead of comparing to NULL.
Signed-off-by: Jani Nikula <ext-jani.1.nikula@nokia.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The intent here is to check the "mfrpl->mapped_page_list" allocation.
We checked "mfrpl->ibfrpl.page_list" earlier.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
cap.max_inline_data is incorrectly set in init_attr instead of attr.
Set it in attr so subsequent init_attr.cap assignment will get the
correct value.
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fixes the typo found in a warning message of a persistent object
allocator function.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: Force MSI irq handlers to run with interrupts disabled
- pcmcia_align() used a "start" variable twice. That's obviously a bad
idea.
- pcmcia_common_resource() needs the current "start" parameter being
passed, instead of res->start.
- pcmcia_common_resource() doesn't use the size and align parameters,
so get rid of those.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Pagecache pages should be allocated with __page_cache_alloc, so they
obey pagecache memory policies.
add_to_page_cache_lru is exported, so it should be used. Benefits over
using a private pagevec: neater code, 128 bytes fewer stack used, percpu
lru ordering is preserved, and finally don't need to flush pagevec
before returning so batching may be shared with other LRU insertions.
Signed-off-by: Nick Piggin <npiggin@suse.de>:
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Added missing down on the memMap->lock semaphore. Also fixed a return
statement so that we always exit with an up (i.e. early exit via return
is not allowed)
Signed-off-by: Leo Hao Chen <leochen@broadcom.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch adds support for SoC build-in watchdog device on litekit
board.
Signed-off-by: Vladimir Zapolskiy <vzapolskiy@gmail.com>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (29 commits)
drm/nouveau: bail out of auxch transaction if we repeatedly recieve defers
drm/nv50: implement gpio set/get routines
drm/nv50: parse/use some more de-magiced parts of gpio table entries
drm/nouveau: store raw gpio table entry in bios gpio structs
drm/nv40: Init some tiling-related PGRAPH state.
drm/nv50: Add NVA3 support in ctxprog/ctxvals generator.
drm/nv50: another dodgy DP hack
drm/nv50: punt hotplug irq handling out to workqueue
drm/nv50: preserve an unknown SOR_MODECTRL value for DP encoders
drm/nv50: Allow using the NVA3 new compute class.
drm/nv50: cleanup properly if PDISPLAY init fails
drm/nouveau: fixup the init failure paths some more
drm/nv50: fix instmem init on IGPs if stolen mem crosses 4GiB mark
drm/nv40: add LVDS table quirk for Dell Latitude D620
drm/nv40: rework lvds table parsing
drm/nouveau: detect vram amount once, and save the value
drm/nouveau: remove some unused members from drm_nouveau_private
drm/nouveau: Make use of TTM busy_placements.
drm/nv50: add more 0x100c80 flushy magic
drm/nv50: fix fbcon when framebuffer above 4GiB mark
...
When CFQ dispatches requests forcefully due to a barrier or changing iosched,
it runs through all cfqq's dispatching requests and then expires each queue.
However, it does not activate a cfqq before flushing its IOs resulting in
using stale values for computing slice_used.
This patch fixes it by calling activate queue before flushing reuqests from
each queue.
This is useful mostly for barrier requests because when the iosched is changing
it really doesnt matter if we have incorrect accounting since we're going to
break down all structures anyway.
We also now expire the current timeslice before moving on with the dispatch
to accurately account slice used for that cfqq.
Signed-off-by: Divyesh Shah<dpshah@google.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The savesys_ipl_nss asm function is put into the .init.text section
however it is missing a ".previous" section which would restore the
previous section.
Luckily all functions in early.c are init functions so it doesn't
matter currently.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
x86/PCI: truncate _CRS windows with _LEN > _MAX - _MIN + 1
x86/PCI: for host bridge address space collisions, show conflicting resource
frv/PCI: remove redundant warnings
x86/PCI: remove redundant warnings
PCI: don't say we claimed a resource if we failed
PCI quirk: Disable MSI on VIA K8T890 systems
PCI quirk: RS780/RS880: work around missing MSI initialization
PCI quirk: only apply CX700 PCI bus parking quirk if external VT6212L is present
PCI: complain about devices that seem to be broken
PCI: print resources consistently with %pR
PCI: make disabled window printk style match the enabled ones
PCI: break out primary/secondary/subordinate for readability
PCI: for address space collisions, show conflicting resource
resources: add interfaces that return conflict information
PCI: cleanup error return for pcix get and set mmrbc functions
PCI: fix access of PCI_X_CMD by pcix get and set mmrbc functions
PCI: kill off pci_register_set_vga_state() symbol export.
PCI: fix return value from pcix_get_max_mmrbc()
`make CONFIG_NILFS2_FS=m M=fs/nilfs2/` will give the following warnings:
fs/nilfs2/btree.c: In function 'nilfs_btree_propagate':
fs/nilfs2/btree.c:1882: warning: 'maxlevel' may be used uninitialized in this function
fs/nilfs2/btree.c:1882: note: 'maxlevel' was declared here
Set maxlevel = 0 to fix it.
Signed-off-by: Li Hong <lihong.hi@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Add a MAINTAINERS record for the key management facility.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* master.kernel.org:/home/rmk/linux-2.6-arm: (21 commits)
ARM: Fix ioremap_cached()/ioremap_wc() for SMP platforms
ARM: 6043/1: AT91 slow-clock resume: Don't wait for a disabled PLL to lock
ARM: 6031/1: fix Thumb-2 decompressor
ARM: 6029/1: ep93xx: gpio.c: local functions should be static
ARM: 6028/1: ARM: add MAINTAINERS for U300
ARM: 6024/1: bcmring: fix missing down on semaphore in dma.c
MXC: mach_armadillo5x0: Add USB Host support.
ARM mach-mx3: duplicated include
ARM mach-mx3: duplicated include
imx31: add watchdog device on litekit board.
imx3: Add watchdog platform device support
MXC: mach-mx31_3ds: add support for freescale mc13783 power management device.
MXC: mach-mx31_3ds: Add SPI1 device support.
MXC: mach-mx31_3ds: Add support for on board NAND Flash.
MXC: mach-mx31_3ds: Update variable names over recent mach name modification.
imx31: fix parent clock for rtc
i.MX51: remove NFC AXI static mapping
i.MX51: determine silicon revision dynamically
i.MX51: map TZIC dynamically
i.MX51: Use correct clock for gpt
...
Otherwise we might be mapping in a page in a new mapping, but that page
(through the swapcache) would later be mapped into an old mapping too.
The page->mapping must be the case that works for everybody, not just
the mapping that happened to page it in first.
Here's the scenario:
- page gets allocated/mapped by process A. Let's call the anon_vma we
associate the page with 'A' to keep it easy to track.
- Process A forks, creating process B. The anon_vma in B is 'B', and has
a chain that looks like 'B' -> 'A'. Everything is fine.
- Swapping happens. The page (with mapping pointing to 'A') gets swapped
out (perhaps not to disk - it's enough to assume that it's just not
mapped any more, and lives entirely in the swap-cache)
- Process B pages it in, which goes like this:
do_swap_page ->
page = lookup_swap_cache(entry);
...
set_pte_at(mm, address, page_table, pte);
page_add_anon_rmap(page, vma, address);
And think about what happens here!
In particular, what happens is that this will now be the "first"
mapping of that page, so page_add_anon_rmap() used to do
if (first)
__page_set_anon_rmap(page, vma, address);
and notice what anon_vma it will use? It will use the anon_vma for
process B!
What happens then? Trivial: process 'A' also pages it in (nothing
happens, it's not the first mapping), and then process 'B' execve's
or exits or unmaps, making anon_vma B go away.
End result: process A has a page that points to anon_vma B, but
anon_vma B does not exist any more. This can go on forever. Forget
about RCU grace periods, forget about locking, forget anything like
that. The bug is simply that page->mapping points to an anon_vma
that was correct at one point, but was _not_ the one that was shared
by all users of that possible mapping.
Changing it to always use the deepest anon_vma in the anonvma chain gets
us to the safest model.
This can be improved in certain cases: if we know the page is private to
just this particular mapping (for example, it's a new page, or it is the
only swapcache entry), we could pick the top (most specific) anon_vma.
But that's a future optimization. Make it _work_ reliably first.
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Borislav Petkov <bp@alien8.de> [ "What do you know, I think you fixed it!" ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Write combining/cached device mappings are not setting the shared bit,
which could potentially cause problems on SMP systems since the cache
lines won't participate in the cache coherency protocol.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
We want to walk the chain in reverse order when cloning it, so that the
order of the result chain will be the same as the order in the source
chain. When we add entries to the chain, they go at the head of the
chain, so we want to add the source head last.
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Borislav Petkov <bp@alien8.de> [ "No, it still oopses" ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6:
quota: Fix possible dq_flags corruption
quota: Hide warnings about writes to the filesystem before quota was turned on
ext3: symlink must be handled via filesystem specific operation
ext2: symlink must be handled via filesystem specific operation
at91 slow-clock resume: Don't wait for a disabled PLL to lock.
We run into this problem with the PLLB on the at91: ohci-at91 disables
the PLLB when going to suspend. The slowclock code however tries to do
the same: It saves the PLLB register value and when restoring the value
during resume, it waits for the PLLB to lock again. However the PLL will
never lock and the loop would run into its timeout because the slowclock
code just stored and restored an empty register.
This fixes the problem by only restoring PLLA/PLLB when they were enabled
at suspend time.
Cc: Andrew Victor <avictor.za@gmail.com>
Signed-off-by: Anders Larsen <al@alarsen.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When we move the boundaries between two vma's due to things like
mprotect, we need to make sure that the anon_vma of the pages that got
moved from one vma to another gets properly copied around. And that was
not always the case, in this rather hard-to-follow code sequence.
Clarify the code, and fix it so that it copies the anon_vma from the
right source.
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Borislav Petkov <bp@alien8.de> [ "Yeah, not so much this one either" ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
dq_flags are modified non-atomically in do_set_dqblk via __set_bit calls and
atomically for example in mark_dquot_dirty or clear_dquot_dirty. Hence a
change done by an atomic operation can be overwritten by a change done by a
non-atomic one. Fix the problem by using atomic bitops even in do_set_dqblk.
Signed-off-by: Andrew Perepechko <andrew.perepechko@sun.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Because we account for reserved space we get from the allocator before we
actually account for allocating delalloc space, we can have a small window where
the amount of "used" space in a space_info is more than the total amount of
space in the space_info. This will cause a overflow in our check, so it will
seem like we have _tons_ of free space, and we'll allow reservations to occur
that will end up larger than the amount of space we have. I've seen users
report ENOSPC panic's in cow_file_range a few times recently, so I tried to
reproduce this problem and found I could reproduce it if I ran one of my tests
in a loop for like 20 minutes. With this patch my test ran all night without
issues. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
This changes the anon_vma reuse case to require that we only reuse
simple anon_vma's - ie the case when the vma only has a single anon_vma
associated with it.
This means that a reuse of an anon_vma from an adjacent vma will always
guarantee that both vma's are associated not only with the same
anon_vma, they will also have the same anon_vma chain (of just a single
entry in this case).
And since anon_vma re-use was the only case where the same anon_vma
might be associated with different chains of anon_vma's, we now have the
case that every vma that shares the same anon_vma will always also have
the same chain. That makes it much easier to think about merging vma's
that share the same anon_vma's: you can always just drop the other
anon_vma chain in anon_vma_merge() since you know that they are always
identical.
This also splits up the function to validate the anon_vma re-use, and
adds a lot of commentary about the possible races.
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Borislav Petkov <bp@alien8.de> [ "That didn't fix it" ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: (36 commits)
MIPS: Calculate proper ebase value for 64-bit kernels
MIPS: Alchemy: DB1200: Remove custom wait implementation
MIPS: Big Sur: Make defconfig more useful.
MIPS: Fix __vmalloc() etc. on MIPS for non-GPL modules
MIPS: Sibyte: Fix M3 TLB exception handler workaround.
MIPS: BCM63xx: Fix build failure in board_bcm963xx.c
MIPS: uasm: Add OR instruction.
MIPS: Sibyte: Apply M3 workaround only on affected chip types and versions.
MIPS: BCM63xx: Initialize gpio_out_low & out_high to current value at boot.
MIPS: BCM63xx: Register SSB SPROM fallback in board's first stage callback
MIPS: BCM63xx: Fix typo in cpu-feature-overrides file.
MIPS: BCM63xx: Add support for second uart.
MIPS: BCM63xx: Fix double gpio registration.
MIPS: BCM63xx: Add DWVS0 board
MIPS: BCM63xx: Add the RTA1025W-16 BCM6348-based board to suppported boards.
MIPS: BCM63xx: Fix BCM6338 and BCM6345 gpio count
MIPS: libgcc.h: Checkpatch cleanup
MIPS: Loongson-2F: Flush the branch target history in BTB and RAS
MIPS: Move signal trampolines off of the stack.
MIPS: Preliminary VDSO
...
For a root filesystem write to the filesystem before quota is turned on happens
regularly and there's no way around it because of writes to syslog, /etc/mtab,
and similar. So the warning is rather pointless for ordinary users. It's
still useful during development so we just hide the warning behind
__DQUOT_PARANOIA config option.
Signed-off-by: Jan Kara <jack@suse.cz>
setup_leaf_for_split needs to drop the path and search again, and has
checks to see if the item we want to split changed size. But, it misses
the case where the leaf changed and now has enough room for the item
we want to insert.
This adds an extra check to make sure the leaf really needs splitting
before we call btrfs_split_leaf(), which keeps us from trying to split
a leaf with a single item.
btrfs_split_leaf() will blindly split the single item leaf, leaving us
with one good leaf and one empty leaf and then a crash.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
98e12b5a6e05413 ("ARM: Fix decompressor's kernel size estimation for
ROM=y") broke the Thumb-2 decompressor because it added an entry in the
LC0 table but didn't adjust the offset the Thumb-2 code uses to load the
SP from that table. Fix it.
Cc: stable <stable@kernel.org>
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/mlx4: Check correct variable for allocation failure
RDMA/nes: Correct cap.max_inline_data assignment in nes_query_qp()
RDMA/cm: Set num_paths when manually assigning path records
IB/cm: Fix device_create() return value check
The ebase is relative to CKSEG0 not CAC_BASE. On a 32-bit kernel they
are the same thing, for a 64-bit kernel they are not.
It happens to kind of work on a 64-bit kernel as they both reference
the same physical memory. However since the CPU uses the CKSEG0 base,
determining if a J instruction will reach always gives the wrong result
unless we use the same number the CPU uses.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/1093/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
bloc->logicalBlockNum is unsigned so it's never less than zero.
When I saw that, it made me worry that "bloc->logicalBlockNum + count"
could overflow. That's why I changed the check for less than zero
to an overflow check. (The test works because "count" is also
unsigned.)
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
This creates the reference to a new snapshot in the same commit as the
snapshot itself. This avoids the need for a second commit in order for a
snapshot to be persistent, and also avoids the problem of "leaking" a
new snapshot tree root if the host crashes before the second commit takes
place.
It is not at all clear to me why it wasn't always done this way. If there
is still a reason for the two-stage {create,finish}_pending_snapshots()
approach I'm missing something! :)
I've been running this for a couple weeks under pretty heavy usage (a few
snapshots per minute) without obvious problems.
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
The functions ep93xx_gpio_update_int_params and ep93xx_gpio_int_mask
are not exported and should be static. This was overlooked when
moving the code from core.c.
Also, change a comment to better indicate what the code is for.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Acked-by: Ryan Mallon <ryan@bluewatersys.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
[S390] Update default configuration.
[S390] nss: add missing .previous statement to asm function
[S390] increase default size of vmalloc area
[S390] s390: disable change bit override
[S390] fix io_return critical section cleanup
[S390] sclp_async: potential buffer overflow
[S390] arch/s390/kernel: Add missing unlock
RPC6 requires that it be possible to create endpoints that listen
exclusively for IPv4 or IPv6 connection requests. This is not currently
supported by the RDMA API.
This fixes a server RDMA regression introduced by 37498292a "NFSD:
Create PF_INET6 listener in write_ports".
Signed-off-by: Tom Tucker<tom@opengridcomputing.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
While playing with the out-of-tree MAE driver module, the system would
panic after a while in the db1200 custom wait code after wakeup due to
a clobbered k0 register being used as target address of a store op.
Remove the custom wait implementation and revert back to the Alchemy-
recommended implementation already set as default.
Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
To: Linux-MIPS <linux-mips@linux-mips.org>
Patchwork: http://patchwork.linux-mips.org/patch/1092/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Everytime we start a new flushing thread, we init the waitqueue if there isn't a
flushing thread running. The problem with this is we check
space_info->flushing, which we clear right before doing a wake_up on the
flushing waitqueue, which causes problems if we init the waitqueue in the middle
of clearing the flushing flagh and calling wake_up. This is hard to hit, but
the code is wrong anyway, so init the flushing/allocating waitqueue when
creating the space info and let it be. I haven't seen the panic since I've been
using this patch. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: (34 commits)
cfq-iosched: Fix the incorrect timeslice accounting with forced_dispatch
loop: Update mtime when writing using aops
block: expose the statistics in blkio.time and blkio.sectors for the root cgroup
backing-dev: Handle class_create() failure
Block: Fix block/elevator.c elevator_get() off-by-one error
drbd: lc_element_by_index() never returns NULL
cciss: unlock on error path
cfq-iosched: Do not merge queues of BE and IDLE classes
cfq-iosched: Add additional blktrace log messages in CFQ for easier debugging
i2o: Remove the dangerous kobj_to_i2o_device macro
block: remove 16 bytes of padding from struct request on 64bits
cfq-iosched: fix a kbuild regression
block: make CONFIG_BLK_CGROUP visible
Remove GENHD_FL_DRIVERFS
block: Export max number of segments and max segment size in sysfs
block: Finalize conversion of block limits functions
block: Fix overrun in lcm() and move it to lib
vfs: improve writeback_inodes_wb()
paride: fix off-by-one test
drbd: fix al-to-on-disk-bitmap for 4k logical_block_size
...
When manually assigning the path records to use for a connection, save
the number of paths that were set. Otherwise, checks against num_path
will show 0, even though path record data is available.
This was discovered by manually setting the path records from user
space, then querying the kernel to see if the correct path records
were assigned, only to discover that the kernel returned 0 path
records to the query.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: Force MSI irq handlers to run with interrupts disabled
- pcmcia_align() used a "start" variable twice. That's obviously a bad
idea.
- pcmcia_common_resource() needs the current "start" parameter being
passed, instead of res->start.
- pcmcia_common_resource() doesn't use the size and align parameters,
so get rid of those.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Pagecache pages should be allocated with __page_cache_alloc, so they
obey pagecache memory policies.
add_to_page_cache_lru is exported, so it should be used. Benefits over
using a private pagevec: neater code, 128 bytes fewer stack used, percpu
lru ordering is preserved, and finally don't need to flush pagevec
before returning so batching may be shared with other LRU insertions.
Signed-off-by: Nick Piggin <npiggin@suse.de>:
Signed-off-by: Chris Mason <chris.mason@oracle.com>
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (29 commits)
drm/nouveau: bail out of auxch transaction if we repeatedly recieve defers
drm/nv50: implement gpio set/get routines
drm/nv50: parse/use some more de-magiced parts of gpio table entries
drm/nouveau: store raw gpio table entry in bios gpio structs
drm/nv40: Init some tiling-related PGRAPH state.
drm/nv50: Add NVA3 support in ctxprog/ctxvals generator.
drm/nv50: another dodgy DP hack
drm/nv50: punt hotplug irq handling out to workqueue
drm/nv50: preserve an unknown SOR_MODECTRL value for DP encoders
drm/nv50: Allow using the NVA3 new compute class.
drm/nv50: cleanup properly if PDISPLAY init fails
drm/nouveau: fixup the init failure paths some more
drm/nv50: fix instmem init on IGPs if stolen mem crosses 4GiB mark
drm/nv40: add LVDS table quirk for Dell Latitude D620
drm/nv40: rework lvds table parsing
drm/nouveau: detect vram amount once, and save the value
drm/nouveau: remove some unused members from drm_nouveau_private
drm/nouveau: Make use of TTM busy_placements.
drm/nv50: add more 0x100c80 flushy magic
drm/nv50: fix fbcon when framebuffer above 4GiB mark
...
When CFQ dispatches requests forcefully due to a barrier or changing iosched,
it runs through all cfqq's dispatching requests and then expires each queue.
However, it does not activate a cfqq before flushing its IOs resulting in
using stale values for computing slice_used.
This patch fixes it by calling activate queue before flushing reuqests from
each queue.
This is useful mostly for barrier requests because when the iosched is changing
it really doesnt matter if we have incorrect accounting since we're going to
break down all structures anyway.
We also now expire the current timeslice before moving on with the dispatch
to accurately account slice used for that cfqq.
Signed-off-by: Divyesh Shah<dpshah@google.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The savesys_ipl_nss asm function is put into the .init.text section
however it is missing a ".previous" section which would restore the
previous section.
Luckily all functions in early.c are init functions so it doesn't
matter currently.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
x86/PCI: truncate _CRS windows with _LEN > _MAX - _MIN + 1
x86/PCI: for host bridge address space collisions, show conflicting resource
frv/PCI: remove redundant warnings
x86/PCI: remove redundant warnings
PCI: don't say we claimed a resource if we failed
PCI quirk: Disable MSI on VIA K8T890 systems
PCI quirk: RS780/RS880: work around missing MSI initialization
PCI quirk: only apply CX700 PCI bus parking quirk if external VT6212L is present
PCI: complain about devices that seem to be broken
PCI: print resources consistently with %pR
PCI: make disabled window printk style match the enabled ones
PCI: break out primary/secondary/subordinate for readability
PCI: for address space collisions, show conflicting resource
resources: add interfaces that return conflict information
PCI: cleanup error return for pcix get and set mmrbc functions
PCI: fix access of PCI_X_CMD by pcix get and set mmrbc functions
PCI: kill off pci_register_set_vga_state() symbol export.
PCI: fix return value from pcix_get_max_mmrbc()
`make CONFIG_NILFS2_FS=m M=fs/nilfs2/` will give the following warnings:
fs/nilfs2/btree.c: In function 'nilfs_btree_propagate':
fs/nilfs2/btree.c:1882: warning: 'maxlevel' may be used uninitialized in this function
fs/nilfs2/btree.c:1882: note: 'maxlevel' was declared here
Set maxlevel = 0 to fix it.
Signed-off-by: Li Hong <lihong.hi@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>