commits
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup:
[x86 setup] Don't rely on the VESA BIOS being register-clean
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/ehca: SRQ fixes to enable IPoIB CM
IB/ehca: Fix Small QP regressions
The VESA BIOS is specified to be register-clean. However, we have now
found at least one system which violates that. Thus, be as paranoid
about VESA calls as about everything else.
Huge thanks to Will Simoneau for reporting, diagnosing, and testing
this out on Dell Inspiron 5150.
Cc: Will Simoneau <simoneau@ele.uri.edu>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
This avoids the recent NFS mount regression (returning EBUSY when
mounting the same filesystem twice with different parameters).
The best I can do given the constraints appears to be to have the kernel
first look for a superblock that matches both the fsid and the
user-specified mount options, and then spawn off a new superblock if
that search fails.
Note that this is not the same as specifying nosharecache everywhere
since nosharecache will never attempt to match an existing superblock.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Tested-by: Hua Zhong <hzhong@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix ehca SRQ support so that IPoIB connected mode works:
- Report max_srq > 0 if SRQ is supported
- Report "last wqe reached" asynchronous event when base QP dies;
this is required by the IB spec and IPoIB CM relies on receiving it
when cleaning up.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Page migration currently does not check if the target of the move contains
nodes that that are invalid (if root attempts to migrate pages)
and may try to allocate from invalid nodes if these are specified
leading to oopses.
Return -EINVAL if an offline node is specified.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Shaohua Li <shaohua.li@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
lguest didn't initialize the kernel stack the way a real i386 kernel
does, and ended up triggering a corner-case in the stack frame checking
that doesn't happen on naive i386, and that the stack dumping didn't
handle quite right.
This makes the frame handling more correct, and tries to clarify the
code at the same time so that it's a bit more obvious what is going on.
Thanks to Rusty Russell for debugging the lguest failure-
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The new Small QP code had a few bugs that would also make it trigger
for non-Small QPs. Fix them.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
For hugepage mappings, the file offset, like the address and size, needs to
be aligned to the size of a hugepage.
In commit 68589bc353037f233fe510ad9ff432338c95db66, the check for this was
moved into prepare_hugepage_range() along with the address and size checks.
But since BenH's rework of the get_unmapped_area() paths leading up to
commit 4b1d89290b62bb2db476c94c82cf7442aab440c8, prepare_hugepage_range()
is only called for MAP_FIXED mappings, not for other mappings. This means
we're no longer ever checking for an aligned offset - I've confirmed that
mmap() will (apparently) succeed with a misaligned offset on both powerpc
and i386 at least.
This patch restores the check, removing it from prepare_hugepage_range()
and putting it back into hugetlbfs_file_mmap(). I'm putting it there,
rather than in the get_unmapped_area() path so it only needs to go in one
place, than separately in the half-dozen or so arch-specific
implementations of hugetlb_get_unmapped_area().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The HPET clocksource in drivers/char/hpet.c was written as generic code
for ia64, but it is not yet ready to replace the native HPET clocksource
implementations that the i386/x86-64 architectures use.
On x86[-64], trying to register this clocksource results in potentially
multiple hpet-based clocksources being registered, and if the ia64 one
is chosen on x86_64 some users have experienced hangs.
Eventually all three architectures may end up using the same code, but
that is not the case right now.
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Paolo Ornati <ornati@fastwebnet.it>
Cc: Bob Picco <bob.picco@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 4549/1: KS8695: Fix build errors
[ARM] 4546/1: s3c2410: fix architecture typo for s3c2442
[ARM] 4544/1: arm: fix section mismatch in pxa fb
MAINTAINERS, order NETERION alphabetically
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6:
- cxgb3 engine microcode load
cxgb3 - Fix dev->priv usage
qeth: Drop ARP packages on HiperSockets interface with NOARP attribute.
qeth: provide specific message for OSA-adapters exclusively used
qeth: crash during reboot after failing online setting
qeth: Announce tx checksumming for qeth devices in TSO/EDDP mode
qeth: dont return the return values of void functions.
qeth: enforce a rate limit for inbound scatter gather messages
qeth: ungrouping a device must not be interruptible
netxen: fix crashes during module unload
netxen: Avoid firmware load in PCI probe
PS3: fix the bug that 'ifconfig down' would hang
IOC3: Program UART predividers.
Very old 64bit binutils have .cfi_startproc/endproc, but
no .cfi_rel_offset. Check for .cfi_rel_offset too.
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The PCI driver has not been merged yet, so comment out call to
ks8695_init_pci() for now.
Also fix some incorrectly marked __init and __initdata sections.
Signed-off-by: Andrew Victor <andrew@sanpeople.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We find that SB700 and SB800 use the same SMBus device ID as SB600, which is
0x4385, instead of the already submitted 0x4395.
Besides removing the wrong SB700 device ID, add SB800 support to kernel, by
renaming the PCI_DEVICE_ID_ATI_IXP600_SMBUS into
PCI_DEVICE_ID_ATI_SBX00_SMBUS.
Signed-off-by: Shane Huang <shane.huang@amd.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
sched: clean up task_new_fair()
sched: small schedstat fix
sched: fix wait_start_fair condition in update_stats_wait_end()
sched: call update_curr() in task_tick_fair()
sched: make the scheduler converge to the ideal latency
sched: fix sleeper bonus limit
Load the engine microcode when an interface
is brought up, instead of of doing it when the module
is loaded.
Loosen up tight binding between the driver and the
engine microcode version.
There is no need for microcode update with T3A boards.
Fix the file naming.
Do a better job at logging the loading activity.
Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Very old binutils (2.12.90...) seem to have trouble with newlines
in assembler macro invocation. They put them into the resulting
argument expansion. In this case this lead to a parse error because
a .rept expression ended up spread over multiple lines. Change the PMDS()
invocation to a single line.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
From: Krzysztof Helt <krzysztof.h1@wp.pl>
This patch fixes a typo in architecture constant name.
The kernel for s3c2442 machines does not build without
this fix.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Acked-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Drop the URL for DCO (URL is invalid). Also, point to SubmittingPatches
for the current DCO.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
[libata] Bump driver versions
ata_piix: implement IOCFG bit18 quirk
libata: implement BROKEN_HPA horkage and apply it to affected drives
sata_promise: FastTrack TX4200 is a second-generation chip
pata_marvell: Add more identifiers
ata_piix: add Satellite U200 to broken suspend list
ata: add ATA_MWDMA* and ATA_SWDMA* defines
ata_piix: IDE mode SATA patch for Intel Tolapai
libata-core: Allow translation setting to fail
cleanup: we have the 'se' and 'curr' entity-pointers already,
no need to use p->se and current->se.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
cxgb3 used netdev_priv() and dev->priv for different purposes.
In 2.6.23, netdev_priv() == dev->priv, cxgb3 needs a fix.
This patch is a partial backport of Dave Miller's changes in the
net-2.6.24 git branch.
Without this fix, cxgb3 crashes on 2.6.23.
Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Fixed wrong expression which enabled watchdogs even if nmi_watchdog kernel
parameter wasn't set. This regression got slightly introduced with commit
b7471c6da94d30d3deadc55986cc38d1ff57f9ca.
Introduced NMI_DISABLED (-1) which allows to switch the value of NMI_DEFAULT
without breaking the APIC NMI watchdog code (again).
Fixes:
https://bugzilla.novell.com/show_bug.cgi?id=298084
http://bugzilla.kernel.org/show_bug.cgi?id=7839
And likely some more nmi_watchdog=0 related issues.
Signed-off-by: Daniel Gollub <dgollub@suse.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix following section mismatch warning:
WARNING: drivers/built-in.o(.text+0x73d0): Section mismatch: reference to .init.data: (between 'pxafb_setup' and 'pxafb_init')
The warning are caused by __devinit pxafb_setup() that refers to a
variable marked __initdata. In a hotplug scenario we would have a
reference to the freed .init.data section. Fix this by declaring
g_options __devinitdata.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Spotted by taoyue <yue.tao@windriver.com> and Jeremy Katz <jeremy.katz@windriver.com>.
collect_signal: sigqueue_free:
list_del_init(&first->list);
if (!list_empty(&q->list)) {
// not taken
}
q->flags &= ~SIGQUEUE_PREALLOC;
__sigqueue_free(first); __sigqueue_free(q);
Now, __sigqueue_free() is called twice on the same "struct sigqueue" with the
obviously bad implications.
In particular, this double free breaks the array_cache->avail logic, so the
same sigqueue could be "allocated" twice, and the bug can manifest itself via
the "impossible" BUG_ON(!SIGQUEUE_PREALLOC) in sigqueue_free/send_sigqueue.
Hopefully this can explain these mysterious bug-reports, see
http://marc.info/?t=118766926500003
http://marc.info/?t=118466273000005
Alexey Dobriyan reports this patch makes the difference for the testcase, but
nobody has an access to the application which opened the problems originally.
Also, this patch removes tasklist lock/unlock, ->siglock is enough.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: taoyue <yue.tao@windriver.com>
Cc: Jeremy Katz <jeremy.katz@windriver.com>
Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Cc: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Roland McGrath <roland@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bump the versions for drivers that were modified, but had not already
had a version number bump.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
small schedstat fix: the cfs_rq->wait_runtime 'sum of all runtimes'
statistics counters missed newly forked tasks and thus had a constant
negative skew. Fix this.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
A network interface can get ARP packets even when the interface has
NOARP specified. In a HiperSockets environment this disturbs receiving
systems when packets are sent on the multicast queue. (E.g. TCP/IP on
z/VM issues messages reporting invalid data on the HiperSockets
interface.)
Qeth will no longer send ARP packets on HiperSockets interface when
interface has the NOARP attribute.
Signed-off-by: Klaus D. Wacker <kdwacker@de.ibm.com>
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This should fix an oops with PCMCIA PATA devices
http://bugzilla.kernel.org/show_bug.cgi?id=8424
This is not a full fix for the problem, but probably
still the right thing to do.
[ I'm almost certain it's *not* the right thing to do, but it avoids an
oops, and I want comments from others on what the right thing would
actually be.. I suspect we should just remove the use of dma_mask
entirely in this function, and just use coherent_dma_mask. - Linus ]
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There's a little problem in Documentation/vm/slabinfo.c
The code is using "%d" in a printf() call to print an 'unsigned long'.
This patch corrects it to use "%lu" instead.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Acked-by: Cedric Le Goater <clg@fr.ibm.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some notebooks need bit18 of IOCFG to be cleared for the drive bay to
work even though the bit is NOOP according to the datasheet. This
patch implement IOCFG bit18 quirk and apply it to Clevo M570U.
http://bugzilla.kernel.org/show_bug.cgi?id=8051
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: D. Angelis <dangelis@beta-cae.gr>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Peter Zijlstra noticed the following bug in SCHED_FEAT_SKIP_INITIAL (which
is disabled by default at the moment): it relies on se.wait_start_fair
being 0 while update_stats_wait_end() did not recognize a 0 value,
so instead of 'skipping' the initial interval we gave the new child
a maximum boost of +runtime-limit ...
(No impact on the default kernel, but nice to fix for completeness.)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Exclusive usage of OSA-cards has been introduced. Even though Linux
does not make use of it, qeth should be prepared to receive a bad RC
for some initialization steps. A meaningful message is now given,
if an OSA-device is set online, even though the OSA-adapter is already
exclusively used by another host.
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
If the forcedeth driver receives too much work in an interrupt, it
assumes it has a broken hardware with stuck IRQ. It works around the
problem by disabling interrupts on the nic but makes a printk while
holding device spinlog - which isn't smart thing to do if you have
netconsole on the same nic.
This patch moves the printk's out of the spinlock protected area.
Without this patch the machine hangs hard. With this patch everything
still works even when there is significant increase on CPU usage while
using the nic.
Signed-off-by: Timo Jantunen <jeti@iki.fi>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The dynamic dma kmalloc creation can run into trouble if a
GFP_ATOMIC allocation is the first one performed for a certain size
of dma kmalloc slab.
- Move the adding of the slab to sysfs into a workqueue
(sysfs does GFP_KERNEL allocations)
- Do not call kmem_cache_destroy() (uses slub_lock)
- Only acquire the slub_lock once and--if we cannot wait--do a trylock.
This introduces a slight risk of the first kmalloc(x, GFP_DMA|GFP_ATOMIC)
for a range of sizes failing due to another process holding the slub_lock.
However, we only need to acquire the spinlock once in order to establish
each power of two DMA kmalloc cache. The possible conflict is with the
slub_lock taken during slab management actions (create / remove slab cache).
It is rather typical that a driver will first fill its buffers using
GFP_KERNEL allocations which will wait until the slub_lock can be acquired.
Drivers will also create its slab caches first outside of an atomic
context before starting to use atomic kmalloc from an interrupt context.
If there are any failures then they will occur early after boot or when
loading of multiple drivers concurrently. Drivers can already accomodate
failures of GFP_ATOMIC for other reasons. Retries will then create the slab.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Mariusz Kozlowski reported lockdep's warning:
> =================================
> [ INFO: inconsistent lock state ]
> 2.6.23-rc2-mm1 #7
> ---------------------------------
> inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.
> ifconfig/5492 [HC0[0]:SC0[0]:HE1:SE1] takes:
> (&tp->lock){+...}, at: [<de8706e0>] rtl8139_interrupt+0x27/0x46b [8139too]
> {in-hardirq-W} state was registered at:
> [<c0138eeb>] __lock_acquire+0x949/0x11ac
> [<c01397e7>] lock_acquire+0x99/0xb2
> [<c0452ff3>] _spin_lock+0x35/0x42
> [<de8706e0>] rtl8139_interrupt+0x27/0x46b [8139too]
> [<c0147a5d>] handle_IRQ_event+0x28/0x59
> [<c01493ca>] handle_level_irq+0xad/0x10b
> [<c0105a13>] do_IRQ+0x93/0xd0
> [<c010441e>] common_interrupt+0x2e/0x34
...
> other info that might help us debug this:
> 1 lock held by ifconfig/5492:
> #0: (rtnl_mutex){--..}, at: [<c0451778>] mutex_lock+0x1c/0x1f
>
> stack backtrace:
...
> [<c0452ff3>] _spin_lock+0x35/0x42
> [<de8706e0>] rtl8139_interrupt+0x27/0x46b [8139too]
> [<c01480fd>] free_irq+0x11b/0x146
> [<de871d59>] rtl8139_close+0x8a/0x14a [8139too]
> [<c03bde63>] dev_close+0x57/0x74
...
This shows that a driver's irq handler was running both in hard interrupt
and process contexts with irqs enabled. The latter was done during
free_irq() call and was possible only with CONFIG_DEBUG_SHIRQ enabled.
This was fixed by another patch.
But similar problem is possible with request_irq(): any locks taken from
irq handler could be vulnerable - especially with soft interrupts. This
patch fixes it by disabling local interrupts during handler's run. (It
seems, disabling softirqs should be enough, but it needs more checking
on possible races or other special cases).
Reported-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Signed-off-by: Jarek Poplawski <jarkao2@o2.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some drives choke on READ_NATIVE_MAX_ADDRESS[_EXT]. Implement
ATA_HORKAGE_BROKEN_HPA and apply it to affected drives.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
update the fair-clock before using it for the key value.
[ mingo@elte.hu: small cleanups. ]
Signed-off-by: Ting Yang <tingy@cs.umass.edu>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Online setting of a qeth device may fail for instance because of:
- out-of-memory condition when allocating qdio queues
- IDX ACTIVATE problem
- ...
Such a device is still returned in a driver_for_each_device loop
processed in qeth_reboot_event(), which calls
qeth_clear_qdio_buffers(). Make sure qeth_clear_output_buffer() is
called only, if the qdio queues have been successfully allocated
during initialization of a qeth device.
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Use cpu_relax() in the busy loops, as atomic_read() doesn't automatically
imply volatility for i386 and x86_64. x86_64 doesn't have this issue because
it open-codes the while loop in smpboot.c:smp_callin() itself that already
uses cpu_relax().
For i386, however, smpboot.c:smp_callin() calls wait_for_init_deassert()
which is buggy for mach-default and mach-es7000 cases.
[ I test-built a kernel -- smp_callin() itself got inlined in its only
callsite, smpboot.c:start_secondary() -- and the relevant piece of
code disassembles to the following:
0xc1019704 <start_secondary+12>: mov 0xc144c4c8,%eax
0xc1019709 <start_secondary+17>: test %eax,%eax
0xc101970b <start_secondary+19>: je 0xc1019709 <start_secondary+17>
init_deasserted (at 0xc144c4c8) gets fetched into %eax only once and
then we loop over the test of the stale value in the register only,
so these look like real bugs to me. With the fix below, this becomes:
0xc1019706 <start_secondary+14>: pause
0xc1019708 <start_secondary+16>: cmpl $0x0,0xc144c4c8
0xc101970f <start_secondary+23>: je 0xc1019706 <start_secondary+14>
which looks nice and healthy. ]
Thanks to Heiko Carstens for noticing this.
Signed-off-by: Satyam Sharma <satyam@infradead.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The MAX_PARTIAL checks were supposed to be an optimization. However, slab
shrinking is a manually triggered process either through running slabinfo
or by the kernel calling kmem_cache_shrink.
If one really wants to shrink a slab then all operations should be done
regardless of the size of the partial list. This also fixes an issue that
could surface if the number of partial slabs was initially above MAX_PARTIAL
in kmem_cache_shrink and later drops below MAX_PARTIAL through the
elimination of empty slabs on the partial list (rare). In that case a few
slabs may be left off the partial list (and only be put back when they
are empty).
Signed-off-by: Christoph Lameter <clameter@sgi.com>
This will avoid a possible fault in ecryptfs_sync_page().
In the function, eCryptfs calls sync_page() method of a lower filesystem
without checking its existence. However, there are many filesystems that
don't have this method including network filesystems such as NFS, AFS, and
so forth. They may fail when an eCryptfs page is waiting for lock.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Acked-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch corrects sata_promise to classify FastTrack TX4200
(DID 3515/3519) as a second-generation chip. Promise's partial-
source FT TX4200 driver confirms this classification.
Treating it as a first-generation chip causes several problems:
1. Detection failures. This is a recent regression triggered by
the hotplug-enabling changes in 2.6.23-rc1.
2. Various "failed to resume link for reset" warnings.
This patch fixes <http://bugzilla.kernel.org/show_bug.cgi?id=8936>.
Thanks to Stephen Ziemba for reporting the bug and for testing the fix.
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Tested-by: Stephen Ziemba <sziemba@ecn.purdue.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
de-HZ-ification of the granularity defaults unearthed a pre-existing
property of CFS: while it correctly converges to the granularity goal,
it does not prevent run-time fluctuations in the range of
[-gran ... 0 ... +gran].
With the increase of the granularity due to the removal of HZ
dependencies, this becomes visible in chew-max output (with 5 tasks
running):
out: 28 . 27. 32 | flu: 0 . 0 | ran: 9 . 13 | per: 37 . 40
out: 27 . 27. 32 | flu: 0 . 0 | ran: 17 . 13 | per: 44 . 40
out: 27 . 27. 32 | flu: 0 . 0 | ran: 9 . 13 | per: 36 . 40
out: 29 . 27. 32 | flu: 2 . 0 | ran: 17 . 13 | per: 46 . 40
out: 28 . 27. 32 | flu: 0 . 0 | ran: 9 . 13 | per: 37 . 40
out: 29 . 27. 32 | flu: 0 . 0 | ran: 18 . 13 | per: 47 . 40
out: 28 . 27. 32 | flu: 0 . 0 | ran: 9 . 13 | per: 37 . 40
average slice is the ideal 13 msecs and the period is picture-perfect 40
msecs. But the 'ran' field fluctuates around 13.33 msecs and there's no
mechanism in CFS to keep that from happening: it's a perfectly valid
solution that CFS finds.
to fix this we add a granularity/preemption rule that knows about
the "target latency", which makes tasks that run longer than the ideal
latency run a bit less. The simplest approach is to simply decrease the
preemption granularity when a task overruns its ideal latency. For this
we have to track how much the task executed since its last preemption.
( this adds a new field to task_struct, but we can eliminate that
overhead in 2.6.24 by putting all the scheduler timestamps into an
anonymous union. )
with this change in place, chew-max output is fluctuation-less all
around:
out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40
out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40
out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40
out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40
out: 28 . 27. 39 | flu: 0 . 1 | ran: 13 . 13 | per: 41 . 40
out: 28 . 27. 39 | flu: 0 . 1 | ran: 13 . 13 | per: 41 . 40
this patch has no impact on any fastpath or on any globally observable
scheduling property. (unless you have sharp enough eyes to see
millisecond-level ruckles in glxgears smoothness :-)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
TSO requires tx checksumming. For non GSO frames in TSO/EDDP mode we
have to manually calculate the checksum.
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (edited MACINTOSH_DRIVERS per Geert Uytterhoeven's remark)
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
UIO currently contains a rather dubious statement which wants removing.
The actual questions around whether user space code that depends tightly
on kernel GPL code designed to co-work with it are derivative works of
the kernel is extremely complex, and since we don't have space for either
a masters length essay on legal issues or need to start flamewars lets
simply remove the comment and leave law to lawyers
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <tovalds@linux-foundation.org>
Fix cut 'n paste bug in Atmel SPI driver.
Signed-off-by: Andrew Victor <andrew@sanpeople.com>
Acked-by: David Brownell <david-b@pacbell.net>
Acked-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This replaces the patch which incorrectly removed the 6145
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
There is an Amarok song switch time increase (regression) under
hefty load.
What is happening is that sleeper_bonus is never consumed, and only
rarely goes below runtime_limit, so for the most part, Amarok isn't
getting any bonus at all. We're keeping sleeper_bonus right at
runtime_limit (sched_latency == sched_runtime_limit == 40ms) forever, ie
we don't consume if we're lower that that, and don't add if we're above
it. One Amarok thread waking (or anybody else) will push us past the
threshold, so the next thread waking gets nada, but will reap pain from
the previous thread waking until we drop back to runtime_limit. It
looks to me like under load, some random task gets a bonus, and
everybody else pays, whether deserving or not.
This diff fixed the regression for me at any load rate.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The lguest block device only requests one minor, which means
partitions don't work (eg "root=/dev/lgba1").
Let's follow the crowd and ask for 16.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The default definition in asm-generic conflicts with Alpha's O_DIRECT,
so, like several other arches, it needs to be redefined.
Signed-off-by: Richard Hendersion <rth@twiddle.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is triggered if PCI && !HOTPLUG.
MODPOST vmlinux.o
WARNING: vmlinux.o(.data+0xc910): Section mismatch: reference to .init.text:pci_ite887x_init (between 'pci_serial_quirks' and 'serial_pci_tbl')
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Satellite U200 also shares the problem. Add it to the broken suspend
list. Original patch from John Schember.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: John Schember <john@nachtimwald.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The VESA BIOS is specified to be register-clean. However, we have now
found at least one system which violates that. Thus, be as paranoid
about VESA calls as about everything else.
Huge thanks to Will Simoneau for reporting, diagnosing, and testing
this out on Dell Inspiron 5150.
Cc: Will Simoneau <simoneau@ele.uri.edu>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
This avoids the recent NFS mount regression (returning EBUSY when
mounting the same filesystem twice with different parameters).
The best I can do given the constraints appears to be to have the kernel
first look for a superblock that matches both the fsid and the
user-specified mount options, and then spawn off a new superblock if
that search fails.
Note that this is not the same as specifying nosharecache everywhere
since nosharecache will never attempt to match an existing superblock.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Tested-by: Hua Zhong <hzhong@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix ehca SRQ support so that IPoIB connected mode works:
- Report max_srq > 0 if SRQ is supported
- Report "last wqe reached" asynchronous event when base QP dies;
this is required by the IB spec and IPoIB CM relies on receiving it
when cleaning up.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Page migration currently does not check if the target of the move contains
nodes that that are invalid (if root attempts to migrate pages)
and may try to allocate from invalid nodes if these are specified
leading to oopses.
Return -EINVAL if an offline node is specified.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Shaohua Li <shaohua.li@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
lguest didn't initialize the kernel stack the way a real i386 kernel
does, and ended up triggering a corner-case in the stack frame checking
that doesn't happen on naive i386, and that the stack dumping didn't
handle quite right.
This makes the frame handling more correct, and tries to clarify the
code at the same time so that it's a bit more obvious what is going on.
Thanks to Rusty Russell for debugging the lguest failure-
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For hugepage mappings, the file offset, like the address and size, needs to
be aligned to the size of a hugepage.
In commit 68589bc353037f233fe510ad9ff432338c95db66, the check for this was
moved into prepare_hugepage_range() along with the address and size checks.
But since BenH's rework of the get_unmapped_area() paths leading up to
commit 4b1d89290b62bb2db476c94c82cf7442aab440c8, prepare_hugepage_range()
is only called for MAP_FIXED mappings, not for other mappings. This means
we're no longer ever checking for an aligned offset - I've confirmed that
mmap() will (apparently) succeed with a misaligned offset on both powerpc
and i386 at least.
This patch restores the check, removing it from prepare_hugepage_range()
and putting it back into hugetlbfs_file_mmap(). I'm putting it there,
rather than in the get_unmapped_area() path so it only needs to go in one
place, than separately in the half-dozen or so arch-specific
implementations of hugetlb_get_unmapped_area().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The HPET clocksource in drivers/char/hpet.c was written as generic code
for ia64, but it is not yet ready to replace the native HPET clocksource
implementations that the i386/x86-64 architectures use.
On x86[-64], trying to register this clocksource results in potentially
multiple hpet-based clocksources being registered, and if the ia64 one
is chosen on x86_64 some users have experienced hangs.
Eventually all three architectures may end up using the same code, but
that is not the case right now.
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Paolo Ornati <ornati@fastwebnet.it>
Cc: Bob Picco <bob.picco@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6:
- cxgb3 engine microcode load
cxgb3 - Fix dev->priv usage
qeth: Drop ARP packages on HiperSockets interface with NOARP attribute.
qeth: provide specific message for OSA-adapters exclusively used
qeth: crash during reboot after failing online setting
qeth: Announce tx checksumming for qeth devices in TSO/EDDP mode
qeth: dont return the return values of void functions.
qeth: enforce a rate limit for inbound scatter gather messages
qeth: ungrouping a device must not be interruptible
netxen: fix crashes during module unload
netxen: Avoid firmware load in PCI probe
PS3: fix the bug that 'ifconfig down' would hang
IOC3: Program UART predividers.
We find that SB700 and SB800 use the same SMBus device ID as SB600, which is
0x4385, instead of the already submitted 0x4395.
Besides removing the wrong SB700 device ID, add SB800 support to kernel, by
renaming the PCI_DEVICE_ID_ATI_IXP600_SMBUS into
PCI_DEVICE_ID_ATI_SBX00_SMBUS.
Signed-off-by: Shane Huang <shane.huang@amd.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
sched: clean up task_new_fair()
sched: small schedstat fix
sched: fix wait_start_fair condition in update_stats_wait_end()
sched: call update_curr() in task_tick_fair()
sched: make the scheduler converge to the ideal latency
sched: fix sleeper bonus limit
Load the engine microcode when an interface
is brought up, instead of of doing it when the module
is loaded.
Loosen up tight binding between the driver and the
engine microcode version.
There is no need for microcode update with T3A boards.
Fix the file naming.
Do a better job at logging the loading activity.
Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Very old binutils (2.12.90...) seem to have trouble with newlines
in assembler macro invocation. They put them into the resulting
argument expansion. In this case this lead to a parse error because
a .rept expression ended up spread over multiple lines. Change the PMDS()
invocation to a single line.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
From: Krzysztof Helt <krzysztof.h1@wp.pl>
This patch fixes a typo in architecture constant name.
The kernel for s3c2442 machines does not build without
this fix.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Acked-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
[libata] Bump driver versions
ata_piix: implement IOCFG bit18 quirk
libata: implement BROKEN_HPA horkage and apply it to affected drives
sata_promise: FastTrack TX4200 is a second-generation chip
pata_marvell: Add more identifiers
ata_piix: add Satellite U200 to broken suspend list
ata: add ATA_MWDMA* and ATA_SWDMA* defines
ata_piix: IDE mode SATA patch for Intel Tolapai
libata-core: Allow translation setting to fail
cxgb3 used netdev_priv() and dev->priv for different purposes.
In 2.6.23, netdev_priv() == dev->priv, cxgb3 needs a fix.
This patch is a partial backport of Dave Miller's changes in the
net-2.6.24 git branch.
Without this fix, cxgb3 crashes on 2.6.23.
Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Fixed wrong expression which enabled watchdogs even if nmi_watchdog kernel
parameter wasn't set. This regression got slightly introduced with commit
b7471c6da94d30d3deadc55986cc38d1ff57f9ca.
Introduced NMI_DISABLED (-1) which allows to switch the value of NMI_DEFAULT
without breaking the APIC NMI watchdog code (again).
Fixes:
https://bugzilla.novell.com/show_bug.cgi?id=298084
http://bugzilla.kernel.org/show_bug.cgi?id=7839
And likely some more nmi_watchdog=0 related issues.
Signed-off-by: Daniel Gollub <dgollub@suse.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix following section mismatch warning:
WARNING: drivers/built-in.o(.text+0x73d0): Section mismatch: reference to .init.data: (between 'pxafb_setup' and 'pxafb_init')
The warning are caused by __devinit pxafb_setup() that refers to a
variable marked __initdata. In a hotplug scenario we would have a
reference to the freed .init.data section. Fix this by declaring
g_options __devinitdata.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Spotted by taoyue <yue.tao@windriver.com> and Jeremy Katz <jeremy.katz@windriver.com>.
collect_signal: sigqueue_free:
list_del_init(&first->list);
if (!list_empty(&q->list)) {
// not taken
}
q->flags &= ~SIGQUEUE_PREALLOC;
__sigqueue_free(first); __sigqueue_free(q);
Now, __sigqueue_free() is called twice on the same "struct sigqueue" with the
obviously bad implications.
In particular, this double free breaks the array_cache->avail logic, so the
same sigqueue could be "allocated" twice, and the bug can manifest itself via
the "impossible" BUG_ON(!SIGQUEUE_PREALLOC) in sigqueue_free/send_sigqueue.
Hopefully this can explain these mysterious bug-reports, see
http://marc.info/?t=118766926500003
http://marc.info/?t=118466273000005
Alexey Dobriyan reports this patch makes the difference for the testcase, but
nobody has an access to the application which opened the problems originally.
Also, this patch removes tasklist lock/unlock, ->siglock is enough.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: taoyue <yue.tao@windriver.com>
Cc: Jeremy Katz <jeremy.katz@windriver.com>
Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Cc: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Roland McGrath <roland@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
small schedstat fix: the cfs_rq->wait_runtime 'sum of all runtimes'
statistics counters missed newly forked tasks and thus had a constant
negative skew. Fix this.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
A network interface can get ARP packets even when the interface has
NOARP specified. In a HiperSockets environment this disturbs receiving
systems when packets are sent on the multicast queue. (E.g. TCP/IP on
z/VM issues messages reporting invalid data on the HiperSockets
interface.)
Qeth will no longer send ARP packets on HiperSockets interface when
interface has the NOARP attribute.
Signed-off-by: Klaus D. Wacker <kdwacker@de.ibm.com>
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This should fix an oops with PCMCIA PATA devices
http://bugzilla.kernel.org/show_bug.cgi?id=8424
This is not a full fix for the problem, but probably
still the right thing to do.
[ I'm almost certain it's *not* the right thing to do, but it avoids an
oops, and I want comments from others on what the right thing would
actually be.. I suspect we should just remove the use of dma_mask
entirely in this function, and just use coherent_dma_mask. - Linus ]
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some notebooks need bit18 of IOCFG to be cleared for the drive bay to
work even though the bit is NOOP according to the datasheet. This
patch implement IOCFG bit18 quirk and apply it to Clevo M570U.
http://bugzilla.kernel.org/show_bug.cgi?id=8051
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: D. Angelis <dangelis@beta-cae.gr>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Peter Zijlstra noticed the following bug in SCHED_FEAT_SKIP_INITIAL (which
is disabled by default at the moment): it relies on se.wait_start_fair
being 0 while update_stats_wait_end() did not recognize a 0 value,
so instead of 'skipping' the initial interval we gave the new child
a maximum boost of +runtime-limit ...
(No impact on the default kernel, but nice to fix for completeness.)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Exclusive usage of OSA-cards has been introduced. Even though Linux
does not make use of it, qeth should be prepared to receive a bad RC
for some initialization steps. A meaningful message is now given,
if an OSA-device is set online, even though the OSA-adapter is already
exclusively used by another host.
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
If the forcedeth driver receives too much work in an interrupt, it
assumes it has a broken hardware with stuck IRQ. It works around the
problem by disabling interrupts on the nic but makes a printk while
holding device spinlog - which isn't smart thing to do if you have
netconsole on the same nic.
This patch moves the printk's out of the spinlock protected area.
Without this patch the machine hangs hard. With this patch everything
still works even when there is significant increase on CPU usage while
using the nic.
Signed-off-by: Timo Jantunen <jeti@iki.fi>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The dynamic dma kmalloc creation can run into trouble if a
GFP_ATOMIC allocation is the first one performed for a certain size
of dma kmalloc slab.
- Move the adding of the slab to sysfs into a workqueue
(sysfs does GFP_KERNEL allocations)
- Do not call kmem_cache_destroy() (uses slub_lock)
- Only acquire the slub_lock once and--if we cannot wait--do a trylock.
This introduces a slight risk of the first kmalloc(x, GFP_DMA|GFP_ATOMIC)
for a range of sizes failing due to another process holding the slub_lock.
However, we only need to acquire the spinlock once in order to establish
each power of two DMA kmalloc cache. The possible conflict is with the
slub_lock taken during slab management actions (create / remove slab cache).
It is rather typical that a driver will first fill its buffers using
GFP_KERNEL allocations which will wait until the slub_lock can be acquired.
Drivers will also create its slab caches first outside of an atomic
context before starting to use atomic kmalloc from an interrupt context.
If there are any failures then they will occur early after boot or when
loading of multiple drivers concurrently. Drivers can already accomodate
failures of GFP_ATOMIC for other reasons. Retries will then create the slab.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Mariusz Kozlowski reported lockdep's warning:
> =================================
> [ INFO: inconsistent lock state ]
> 2.6.23-rc2-mm1 #7
> ---------------------------------
> inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.
> ifconfig/5492 [HC0[0]:SC0[0]:HE1:SE1] takes:
> (&tp->lock){+...}, at: [<de8706e0>] rtl8139_interrupt+0x27/0x46b [8139too]
> {in-hardirq-W} state was registered at:
> [<c0138eeb>] __lock_acquire+0x949/0x11ac
> [<c01397e7>] lock_acquire+0x99/0xb2
> [<c0452ff3>] _spin_lock+0x35/0x42
> [<de8706e0>] rtl8139_interrupt+0x27/0x46b [8139too]
> [<c0147a5d>] handle_IRQ_event+0x28/0x59
> [<c01493ca>] handle_level_irq+0xad/0x10b
> [<c0105a13>] do_IRQ+0x93/0xd0
> [<c010441e>] common_interrupt+0x2e/0x34
...
> other info that might help us debug this:
> 1 lock held by ifconfig/5492:
> #0: (rtnl_mutex){--..}, at: [<c0451778>] mutex_lock+0x1c/0x1f
>
> stack backtrace:
...
> [<c0452ff3>] _spin_lock+0x35/0x42
> [<de8706e0>] rtl8139_interrupt+0x27/0x46b [8139too]
> [<c01480fd>] free_irq+0x11b/0x146
> [<de871d59>] rtl8139_close+0x8a/0x14a [8139too]
> [<c03bde63>] dev_close+0x57/0x74
...
This shows that a driver's irq handler was running both in hard interrupt
and process contexts with irqs enabled. The latter was done during
free_irq() call and was possible only with CONFIG_DEBUG_SHIRQ enabled.
This was fixed by another patch.
But similar problem is possible with request_irq(): any locks taken from
irq handler could be vulnerable - especially with soft interrupts. This
patch fixes it by disabling local interrupts during handler's run. (It
seems, disabling softirqs should be enough, but it needs more checking
on possible races or other special cases).
Reported-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Signed-off-by: Jarek Poplawski <jarkao2@o2.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Online setting of a qeth device may fail for instance because of:
- out-of-memory condition when allocating qdio queues
- IDX ACTIVATE problem
- ...
Such a device is still returned in a driver_for_each_device loop
processed in qeth_reboot_event(), which calls
qeth_clear_qdio_buffers(). Make sure qeth_clear_output_buffer() is
called only, if the qdio queues have been successfully allocated
during initialization of a qeth device.
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Use cpu_relax() in the busy loops, as atomic_read() doesn't automatically
imply volatility for i386 and x86_64. x86_64 doesn't have this issue because
it open-codes the while loop in smpboot.c:smp_callin() itself that already
uses cpu_relax().
For i386, however, smpboot.c:smp_callin() calls wait_for_init_deassert()
which is buggy for mach-default and mach-es7000 cases.
[ I test-built a kernel -- smp_callin() itself got inlined in its only
callsite, smpboot.c:start_secondary() -- and the relevant piece of
code disassembles to the following:
0xc1019704 <start_secondary+12>: mov 0xc144c4c8,%eax
0xc1019709 <start_secondary+17>: test %eax,%eax
0xc101970b <start_secondary+19>: je 0xc1019709 <start_secondary+17>
init_deasserted (at 0xc144c4c8) gets fetched into %eax only once and
then we loop over the test of the stale value in the register only,
so these look like real bugs to me. With the fix below, this becomes:
0xc1019706 <start_secondary+14>: pause
0xc1019708 <start_secondary+16>: cmpl $0x0,0xc144c4c8
0xc101970f <start_secondary+23>: je 0xc1019706 <start_secondary+14>
which looks nice and healthy. ]
Thanks to Heiko Carstens for noticing this.
Signed-off-by: Satyam Sharma <satyam@infradead.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The MAX_PARTIAL checks were supposed to be an optimization. However, slab
shrinking is a manually triggered process either through running slabinfo
or by the kernel calling kmem_cache_shrink.
If one really wants to shrink a slab then all operations should be done
regardless of the size of the partial list. This also fixes an issue that
could surface if the number of partial slabs was initially above MAX_PARTIAL
in kmem_cache_shrink and later drops below MAX_PARTIAL through the
elimination of empty slabs on the partial list (rare). In that case a few
slabs may be left off the partial list (and only be put back when they
are empty).
Signed-off-by: Christoph Lameter <clameter@sgi.com>
This will avoid a possible fault in ecryptfs_sync_page().
In the function, eCryptfs calls sync_page() method of a lower filesystem
without checking its existence. However, there are many filesystems that
don't have this method including network filesystems such as NFS, AFS, and
so forth. They may fail when an eCryptfs page is waiting for lock.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Acked-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch corrects sata_promise to classify FastTrack TX4200
(DID 3515/3519) as a second-generation chip. Promise's partial-
source FT TX4200 driver confirms this classification.
Treating it as a first-generation chip causes several problems:
1. Detection failures. This is a recent regression triggered by
the hotplug-enabling changes in 2.6.23-rc1.
2. Various "failed to resume link for reset" warnings.
This patch fixes <http://bugzilla.kernel.org/show_bug.cgi?id=8936>.
Thanks to Stephen Ziemba for reporting the bug and for testing the fix.
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Tested-by: Stephen Ziemba <sziemba@ecn.purdue.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
de-HZ-ification of the granularity defaults unearthed a pre-existing
property of CFS: while it correctly converges to the granularity goal,
it does not prevent run-time fluctuations in the range of
[-gran ... 0 ... +gran].
With the increase of the granularity due to the removal of HZ
dependencies, this becomes visible in chew-max output (with 5 tasks
running):
out: 28 . 27. 32 | flu: 0 . 0 | ran: 9 . 13 | per: 37 . 40
out: 27 . 27. 32 | flu: 0 . 0 | ran: 17 . 13 | per: 44 . 40
out: 27 . 27. 32 | flu: 0 . 0 | ran: 9 . 13 | per: 36 . 40
out: 29 . 27. 32 | flu: 2 . 0 | ran: 17 . 13 | per: 46 . 40
out: 28 . 27. 32 | flu: 0 . 0 | ran: 9 . 13 | per: 37 . 40
out: 29 . 27. 32 | flu: 0 . 0 | ran: 18 . 13 | per: 47 . 40
out: 28 . 27. 32 | flu: 0 . 0 | ran: 9 . 13 | per: 37 . 40
average slice is the ideal 13 msecs and the period is picture-perfect 40
msecs. But the 'ran' field fluctuates around 13.33 msecs and there's no
mechanism in CFS to keep that from happening: it's a perfectly valid
solution that CFS finds.
to fix this we add a granularity/preemption rule that knows about
the "target latency", which makes tasks that run longer than the ideal
latency run a bit less. The simplest approach is to simply decrease the
preemption granularity when a task overruns its ideal latency. For this
we have to track how much the task executed since its last preemption.
( this adds a new field to task_struct, but we can eliminate that
overhead in 2.6.24 by putting all the scheduler timestamps into an
anonymous union. )
with this change in place, chew-max output is fluctuation-less all
around:
out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40
out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40
out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40
out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40
out: 28 . 27. 39 | flu: 0 . 1 | ran: 13 . 13 | per: 41 . 40
out: 28 . 27. 39 | flu: 0 . 1 | ran: 13 . 13 | per: 41 . 40
this patch has no impact on any fastpath or on any globally observable
scheduling property. (unless you have sharp enough eyes to see
millisecond-level ruckles in glxgears smoothness :-)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
UIO currently contains a rather dubious statement which wants removing.
The actual questions around whether user space code that depends tightly
on kernel GPL code designed to co-work with it are derivative works of
the kernel is extremely complex, and since we don't have space for either
a masters length essay on legal issues or need to start flamewars lets
simply remove the comment and leave law to lawyers
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <tovalds@linux-foundation.org>
Fix cut 'n paste bug in Atmel SPI driver.
Signed-off-by: Andrew Victor <andrew@sanpeople.com>
Acked-by: David Brownell <david-b@pacbell.net>
Acked-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There is an Amarok song switch time increase (regression) under
hefty load.
What is happening is that sleeper_bonus is never consumed, and only
rarely goes below runtime_limit, so for the most part, Amarok isn't
getting any bonus at all. We're keeping sleeper_bonus right at
runtime_limit (sched_latency == sched_runtime_limit == 40ms) forever, ie
we don't consume if we're lower that that, and don't add if we're above
it. One Amarok thread waking (or anybody else) will push us past the
threshold, so the next thread waking gets nada, but will reap pain from
the previous thread waking until we drop back to runtime_limit. It
looks to me like under load, some random task gets a bonus, and
everybody else pays, whether deserving or not.
This diff fixed the regression for me at any load rate.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
This is triggered if PCI && !HOTPLUG.
MODPOST vmlinux.o
WARNING: vmlinux.o(.data+0xc910): Section mismatch: reference to .init.text:pci_ite887x_init (between 'pci_serial_quirks' and 'serial_pci_tbl')
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>