commits

I have got a zero division error when disabling the forced
idle injection from the intel powerclamp. I did

echo 0 >/sys/class/thermal/cooling_device48/cur_state

and got

[ 986.072632] divide error: 0000 [#1] PREEMPT SMP
[ 986.078989] Modules linked in:
[ 986.083618] CPU: 17 PID: 24967 Comm: kidle_inject/17 Not tainted 4.7.0-1-default+ #3055
[ 986.093781] Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.R3.27.D685.1305151734 05/15/2013
[ 986.106227] task: ffff880430e1c080 task.stack: ffff880427ef0000
[ 986.114122] RIP: 0010:[<ffffffff81794859>] [<ffffffff81794859>] clamp_thread+0x1d9/0x600
[ 986.124609] RSP: 0018:ffff880427ef3e20 EFLAGS: 00010246
[ 986.131860] RAX: 0000000000000258 RBX: 0000000000000006 RCX: 0000000000000001
[ 986.141179] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000018
[ 986.150478] RBP: ffff880427ef3ec8 R08: ffff880427ef0000 R09: 0000000000000002
[ 986.159779] R10: 0000000000003df2 R11: 0000000000000018 R12: 0000000000000002
[ 986.169089] R13: 0000000000000000 R14: ffff880427ef0000 R15: ffff880427ef0000
[ 986.178388] FS: 0000000000000000(0000) GS:ffff880435940000(0000) knlGS:0000000000000000
[ 986.188785] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 986.196559] CR2: 00007f1d0caf0000 CR3: 0000000002006000 CR4: 00000000001406e0
[ 986.205909] Stack:
[ 986.209524] ffff8802be897b00 ffff880430e1c080 0000000000000011 0000006a35959780
[ 986.219236] 0000000000000011 ffff880427ef0008 0000000000000000 ffff8804359503d0
[ 986.228966] 0000000100029d93 ffffffff81794140 0000000000000000 ffffffff05000011
[ 986.238686] Call Trace:
[ 986.242825] [<ffffffff81794140>] ? pkg_state_counter+0x80/0x80
[ 986.250866] [<ffffffff81794680>] ? powerclamp_set_cur_state+0x180/0x180
[ 986.259797] [<ffffffff8111d1a9>] kthread+0xc9/0xe0
[ 986.266682] [<ffffffff8193d69f>] ret_from_fork+0x1f/0x40
[ 986.274142] [<ffffffff8111d0e0>] ? kthread_create_on_node+0x180/0x180
[ 986.282869] Code: d1 ea 48 89 d6 80 3d 6a d0 d4 00 00 ba 64 00 00 00 89 d8 41 0f 45 f5 0f af c2 42 8d 14 2e be 31 00 00 00 83 fa 31 0f 42 f2 31 d2 <f7> f6 48 8b 15 9e 07 87 00 48 8b 3d 97 07 87 00 48 63 f0 83 e8
[ 986.307806] RIP [<ffffffff81794859>] clamp_thread+0x1d9/0x600
[ 986.315871] RSP <ffff880427ef3e20>

RIP points to the following lines:

compensation = get_compensation(target_ratio);
interval = duration_jiffies*100/(target_ratio+compensation);

A solution would be to switch the following two commands in
powerclamp_set_cur_state():

set_target_ratio = 0;
end_power_clamp();

But I think that the zero division might happen also when target_ratio
is non-zero because the compensation might be negative. Therefore
we also check the sum of target_ratio and compensation explicitly.

Also the compensated_ratio variable is always set. Therefore there
is no need to initialize it.

Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>

9y ago

Wei Yongjun

165989a5

thermal: clock_cooling: Fix missing mutex_init()

9y ago

Linus Torvalds

120c5475

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

9y ago

Guenter Roeck

2b05980d

h8300: Add missing include file to asm/io.h

9y ago

Linus Torvalds

29b4817d

Linux 4.8-rc1 v4.8-rc1

9y ago

Srinivas Pandruvada

176b1ec2

thermal: intel_pch_thermal: Add suspend/resume callback

9y ago

Kuninori Morimoto

f4c59243

thermal: hwmon: EXPORT_SYMBOL_GPL for thermal hwmon sysfs

9y ago

Linus Torvalds

329f4152

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

9y ago

Masahiro Yamada

53fb45d3

arm64: defconfig: enable CONFIG_LOCALVERSION_AUTO

9y ago

Guenter Roeck

783011b1

unicore32: mm: Add missing parameter to arch_vma_access_permitted

9y ago

Linus Torvalds

857953d7

Merge branch 'for-linus' of git://git.kernel.dk/linux-block

9y ago

Michele Di Giorgio

d0b7306d

thermal: fix race condition when updating cooling device

When multiple thermal zones are bound to the same cooling device, multiple
kernel threads may want to update the cooling device state by calling
thermal_cdev_update(). Having cdev not protected by a mutex can lead to a race
condition. Consider the following situation with two kernel threads k1 and k2:

Thread k1 Thread k2
||
|| call thermal_cdev_update()
|| ...
|| set_cur_state(cdev, target);
call power_actor_set_power() ||
... ||
instance->target = state; ||
cdev->updated = false; ||
|| cdev->updated = true;
|| // completes execution
call thermal_cdev_update() ||
// cdev->updated == true ||
return; ||
\/
time

k2 has already looped through the thermal instances looking for the deepest
cooling device state and is preempted right before setting cdev->updated to
true. Now, k1 runs, modifies the thermal instance state and sets cdev->updated
to false. Then, k1 is preempted and k2 continues the execution by setting
cdev->updated to true, therefore preventing k1 from performing the update.
Notice that this is not an issue if k2 looks at the instance->target modified by
k1 "after" it is assigned by k1. In fact, in this case the update will happen
anyway and k1 can safely return immediately from thermal_cdev_update().

This may lead to a situation where a thermal governor never updates the cooling
device. For example, this is the case for the step_wise governor: when calling
the function thermal_zone_trip_update(), the governor may always get a new state
equal to the old one (which, however, wasn't notified to the cooling device) and
will therefore skip the update.

CC: Zhang Rui <rui.zhang@intel.com>
CC: Eduardo Valentin <edubezval@gmail.com>
CC: Peter Feuerer <peter@piie.net>
Reported-by: Toby Huang <toby.huang@arm.com>
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>

9y ago

Linus Torvalds

a1e21033

Merge branch 'for-linus' of git://git.kernel.dk/linux-block

9y ago

Radim Krčmář

89a1d43e

Merge tag 'kvm-s390-master-4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux

9y ago

Riku Voipio

2323439f

arm64: defconfig: add options for virtualization and containers

9y ago

Linus Torvalds

f31494bd

Merge tag 'vfio-v4.8-rc2' of git://github.com/awilliam/linux-vfio

9y ago

Linus Torvalds

635a4ba1

Merge tag 'drm-for-v4.8-zpos' of git://people.freedesktop.org/~airlied/linux

9y ago

Jens Axboe

1eff9d32

block: rename bio bi_rw to bi_opf

9y ago

Johannes Berg

1ea049b2

bvec: avoid variable shadowing warning

9y ago

James Hogan

9b731bcf

MIPS: KVM: Propagate kseg0/mapped tlb fault errors

9y ago

Julius Niedworok

aca411a4

KVM: s390: reset KVM_REQ_MMU_RELOAD if mapping the prefix failed

9y ago

Mark Rutland

dfbca61a

arm64: hibernate: handle allocation failures

9y ago

Linus Torvalds

b112324c

Merge tag 'nfsd-4.8-1' of git://linux-nfs.org/~bfields/linux

9y ago

Alex Williamson

c8952a70

vfio/pci: Fix NULL pointer oops in error interrupt setup handling

9y ago

Linus Torvalds

52ddb7e9

Merge tag 'doc-4.8-fixes' of git://git.lwn.net/linux

9y ago

Dave Airlie

586efded

Merge branch 'generic-zpos-v8' of http://git.linaro.org/people/benjamin.gaignard/kernel into drm-next

9y ago

Jens Axboe

31c64f78

target: iblock_execute_sync_cache() should use bio_set_op_attrs()

9y ago

Joe Lawrence

005411ea

doc: update block/queue-sysfs.txt entries

9y ago

James Hogan

0741f52d

MIPS: KVM: Fix gfn range check in kseg0 tlb faults

9y ago

Julius Niedworok

75a4615c

KVM: s390: set the prefix initially properly

9y ago

Mark Rutland

0194e760

arm64: hibernate: avoid potential TLB conflict

9y ago

Linus Torvalds

9710cb66

Merge tag 'pm-4.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

9y ago

Jeff Layton

dd257933

nfsd: don't return an unhashed lock stateid after taking mutex

9y ago

Linus Torvalds

e9d488c3

Merge tag 'binfmt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/binfmt_misc

9y ago

Jani Nikula

bdf107d8

DocBook: use DOCBOOKS="" to ignore DocBooks instead of IGNORE_DOCBOOKS=1

9y ago

Ville Syrjälä

dfd2e9ab

drm/i915: Check PSR setup time vs. vblank length

9y ago

Benjamin Gaignard

2fc4d838

drm: rcar: use generic code for managing zpos plane property

9y ago

Jens Axboe

ba13e83e

mm: make __swap_writepage() use bio_set_op_attrs()

9y ago

Gabriel Krisman Bertazi

c21377f8

nvme: Suspend all queues before deletion

9y ago

James Hogan

8985d503

MIPS: KVM: Add missing gfn range check

9y ago

Laura Abbott

9adeb8e7

arm64: Handle el1 synchronous instruction aborts cleanly

9y ago

Linus Torvalds

01ea4439

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

9y ago

Rafael J. Wysocki

0aeeb3e7

Merge branches 'pm-sleep' and 'pm-cpufreq'

9y ago

Chuck Lever

42691398

nfsd: Fix race between FREE_STATEID and LOCK

9y ago

Eryu Guan

337684a1

fs: return EPERM on immutable inode

9y ago

James Bottomley

4af75df6

binfmt_misc: add F option description to documentation

9y ago

seokhoon.yoon

09c3bcce

Documenation: update cgroup's document path

9y ago

Ville Syrjälä

6608804b

drm/dp: Add drm_dp_psr_setup_time()

9y ago

Marek Szyprowski

e47726a1

drm/exynos: use generic code for managing zpos plane property

9y ago

Jens Axboe

c11f0c0b

block/mm: make bdev_ops->rw_page() take a bool for read/write

9y ago

Konstantin Khlebnikov

51350ea0

mm, writeback: flush plugged IO in wakeup_flusher_threads()

9y ago

James Hogan

c604cffa

MIPS: KVM: Fix mapped fault broken commpage handling

9y ago

David A. Long

ad05711c

arm64: Remove stack duplicating code from jprobes

9y ago

Linus Torvalds

3bc6d8c1

Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

9y ago

Linux 4.8-rc2 v4.8-rc2

694d0d0b

Linus Torvalds

Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux

0043ee40

Linus Torvalds

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu

4ef870e3

Linus Torvalds

Merge branches 'thermal-intel' and 'thermal-core' into next

1577ddfa

Zhang Rui

Merge tag 'fixes-for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

118253a5

Linus Torvalds

m68knommu: fix user a5 register being overwritten

0b980271

Greg Ungerer

thermal/powerclamp: Prevent division by zero when counting interval

70c50ee7

Petr Mladek

thermal: clock_cooling: Fix missing mutex_init()

165989a5

Wei Yongjun

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

120c5475

Linus Torvalds

h8300: Add missing include file to asm/io.h

2b05980d

Guenter Roeck

Linux 4.8-rc1 v4.8-rc1

29b4817d

Linus Torvalds

thermal: intel_pch_thermal: Add suspend/resume callback

176b1ec2

Srinivas Pandruvada

thermal: hwmon: EXPORT_SYMBOL_GPL for thermal hwmon sysfs

f4c59243

Kuninori Morimoto

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

329f4152

Linus Torvalds

arm64: defconfig: enable CONFIG_LOCALVERSION_AUTO

53fb45d3

Masahiro Yamada

unicore32: mm: Add missing parameter to arch_vma_access_permitted

783011b1

Guenter Roeck

Merge branch 'for-linus' of git://git.kernel.dk/linux-block

857953d7

Linus Torvalds

thermal: fix race condition when updating cooling device

d0b7306d

Michele Di Giorgio

Merge branch 'for-linus' of git://git.kernel.dk/linux-block

a1e21033

Linus Torvalds

Merge tag 'kvm-s390-master-4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux

89a1d43e

Radim Krčmář

arm64: defconfig: add options for virtualization and containers

2323439f

Riku Voipio

Merge tag 'vfio-v4.8-rc2' of git://github.com/awilliam/linux-vfio

f31494bd

Linus Torvalds

Merge tag 'drm-for-v4.8-zpos' of git://people.freedesktop.org/~airlied/linux

635a4ba1

Linus Torvalds

block: rename bio bi_rw to bi_opf

1eff9d32

Jens Axboe

bvec: avoid variable shadowing warning

1ea049b2

Johannes Berg

MIPS: KVM: Propagate kseg0/mapped tlb fault errors

9b731bcf

James Hogan

KVM: s390: reset KVM_REQ_MMU_RELOAD if mapping the prefix failed

aca411a4

Julius Niedworok

arm64: hibernate: handle allocation failures

dfbca61a

Mark Rutland

Merge tag 'nfsd-4.8-1' of git://linux-nfs.org/~bfields/linux

b112324c

Linus Torvalds

vfio/pci: Fix NULL pointer oops in error interrupt setup handling

c8952a70

Alex Williamson

Merge tag 'doc-4.8-fixes' of git://git.lwn.net/linux

52ddb7e9

Linus Torvalds

Merge branch 'generic-zpos-v8' of http://git.linaro.org/people/benjamin.gaignard/kernel into drm-next

586efded

Dave Airlie

target: iblock_execute_sync_cache() should use bio_set_op_attrs()

31c64f78

Jens Axboe

doc: update block/queue-sysfs.txt entries

005411ea

Joe Lawrence

MIPS: KVM: Fix gfn range check in kseg0 tlb faults

0741f52d

James Hogan

KVM: s390: set the prefix initially properly

75a4615c

Julius Niedworok

arm64: hibernate: avoid potential TLB conflict

0194e760

Mark Rutland

Merge tag 'pm-4.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

9710cb66

Linus Torvalds

nfsd: don't return an unhashed lock stateid after taking mutex

dd257933

Jeff Layton

Merge tag 'binfmt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/binfmt_misc

e9d488c3

Linus Torvalds

DocBook: use DOCBOOKS="" to ignore DocBooks instead of IGNORE_DOCBOOKS=1

bdf107d8

Jani Nikula

drm/i915: Check PSR setup time vs. vblank length

dfd2e9ab

Ville Syrjälä

drm: rcar: use generic code for managing zpos plane property

2fc4d838

Benjamin Gaignard

mm: make __swap_writepage() use bio_set_op_attrs()

ba13e83e

Jens Axboe

nvme: Suspend all queues before deletion

When nvme_delete_queue fails in the first pass of the
nvme_disable_io_queues() loop, we return early, failing to suspend all
of the IO queues. Later, on the nvme_pci_disable path, this causes us
to disable MSI without actually having freed all the IRQs, which
triggers the BUG_ON in free_msi_irqs(), as show below.

This patch refactors nvme_disable_io_queues to suspend all queues before
start submitting delete queue commands. This way, we ensure that we
have at least returned every IRQ before continuing with the removal
path.

[ 487.529200] kernel BUG at ../drivers/pci/msi.c:368!
cpu 0x46: Vector: 700 (Program Check) at [c0000078c5b83650]
pc: c000000000627a50: free_msi_irqs+0x90/0x200
lr: c000000000627a40: free_msi_irqs+0x80/0x200
sp: c0000078c5b838d0
msr: 9000000100029033
current = 0xc0000078c5b40000
paca = 0xc000000002bd7600 softe: 0 irq_happened: 0x01
pid = 1376, comm = kworker/70:1H
kernel BUG at ../drivers/pci/msi.c:368!
Linux version 4.7.0.mainline+ (root@iod76) (gcc version 5.3.1 20160413
(Ubuntu/IBM 5.3.1-14ubuntu2.1) ) #104 SMP Fri Jul 29 09:20:17 CDT 2016
enter ? for help
[c0000078c5b83920] d0000000363b0cd8 nvme_dev_disable+0x208/0x4f0 [nvme]
[c0000078c5b83a10] d0000000363b12a4 nvme_timeout+0xe4/0x250 [nvme]
[c0000078c5b83ad0] c0000000005690e4 blk_mq_rq_timed_out+0x64/0x110
[c0000078c5b83b40] c00000000056c930 bt_for_each+0x160/0x170
[c0000078c5b83bb0] c00000000056d928 blk_mq_queue_tag_busy_iter+0x78/0x110
[c0000078c5b83c00] c0000000005675d8 blk_mq_timeout_work+0xd8/0x1b0
[c0000078c5b83c50] c0000000000e8cf0 process_one_work+0x1e0/0x590
[c0000078c5b83ce0] c0000000000e9148 worker_thread+0xa8/0x660
[c0000078c5b83d80] c0000000000f2090 kthread+0x110/0x130
[c0000078c5b83e30] c0000000000095f0 ret_from_kernel_thread+0x5c/0x6c

Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: linux-nvme@lists.infradead.org
Signed-off-by: Jens Axboe <axboe@fb.com>