Age | Commit message (Collapse) | Author | Files | Lines |
|
commit 29f6738609e40227dabcc63bfb3b84b3726a75bd upstream.
memblock_free_reserved_regions() calls memblock_free(), but
memblock_free() would double reserved.regions too, so we could free the
old range for reserved.regions.
Also tj said there is another bug which could be related to this.
| I don't think we're saving any noticeable
| amount by doing this "free - give it to page allocator - reserve
| again" dancing. We should just allocate regions aligned to page
| boundaries and free them later when memblock is no longer in use.
in that case, when DEBUG_PAGEALLOC, will get panic:
memblock_free: [0x0000102febc080-0x0000102febf080] memblock_free_reserved_regions+0x37/0x39
BUG: unable to handle kernel paging request at ffff88102febd948
IP: [<ffffffff836a5774>] __next_free_mem_range+0x9b/0x155
PGD 4826063 PUD cf67a067 PMD cf7fa067 PTE 800000102febd160
Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
CPU 0
Pid: 0, comm: swapper Not tainted 3.5.0-rc2-next-20120614-sasha #447
RIP: 0010:[<ffffffff836a5774>] [<ffffffff836a5774>] __next_free_mem_range+0x9b/0x155
See the discussion at https://lkml.org/lkml/2012/6/13/469
So try to allocate with PAGE_SIZE alignment and free it later.
Reported-by: Sasha Levin <levinsasha928@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d8adde17e5f858427504725218c56aef90e90fc7 upstream.
kswapd_stop() is called to destroy the kswapd work thread when all memory
of a NUMA node has been offlined. But kswapd_stop() only terminates the
work thread without resetting NODE_DATA(nid)->kswapd to NULL. The stale
pointer will prevent kswapd_run() from creating a new work thread when
adding memory to the memory-less NUMA node again. Eventually the stale
pointer may cause invalid memory access.
An example stack dump as below. It's reproduced with 2.6.32, but latest
kernel has the same issue.
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff81051a94>] exit_creds+0x12/0x78
PGD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/system/memory/memory391/state
CPU 11
Modules linked in: cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq microcode fuse loop dm_mod tpm_tis rtc_cmos i2c_i801 rtc_core tpm serio_raw pcspkr sg tpm_bios igb i2c_core iTCO_wdt rtc_lib mptctl iTCO_vendor_support button dca bnx2 usbhid hid uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif edd ext3 mbcache jbd fan ide_pci_generic ide_core ata_generic ata_piix libata thermal processor thermal_sys hwmon mptsas mptscsih mptbase scsi_transport_sas scsi_mod
Pid: 7949, comm: sh Not tainted 2.6.32.12-qiuxishi-5-default #92 Tecal RH2285
RIP: 0010:exit_creds+0x12/0x78
RSP: 0018:ffff8806044f1d78 EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff880604f22140 RCX: 0000000000019502
RDX: 0000000000000000 RSI: 0000000000000202 RDI: 0000000000000000
RBP: ffff880604f22150 R08: 0000000000000000 R09: ffffffff81a4dc10
R10: 00000000000032a0 R11: ffff880006202500 R12: 0000000000000000
R13: 0000000000c40000 R14: 0000000000008000 R15: 0000000000000001
FS: 00007fbc03d066f0(0000) GS:ffff8800282e0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 000000060f029000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process sh (pid: 7949, threadinfo ffff8806044f0000, task ffff880603d7c600)
Stack:
ffff880604f22140 ffffffff8103aac5 ffff880604f22140 ffffffff8104d21e
ffff880006202500 0000000000008000 0000000000c38000 ffffffff810bd5b1
0000000000000000 ffff880603d7c600 00000000ffffdd29 0000000000000003
Call Trace:
__put_task_struct+0x5d/0x97
kthread_stop+0x50/0x58
offline_pages+0x324/0x3da
memory_block_change_state+0x179/0x1db
store_mem_state+0x9e/0xbb
sysfs_write_file+0xd0/0x107
vfs_write+0xad/0x169
sys_write+0x45/0x6e
system_call_fastpath+0x16/0x1b
Code: ff 4d 00 0f 94 c0 84 c0 74 08 48 89 ef e8 1f fd ff ff 5b 5d 31 c0 41 5c c3 53 48 8b 87 20 06 00 00 48 89 fb 48 8b bf 18 06 00 00 <8b> 00 48 c7 83 18 06 00 00 00 00 00 00 f0 ff 0f 0f 94 c0 84 c0
RIP exit_creds+0x12/0x78
RSP <ffff8806044f1d78>
CR2: 0000000000000000
[akpm@linux-foundation.org: add pglist_data.kswapd locking comments]
Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: David Rientjes <rientjes@google.com>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 047fe3605235888f3ebcda0c728cb31937eadfe6 upstream.
Dave Jones reported a kernel BUG at mm/slub.c:3474! triggered
by splice_shrink_spd() called from vmsplice_to_pipe()
commit 35f3d14dbbc5 (pipe: add support for shrinking and growing pipes)
added capability to adjust pipe->buffers.
Problem is some paths don't hold pipe mutex and assume pipe->buffers
doesn't change for their duration.
Fix this by adding nr_pages_max field in struct splice_pipe_desc, and
use it in place of pipe->buffers where appropriate.
splice_shrink_spd() loses its struct pipe_inode_info argument.
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Tom Herbert <therbert@google.com>
Tested-by: Dave Jones <davej@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
[bwh: Backported to 3.2:
- Adjust context in vmsplice_to_pipe()
- Update one more call to splice_shrink_spd(), from skb_splice_bits()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e4eed03fd06578571c01d4f1478c874bb432c815 upstream.
In the x86 32bit PAE CONFIG_TRANSPARENT_HUGEPAGE=y case while holding the
mmap_sem for reading, cmpxchg8b cannot be used to read pmd contents under
Xen.
So instead of dealing only with "consistent" pmdvals in
pmd_none_or_trans_huge_or_clear_bad() (which would be conceptually
simpler) we let pmd_none_or_trans_huge_or_clear_bad() deal with pmdvals
where the low 32bit and high 32bit could be inconsistent (to avoid having
to use cmpxchg8b).
The only guarantee we get from pmd_read_atomic is that if the low part of
the pmd was found null, the high part will be null too (so the pmd will be
considered unstable). And if the low part of the pmd is found "stable"
later, then it means the whole pmd was read atomically (because after a
pmd is stable, neither MADV_DONTNEED nor page faults can alter it anymore,
and we read the high part after the low part).
In the 32bit PAE x86 case, it is enough to read the low part of the pmdval
atomically to declare the pmd as "stable" and that's true for THP and no
THP, furthermore in the THP case we also have a barrier() that will
prevent any inconsistent pmdvals to be cached by a later re-read of the
*pmd.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Jonathan Nieder <jrnieder@gmail.com>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Tested-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 26c191788f18129af0eb32a358cdaea0c7479626 upstream.
When holding the mmap_sem for reading, pmd_offset_map_lock should only
run on a pmd_t that has been read atomically from the pmdp pointer,
otherwise we may read only half of it leading to this crash.
PID: 11679 TASK: f06e8000 CPU: 3 COMMAND: "do_race_2_panic"
#0 [f06a9dd8] crash_kexec at c049b5ec
#1 [f06a9e2c] oops_end at c083d1c2
#2 [f06a9e40] no_context at c0433ded
#3 [f06a9e64] bad_area_nosemaphore at c043401a
#4 [f06a9e6c] __do_page_fault at c0434493
#5 [f06a9eec] do_page_fault at c083eb45
#6 [f06a9f04] error_code (via page_fault) at c083c5d5
EAX: 01fb470c EBX: fff35000 ECX: 00000003 EDX: 00000100 EBP:
00000000
DS: 007b ESI: 9e201000 ES: 007b EDI: 01fb4700 GS: 00e0
CS: 0060 EIP: c083bc14 ERR: ffffffff EFLAGS: 00010246
#7 [f06a9f38] _spin_lock at c083bc14
#8 [f06a9f44] sys_mincore at c0507b7d
#9 [f06a9fb0] system_call at c083becd
start len
EAX: ffffffda EBX: 9e200000 ECX: 00001000 EDX: 6228537f
DS: 007b ESI: 00000000 ES: 007b EDI: 003d0f00
SS: 007b ESP: 62285354 EBP: 62285388 GS: 0033
CS: 0073 EIP: 00291416 ERR: 000000da EFLAGS: 00000286
This should be a longstanding bug affecting x86 32bit PAE without THP.
Only archs with 64bit large pmd_t and 32bit unsigned long should be
affected.
With THP enabled the barrier() in pmd_none_or_trans_huge_or_clear_bad()
would partly hide the bug when the pmd transition from none to stable,
by forcing a re-read of the *pmd in pmd_offset_map_lock, but when THP is
enabled a new set of problem arises by the fact could then transition
freely in any of the none, pmd_trans_huge or pmd_trans_stable states.
So making the barrier in pmd_none_or_trans_huge_or_clear_bad()
unconditional isn't good idea and it would be a flakey solution.
This should be fully fixed by introducing a pmd_read_atomic that reads
the pmd in order with THP disabled, or by reading the pmd atomically
with cmpxchg8b with THP enabled.
Luckily this new race condition only triggers in the places that must
already be covered by pmd_none_or_trans_huge_or_clear_bad() so the fix
is localized there but this bug is not related to THP.
NOTE: this can trigger on x86 32bit systems with PAE enabled with more
than 4G of ram, otherwise the high part of the pmd will never risk to be
truncated because it would be zero at all times, in turn so hiding the
SMP race.
This bug was discovered and fully debugged by Ulrich, quote:
----
[..]
pmd_none_or_trans_huge_or_clear_bad() loads the content of edx and
eax.
496 static inline int pmd_none_or_trans_huge_or_clear_bad(pmd_t
*pmd)
497 {
498 /* depend on compiler for an atomic pmd read */
499 pmd_t pmdval = *pmd;
// edi = pmd pointer
0xc0507a74 <sys_mincore+548>: mov 0x8(%esp),%edi
...
// edx = PTE page table high address
0xc0507a84 <sys_mincore+564>: mov 0x4(%edi),%edx
...
// eax = PTE page table low address
0xc0507a8e <sys_mincore+574>: mov (%edi),%eax
[..]
Please note that the PMD is not read atomically. These are two "mov"
instructions where the high order bits of the PMD entry are fetched
first. Hence, the above machine code is prone to the following race.
- The PMD entry {high|low} is 0x0000000000000000.
The "mov" at 0xc0507a84 loads 0x00000000 into edx.
- A page fault (on another CPU) sneaks in between the two "mov"
instructions and instantiates the PMD.
- The PMD entry {high|low} is now 0x00000003fda38067.
The "mov" at 0xc0507a8e loads 0xfda38067 into eax.
----
Reported-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
upstream commit 9793f7c88937e7ac07305ab1af1a519225836823.
This new routine is responsible for service registration in a specified
network context.
The idea is to separate service creation from per-net operations.
Note also: since registering service with svc_bind() can fail, the
service will be destroyed and during destruction it will try to
unregister itself from rpcbind. In this case unregistration has to be
skipped.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
upstream commit e3f70eadb7dddfb5a2bb9afff7abfc6ee17a29d0.
v2: dereference of most probably already released nlm_host removed in
nlmclnt_done() and reclaimer().
These routines are called from locks reclaimer() kernel thread. This thread
works in "init_net" network context and currently relays on persence on lockd
thread and it's per-net resources. Thus lockd_up() and lockd_down() can't relay
on current network context. So let's pass corrent one into them.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit dbf0e4c7257f8d684ec1a3c919853464293de66e upstream.
Quite a few ASUS computers experience a nasty problem, related to the
EHCI controllers, when going into system suspend. It was observed
that the problem didn't occur if the controllers were not put into the
D3 power state before starting the suspend, and commit
151b61284776be2d6f02d48c23c3625678960b97 (USB: EHCI: fix crash during
suspend on ASUS computers) was created to do this.
It turned out this approach messed up other computers that didn't have
the problem -- it prevented USB wakeup from working. Consequently
commit c2fb8a3fa25513de8fedb38509b1f15a5bbee47b (USB: add
NO_D3_DURING_SLEEP flag and revert 151b61284776be2) was merged; it
reverted the earlier commit and added a whitelist of known good board
names.
Now we know the actual cause of the problem. Thanks to AceLan Kao for
tracking it down.
According to him, an engineer at ASUS explained that some of their
BIOSes contain a bug that was added in an attempt to work around a
problem in early versions of Windows. When the computer goes into S3
suspend, the BIOS tries to verify that the EHCI controllers were first
quiesced by the OS. Nothing's wrong with this, but the BIOS does it
by checking that the PCI COMMAND registers contain 0 without checking
the controllers' power state. If the register isn't 0, the BIOS
assumes the controller needs to be quiesced and tries to do so. This
involves making various MMIO accesses to the controller, which don't
work very well if the controller is already in D3. The end result is
a system hang or memory corruption.
Since the value in the PCI COMMAND register doesn't matter once the
controller has been suspended, and since the value will be restored
anyway when the controller is resumed, we can work around the BIOS bug
simply by setting the register to 0 during system suspend. This patch
(as1590) does so and also reverts the second commit mentioned above,
which is now unnecessary.
In theory we could do this for every PCI device. However to avoid
introducing new problems, the patch restricts itself to EHCI host
controllers.
Finally the affected systems can suspend with USB wakeup working
properly.
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=37632
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=42728
Based-on-patch-by: AceLan Kao <acelan.kao@canonical.com>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Dâniel Fraga <fragabr@gmail.com>
Tested-by: Javier Marcet <jmarcet@gmail.com>
Tested-by: Andrey Rahmatullin <wrar@wrar.name>
Tested-by: Oleksij Rempel <bug-track@fisher-privat.net>
Tested-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6ef1b512f4e6f936d89aa20be3d97a7ec7c290ac upstream.
fill_result_tf() grabs the taskfile flags from the originating qc which
sas_ata_qc_fill_rtf() promptly overwrites. The presence of an
ata_taskfile in the sata_device makes it tempting to just copy the full
contents in sas_ata_qc_fill_rtf(). However, libata really only wants
the fis contents and expects the other portions of the taskfile to not
be touched by ->qc_fill_rtf. To that end store a fis buffer in the
sata_device and use ata_tf_from_fis() like every other ->qc_fill_rtf()
implementation.
Reported-by: Praveen Murali <pmurali@logicube.com>
Tested-by: Praveen Murali <pmurali@logicube.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 222a806af830fda34ad1f6bc991cd226916de060 upstream.
Avoid crashing if the private_data pointer happens to be NULL. This has
been seen sometimes when a host reset happens, notably when there are
many LUNs:
host3: Assigned Port ID 0c1601
scsi host3: libfc: Host reset succeeded on port (0c1601)
BUG: unable to handle kernel NULL pointer dereference at 0000000000000350
IP: [<ffffffff81352bb8>] scsi_send_eh_cmnd+0x58/0x3a0
<snip>
Process scsi_eh_3 (pid: 4144, threadinfo ffff88030920c000, task ffff880326b160c0)
Stack:
000000010372e6ba 0000000000000282 000027100920dca0 ffffffffa0038ee0
0000000000000000 0000000000030003 ffff88030920dc80 ffff88030920dc80
00000002000e0000 0000000a00004000 ffff8803242f7760 ffff88031326ed80
Call Trace:
[<ffffffff8105b590>] ? lock_timer_base+0x70/0x70
[<ffffffff81352fbe>] scsi_eh_tur+0x3e/0xc0
[<ffffffff81353a36>] scsi_eh_test_devices+0x76/0x170
[<ffffffff81354125>] scsi_eh_host_reset+0x85/0x160
[<ffffffff81354291>] scsi_eh_ready_devs+0x91/0x110
[<ffffffff813543fd>] scsi_unjam_host+0xed/0x1f0
[<ffffffff813546a8>] scsi_error_handler+0x1a8/0x200
[<ffffffff81354500>] ? scsi_unjam_host+0x1f0/0x1f0
[<ffffffff8106ec3e>] kthread+0x9e/0xb0
[<ffffffff81509264>] kernel_thread_helper+0x4/0x10
[<ffffffff8106eba0>] ? kthread_freezable_should_stop+0x70/0x70
[<ffffffff81509260>] ? gs_change+0x13/0x13
Code: 25 28 00 00 00 48 89 45 c8 31 c0 48 8b 87 80 00 00 00 48 8d b5 60 ff ff ff 89 d1 48 89 fb 41 89 d6 4c 89 fa 48 8b 80 b8 00 00 00
<48> 8b 80 50 03 00 00 48 8b 00 48 89 85 38 ff ff ff 48 8b 07 4c
RIP [<ffffffff81352bb8>] scsi_send_eh_cmnd+0x58/0x3a0
RSP <ffff88030920dc50>
CR2: 0000000000000350
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Marcus Dennis <marcusx.e.dennis@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 2dfd06036ba7ae8e7be2daf5a2fff1dac42390bf upstream.
Ocfs2 uses kiocb.*private as a flag of unsigned long size. In
commit a11f7e6 ocfs2: serialize unaligned aio, the unaligned
io flag is involved in it to serialize the unaligned aio. As
*private is not initialized in init_sync_kiocb() of do_sync_write(),
this unaligned io flag may be unexpectly set in an aligned dio.
And this will cause OCFS2_I(inode)->ip_unaligned_aio decreased
to -1 in ocfs2_dio_end_io(), thus the following unaligned dio
will hang forever at ocfs2_aiodio_wait() in ocfs2_file_aio_write().
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6a0bdffa0073857870a4ed1b4489762146359eb4 upstream.
Several bug reports have been received recently for USB mass-storage
devices that don't handle READ CAPACITY(16) commands properly. They
report bogus sizes, in some cases becoming unusable as a result.
The bugs were triggered by commit
09b6b51b0b6c1b9bb61815baf205e4d74c89ff04 (SCSI & usb-storage: add
flags for VPD pages and REPORT LUNS), which caused usb-storage to stop
overriding the SCSI level reported by devices. By default, the sd
driver will try READ CAPACITY(16) first for any device whose level is
above SCSI_SPC_2.
It seems likely that any device large enough to require the use of
READ CAPACITY(16) (i.e., 2 TB or more) would be able to handle READ
CAPACITY(10) commands properly. Indeed, I don't know of any devices
that don't handle READ CAPACITY(10) properly.
Therefore this patch (as1559) adds a new flag telling the sd driver
to try READ CAPACITY(10) before READ CAPACITY(16), and sets this flag
for every USB mass-storage device. If a device really is larger than
2 TB, sd will fall back to READ CAPACITY(16) just as it used to.
This fixes Bugzilla #43391.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Hans de Goede <hdegoede@redhat.com>
CC: "James E.J. Bottomley" <JBottomley@parallels.com>
CC: Matthew Dharm <mdharm-usb@one-eyed-alien.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 15fd943af50dbc5f7f4de33835795c72595f7bf4 upstream.
When inbound messages arrive, rpmsg core looks up their associated
endpoint (by destination address) and then invokes their callback.
We've made sure that endpoints will never be de-allocated after they
were found by rpmsg core, but we also need to protect against the
(rare) scenario where the rpmsg driver was just removed, and its
callback function isn't available anymore.
This is achieved by introducing a callback mutex, which must be taken
before the callback is invoked, and, obviously, before it is removed.
Reported-by: Fernando Guzman Lugo <fernando.lugo@ti.com>
Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 5a081caa0414b9bbb82c17ffab9d6fe66edbb72f upstream.
When an inbound message arrives, the rpmsg core looks up its
associated endpoint and invokes the registered callback.
If a message arrives while its endpoint is being removed (because
the rpmsg driver was removed, or a recovery of a remote processor
has kicked in) we must ensure atomicity, i.e.:
- Either the ept is removed before it is found
or
- The ept is found but will not be freed until the callback returns
This is achieved by maintaining a per-ept reference count, which,
when drops to zero, will trigger deallocation of the ept.
With this in hand, it is now forbidden to directly deallocate
epts once they have been added to the endpoints idr.
Reported-by: Fernando Guzman Lugo <fernando.lugo@ti.com>
Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit abca7c4965845924f65d40e0aa1092bdd895e314 upstream.
On arches that do not support this_cpu_cmpxchg_double() slab_lock is used
to do atomic cmpxchg() on double word which contains page->_count. The
page count can be changed from get_page() or put_page() without taking
slab_lock. That corrupts page counter.
Fix it by moving page->_count out of cmpxchg_double data. So that slub
does no change it while updating slub meta-data in struct page.
[akpm@linux-foundation.org: use standard comment layout, tweak comment text]
Reported-by: Amey Bhide <abhide@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 62b1a8ab9b3660bb820d8dfe23148ed6cda38574 ]
Orphaning skb in dev_hard_start_xmit() makes bonding behavior
unfriendly for applications sending big UDP bursts : Once packets
pass the bonding device and come to real device, they might hit a full
qdisc and be dropped. Without orphaning, the sender is automatically
throttled because sk->sk_wmemalloc reaches sk->sk_sndbuf (assuming
sk_sndbuf is not too big)
We could try to defer the orphaning adding another test in
dev_hard_start_xmit(), but all this seems of little gain,
now that BQL tends to make packets more likely to be parked
in Qdisc queues instead of NIC TX ring, in cases where performance
matters.
Reverts commits :
fc6055a5ba31 net: Introduce skb_orphan_try()
87fd308cfc6b net: skb_tx_hash() fix relative to skb_orphan_try()
and removes SKBTX_DRV_NEEDS_SK_REF flag
Reported-and-bisected-by: Jean-Michel Hautbois <jhautbois@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 5ee31c6898ea5537fcea160999d60dc63bc0c305 ]
In the transmit path of the bonding driver, skb->cb is used to
stash the skb->queue_mapping so that the bonding device can set its
own queue mapping. This value becomes corrupted since the skb->cb is
also used in __dev_xmit_skb.
When transmitting through bonding driver, bond_select_queue is
called from dev_queue_xmit. In bond_select_queue the original
skb->queue_mapping is copied into skb->cb (via bond_queue_mapping)
and skb->queue_mapping is overwritten with the bond driver queue.
Subsequently in dev_queue_xmit, __dev_xmit_skb is called which writes
the packet length into skb->cb, thereby overwriting the stashed
queue mappping. In bond_dev_queue_xmit (called from hard_start_xmit),
the queue mapping for the skb is set to the stashed value which is now
the skb length and hence is an invalid queue for the slave device.
If we want to save skb->queue_mapping into skb->cb[], best place is to
add a field in struct qdisc_skb_cb, to make sure it wont conflict with
other layers (eg : Qdiscc, Infiniband...)
This patchs also makes sure (struct qdisc_skb_cb)->data is aligned on 8
bytes :
netem qdisc for example assumes it can store an u64 in it, without
misalignment penalty.
Note : we only have 20 bytes left in (struct qdisc_skb_cb)->data[].
The largest user is CHOKe and it fills it.
Based on a previous patch from Tom Herbert.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Tom Herbert <therbert@google.com>
Cc: John Fastabend <john.r.fastabend@intel.com>
Cc: Roland Dreier <roland@kernel.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 55432d2b543a4b6dfae54f5c432a566877a85d90 ]
commit 5faa5df1fa2024 (inetpeer: Invalidate the inetpeer tree along with
the routing cache) added a race :
Before freeing an inetpeer, we must respect a RCU grace period, and make
sure no user will attempt to increase refcnt.
inetpeer_invalidate_tree() waits for a RCU grace period before inserting
inetpeer tree into gc_list and waking the worker. At that time, no
concurrent lookup can find a inetpeer in this tree.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 20e2a86485967c385d7c7befc1646e4d1d39362e ]
When NetLabel is not enabled, e.g. CONFIG_NETLABEL=n, and the system
receives a CIPSO tagged packet it is dropped (cipso_v4_validate()
returns non-zero). In most cases this is the correct and desired
behavior, however, in the case where we are simply forwarding the
traffic, e.g. acting as a network bridge, this becomes a problem.
This patch fixes the forwarding problem by providing the basic CIPSO
validation code directly in ip_options_compile() without the need for
the NetLabel or CIPSO code. The new validation code can not perform
any of the CIPSO option label/value verification that
cipso_v4_validate() does, but it can verify the basic CIPSO option
format.
The behavior when NetLabel is enabled is unchanged.
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit c2fb8a3fa25513de8fedb38509b1f15a5bbee47b upstream.
This patch (as1558) fixes a problem affecting several ASUS computers:
The machine crashes or corrupts memory when going into suspend if the
ehci-hcd driver is bound to any controllers. Users have been forced
to unbind or unload ehci-hcd before putting their systems to sleep.
After extensive testing, it was determined that the machines don't
like going into suspend when any EHCI controllers are in the PCI D3
power state. Presumably this is a firmware bug, but there's nothing
we can do about it except to avoid putting the controllers in D3
during system sleep.
The patch adds a new flag to indicate whether the problem is present,
and avoids changing the controller's power state if the flag is set.
Runtime suspend is unaffected; this matters only for system suspend.
However as a side effect, the controller will not respond to remote
wakeup requests while the system is asleep. Hence USB wakeup is not
functional -- but of course, this is already true in the current state
of affairs.
A similar patch has already been applied as commit
151b61284776be2d6f02d48c23c3625678960b97 (USB: EHCI: fix crash during
suspend on ASUS computers). The patch supersedes that one and reverts
it. There are two differences:
The old patch added the flag at the USB level; this patch
adds it at the PCI level.
The old patch applied to all chipsets with the same vendor,
subsystem vendor, and product IDs; this patch makes an
exception for a known-good system (based on DMI information).
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Dâniel Fraga <fragabr@gmail.com>
Tested-by: Andrey Rahmatullin <wrar@wrar.name>
Tested-by: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9b15b817f3d62409290fd56fe3cbb076a931bb0a upstream.
Minchan Kim reports that when a system has many swap areas, and tmpfs
swaps out to the ninth or more, shmem_getpage_gfp()'s attempts to read
back the page cannot locate it, and the read fails with -ENOMEM.
Whoops. Yes, I blindly followed read_swap_header()'s pte_to_swp_entry(
swp_entry_to_pte()) technique for determining maximum usable swap
offset, without stopping to realize that that actually depends upon the
pte swap encoding shifting swap offset to the higher bits and truncating
it there. Whereas our radix_tree swap encoding leaves offset in the
lower bits: it's swap "type" (that is, index of swap area) that was
truncated.
Fix it by reducing the SWP_TYPE_SHIFT() in swapops.h, and removing the
broken radix_to_swp_entry(swp_to_radix_entry()) from read_swap_header().
This does not reduce the usable size of a swap area any further, it
leaves it as claimed when making the original commit: no change from 3.0
on x86_64, nor on i386 without PAE; but 3.0's 512GB is reduced to 128GB
per swapfile on i386 with PAE. It's not a change I would have risked
five years ago, but with x86_64 supported for ten years, I believe it's
appropriate now.
Hmm, and what if some architecture implements its swap pte with offset
encoded below type? That would equally break the maximum usable swap
offset check. Happily, they all follow the same tradition of encoding
offset above type, but I'll prepare a check on that for next.
Reported-and-Reviewed-and-Tested-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit db63a4c8115a0bb904496e1cdd3e7488e68b0d06 upstream.
Where devices are visible via more than one host we sometimes wish to
indicate that cirtain devices should be ignored on a specific host. Add a
host flag indicating that this host wishes to ignore ATA specific devices.
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Cc: Victor Miasnikov <vvm@tut.by>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit ae82fdb1406ad41d68f07027fe31f2d35ba22a90 upstream.
Commit 026cee0086fe1df4cf74691cf273062cc769617d "params:
<level>_initcall-like kernel parameters" set old-style module
parameters to level 0. And we call those level 0 calls where we used
to, early in start_kernel().
We also loop through the initcall levels and call the levelled
module_params before the corresponding initcall. Unfortunately level
0 is early_init(), so we call the standard module_param calls twice.
(Turns out most things don't care, but at least ubi.mtd does).
Change the level to -1 for standard module_param calls.
Reported-by: Benoît Thébaudeau <benoit.thebaudeau@advansee.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 7aaa61b3476462b69f1ac7669fcca8d608ce3cb5 upstream.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a2bef8ce826dd1e787fd8ad9b6e0566ba59dab43 upstream.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 4a6991cc1fad514745b79181df3ace72d561e7aa upstream.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d430f7dbf7bd6aaaa40c0660b3204df8cf07b22b upstream.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit fffaee365fded09f9ebf2db19066065fa54323c3 upstream.
This patch fixes bug in macro radix_tree_for_each_contig().
If radix_tree_next_slot() sees NULL in next slot it returns NULL, but following
radix_tree_next_chunk() switches iterating into next chunk. As result iterating
becomes non-contiguous and breaks vfs "splice" and all its users.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Reported-and-bisected-by: Hans de Bruin <jmdebruin@xmsnet.nl>
Reported-and-bisected-by: Ondrej Zary <linux@rainbow-software.org>
Reported-bisected-and-tested-by: Toralf Förster <toralf.foerster@gmx.de>
Link: https://lkml.org/lkml/2012/6/5/64
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 617c8c11236716dcbda877e764b7bf37c6fd8063 ]
At the beginning of __skb_cow, headroom gets set to a minimum of
NET_SKB_PAD. This causes unnecessary reallocations if the buffer was not
cloned and the headroom is just below NET_SKB_PAD, but still more than the
amount requested by the caller.
This was showing up frequently in my tests on VLAN tx, where
vlan_insert_tag calls skb_cow_head(skb, VLAN_HLEN).
Locally generated packets should have enough headroom, and for forward
paths, we already have NET_SKB_PAD bytes of headroom, so we don't need to
add any extra space here.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 0c1833797a5a6ec23ea9261d979aa18078720b74 ]
Since commit ad0081e43a
"ipv6: Fragment locally generated tunnel-mode IPSec6 packets as needed"
the fragment of packets is incorrect.
because tunnel mode needs IPsec headers and trailer for all fragments,
while on transport mode it is sufficient to add the headers to the
first fragment and the trailer to the last.
so modify mtu and maxfraglen base on ipsec mode and if fragment is first
or last.
with my test,it work well(every fragment's size is the mtu)
and does not trigger slow fragment path.
Changes from v1:
though optimization, mtu_prev and maxfraglen_prev can be delete.
replace xfrm mode codes with dst_entry's new frag DST_XFRM_TUNNEL.
add fuction ip6_append_data_mtu to make codes clearer.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9295b7a07c859a42346221b5839be0ae612333b0 upstream.
Programs using /proc/kpageflags need to know about the various flags. The
<linux/kernel-page-flags.h> provides them and the comments in the file
indicate that it is supposed to be used by user-level code. But the file
is not installed.
Install the headers and mark the unstable flags as out-of-bounds. The
page-type tool is also adjusted to not duplicate the definitions
Signed-off-by: Ulrich Drepper <drepper@gmail.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit bbbc4c4d8c5face097d695f9bf3a39647ba6b7e7 upstream.
Commit 06e8935feb ("optimized SDIO IRQ handling for single irq")
introduced some spurious calls to SDIO function interrupt handlers,
such as when the SDIO IRQ thread is started, or the safety check
performed upon a system resume. Let's add a flag to perform the
optimization only when a real interrupt is signaled by the host
driver and we know there is no point confirming it.
Reported-by: Sujit Reddy Thumma <sthumma@codeaurora.org>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 68c2c39a76b094e9b2773e5846424ea674bf2c46 upstream.
PV on HVM guests map GSIs into event channels. At restore time the
event channels are resumed by restore_pirqs.
Device drivers might try to register the same GSI again through ACPI at
restore time, but the GSI has already been mapped and bound by
restore_pirqs. This patch detects these situations and avoids
mapping the same GSI multiple times.
Without this patch we get:
(XEN) irq.c:2235: dom4: pirq 23 or emuirq 28 already mapped
and waste a pirq.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 8815bb09af21316aeb5f8948b24ac62181670db2 upstream.
On some HCDs usb_unlink_urb() can directly call the
completion handler. That limits the spinlocks that can
be taken in the handler to locks not held while calling
usb_unlink_urb()
To prevent a race with resubmission, this patch exposes
usbcore's infrastructure for blocking submission, uses it
and so drops the lock without causing a race in usbhid.
Signed-off-by: Oliver Neukum <oneukum@suse.de>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Pull block layer fixes from Jens Axboe:
"A few small, but important fixes. Most of them are marked for stable
as well
- Fix failure to release a semaphore on error path in mtip32xx.
- Fix crashable condition in bio_get_nr_vecs().
- Don't mark end-of-disk buffers as mapped, limit it to i_size.
- Fix for build problem with CONFIG_BLOCK=n on arm at least.
- Fix for a buffer overlow on UUID partition printing.
- Trivial removal of unused variables in dac960."
* 'for-linus' of git://git.kernel.dk/linux-block:
block: fix buffer overflow when printing partition UUIDs
Fix blkdev.h build errors when BLOCK=n
bio allocation failure due to bio_get_nr_vecs()
block: don't mark buffers beyond end of disk as mapped
mtip32xx: release the semaphore on an error path
dac960: Remove unused variables from DAC960_CreateProcEntries()
|
|
'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf, x86 and scheduler updates from Ingo Molnar.
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tracing: Do not enable function event with enable
perf stat: handle ENXIO error for perf_event_open
perf: Turn off compiler warnings for flex and bison generated files
perf stat: Fix case where guest/host monitoring is not supported by kernel
perf build-id: Fix filename size calculation
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, kvm: KVM paravirt kernels don't check for CPUID being unavailable
x86: Fix section annotation of acpi_map_cpu2node()
x86/microcode: Ensure that module is only loaded on supported Intel CPUs
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Fix KVM and ia64 boot crash due to sched_groups circular linked list assumption
|
|
Pull networking tree from David Miller:
1) ptp_pch driver build broke during this merge window due to missing
slab.h header, fix from Geery Uytterhoeven.
2) If ipset passes in a bogus hash table size we crash because the size
is not validated properly. Compounding this, gcc-4.7 can miscompile
ipset such that even when the user specifies legitimate parameters
the tool passes in an out-of-range size to the kernel.
Fix from Jozsef Kadlecsik.
3) Users have reported that the netdev watchdog can trigger with pch_gbe
devices, and it turns out this is happening because of races in the
TX path of the driver leading to the transmitter hanging. Fix from
Eric Dumazet, reported and tested by Andy Cress.
4) Novatel USB551L devices match the generic class entries for the cdc
ethernet USB driver, but they don't work because they have generic
descriptors and thus need FLAG_WWAN to function properly.
Add the necessary ID table entry to fix this, from Dan Williams.
5) A recursive locking fix in the USBNET driver added a new problem, in
that packet list traversal is now racy and we can thus access
unlinked SKBs and crash.
Avoid this situation by adding some extra state tracking, from Ming
Lei.
6) The rtlwifi conversion to asynchronous firmware loading is racy, fix
by reordering the probe procedure. From Larry Finger.
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=43187
7) Fix regressions with bluetooth keyboards by notifying userland
properly when the security level changes, from Gustavo Padovan.
8) Bluetooth needs to make sure device connected events are emitted
before other kinds of events, otherwise userspace will think there is
no baseband link yet and therefore abort the sockets associated with
that connection.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
netfilter: ipset: fix hash size checking in kernel
ptp_pch: Add missing #include <linux/slab.h>
pch_gbe: fix transmit races
cdc_ether: add Novatel USB551L device IDs for FLAG_WWAN
usbnet: fix skb traversing races during unlink(v2)
Bluetooth: mgmt: Fix device_connected sending order
Bluetooth: notify userspace of security level change
rtlwifi: fix for race condition when firmware is cached
|
|
The hash size must fit both into u32 (jhash) and the max value of
size_t. The missing checking could lead to kernel crash, bug reported
by Seblu.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
John Linville says:
Here are three more fixes that some of my developers are desperate to
see included in 3.4...
Johan Hedberg went to some length justifyng the inclusion of these two
Bluetooth fixes:
"The device_connected fix should be quite self-explanatory, but it's
actually a wider issue than just for keyboards. All profiles that do
incoming connection authorization (e.g. headsets) will break without it
with specific hardware. The reason it wasn't caught earlier is that it
only occurs with specific Bluetooth adapters.
As for the security level patch, this fixes L2CAP socket based security
level elevation during a connection. The HID profile needs this (for
keyboards) and it is the only way to achieve the security level
elevation when using the management interface to talk to the kernel
(hence the management enabling patch being the one that exposes this"
The rtlwifi fix addresses a regression related to firmware loading,
as described in kernel.org bug 43187. It basically just moves a hunk
of code to a more appropriate place.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
|
|
Commit 4231d47e6fe69f061f96c98c30eaf9fb4c14b96d(net/usbnet: avoid
recursive locking in usbnet_stop()) fixes the recursive locking
problem by releasing the skb queue lock before unlink, but may
cause skb traversing races:
- after URB is unlinked and the queue lock is released,
the refered skb and skb->next may be moved to done queue,
even be released
- in skb_queue_walk_safe, the next skb is still obtained
by next pointer of the last skb
- so maybe trigger oops or other problems
This patch extends the usage of entry->state to describe 'start_unlink'
state, so always holding the queue(rx/tx) lock to change the state if
the referd skb is in rx or tx queue because we need to know if the
refered urb has been started unlinking in unlink_urbs.
The other part of this patch is based on Huajun's patch:
always traverse from head of the tx/rx queue to get skb which is
to be unlinked but not been started unlinking.
Signed-off-by: Huajun Li <huajun.li.lee@gmail.com>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Cc: Oliver Neukum <oneukum@suse.de>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
6d1d8050b4bc8 "block, partition: add partition_meta_info to hd_struct"
added part_unpack_uuid() which assumes that the passed in buffer has
enough space for sprintfing "%pU" - 37 characters including '\0'.
Unfortunately, b5af921ec0233 "init: add support for root devices
specified by partition UUID" supplied 33 bytes buffer to the function
leading to the following panic with stackprotector enabled.
Kernel panic - not syncing: stack-protector: Kernel stack corrupted in: ffffffff81b14c7e
[<ffffffff815e226b>] panic+0xba/0x1c6
[<ffffffff81b14c7e>] ? printk_all_partitions+0x259/0x26xb
[<ffffffff810566bb>] __stack_chk_fail+0x1b/0x20
[<ffffffff81b15c7e>] printk_all_paritions+0x259/0x26xb
[<ffffffff81aedfe0>] mount_block_root+0x1bc/0x27f
[<ffffffff81aee0fa>] mount_root+0x57/0x5b
[<ffffffff81aee23b>] prepare_namespace+0x13d/0x176
[<ffffffff8107eec0>] ? release_tgcred.isra.4+0x330/0x30
[<ffffffff81aedd60>] kernel_init+0x155/0x15a
[<ffffffff81087b97>] ? schedule_tail+0x27/0xb0
[<ffffffff815f4d24>] kernel_thread_helper+0x5/0x10
[<ffffffff81aedc0b>] ? start_kernel+0x3c5/0x3c5
[<ffffffff815f4d20>] ? gs_change+0x13/0x13
Increase the buffer size, remove the dangerous part_unpack_uuid() and
use snprintf() directly from printk_all_partitions().
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Szymon Gruszczynski <sz.gruszczynski@googlemail.com>
Cc: Will Drewry <wad@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
"For a some fix patches for v3.4, including a regression fix at DVB core"
Fix up trivial conflicts in Documentation/feature-removal-schedule.txt
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] gspca - sonixj: Fix a zero divide in isoc interrupt
[media] media: videobuf2-dma-contig: include header for exported symbols
[media] media: videobuf2-dma-contig: quiet sparse noise about plain integer as NULL pointer
[media] media: vb2-memops: Export vb2_get_vma symbol
[media] s5p-fimc: Correct memory allocation for VIDIOC_CREATE_BUFS
[media] s5p-fimc: Fix locking in subdev set_crop op
[media] dvb_frontend: fix a regression with DVB-S zig-zag
[media] fintek-cir: change || to &&
[media] V4L: Schedule V4L2_CID_HCENTER, V4L2_CID_VCENTER controls for removal
[media] rc: Postpone ISR registration
[media] marvell-cam: fix an ARM build error
[media] V4L: soc-camera: protect hosts during probing from overzealous user-space
|
|
It fixes L2CAP socket based security level elevation during a
connection. The HID profile needs this (for keyboards) and it is the only
way to achieve the security level elevation when using the management
interface to talk to the kernel (hence the management enabling patch
being the one that exposes this issue).
It enables the userspace a security level change when the socket is
already connected and create a way to notify the socket the result of the
request. At the moment of the request the socket is made non writable, if
the request fails the connections closes, otherwise the socket is made
writable again, POLL_OUT is emmited.
Signed-off-by: Gustavo Padovan <gustavo@padovan.org>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
I see builds failing with:
CC [M] drivers/mmc/host/dw_mmc.o
In file included from drivers/mmc/host/dw_mmc.c:15:
include/linux/blkdev.h:1404: warning: 'struct task_struct' declared inside parameter list
include/linux/blkdev.h:1404: warning: its scope is only this definition or declaration, which is probably not what you want
include/linux/blkdev.h:1408: warning: 'struct task_struct' declared inside parameter list
include/linux/blkdev.h:1413: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'blk_needs_flush_plug'
make[4]: *** [drivers/mmc/host/dw_mmc.o] Error 1
This is because dw_mmc.c includes linux/blkdev.h as the very first file,
and when CONFIG_BLOCK=n, blkdev.h omits all includes.
As it requires linux/sched.h even when CONFIG_BLOCK=n, move this out of
the #ifdef.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Pull networking fixes from David S. Miller:
1) Since we do RCU lookups on ipv4 FIB entries, we have to test if the
entry is dead before returning it to our caller.
2) openvswitch locking and packet validation fixes from Ansis Atteka,
Jesse Gross, and Pravin B Shelar.
3) Fix PM resume locking in IGB driver, from Benjamin Poirier.
4) Fix VLAN header handling in vhost-net and macvtap, from Basil Gor.
5) Revert a bogus network namespace isolation change that was causing
regressions on S390 networking devices.
6) If bonding decides to process and handle a LACPDU frame, we
shouldn't bump the rx_dropped counter. From Jiri Bohac.
7) Fix mis-calculation of available TX space in r8169 driver when doing
TSO, which can lead to crashes and/or hung device. From Julien
Ducourthial.
8) SCTP does not validate cached routes properly in all cases, from
Nicolas Dichtel.
9) Link status interrupt needs to be handled in ks8851 driver, from
Stephen Boyd.
10) Use capable(), not cap_raised(), in connector/userns netlink code.
From Eric W. Biederman via Andrew Morton.
11) Fix pktgen OOPS on module unload, from Eric Dumazet.
12) iwlwifi under-estimates SKB truesizes, also from Eric Dumazet.
13) Cure division by zero in SFC driver, from Ben Hutchings.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits)
ks8851: Update link status during link change interrupt
macvtap: restore vlan header on user read
vhost-net: fix handle_rx buffer size
bonding: don't increase rx_dropped after processing LACPDUs
connector/userns: replace netlink uses of cap_raised() with capable()
sctp: check cached dst before using it
pktgen: fix crash at module unload
Revert "net: maintain namespace isolation between vlan and real device"
ehea: fix losing of NEQ events when one event occurred early
igb: fix rtnl race in PM resume path
ipv4: Do not use dead fib_info entries.
r8169: fix unsigned int wraparound with TSO
sfc: Fix division by zero when using one RX channel and no SR-IOV
openvswitch: Validation of IPv6 set port action uses IPv4 header
net: compare_ether_addr[_64bits]() has no ordering
cdc_ether: Ignore bogus union descriptor for RNDIS devices
bnx2x: bug fix when loading after SAN boot
e1000: Silence sparse warnings by correcting type
igb, ixgbe: netdev_tx_reset_queue incorrectly called from tx init path
openvswitch: Release rtnl_lock if ovs_vport_cmd_build_info() failed.
...
|
|
Hi,
We have a bug report open where a squashfs image mounted on ppc64 would
exhibit errors due to trying to read beyond the end of the disk. It can
easily be reproduced by doing the following:
[root@ibm-p750e-02-lp3 ~]# ls -l install.img
-rw-r--r-- 1 root root 142032896 Apr 30 16:46 install.img
[root@ibm-p750e-02-lp3 ~]# mount -o loop ./install.img /mnt/test
[root@ibm-p750e-02-lp3 ~]# dd if=/dev/loop0 of=/dev/null
dd: reading `/dev/loop0': Input/output error
277376+0 records in
277376+0 records out
142016512 bytes (142 MB) copied, 0.9465 s, 150 MB/s
In dmesg, you'll find the following:
squashfs: version 4.0 (2009/01/31) Phillip Lougher
[ 43.106012] attempt to access beyond end of device
[ 43.106029] loop0: rw=0, want=277410, limit=277408
[ 43.106039] Buffer I/O error on device loop0, logical block 138704
[ 43.106053] attempt to access beyond end of device
[ 43.106057] loop0: rw=0, want=277412, limit=277408
[ 43.106061] Buffer I/O error on device loop0, logical block 138705
[ 43.106066] attempt to access beyond end of device
[ 43.106070] loop0: rw=0, want=277414, limit=277408
[ 43.106073] Buffer I/O error on device loop0, logical block 138706
[ 43.106078] attempt to access beyond end of device
[ 43.106081] loop0: rw=0, want=277416, limit=277408
[ 43.106085] Buffer I/O error on device loop0, logical block 138707
[ 43.106089] attempt to access beyond end of device
[ 43.106093] loop0: rw=0, want=277418, limit=277408
[ 43.106096] Buffer I/O error on device loop0, logical block 138708
[ 43.106101] attempt to access beyond end of device
[ 43.106104] loop0: rw=0, want=277420, limit=277408
[ 43.106108] Buffer I/O error on device loop0, logical block 138709
[ 43.106112] attempt to access beyond end of device
[ 43.106116] loop0: rw=0, want=277422, limit=277408
[ 43.106120] Buffer I/O error on device loop0, logical block 138710
[ 43.106124] attempt to access beyond end of device
[ 43.106128] loop0: rw=0, want=277424, limit=277408
[ 43.106131] Buffer I/O error on device loop0, logical block 138711
[ 43.106135] attempt to access beyond end of device
[ 43.106139] loop0: rw=0, want=277426, limit=277408
[ 43.106143] Buffer I/O error on device loop0, logical block 138712
[ 43.106147] attempt to access beyond end of device
[ 43.106151] loop0: rw=0, want=277428, limit=277408
[ 43.106154] Buffer I/O error on device loop0, logical block 138713
[ 43.106158] attempt to access beyond end of device
[ 43.106162] loop0: rw=0, want=277430, limit=277408
[ 43.106166] attempt to access beyond end of device
[ 43.106169] loop0: rw=0, want=277432, limit=277408
...
[ 43.106307] attempt to access beyond end of device
[ 43.106311] loop0: rw=0, want=277470, limit=2774
Squashfs manages to read in the end block(s) of the disk during the
mount operation. Then, when dd reads the block device, it leads to
block_read_full_page being called with buffers that are beyond end of
disk, but are marked as mapped. Thus, it would end up submitting read
I/O against them, resulting in the errors mentioned above. I fixed the
problem by modifying init_page_buffers to only set the buffer mapped if
it fell inside of i_size.
Cheers,
Jeff
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Acked-by: Nick Piggin <npiggin@kernel.dk>
--
Changes from v1->v2: re-used max_block, as suggested by Nick Piggin.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
dst_check() will take care of SA (and obsolete field), hence
IPsec rekeying scenario is taken into account.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Vlad Yaseivch <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This reverts commit 8a83a00b0735190384a348156837918271034144.
It causes regressions for S390 devices, because it does an
unconditional DST drop on SKBs for vlans and the QETH device
needs the neighbour entry hung off the DST for certain things
on transmit.
Arnd can't remember exactly why he even needed this change.
Conflicts:
drivers/net/macvlan.c
net/8021q/vlan_dev.c
net/core/dev.c
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With the adding of function tracing event to perf, it caused a
side effect that produces the following warning when enabling all
events in ftrace:
# echo 1 > /sys/kernel/debug/tracing/events/enable
[console]
event trace: Could not enable event function
This is because when enabling all events via the debugfs system
it ignores events that do not have a ->reg() function assigned.
This was to skip over the ftrace internal events (as they are
not TRACE_EVENTs). But as the ftrace function event now has
a ->reg() function attached to it for use with perf, it is no
longer ignored.
Worse yet, this ->reg() function is being called when it should
not be. It returns an error and causes the above warning to
be printed.
By adding a new event_call flag (TRACE_EVENT_FL_IGNORE_ENABLE)
and have all ftrace internel event structures have it set,
setting the events/enable will no longe try to incorrectly enable
the function event and does not warn.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|