summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2014-11-18net: rfkill-gpio: Add handle clock for rfkill gpioBeomho Seo1-3/+4
WORKAROUND: Temporary workaround for bluetooth enable. This patch add clk parse and clk get function for using bluetooth. Change-Id: Id72770e73c8b5cc895ad6a2e331899380beca7b7 Signed-off-by: Beomho Seo <beomho.seo@samsung.com>
2014-11-18net: rfkill-gpio: add low power mode for bluetooth chipSeung-Woo Kim1-0/+110
WORKAROUND: Temporary workaround for bluetooth enable. This patch add LP mode for bluetooth chip. Change-Id: I1badf49085e58cec08704581beb0458ffbeaee64 Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
2014-11-18Bluetooth: add hci event notificationSeung-Woo Kim1-0/+17
WORKAROUND: Temporary workaround for bluetooth enable. For specific hci events, operations are needed. So this patch adds event notification and its notifier registration. Change-Id: I6f8fe0b3b90c12d7f4fe35823d25b9fbdcd98d06 Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
2014-11-18net: rfkill-gpio: add host wake and wake pinsSeung-Woo Kim1-0/+99
WORKAROUND: Temporary workaround for bluetooth enable. The wake pin is used for waking up the hw and host wake pin is used for waking up host processor during sleep. Additionally, This patch add reset-gpio property and devm_gpio_request_one function. Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com> [This patch rebased by Beomho Seo] Signed-off-by: Beomho Seo <beomho.seo@samsung.com> Change-Id: Ibb2f74cae6e1d11ae84172d0f49a10563fc57e7f
2014-11-18net: rfkill-gpio: fix initial state as blockedSeung-Woo Kim1-0/+4
WORKAROUND: Temporary worgaround for bluetooth enable. Initial state is not set as blocked, so it is always unblocked after booting. This patch fixes initual state of rfkill-gpio as blocked. Change-Id: Ide67763d10e2ef3ea75fe58d917cd725b26e2cab Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
2014-11-18net: rfkill: add line for parsing shutdown gpio prooertyBeomho Seo1-0/+1
This patch add line in rfkill_gpio_dt_probe function. "shutdown-gpio" property is added on device node already. Change-Id: I4c6861178cafa64d2017a26f5dbe9e311db2004c Signed-off-by: Beomho Seo <beomho.seo@samsung.com>
2014-11-18net: rfkill: gpio: add device tree supportChen-Yu Tsai1-1/+24
Change-Id: Ic179fcf929d3555a29addd39024011537ea041fa Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2014-11-18net: rfkill: gpio: fix reversed clock enable stateChen-Yu Tsai1-1/+1
rfkill-gpio has clk_enabled = blocked, which is true when rfkill blocks the device. This results in calling clock enable/disable at the wrong time. Reversing the value fixes this. Change-Id: Id13a2bd8d7314b29753d2cdedca62b27fa9fd1d9 Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2014-11-18net: rfkill: gpio: use clk_prepare_enable/clk_disable_unprepareChen-Yu Tsai1-2/+2
rfkill-gpio calls clk_enable() without first calling clk_prepare(), resulting in a warning and no effect. Switch to clk_prepare_enable() and clk_disable_unprepare. Change-Id: I4cb7ed7ff551daab034deb300f6def1c44721fe3 Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2014-11-18net: rfkill: gpio: prepare for DT and ACPI supportHeikki Krogerus1-41/+51
This will add the relevant values like the gpios and the type in rfkill_gpio_platform_data to the rfkill_gpio_data structure. It will allow those values to be easily picked from DT and ACPI tables later. Change-Id: I0851561b648c52be17dbe9a7bb01b0b62ff621a6 Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Acked-by: Rhyland Klein <rklein@nvidia.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-11-18net: rfkill: gpio: spinlock-safe GPIO accessHeikki Krogerus1-8/+8
This sets the direction of the gpio once when it's requested, and uses the spinlock-safe gpio_set_state() to change the state. Change-Id: I1e9e9035dc9451d7531d230645d5f1572130a77b Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Acked-by: Rhyland Klein <rklein@nvidia.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-11-18net: rfkill: gpio: clean up clock handlingHeikki Krogerus2-28/+9
Use a simple flag to see the state of the clock, and make the clock available even without a name. Also, get rid of HAVE_CLK dependency. Change-Id: Ie113c861a296a57ff32f5a3ee95ef0b9d9fff7e4 Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Acked-by: Rhyland Klein <rklein@nvidia.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-11-18net: rfkill: gpio: convert to resource managed allocationHeikki Krogerus1-55/+20
And remove now unneeded resource freeing. Change-Id: Ie6baa61361effe095de95c5ac19cd19f6ddf4475 Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Acked-by: Rhyland Klein <rklein@nvidia.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-11-18netfilter: x_tables: lightweight process control group matchingDaniel Borkmann3-0/+82
It would be useful e.g. in a server or desktop environment to have a facility in the notion of fine-grained "per application" or "per application group" firewall policies. Probably, users in the mobile, embedded area (e.g. Android based) with different security policy requirements for application groups could have great benefit from that as well. For example, with a little bit of configuration effort, an admin could whitelist well-known applications, and thus block otherwise unwanted "hard-to-track" applications like [1] from a user's machine. Blocking is just one example, but it is not limited to that, meaning we can have much different scenarios/policies that netfilter allows us than just blocking, e.g. fine grained settings where applications are allowed to connect/send traffic to, application traffic marking/conntracking, application-specific packet mangling, and so on. Implementation of PID-based matching would not be appropriate as they frequently change, and child tracking would make that even more complex and ugly. Cgroups would be a perfect candidate for accomplishing that as they associate a set of tasks with a set of parameters for one or more subsystems, in our case the netfilter subsystem, which, of course, can be combined with other cgroup subsystems into something more complex if needed. As mentioned, to overcome this constraint, such processes could be placed into one or multiple cgroups where different fine-grained rules can be defined depending on the application scenario, while e.g. everything else that is not part of that could be dropped (or vice versa), thus making life harder for unwanted processes to communicate to the outside world. So, we make use of cgroups here to track jobs and limit their resources in terms of iptables policies; in other words, limiting, tracking, etc what they are allowed to communicate. In our case we're working on outgoing traffic based on which local socket that originated from. Also, one doesn't even need to have an a-prio knowledge of the application internals regarding their particular use of ports or protocols. Matching is *extremly* lightweight as we just test for the sk_classid marker of sockets, originating from net_cls. net_cls and netfilter do not contradict each other; in fact, each construct can live as standalone or they can be used in combination with each other, which is perfectly fine, plus it serves Tejun's requirement to not introduce a new cgroups subsystem. Through this, we result in a very minimal and efficient module, and don't add anything except netfilter code. One possible, minimal usage example (many other iptables options can be applied obviously): 1) Configuring cgroups if not already done, e.g.: mkdir /sys/fs/cgroup/net_cls mount -t cgroup -o net_cls net_cls /sys/fs/cgroup/net_cls mkdir /sys/fs/cgroup/net_cls/0 echo 1 > /sys/fs/cgroup/net_cls/0/net_cls.classid (resp. a real flow handle id for tc) 2) Configuring netfilter (iptables-nftables), e.g.: iptables -A OUTPUT -m cgroup ! --cgroup 1 -j DROP 3) Running applications, e.g.: ping 208.67.222.222 <pid:1799> echo 1799 > /sys/fs/cgroup/net_cls/0/tasks 64 bytes from 208.67.222.222: icmp_seq=44 ttl=49 time=11.9 ms [...] ping 208.67.220.220 <pid:1804> ping: sendmsg: Operation not permitted [...] echo 1804 > /sys/fs/cgroup/net_cls/0/tasks 64 bytes from 208.67.220.220: icmp_seq=89 ttl=56 time=19.0 ms [...] Of course, real-world deployments would make use of cgroups user space toolsuite, or own custom policy daemons dynamically moving applications from/to various cgroups. [1] http://www.blackhat.com/presentations/bh-europe-06/bh-eu-06-biondi/bh-eu-06-biondi-up.pdf Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: cgroups@vger.kernel.org Acked-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> [backport from upstream commit 82a37132f300ea53bdcd812917af5a6329ec80c3] Signed-off-by: Przemyslaw Kedzierski <p.kedzierski@samsung.com> Change-Id: Iac82ecef5b31a50f52ad9329bdd0403c667f154d
2014-11-18upstream: treewide: Fix typo in KconfigMasanari Iida1-1/+1
Correct spelling typo in Kconfig. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Conflicts: arch/arm/mach-sti/Kconfig drivers/fmc/Kconfig
2014-11-18upstream: treewide: Fix common typo in "identify"Maxime Jayat2-3/+3
Correct common misspelling of "identify" as "indentify" throughout the kernel Signed-off-by: Maxime Jayat <maxime@artisandeveloppeur.fr> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-11-18upstream: drivers: avoid parsing names as kthread_run() format stringsKees Cook1-1/+1
Calling kthread_run with a single name parameter causes it to be handled as a format string. Many callers are passing potentially dynamic string content, so use "%s" in those cases to avoid any potential accidents. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-11-18upstream: drivers: avoid format strings in names passed to alloc_workqueue()Kees Cook2-6/+5
For the workqueue creation interfaces that do not expect format strings, make sure they cannot accidently be parsed that way. Additionally, clean up calls made with a single parameter that would be handled as a format string. Many callers are passing potentially dynamic string content, so use "%s" in those cases to avoid any potential accidents. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-11-14libceph: ceph-msgr workqueue needs a resque workerIlya Dryomov1-1/+2
commit f9865f06f7f18c6661c88d0511f05c48612319cc upstream. Commit f363e45fd118 ("net/ceph: make ceph_msgr_wq non-reentrant") effectively removed WQ_MEM_RECLAIM flag from ceph_msgr_wq. This is wrong - libceph is very much a memory reclaim path, so restore it. Signed-off-by: Ilya Dryomov <idryomov@redhat.com> Tested-by: Micha Krause <micha@krausam.de> Reviewed-by: Sage Weil <sage@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14mac80211: fix typo in starting baserate for rts_cts_rate_idxKarl Beldan1-1/+1
commit c7abf25af0f41be4b50d44c5b185d52eea360cb8 upstream. It affects non-(V)HT rates and can lead to selecting an rts_cts rate that is not a basic rate or way superior to the reference rate (ATM rates[0] used for the 1st attempt of the protected frame data). E.g, assuming drivers register growing (bitrate) sorted tables of ieee80211_rate-s, having : - rates[0].idx == d'2 and basic_rates == b'10100 will select rts_cts idx b'10011 & ~d'(BIT(2)-1), i.e. 1, likewise - rates[0].idx == d'2 and basic_rates == b'10001 will select rts_cts idx b'10000 The first is not a basic rate and the second is > rates[0]. Also, wrt severity of the addressed misbehavior, ATM we only have one rts_cts_rate_idx rather than one per rate table entry, so this idx might still point to bitrates > rates[1..MAX_RATES]. Fixes: 5253ffb8c9e1 ("mac80211: always pick a basic rate to tx RTS/CTS for pre-HT rates") Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14fix misuses of f_count() in ppp and netlinkAl Viro1-1/+1
commit 24dff96a37a2ca319e75a74d3929b2de22447ca6 upstream. we used to check for "nobody else could start doing anything with that opened file" by checking that refcount was 2 or less - one for descriptor table and one we'd acquired in fget() on the way to wherever we are. That was race-prone (somebody else might have had a reference to descriptor table and do fget() just as we'd been checking) and it had become flat-out incorrect back when we switched to fget_light() on those codepaths - unlike fget(), it doesn't grab an extra reference unless the descriptor table is shared. The same change allowed a race-free check, though - we are safe exactly when refcount is less than 2. It was a long time ago; pre-2.6.12 for ioctl() (the codepath leading to ppp one) and 2.6.17 for sendmsg() (netlink one). OTOH, netlink hadn't grown that check until 3.9 and ppp used to live in drivers/net, not drivers/net/ppp until 3.1. The bug existed well before that, though, and the same fix used to apply in old location of file. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14ipv4: dst_entry leak in ip_send_unicast_reply()Vasily Averin1-3/+9
[ Upstream commit 4062090e3e5caaf55bed4523a69f26c3265cc1d2 ] ip_setup_cork() called inside ip_append_data() steals dst entry from rt to cork and in case errors in __ip_append_data() nobody frees stolen dst entry Fixes: 2e77d89b2fa8 ("net: avoid a pair of dst_hold()/dst_release() in ip_append_data()") Signed-off-by: Vasily Averin <vvs@parallels.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14ipv4: fix nexthop attlen check in fib_nh_matchJiri Pirko1-1/+1
[ Upstream commit f76936d07c4eeb36d8dbb64ebd30ab46ff85d9f7 ] fib_nh_match does not match nexthops correctly. Example: ip route add 172.16.10/24 nexthop via 192.168.122.12 dev eth0 \ nexthop via 192.168.122.13 dev eth0 ip route del 172.16.10/24 nexthop via 192.168.122.14 dev eth0 \ nexthop via 192.168.122.15 dev eth0 Del command is successful and route is removed. After this patch applied, the route is correctly matched and result is: RTNETLINK answers: No such process Please consider this for stable trees as well. Fixes: 4e902c57417c4 ("[IPv4]: FIB configuration using struct fib_config") Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-15ipv6: reallocate addrconf router for ipv6 address when lo device upGao feng1-2/+12
[ Upstream commit 33d99113b1102c2d2f8603b9ba72d89d915c13f5 ] commit 25fb6ca4ed9cad72f14f61629b68dc03c0d9713f "net IPv6 : Fix broken IPv6 routing table after loopback down-up" allocates addrconf router for ipv6 address when lo device up. but commit a881ae1f625c599b460cc8f8a7fcb1c438f699ad "ipv6:don't call addrconf_dst_alloc again when enable lo" breaks this behavior. Since the addrconf router is moved to the garbage list when lo device down, we should release this router and rellocate a new one for ipv6 address when lo device up. This patch solves bug 67951 on bugzilla https://bugzilla.kernel.org/show_bug.cgi?id=67951 change from v1: use ip6_rt_put to repleace ip6_del_rt, thanks Hannes! change code style, suggested by Sergei. CC: Sabrina Dubroca <sd@queasysnail.net> CC: Hannes Frederic Sowa <hannes@stressinduktion.org> Reported-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-15tcp: fixing TLP's FIN recoveryPer Hurtig1-3/+1
[ Upstream commit bef1909ee3ed1ca39231b260a8d3b4544ecd0c8f ] Fix to a problem observed when losing a FIN segment that does not contain data. In such situations, TLP is unable to recover from *any* tail loss and instead adds at least PTO ms to the retransmission process, i.e., RTO = RTO + PTO. Signed-off-by: Per Hurtig <per.hurtig@kau.se> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Nandita Dukkipati <nanditad@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-15sctp: handle association restarts when the socket is closed.Vlad Yasevich1-3/+16
[ Upstream commit bdf6fa52f01b941d4a80372d56de465bdbbd1d23 ] Currently association restarts do not take into consideration the state of the socket. When a restart happens, the current assocation simply transitions into established state. This creates a condition where a remote system, through a the restart procedure, may create a local association that is no way reachable by user. The conditions to trigger this are as follows: 1) Remote does not acknoledge some data causing data to remain outstanding. 2) Local application calls close() on the socket. Since data is still outstanding, the association is placed in SHUTDOWN_PENDING state. However, the socket is closed. 3) The remote tries to create a new association, triggering a restart on the local system. The association moves from SHUTDOWN_PENDING to ESTABLISHED. At this point, it is no longer reachable by any socket on the local system. This patch addresses the above situation by moving the newly ESTABLISHED association into SHUTDOWN-SENT state and bundling a SHUTDOWN after the COOKIE-ACK chunk. This way, the restarted associate immidiately enters the shutdown procedure and forces the termination of the unreachable association. Reported-by: David Laight <David.Laight@aculab.com> Signed-off-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-15ip6_gre: fix flowi6_proto value in xmit pathNicolas Dichtel1-2/+2
[ Upstream commit 3be07244b7337760a3269d56b2f4a63e72218648 ] In xmit path, we build a flowi6 which will be used for the output route lookup. We are sending a GRE packet, neither IPv4 nor IPv6 encapsulated packet, thus the protocol should be IPPROTO_GRE. Fixes: c12b395a4664 ("gre: Support GRE over IPv6") Reported-by: Matthieu Ternisien d'Ouville <matthieu.tdo@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-15l2tp: fix race while getting PMTU on PPP pseudo-wireGuillaume Nault1-1/+2
[ Upstream commit eed4d839b0cdf9d84b0a9bc63de90fd5e1e886fb ] Use dst_entry held by sk_dst_get() to retrieve tunnel's PMTU. The dst_mtu(__sk_dst_get(tunnel->sock)) call was racy. __sk_dst_get() could return NULL if tunnel->sock->sk_dst_cache was reset just before the call, thus making dst_mtu() dereference a NULL pointer: [ 1937.661598] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [ 1937.664005] IP: [<ffffffffa049db88>] pppol2tp_connect+0x33d/0x41e [l2tp_ppp] [ 1937.664005] PGD daf0c067 PUD d9f93067 PMD 0 [ 1937.664005] Oops: 0000 [#1] SMP [ 1937.664005] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables udp_tunnel pppoe pppox ppp_generic slhc deflate ctr twofish_generic twofish_x86_64_3way xts lrw gf128mul glue_helper twofish_x86_64 twofish_common blowfish_generic blowfish_x86_64 blowfish_common des_generic cbc xcbc rmd160 sha512_generic hmac crypto_null af_key xfrm_algo 8021q garp bridge stp llc tun atmtcp clip atm ext3 mbcache jbd iTCO_wdt coretemp kvm_intel iTCO_vendor_support kvm pcspkr evdev ehci_pci lpc_ich mfd_core i5400_edac edac_core i5k_amb shpchp button processor thermal_sys xfs crc32c_generic libcrc32c dm_mod usbhid sg hid sr_mod sd_mod cdrom crc_t10dif crct10dif_common ata_generic ahci ata_piix tg3 libahci libata uhci_hcd ptp ehci_hcd pps_core usbcore scsi_mod libphy usb_common [last unloaded: l2tp_core] [ 1937.664005] CPU: 0 PID: 10022 Comm: l2tpstress Tainted: G O 3.17.0-rc1 #1 [ 1937.664005] Hardware name: HP ProLiant DL160 G5, BIOS O12 08/22/2008 [ 1937.664005] task: ffff8800d8fda790 ti: ffff8800c43c4000 task.ti: ffff8800c43c4000 [ 1937.664005] RIP: 0010:[<ffffffffa049db88>] [<ffffffffa049db88>] pppol2tp_connect+0x33d/0x41e [l2tp_ppp] [ 1937.664005] RSP: 0018:ffff8800c43c7de8 EFLAGS: 00010282 [ 1937.664005] RAX: ffff8800da8a7240 RBX: ffff8800d8c64600 RCX: 000001c325a137b5 [ 1937.664005] RDX: 8c6318c6318c6320 RSI: 000000000000010c RDI: 0000000000000000 [ 1937.664005] RBP: ffff8800c43c7ea8 R08: 0000000000000000 R09: 0000000000000000 [ 1937.664005] R10: ffffffffa048e2c0 R11: ffff8800d8c64600 R12: ffff8800ca7a5000 [ 1937.664005] R13: ffff8800c439bf40 R14: 000000000000000c R15: 0000000000000009 [ 1937.664005] FS: 00007fd7f610f700(0000) GS:ffff88011a600000(0000) knlGS:0000000000000000 [ 1937.664005] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1937.664005] CR2: 0000000000000020 CR3: 00000000d9d75000 CR4: 00000000000027e0 [ 1937.664005] Stack: [ 1937.664005] ffffffffa049da80 ffff8800d8fda790 000000000000005b ffff880000000009 [ 1937.664005] ffff8800daf3f200 0000000000000003 ffff8800c43c7e48 ffffffff81109b57 [ 1937.664005] ffffffff81109b0e ffffffff8114c566 0000000000000000 0000000000000000 [ 1937.664005] Call Trace: [ 1937.664005] [<ffffffffa049da80>] ? pppol2tp_connect+0x235/0x41e [l2tp_ppp] [ 1937.664005] [<ffffffff81109b57>] ? might_fault+0x9e/0xa5 [ 1937.664005] [<ffffffff81109b0e>] ? might_fault+0x55/0xa5 [ 1937.664005] [<ffffffff8114c566>] ? rcu_read_unlock+0x1c/0x26 [ 1937.664005] [<ffffffff81309196>] SYSC_connect+0x87/0xb1 [ 1937.664005] [<ffffffff813e56f7>] ? sysret_check+0x1b/0x56 [ 1937.664005] [<ffffffff8107590d>] ? trace_hardirqs_on_caller+0x145/0x1a1 [ 1937.664005] [<ffffffff81213dee>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 1937.664005] [<ffffffff8114c262>] ? spin_lock+0x9/0xb [ 1937.664005] [<ffffffff813092b4>] SyS_connect+0x9/0xb [ 1937.664005] [<ffffffff813e56d2>] system_call_fastpath+0x16/0x1b [ 1937.664005] Code: 10 2a 84 81 e8 65 76 bd e0 65 ff 0c 25 10 bb 00 00 4d 85 ed 74 37 48 8b 85 60 ff ff ff 48 8b 80 88 01 00 00 48 8b b8 10 02 00 00 <48> 8b 47 20 ff 50 20 85 c0 74 0f 83 e8 28 89 83 10 01 00 00 89 [ 1937.664005] RIP [<ffffffffa049db88>] pppol2tp_connect+0x33d/0x41e [l2tp_ppp] [ 1937.664005] RSP <ffff8800c43c7de8> [ 1937.664005] CR2: 0000000000000020 [ 1939.559375] ---[ end trace 82d44500f28f8708 ]--- Fixes: f34c4a35d879 ("l2tp: take PMTU from tunnel UDP socket") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-15openvswitch: fix panic with multiple vlan headersJiri Benc1-0/+5
[ Upstream commit 2ba5af42a7b59ef01f9081234d8855140738defd ] When there are multiple vlan headers present in a received frame, the first one is put into vlan_tci and protocol is set to ETH_P_8021Q. Anything in the skb beyond the VLAN TPID may be still non-linear, including the inner TCI and ethertype. While ovs_flow_extract takes care of IP and IPv6 headers, it does nothing with ETH_P_8021Q. Later, if OVS_ACTION_ATTR_POP_VLAN is executed, __pop_vlan_tci pulls the next vlan header into vlan_tci. This leads to two things: 1. Part of the resulting ethernet header is in the non-linear part of the skb. When eth_type_trans is called later as the result of OVS_ACTION_ATTR_OUTPUT, kernel BUGs in __skb_pull. Also, __pop_vlan_tci is in fact accessing random data when it reads past the TPID. 2. network_header points into the ethernet header instead of behind it. mac_len is set to a wrong value (10), too. Reported-by: Yulong Pei <ypei@redhat.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-15packet: handle too big packets for PACKET_V3Eric Dumazet2-0/+18
[ Upstream commit dc808110bb62b64a448696ecac3938902c92e1ab ] af_packet can currently overwrite kernel memory by out of bound accesses, because it assumed a [new] block can always hold one frame. This is not generally the case, even if most existing tools do it right. This patch clamps too long frames as API permits, and issue a one time error on syslog. [ 394.357639] tpacket_rcv: packet too big, clamped from 5042 to 3966. macoff=82 In this example, packet header tp_snaplen was set to 3966, and tp_len was set to 5042 (skb->len) Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: f6fb8f100b80 ("af-packet: TPACKET_V3 flexible buffer implementation.") Acked-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-15tcp: fix tcp_release_cb() to dispatch via address family for mtu_reduced()Neal Cardwell3-4/+6
[ Upstream commit 4fab9071950c2021d846e18351e0f46a1cffd67b ] Make sure we use the correct address-family-specific function for handling MTU reductions from within tcp_release_cb(). Previously AF_INET6 sockets were incorrectly always using the IPv6 code path when sometimes they were handling IPv4 traffic and thus had an IPv4 dst. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Diagnosed-by: Willem de Bruijn <willemb@google.com> Fixes: 563d34d057862 ("tcp: dont drop MTU reduction indications") Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-15sit: Fix ipip6_tunnel_lookup device matching criteriaShmulik Ladkani1-3/+3
[ Upstream commit bc8fc7b8f825ef17a0fb9e68c18ce94fa66ab337 ] As of 4fddbf5d78 ("sit: strictly restrict incoming traffic to tunnel link device"), when looking up a tunnel, tunnel's underlying interface (t->parms.link) is verified to match incoming traffic's ingress device. However the comparison was incorrectly based on skb->dev->iflink. Instead, dev->ifindex should be used, which correctly represents the interface from which the IP stack hands the ipip6 packets. This allows setting up sit tunnels bound to vlan interfaces (otherwise incoming ipip6 traffic on the vlan interface was dropped due to ipip6_tunnel_lookup match failure). Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-09nl80211: clear skb cb before passing to netlinkJohannes Berg1-0/+6
commit bd8c78e78d5011d8111bc2533ee73b13a3bd6c42 upstream. In testmode and vendor command reply/event SKBs we use the skb cb data to store nl80211 parameters between allocation and sending. This causes the code for CONFIG_NETLINK_MMAP to get confused, because it takes ownership of the skb cb data when the SKB is handed off to netlink, and it doesn't explicitly clear it. Clear the skb cb explicitly when we're done and before it gets passed to netlink to avoid this issue. Reported-by: Assaf Azulay <assaf.azulay@intel.com> Reported-by: David Spinadel <david.spinadel@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05netfilter: nf_conntrack: avoid large timeout for mid-stream pickupFlorian Westphal1-0/+6
commit 6547a221871f139cc56328a38105d47c14874cbe upstream. When loose tracking is enabled (default), non-syn packets cause creation of new conntracks in established state with default timeout for established state (5 days). This causes the table to fill up with UNREPLIED when the 'new ack' packet happened to be the last-ack of a previous, already timed-out connection. Consider: A 192.168.x.52792 > 10.184.y.80: F, 426:426(0) ack 9237 win 255 B 10.184.y.80 > 192.168.x.52792: ., ack 427 win 123 <61 second pause> C 10.184.y.80 > 192.168.x.52792: F, 9237:9237(0) ack 427 win 123 D 192.168.x.52792 > 10.184.y.80: ., ack 9238 win 255 B moves conntrack to CLOSE_WAIT and will kill it after 60 second timeout, C is ignored (FIN set), but last packet (D) causes new ct with 5-days timeout. Use UNACK timeout (5 minutes) instead to get rid of these entries sooner when in ESTABLISHED state without having seen traffic in both directions. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Florian Koch <florian.koch1981@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05ipvs: fix ipv6 hook registration for local repliesJulian Anastasov1-1/+1
commit eb90b0c734ad793d5f5bf230a9e9a4dcc48df8aa upstream. commit fc604767613b6d2036cdc35b660bc39451040a47 ("ipvs: changes for local real server") from 2.6.37 introduced DNAT support to local real server but the IPv6 LOCAL_OUT handler ip_vs_local_reply6() is registered incorrectly as IPv4 hook causing any outgoing IPv4 traffic to be dropped depending on the IP header values. Chris tracked down the problem to CONFIG_IP_VS_IPV6=y Bug report: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349768 Reported-by: Chris J Arges <chris.j.arges@canonical.com> Tested-by: Chris J Arges <chris.j.arges@canonical.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05ipvs: Maintain all DSCP and ECN bits for ipv6 tun forwardingAlex Gartrell1-1/+1
commit 76f084bc10004b3050b2cff9cfac29148f1f6088 upstream. Previously, only the four high bits of the tclass were maintained in the ipv6 case. This matches the behavior of ipv4, though whether or not we should reflect ECN bits may be up for debate. Signed-off-by: Alex Gartrell <agartrell@fb.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05ipvs: avoid netns exit crash on ip_vs_conn_drop_conntrackJulian Anastasov1-1/+0
commit 2627b7e15c5064ddd5e578e4efd948d48d531a3f upstream. commit 8f4e0a18682d91 ("IPVS netns exit causes crash in conntrack") added second ip_vs_conn_drop_conntrack call instead of just adding the needed check. As result, the first call still can cause crash on netns exit. Remove it. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05Revert "mac80211: disable uAPSD if all ACs are under ACM"Johannes Berg1-2/+1
commit bb512ad0732232f1d2693bb68f31a76bed8f22ae upstream. This reverts commit 24aa11ab8ae03292d38ec0dbd9bc2ac49fe8a6dd. That commit was wrong since it uses data that hasn't even been set up yet, but might be a hold-over from a previous connection. Additionally, it seems like a driver-specific workaround that shouldn't have been in mac80211 to start with. Fixes: 24aa11ab8ae0 ("mac80211: disable uAPSD if all ACs are under ACM") Reviewed-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-17libceph: gracefully handle large reply messages from the monSage Weil1-0/+8
commit 73c3d4812b4c755efeca0140f606f83772a39ce4 upstream. We preallocate a few of the message types we get back from the mon. If we get a larger message than we are expecting, fall back to trying to allocate a new one instead of blindly using the one we have. Signed-off-by: Sage Weil <sage@redhat.com> Reviewed-by: Ilya Dryomov <ilya.dryomov@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-17libceph: rename ceph_msg::front_max to front_alloc_lenIlya Dryomov2-7/+7
commit 3cea4c3071d4e55e9d7356efe9d0ebf92f0c2204 upstream. Rename front_max field of struct ceph_msg to front_alloc_len to make its purpose more clear. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-17libceph: do not hard code max auth ticket lenIlya Dryomov1-35/+29
commit c27a3e4d667fdcad3db7b104f75659478e0c68d8 upstream. We hard code cephx auth ticket buffer size to 256 bytes. This isn't enough for any moderate setups and, in case tickets themselves are not encrypted, leads to buffer overflows (ceph_x_decrypt() errors out, but ceph_decode_copy() doesn't - it's just a memcpy() wrapper). Since the buffer is allocated dynamically anyway, allocated it a bit later, at the point where we know how much is going to be needed. Fixes: http://tracker.ceph.com/issues/8979 Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-17libceph: add process_one_ticket() helperIlya Dryomov1-104/+124
commit 597cda357716a3cf8d994cb11927af917c8d71fa upstream. Add a helper for processing individual cephx auth tickets. Needed for the next commit, which deals with allocating ticket buffers. (Most of the diff here is whitespace - view with git diff -b). Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-17libceph: set last_piece in ceph_msg_data_pages_cursor_init() correctlyIlya Dryomov1-1/+1
commit 5f740d7e1531099b888410e6bab13f68da9b1a4d upstream. Determining ->last_piece based on the value of ->page_offset + length is incorrect because length here is the length of the entire message. ->last_piece set to false even if page array data item length is <= PAGE_SIZE, which results in invalid length passed to ceph_tcp_{send,recv}page() and causes various asserts to fire. # cat pages-cursor-init.sh #!/bin/bash rbd create --size 10 --image-format 2 foo FOO_DEV=$(rbd map foo) dd if=/dev/urandom of=$FOO_DEV bs=1M &>/dev/null rbd snap create foo@snap rbd snap protect foo@snap rbd clone foo@snap bar # rbd_resize calls librbd rbd_resize(), size is in bytes ./rbd_resize bar $(((4 << 20) + 512)) rbd resize --size 10 bar BAR_DEV=$(rbd map bar) # trigger a 512-byte copyup -- 512-byte page array data item dd if=/dev/urandom of=$BAR_DEV bs=1M count=1 seek=5 The problem exists only in ceph_msg_data_pages_cursor_init(), ceph_msg_data_pages_advance() does the right thing. The size_t cast is unnecessary. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@redhat.com> Reviewed-by: Alex Elder <elder@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-17Bluetooth: Avoid use of session socket after the session gets freedVignesh Raman1-2/+5
commit 32333edb82fb2009980eefc5518100068147ab82 upstream. The commits 08c30aca9e698faddebd34f81e1196295f9dc063 "Bluetooth: Remove RFCOMM session refcnt" and 8ff52f7d04d9cc31f1e81dcf9a2ba6335ed34905 "Bluetooth: Return RFCOMM session ptrs to avoid freed session" allow rfcomm_recv_ua and rfcomm_session_close to delete the session (and free the corresponding socket) and propagate NULL session pointer to the upper callers. Additional fix is required to terminate the loop in rfcomm_process_rx function to avoid use of freed 'sk' memory. The issue is only reproducible with kernel option CONFIG_PAGE_POISONING enabled making freed memory being changed and filled up with fixed char value used to unmask use-after-free issues. Signed-off-by: Vignesh Raman <Vignesh_Raman@mentor.com> Signed-off-by: Vitaly Kuzmichev <Vitaly_Kuzmichev@mentor.com> Acked-by: Dean Jenkins <Dean_Jenkins@mentor.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-17Bluetooth: never linger on process exitVladimir Davydov3-4/+8
commit 093facf3634da1b0c2cc7ed106f1983da901bbab upstream. If the current process is exiting, lingering on socket close will make it unkillable, so we should avoid it. Reproducer: #include <sys/types.h> #include <sys/socket.h> #define BTPROTO_L2CAP 0 #define BTPROTO_SCO 2 #define BTPROTO_RFCOMM 3 int main() { int fd; struct linger ling; fd = socket(PF_BLUETOOTH, SOCK_STREAM, BTPROTO_RFCOMM); //or: fd = socket(PF_BLUETOOTH, SOCK_DGRAM, BTPROTO_L2CAP); //or: fd = socket(PF_BLUETOOTH, SOCK_SEQPACKET, BTPROTO_SCO); ling.l_onoff = 1; ling.l_linger = 1000000000; setsockopt(fd, SOL_SOCKET, SO_LINGER, &ling, sizeof(ling)); return 0; } Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-05svcrdma: Select NFSv4.1 backchannel transport based on forward channelChuck Lever3-1/+4
commit 3c45ddf823d679a820adddd53b52c6699c9a05ac upstream. The current code always selects XPRT_TRANSPORT_BC_TCP for the back channel, even when the forward channel was not TCP (eg, RDMA). When a 4.1 mount is attempted with RDMA, the server panics in the TCP BC code when trying to send CB_NULL. Instead, construct the transport protocol number from the forward channel transport or'd with XPRT_TRANSPORT_BC. Transports that do not support bi-directional RPC will not have registered a "BC" transport, causing create_backchannel_client() to fail immediately. Fixes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=265 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-08-14sctp: fix possible seqlock seadlock in sctp_packet_transmit()Eric Dumazet1-1/+1
[ Upstream commit 757efd32d5ce31f67193cc0e6a56e4dffcc42fb1 ] Dave reported following splat, caused by improper use of IP_INC_STATS_BH() in process context. BUG: using __this_cpu_add() in preemptible [00000000] code: trinity-c117/14551 caller is __this_cpu_preempt_check+0x13/0x20 CPU: 3 PID: 14551 Comm: trinity-c117 Not tainted 3.16.0+ #33 ffffffff9ec898f0 0000000047ea7e23 ffff88022d32f7f0 ffffffff9e7ee207 0000000000000003 ffff88022d32f818 ffffffff9e397eaa ffff88023ee70b40 ffff88022d32f970 ffff8801c026d580 ffff88022d32f828 ffffffff9e397ee3 Call Trace: [<ffffffff9e7ee207>] dump_stack+0x4e/0x7a [<ffffffff9e397eaa>] check_preemption_disabled+0xfa/0x100 [<ffffffff9e397ee3>] __this_cpu_preempt_check+0x13/0x20 [<ffffffffc0839872>] sctp_packet_transmit+0x692/0x710 [sctp] [<ffffffffc082a7f2>] sctp_outq_flush+0x2a2/0xc30 [sctp] [<ffffffff9e0d985c>] ? mark_held_locks+0x7c/0xb0 [<ffffffff9e7f8c6d>] ? _raw_spin_unlock_irqrestore+0x5d/0x80 [<ffffffffc082b99a>] sctp_outq_uncork+0x1a/0x20 [sctp] [<ffffffffc081e112>] sctp_cmd_interpreter.isra.23+0x1142/0x13f0 [sctp] [<ffffffffc081c86b>] sctp_do_sm+0xdb/0x330 [sctp] [<ffffffff9e0b8f1b>] ? preempt_count_sub+0xab/0x100 [<ffffffffc083b350>] ? sctp_cname+0x70/0x70 [sctp] [<ffffffffc08389ca>] sctp_primitive_ASSOCIATE+0x3a/0x50 [sctp] [<ffffffffc083358f>] sctp_sendmsg+0x88f/0xe30 [sctp] [<ffffffff9e0d673a>] ? lock_release_holdtime.part.28+0x9a/0x160 [<ffffffff9e0d62ce>] ? put_lock_stats.isra.27+0xe/0x30 [<ffffffff9e73b624>] inet_sendmsg+0x104/0x220 [<ffffffff9e73b525>] ? inet_sendmsg+0x5/0x220 [<ffffffff9e68ac4e>] sock_sendmsg+0x9e/0xe0 [<ffffffff9e1c0c09>] ? might_fault+0xb9/0xc0 [<ffffffff9e1c0bae>] ? might_fault+0x5e/0xc0 [<ffffffff9e68b234>] SYSC_sendto+0x124/0x1c0 [<ffffffff9e0136b0>] ? syscall_trace_enter+0x250/0x330 [<ffffffff9e68c3ce>] SyS_sendto+0xe/0x10 [<ffffffff9e7f9be4>] tracesys+0xdd/0xe2 This is a followup of commits f1d8cba61c3c4b ("inet: fix possible seqlock deadlocks") and 7f88c6b23afbd315 ("ipv6: fix possible seqlock deadlock in ip6_finish_output2") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Reported-by: Dave Jones <davej@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-08-14iovec: make sure the caller actually wants anything in memcpy_fromiovecendSasha Levin1-0/+4
[ Upstream commit 06ebb06d49486676272a3c030bfeef4bd969a8e6 ] Check for cases when the caller requests 0 bytes instead of running off and dereferencing potentially invalid iovecs. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-08-14net: Correctly set segment mac_len in skb_segment().Vlad Yasevich1-1/+1
[ Upstream commit fcdfe3a7fa4cb74391d42b6a26dc07c20dab1d82 ] When performing segmentation, the mac_len value is copied right out of the original skb. However, this value is not always set correctly (like when the packet is VLAN-tagged) and we'll end up copying a bad value. One way to demonstrate this is to configure a VM which tags packets internally and turn off VLAN acceleration on the forwarding bridge port. The packets show up corrupt like this: 16:18:24.985548 52:54:00:ab:be:25 > 52:54:00:26:ce:a3, ethertype 802.1Q (0x8100), length 1518: vlan 100, p 0, ethertype 0x05e0, 0x0000: 8cdb 1c7c 8cdb 0064 4006 b59d 0a00 6402 ...|...d@.....d. 0x0010: 0a00 6401 9e0d b441 0a5e 64ec 0330 14fa ..d....A.^d..0.. 0x0020: 29e3 01c9 f871 0000 0101 080a 000a e833)....q.........3 0x0030: 000f 8c75 6e65 7470 6572 6600 6e65 7470 ...unetperf.netp 0x0040: 6572 6600 6e65 7470 6572 6600 6e65 7470 erf.netperf.netp 0x0050: 6572 6600 6e65 7470 6572 6600 6e65 7470 erf.netperf.netp 0x0060: 6572 6600 6e65 7470 6572 6600 6e65 7470 erf.netperf.netp ... This also leads to awful throughput as GSO packets are dropped and cause retransmissions. The solution is to set the mac_len using the values already available in then new skb. We've already adjusted all of the header offset, so we might as well correctly figure out the mac_len using skb_reset_mac_len(). After this change, packets are segmented correctly and performance is restored. CC: Eric Dumazet <edumazet@google.com> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-08-14net: sctp: inherit auth_capable on INIT collisionsDaniel Borkmann1-0/+1
[ Upstream commit 1be9a950c646c9092fb3618197f7b6bfb50e82aa ] Jason reported an oops caused by SCTP on his ARM machine with SCTP authentication enabled: Internal error: Oops: 17 [#1] ARM CPU: 0 PID: 104 Comm: sctp-test Not tainted 3.13.0-68744-g3632f30c9b20-dirty #1 task: c6eefa40 ti: c6f52000 task.ti: c6f52000 PC is at sctp_auth_calculate_hmac+0xc4/0x10c LR is at sg_init_table+0x20/0x38 pc : [<c024bb80>] lr : [<c00f32dc>] psr: 40000013 sp : c6f538e8 ip : 00000000 fp : c6f53924 r10: c6f50d80 r9 : 00000000 r8 : 00010000 r7 : 00000000 r6 : c7be4000 r5 : 00000000 r4 : c6f56254 r3 : c00c8170 r2 : 00000001 r1 : 00000008 r0 : c6f1e660 Flags: nZcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 0005397f Table: 06f28000 DAC: 00000015 Process sctp-test (pid: 104, stack limit = 0xc6f521c0) Stack: (0xc6f538e8 to 0xc6f54000) [...] Backtrace: [<c024babc>] (sctp_auth_calculate_hmac+0x0/0x10c) from [<c0249af8>] (sctp_packet_transmit+0x33c/0x5c8) [<c02497bc>] (sctp_packet_transmit+0x0/0x5c8) from [<c023e96c>] (sctp_outq_flush+0x7fc/0x844) [<c023e170>] (sctp_outq_flush+0x0/0x844) from [<c023ef78>] (sctp_outq_uncork+0x24/0x28) [<c023ef54>] (sctp_outq_uncork+0x0/0x28) from [<c0234364>] (sctp_side_effects+0x1134/0x1220) [<c0233230>] (sctp_side_effects+0x0/0x1220) from [<c02330b0>] (sctp_do_sm+0xac/0xd4) [<c0233004>] (sctp_do_sm+0x0/0xd4) from [<c023675c>] (sctp_assoc_bh_rcv+0x118/0x160) [<c0236644>] (sctp_assoc_bh_rcv+0x0/0x160) from [<c023d5bc>] (sctp_inq_push+0x6c/0x74) [<c023d550>] (sctp_inq_push+0x0/0x74) from [<c024a6b0>] (sctp_rcv+0x7d8/0x888) While we already had various kind of bugs in that area ec0223ec48a9 ("net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH capable") and b14878ccb7fa ("net: sctp: cache auth_enable per endpoint"), this one is a bit of a different kind. Giving a bit more background on why SCTP authentication is needed can be found in RFC4895: SCTP uses 32-bit verification tags to protect itself against blind attackers. These values are not changed during the lifetime of an SCTP association. Looking at new SCTP extensions, there is the need to have a method of proving that an SCTP chunk(s) was really sent by the original peer that started the association and not by a malicious attacker. To cause this bug, we're triggering an INIT collision between peers; normal SCTP handshake where both sides intent to authenticate packets contains RANDOM; CHUNKS; HMAC-ALGO parameters that are being negotiated among peers: ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ----------> <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] --------- -------------------- COOKIE-ECHO --------------------> <-------------------- COOKIE-ACK --------------------- RFC4895 says that each endpoint therefore knows its own random number and the peer's random number *after* the association has been established. The local and peer's random number along with the shared key are then part of the secret used for calculating the HMAC in the AUTH chunk. Now, in our scenario, we have 2 threads with 1 non-blocking SEQ_PACKET socket each, setting up common shared SCTP_AUTH_KEY and SCTP_AUTH_ACTIVE_KEY properly, and each of them calling sctp_bindx(3), listen(2) and connect(2) against each other, thus the handshake looks similar to this, e.g.: ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ----------> <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] --------- <--------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ----------- -------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] --------> ... Since such collisions can also happen with verification tags, the RFC4895 for AUTH rather vaguely says under section 6.1: In case of INIT collision, the rules governing the handling of this Random Number follow the same pattern as those for the Verification Tag, as explained in Section 5.2.4 of RFC 2960 [5]. Therefore, each endpoint knows its own Random Number and the peer's Random Number after the association has been established. In RFC2960, section 5.2.4, we're eventually hitting Action B: B) In this case, both sides may be attempting to start an association at about the same time but the peer endpoint started its INIT after responding to the local endpoint's INIT. Thus it may have picked a new Verification Tag not being aware of the previous Tag it had sent this endpoint. The endpoint should stay in or enter the ESTABLISHED state but it MUST update its peer's Verification Tag from the State Cookie, stop any init or cookie timers that may running and send a COOKIE ACK. In other words, the handling of the Random parameter is the same as behavior for the Verification Tag as described in Action B of section 5.2.4. Looking at the code, we exactly hit the sctp_sf_do_dupcook_b() case which triggers an SCTP_CMD_UPDATE_ASSOC command to the side effect interpreter, and in fact it properly copies over peer_{random, hmacs, chunks} parameters from the newly created association to update the existing one. Also, the old asoc_shared_key is being released and based on the new params, sctp_auth_asoc_init_active_key() updated. However, the issue observed in this case is that the previous asoc->peer.auth_capable was 0, and has *not* been updated, so that instead of creating a new secret, we're doing an early return from the function sctp_auth_asoc_init_active_key() leaving asoc->asoc_shared_key as NULL. However, we now have to authenticate chunks from the updated chunk list (e.g. COOKIE-ACK). That in fact causes the server side when responding with ... <------------------ AUTH; COOKIE-ACK ----------------- ... to trigger a NULL pointer dereference, since in sctp_packet_transmit(), it discovers that an AUTH chunk is being queued for xmit, and thus it calls sctp_auth_calculate_hmac(). Since the asoc->active_key_id is still inherited from the endpoint, and the same as encoded into the chunk, it uses asoc->asoc_shared_key, which is still NULL, as an asoc_key and dereferences it in ... crypto_hash_setkey(desc.tfm, &asoc_key->data[0], asoc_key->len) ... causing an oops. All this happens because sctp_make_cookie_ack() called with the *new* association has the peer.auth_capable=1 and therefore marks the chunk with auth=1 after checking sctp_auth_send_cid(), but it is *actually* sent later on over the then *updated* association's transport that didn't initialize its shared key due to peer.auth_capable=0. Since control chunks in that case are not sent by the temporary association which are scheduled for deletion, they are issued for xmit via SCTP_CMD_REPLY in the interpreter with the context of the *updated* association. peer.auth_capable was 0 in the updated association (which went from COOKIE_WAIT into ESTABLISHED state), since all previous processing that performed sctp_process_init() was being done on temporary associations, that we eventually throw away each time. The correct fix is to update to the new peer.auth_capable value as well in the collision case via sctp_assoc_update(), so that in case the collision migrated from 0 -> 1, sctp_auth_asoc_init_active_key() can properly recalculate the secret. This therefore fixes the observed server panic. Fixes: 730fc3d05cd4 ("[SCTP]: Implete SCTP-AUTH parameter processing") Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Tested-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>