summaryrefslogtreecommitdiff
path: root/net/xfrm
AgeCommit message (Collapse)AuthorFilesLines
2014-10-15xfrm: Generate queueing routes only from route lookup functionsSteffen Klassert1-8/+24
[ Upstream commit b8c203b2d2fc961bafd53b41d5396bbcdec55998 ] Currently we genarate a queueing route if we have matching policies but can not resolve the states and the sysctl xfrm_larval_drop is disabled. Here we assume that dst_output() is called to kill the queued packets. Unfortunately this assumption is not true in all cases, so it is possible that these packets leave the system unwanted. We fix this by generating queueing routes only from the route lookup functions, here we can guarantee a call to dst_output() afterwards. Fixes: a0073fe18e71 ("xfrm: Add a state resolution packet queue") Reported-by: Konstantinos Kolelis <k.kolelis@sirrix.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-15xfrm: Generate blackhole routes only from route lookup functionsSteffen Klassert1-1/+17
[ Upstream commit f92ee61982d6da15a9e49664ecd6405a15a2ee56 ] Currently we genarate a blackhole route route whenever we have matching policies but can not resolve the states. Here we assume that dst_output() is called to kill the balckholed packets. Unfortunately this assumption is not true in all cases, so it is possible that these packets leave the system unwanted. We fix this by generating blackhole routes only from the route lookup functions, here we can guarantee a call to dst_output() afterwards. Fixes: 2774c131b1d ("xfrm: Handle blackhole route creation via afinfo.") Reported-by: Konstantinos Kolelis <k.kolelis@sirrix.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-08-14xfrm: Fix installation of AH IPsec SAsTobias Brunner1-4/+3
[ Upstream commit a0e5ef53aac8e5049f9344857d8ec5237d31e58b ] The SPI check introduced in ea9884b3acf3311c8a11db67bfab21773f6f82ba was intended for IPComp SAs but actually prevented AH SAs from getting installed (depending on the SPI). Fixes: ea9884b3acf3 ("xfrm: check user specified spi for IPComp") Cc: Fan Du <fan.du@windriver.com> Signed-off-by: Tobias Brunner <tobias@strongswan.org> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-26net: Use netlink_ns_capable to verify the permisions of netlink messagesEric W. Biederman1-1/+1
[ Upstream commit 90f62cf30a78721641e08737bda787552428061e ] It is possible by passing a netlink socket to a more privileged executable and then to fool that executable into writing to the socket data that happens to be valid netlink message to do something that privileged executable did not intend to do. To keep this from happening replace bare capable and ns_capable calls with netlink_capable, netlink_net_calls and netlink_ns_capable calls. Which act the same as the previous calls except they verify that the opener of the socket had the desired permissions as well. Reported-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-10selinux: add gfp argument to security_xfrm_policy_alloc and fix callersNikolay Aleksandrov1-3/+3
security_xfrm_policy_alloc can be called in atomic context so the allocation should be done with GFP_ATOMIC. Add an argument to let the callers choose the appropriate way. In order to do so a gfp argument needs to be added to the method xfrm_policy_alloc_security in struct security_operations and to the internal function selinux_xfrm_alloc_user. After that switch to GFP_ATOMIC in the atomic callers and leave GFP_KERNEL as before for the rest. The path that needed the gfp argument addition is: security_xfrm_policy_alloc -> security_ops.xfrm_policy_alloc_security -> all users of xfrm_policy_alloc_security (e.g. selinux_xfrm_policy_alloc) -> selinux_xfrm_alloc_user (here the allocation used to be GFP_KERNEL only) Now adding a gfp argument to selinux_xfrm_alloc_user requires us to also add it to security_context_to_sid which is used inside and prior to this patch did only GFP_KERNEL allocation. So add gfp argument to security_context_to_sid and adjust all of its callers as well. CC: Paul Moore <paul@paul-moore.com> CC: Dave Jones <davej@redhat.com> CC: Steffen Klassert <steffen.klassert@secunet.com> CC: Fan Du <fan.du@windriver.com> CC: David S. Miller <davem@davemloft.net> CC: LSM list <linux-security-module@vger.kernel.org> CC: SELinux list <selinux@tycho.nsa.gov> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-26xfrm: Fix unlink race when policies are deleted.Steffen Klassert1-1/+1
When a policy is unlinked from the lists in thread context, the xfrm timer can fire before we can mark this policy as dead. So reinitialize the bydst hlist, then hlist_unhashed() will notice that this policy is not linked and will avoid a doulble unlink of that policy. Reported-by: Xianpeng Zhao <673321875@qq.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-20xfrm: Clone states properly on migrationSteffen Klassert2-5/+8
We loose a lot of information of the original state if we clone it with xfrm_state_clone(). In particular, there is no crypto algorithm attached if the original state uses an aead algorithm. This patch add the missing information to the clone state. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-20xfrm: Take xfrm_state_lock in xfrm_migrate_state_findSteffen Klassert1-5/+8
A comment on xfrm_migrate_state_find() says that xfrm_state_lock is held. This is apparently not the case, but we need it to traverse through the state lists. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-20xfrm: Fix NULL pointer dereference on sub policy usageSteffen Klassert1-1/+1
xfrm_state_sort() takes the unsorted states from the src array and stores them into the dst array. We try to get the namespace from the dst array which is empty at this time, so take the namespace from the src array instead. Fixes: 283bc9f35bbbc ("xfrm: Namespacify xfrm state/policy locks") Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-01-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds5-217/+239
Pull networking updates from David Miller: 1) BPF debugger and asm tool by Daniel Borkmann. 2) Speed up create/bind in AF_PACKET, also from Daniel Borkmann. 3) Correct reciprocal_divide and update users, from Hannes Frederic Sowa and Daniel Borkmann. 4) Currently we only have a "set" operation for the hw timestamp socket ioctl, add a "get" operation to match. From Ben Hutchings. 5) Add better trace events for debugging driver datapath problems, also from Ben Hutchings. 6) Implement auto corking in TCP, from Eric Dumazet. Basically, if we have a small send and a previous packet is already in the qdisc or device queue, defer until TX completion or we get more data. 7) Allow userspace to manage ipv6 temporary addresses, from Jiri Pirko. 8) Add a qdisc bypass option for AF_PACKET sockets, from Daniel Borkmann. 9) Share IP header compression code between Bluetooth and IEEE802154 layers, from Jukka Rissanen. 10) Fix ipv6 router reachability probing, from Jiri Benc. 11) Allow packets to be captured on macvtap devices, from Vlad Yasevich. 12) Support tunneling in GRO layer, from Jerry Chu. 13) Allow bonding to be configured fully using netlink, from Scott Feldman. 14) Allow AF_PACKET users to obtain the VLAN TPID, just like they can already get the TCI. From Atzm Watanabe. 15) New "Heavy Hitter" qdisc, from Terry Lam. 16) Significantly improve the IPSEC support in pktgen, from Fan Du. 17) Allow ipv4 tunnels to cache routes, just like sockets. From Tom Herbert. 18) Add Proportional Integral Enhanced packet scheduler, from Vijay Subramanian. 19) Allow openvswitch to mmap'd netlink, from Thomas Graf. 20) Key TCP metrics blobs also by source address, not just destination address. From Christoph Paasch. 21) Support 10G in generic phylib. From Andy Fleming. 22) Try to short-circuit GRO flow compares using device provided RX hash, if provided. From Tom Herbert. The wireless and netfilter folks have been busy little bees too. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2064 commits) net/cxgb4: Fix referencing freed adapter ipv6: reallocate addrconf router for ipv6 address when lo device up fib_frontend: fix possible NULL pointer dereference rtnetlink: remove IFLA_BOND_SLAVE definition rtnetlink: remove check for fill_slave_info in rtnl_have_link_slave_info qlcnic: update version to 5.3.55 qlcnic: Enhance logic to calculate msix vectors. qlcnic: Refactor interrupt coalescing code for all adapters. qlcnic: Update poll controller code path qlcnic: Interrupt code cleanup qlcnic: Enhance Tx timeout debugging. qlcnic: Use bool for rx_mac_learn. bonding: fix u64 division rtnetlink: add missing IFLA_BOND_AD_INFO_UNSPEC sfc: Use the correct maximum TX DMA ring size for SFC9100 Add Shradha Shah as the sfc driver maintainer. net/vxlan: Share RX skb de-marking and checksum checks with ovs tulip: cleanup by using ARRAY_SIZE() ip_tunnel: clear IPCB in ip_tunnel_xmit() in case dst_link_failure() is called net/cxgb4: Don't retrieve stats during recovery ...
2014-01-23Merge git://git.infradead.org/users/eparis/auditLinus Torvalds3-13/+13
Pull audit update from Eric Paris: "Again we stayed pretty well contained inside the audit system. Venturing out was fixing a couple of function prototypes which were inconsistent (didn't hurt anything, but we used the same value as an int, uint, u32, and I think even a long in a couple of places). We also made a couple of minor changes to when a couple of LSMs called the audit system. We hoped to add aarch64 audit support this go round, but it wasn't ready. I'm disappearing on vacation on Thursday. I should have internet access, but it'll be spotty. If anything goes wrong please be sure to cc rgb@redhat.com. He'll make fixing things his top priority" * git://git.infradead.org/users/eparis/audit: (50 commits) audit: whitespace fix in kernel-parameters.txt audit: fix location of __net_initdata for audit_net_ops audit: remove pr_info for every network namespace audit: Modify a set of system calls in audit class definitions audit: Convert int limit uses to u32 audit: Use more current logging style audit: Use hex_byte_pack_upper audit: correct a type mismatch in audit_syscall_exit() audit: reorder AUDIT_TTY_SET arguments audit: rework AUDIT_TTY_SET to only grab spin_lock once audit: remove needless switch in AUDIT_SET audit: use define's for audit version audit: documentation of audit= kernel parameter audit: wait_for_auditd rework for readability audit: update MAINTAINERS audit: log task info on feature change audit: fix incorrect set of audit_sock audit: print error message when fail to create audit socket audit: fix dangling keywords in audit_log_set_loginuid() output audit: log on errors from filter user rules ...
2014-01-14net: replace macros net_random and net_srandom with direct calls to prandomAruna-Hewapathirane1-1/+1
This patch removes the net_random and net_srandom macros and replaces them with direct calls to the prandom ones. As new commits only seem to use prandom_u32 there is no use to keep them around. This change makes it easier to grep for users of prandom_u32. Signed-off-by: Aruna-Hewapathirane <aruna.hewapathirane@gmail.com> Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-13Merge branch 'master' of ↵David S. Miller5-34/+56
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Conflicts: net/xfrm/xfrm_policy.c Steffen Klassert says: ==================== This pull request has a merge conflict between commits be7928d20bab ("net: xfrm: xfrm_policy: fix inline not at beginning of declaration") and da7c224b1baa ("net: xfrm: xfrm_policy: silence compiler warning") from the net-next tree and commit 2f3ea9a95c58 ("xfrm: checkpatch erros with inline keyword position") from the ipsec-next tree. The version from net-next can be used, like it is done in linux-next. 1) Checkpatch cleanups, from Weilong Chen. 2) Fix lockdep complaints when pktgen is used with IPsec, from Fan Du. 3) Update pktgen to allow any combination of IPsec transport/tunnel mode and AH/ESP/IPcomp type, from Fan Du. 4) Make pktgen_dst_metrics static, Fengguang Wu. 5) Compile fix for pktgen when CONFIG_XFRM is not set, from Fan Du. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-13audit: convert all sessionid declaration to unsigned intEric Paris3-13/+13
Right now the sessionid value in the kernel is a combination of u32, int, and unsigned int. Just use unsigned int throughout. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2014-01-07net: xfrm: xfrm_policy: silence compiler warningYing Xue1-0/+2
Fix below compiler warning: net/xfrm/xfrm_policy.c:1644:12: warning: ‘xfrm_dst_alloc_copy’ defined but not used [-Wunused-function] Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-07net: xfrm: xfrm_policy: fix inline not at beginning of declarationDaniel Borkmann1-6/+6
Fix three warnings related to: net/xfrm/xfrm_policy.c:1644:1: warning: 'inline' is not at beginning of declaration [-Wold-style-declaration] net/xfrm/xfrm_policy.c:1656:1: warning: 'inline' is not at beginning of declaration [-Wold-style-declaration] net/xfrm/xfrm_policy.c:1668:1: warning: 'inline' is not at beginning of declaration [-Wold-style-declaration] Just removing the inline keyword is sufficient as the compiler will decide on its own about inlining or not. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-03{pktgen, xfrm} Introduce xfrm_state_lookup_byspi for pktgenFan Du1-0/+22
Introduce xfrm_state_lookup_byspi to find user specified by custom from "pgset spi xxx". Using this scheme, any flow regardless its saddr/daddr could be transform by SA specified with configurable spi. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-01-03{pktgen, xfrm} Correct xfrm_state_lock usage in xfrm_stateonly_findFan Du1-2/+2
Acquiring xfrm_state_lock in process context is expected to turn BH off, as this lock is also used in BH context, namely xfrm state timer handler. Otherwise it surprises LOCKDEP with below messages. [ 81.422781] pktgen: Packet Generator for packet performance testing. Version: 2.74 [ 81.725194] [ 81.725211] ========================================================= [ 81.725212] [ INFO: possible irq lock inversion dependency detected ] [ 81.725215] 3.13.0-rc2+ #92 Not tainted [ 81.725216] --------------------------------------------------------- [ 81.725218] kpktgend_0/2780 just changed the state of lock: [ 81.725220] (xfrm_state_lock){+.+...}, at: [<ffffffff816dd751>] xfrm_stateonly_find+0x41/0x1f0 [ 81.725231] but this lock was taken by another, SOFTIRQ-safe lock in the past: [ 81.725232] (&(&x->lock)->rlock){+.-...} [ 81.725232] [ 81.725232] and interrupts could create inverse lock ordering between them. [ 81.725232] [ 81.725235] [ 81.725235] other info that might help us debug this: [ 81.725237] Possible interrupt unsafe locking scenario: [ 81.725237] [ 81.725238] CPU0 CPU1 [ 81.725240] ---- ---- [ 81.725241] lock(xfrm_state_lock); [ 81.725243] local_irq_disable(); [ 81.725244] lock(&(&x->lock)->rlock); [ 81.725246] lock(xfrm_state_lock); [ 81.725248] <Interrupt> [ 81.725249] lock(&(&x->lock)->rlock); [ 81.725251] [ 81.725251] *** DEADLOCK *** [ 81.725251] [ 81.725254] no locks held by kpktgend_0/2780. [ 81.725255] [ 81.725255] the shortest dependencies between 2nd lock and 1st lock: [ 81.725269] -> (&(&x->lock)->rlock){+.-...} ops: 8 { [ 81.725274] HARDIRQ-ON-W at: [ 81.725276] [<ffffffff8109a64b>] __lock_acquire+0x65b/0x1d70 [ 81.725282] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725284] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725289] [<ffffffff816dc3a3>] xfrm_timer_handler+0x43/0x290 [ 81.725292] [<ffffffff81059437>] __tasklet_hrtimer_trampoline+0x17/0x40 [ 81.725300] [<ffffffff8105a1b7>] tasklet_hi_action+0xd7/0xf0 [ 81.725303] [<ffffffff81059ac6>] __do_softirq+0xe6/0x2d0 [ 81.725305] [<ffffffff8105a026>] irq_exit+0x96/0xc0 [ 81.725308] [<ffffffff8177fd0a>] smp_apic_timer_interrupt+0x4a/0x60 [ 81.725313] [<ffffffff8177e96f>] apic_timer_interrupt+0x6f/0x80 [ 81.725316] [<ffffffff8100b7c6>] arch_cpu_idle+0x26/0x30 [ 81.725329] [<ffffffff810ace28>] cpu_startup_entry+0x88/0x2b0 [ 81.725333] [<ffffffff8102e5b0>] start_secondary+0x190/0x1f0 [ 81.725338] IN-SOFTIRQ-W at: [ 81.725340] [<ffffffff8109a61d>] __lock_acquire+0x62d/0x1d70 [ 81.725342] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725344] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725347] [<ffffffff816dc3a3>] xfrm_timer_handler+0x43/0x290 [ 81.725349] [<ffffffff81059437>] __tasklet_hrtimer_trampoline+0x17/0x40 [ 81.725352] [<ffffffff8105a1b7>] tasklet_hi_action+0xd7/0xf0 [ 81.725355] [<ffffffff81059ac6>] __do_softirq+0xe6/0x2d0 [ 81.725358] [<ffffffff8105a026>] irq_exit+0x96/0xc0 [ 81.725360] [<ffffffff8177fd0a>] smp_apic_timer_interrupt+0x4a/0x60 [ 81.725363] [<ffffffff8177e96f>] apic_timer_interrupt+0x6f/0x80 [ 81.725365] [<ffffffff8100b7c6>] arch_cpu_idle+0x26/0x30 [ 81.725368] [<ffffffff810ace28>] cpu_startup_entry+0x88/0x2b0 [ 81.725370] [<ffffffff8102e5b0>] start_secondary+0x190/0x1f0 [ 81.725373] INITIAL USE at: [ 81.725375] [<ffffffff8109a31a>] __lock_acquire+0x32a/0x1d70 [ 81.725385] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725388] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725390] [<ffffffff816dc3a3>] xfrm_timer_handler+0x43/0x290 [ 81.725394] [<ffffffff81059437>] __tasklet_hrtimer_trampoline+0x17/0x40 [ 81.725398] [<ffffffff8105a1b7>] tasklet_hi_action+0xd7/0xf0 [ 81.725401] [<ffffffff81059ac6>] __do_softirq+0xe6/0x2d0 [ 81.725404] [<ffffffff8105a026>] irq_exit+0x96/0xc0 [ 81.725407] [<ffffffff8177fd0a>] smp_apic_timer_interrupt+0x4a/0x60 [ 81.725409] [<ffffffff8177e96f>] apic_timer_interrupt+0x6f/0x80 [ 81.725412] [<ffffffff8100b7c6>] arch_cpu_idle+0x26/0x30 [ 81.725415] [<ffffffff810ace28>] cpu_startup_entry+0x88/0x2b0 [ 81.725417] [<ffffffff8102e5b0>] start_secondary+0x190/0x1f0 [ 81.725420] } [ 81.725421] ... key at: [<ffffffff8295b9c8>] __key.46349+0x0/0x8 [ 81.725445] ... acquired at: [ 81.725446] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725449] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725452] [<ffffffff816dc057>] __xfrm_state_delete+0x37/0x140 [ 81.725454] [<ffffffff816dc18c>] xfrm_state_delete+0x2c/0x50 [ 81.725456] [<ffffffff816dc277>] xfrm_state_flush+0xc7/0x1b0 [ 81.725458] [<ffffffffa005f6cc>] pfkey_flush+0x7c/0x100 [af_key] [ 81.725465] [<ffffffffa005efb7>] pfkey_process+0x1c7/0x1f0 [af_key] [ 81.725468] [<ffffffffa005f139>] pfkey_sendmsg+0x159/0x260 [af_key] [ 81.725471] [<ffffffff8162c16f>] sock_sendmsg+0xaf/0xc0 [ 81.725476] [<ffffffff8162c99c>] SYSC_sendto+0xfc/0x130 [ 81.725479] [<ffffffff8162cf3e>] SyS_sendto+0xe/0x10 [ 81.725482] [<ffffffff8177dd12>] system_call_fastpath+0x16/0x1b [ 81.725484] [ 81.725486] -> (xfrm_state_lock){+.+...} ops: 11 { [ 81.725490] HARDIRQ-ON-W at: [ 81.725493] [<ffffffff8109a64b>] __lock_acquire+0x65b/0x1d70 [ 81.725504] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725507] [<ffffffff81774e4b>] _raw_spin_lock_bh+0x3b/0x70 [ 81.725510] [<ffffffff816dc1df>] xfrm_state_flush+0x2f/0x1b0 [ 81.725513] [<ffffffffa005f6cc>] pfkey_flush+0x7c/0x100 [af_key] [ 81.725516] [<ffffffffa005efb7>] pfkey_process+0x1c7/0x1f0 [af_key] [ 81.725519] [<ffffffffa005f139>] pfkey_sendmsg+0x159/0x260 [af_key] [ 81.725522] [<ffffffff8162c16f>] sock_sendmsg+0xaf/0xc0 [ 81.725525] [<ffffffff8162c99c>] SYSC_sendto+0xfc/0x130 [ 81.725527] [<ffffffff8162cf3e>] SyS_sendto+0xe/0x10 [ 81.725530] [<ffffffff8177dd12>] system_call_fastpath+0x16/0x1b [ 81.725533] SOFTIRQ-ON-W at: [ 81.725534] [<ffffffff8109a67a>] __lock_acquire+0x68a/0x1d70 [ 81.725537] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725539] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725541] [<ffffffff816dd751>] xfrm_stateonly_find+0x41/0x1f0 [ 81.725544] [<ffffffffa008af03>] mod_cur_headers+0x793/0x7f0 [pktgen] [ 81.725547] [<ffffffffa008bca2>] pktgen_thread_worker+0xd42/0x1880 [pktgen] [ 81.725550] [<ffffffff81078f84>] kthread+0xe4/0x100 [ 81.725555] [<ffffffff8177dc6c>] ret_from_fork+0x7c/0xb0 [ 81.725565] INITIAL USE at: [ 81.725567] [<ffffffff8109a31a>] __lock_acquire+0x32a/0x1d70 [ 81.725569] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725572] [<ffffffff81774e4b>] _raw_spin_lock_bh+0x3b/0x70 [ 81.725574] [<ffffffff816dc1df>] xfrm_state_flush+0x2f/0x1b0 [ 81.725576] [<ffffffffa005f6cc>] pfkey_flush+0x7c/0x100 [af_key] [ 81.725580] [<ffffffffa005efb7>] pfkey_process+0x1c7/0x1f0 [af_key] [ 81.725583] [<ffffffffa005f139>] pfkey_sendmsg+0x159/0x260 [af_key] [ 81.725586] [<ffffffff8162c16f>] sock_sendmsg+0xaf/0xc0 [ 81.725589] [<ffffffff8162c99c>] SYSC_sendto+0xfc/0x130 [ 81.725594] [<ffffffff8162cf3e>] SyS_sendto+0xe/0x10 [ 81.725597] [<ffffffff8177dd12>] system_call_fastpath+0x16/0x1b [ 81.725599] } [ 81.725600] ... key at: [<ffffffff81cadef8>] xfrm_state_lock+0x18/0x50 [ 81.725606] ... acquired at: [ 81.725607] [<ffffffff810995c0>] check_usage_backwards+0x110/0x150 [ 81.725609] [<ffffffff81099e96>] mark_lock+0x196/0x2f0 [ 81.725611] [<ffffffff8109a67a>] __lock_acquire+0x68a/0x1d70 [ 81.725614] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725616] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725627] [<ffffffff816dd751>] xfrm_stateonly_find+0x41/0x1f0 [ 81.725629] [<ffffffffa008af03>] mod_cur_headers+0x793/0x7f0 [pktgen] [ 81.725632] [<ffffffffa008bca2>] pktgen_thread_worker+0xd42/0x1880 [pktgen] [ 81.725635] [<ffffffff81078f84>] kthread+0xe4/0x100 [ 81.725637] [<ffffffff8177dc6c>] ret_from_fork+0x7c/0xb0 [ 81.725640] [ 81.725641] [ 81.725641] stack backtrace: [ 81.725645] CPU: 0 PID: 2780 Comm: kpktgend_0 Not tainted 3.13.0-rc2+ #92 [ 81.725647] Hardware name: innotek GmbH VirtualBox, BIOS VirtualBox 12/01/2006 [ 81.725649] ffffffff82537b80 ffff880018199988 ffffffff8176af37 0000000000000007 [ 81.725652] ffff8800181999f0 ffff8800181999d8 ffffffff81099358 ffffffff82537b80 [ 81.725655] ffffffff81a32def ffff8800181999f4 0000000000000000 ffff880002cbeaa8 [ 81.725659] Call Trace: [ 81.725664] [<ffffffff8176af37>] dump_stack+0x46/0x58 [ 81.725667] [<ffffffff81099358>] print_irq_inversion_bug.part.42+0x1e8/0x1f0 [ 81.725670] [<ffffffff810995c0>] check_usage_backwards+0x110/0x150 [ 81.725672] [<ffffffff81099e96>] mark_lock+0x196/0x2f0 [ 81.725675] [<ffffffff810994b0>] ? check_usage_forwards+0x150/0x150 [ 81.725685] [<ffffffff8109a67a>] __lock_acquire+0x68a/0x1d70 [ 81.725691] [<ffffffff810899a5>] ? sched_clock_local+0x25/0x90 [ 81.725694] [<ffffffff81089b38>] ? sched_clock_cpu+0xa8/0x120 [ 81.725697] [<ffffffff8109a31a>] ? __lock_acquire+0x32a/0x1d70 [ 81.725699] [<ffffffff816dd751>] ? xfrm_stateonly_find+0x41/0x1f0 [ 81.725702] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725704] [<ffffffff816dd751>] ? xfrm_stateonly_find+0x41/0x1f0 [ 81.725707] [<ffffffff810899a5>] ? sched_clock_local+0x25/0x90 [ 81.725710] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725712] [<ffffffff816dd751>] ? xfrm_stateonly_find+0x41/0x1f0 [ 81.725715] [<ffffffff810971ec>] ? lock_release_holdtime.part.26+0x1c/0x1a0 [ 81.725717] [<ffffffff816dd751>] xfrm_stateonly_find+0x41/0x1f0 [ 81.725721] [<ffffffffa008af03>] mod_cur_headers+0x793/0x7f0 [pktgen] [ 81.725724] [<ffffffffa008bca2>] pktgen_thread_worker+0xd42/0x1880 [pktgen] [ 81.725727] [<ffffffffa008ba71>] ? pktgen_thread_worker+0xb11/0x1880 [pktgen] [ 81.725729] [<ffffffff8109cf9d>] ? trace_hardirqs_on+0xd/0x10 [ 81.725733] [<ffffffff81775410>] ? _raw_spin_unlock_irq+0x30/0x40 [ 81.725745] [<ffffffff8151faa0>] ? e1000_clean+0x9d0/0x9d0 [ 81.725751] [<ffffffff81094310>] ? __init_waitqueue_head+0x60/0x60 [ 81.725753] [<ffffffff81094310>] ? __init_waitqueue_head+0x60/0x60 [ 81.725757] [<ffffffffa008af60>] ? mod_cur_headers+0x7f0/0x7f0 [pktgen] [ 81.725759] [<ffffffff81078f84>] kthread+0xe4/0x100 [ 81.725762] [<ffffffff81078ea0>] ? flush_kthread_worker+0x170/0x170 [ 81.725765] [<ffffffff8177dc6c>] ret_from_fork+0x7c/0xb0 [ 81.725768] [<ffffffff81078ea0>] ? flush_kthread_worker+0x170/0x170 Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-01-02xfrm: checkpatch erros with inline keyword positionWeilong Chen1-3/+3
Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-01-02xfrm: fix checkpatch errorWeilong Chen1-2/+2
Fix that "else should follow close brace '}'". Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-01-02xfrm: checkpatch erros with space prohibitedWeilong Chen2-4/+4
Fix checkpatch error "space prohibited xxx". Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-01-02xfrm: checkpatch errors with foo * barWeilong Chen3-13/+13
This patch clean up some checkpatch errors like this: ERROR: "foo * bar" should be "foo *bar" ERROR: "(foo*)" should be "(foo *)" Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-01-02xfrm: checkpatch errors with spaceWeilong Chen4-13/+13
This patch cleanup some space errors. Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-12-16xfrm: export verify_userspi_info for pkfey and netlink interfaceFan Du2-24/+25
In order to check against valid IPcomp spi range, export verify_userspi_info for both pfkey and netlink interface. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-12-16xfrm: check user specified spi for IPCompFan Du1-1/+3
IPComp connection between two hosts is broken if given spi bigger than 0xffff. OUTSPI=0x87 INSPI=0x11112 ip xfrm policy update dst 192.168.1.101 src 192.168.1.109 dir out action allow \ tmpl dst 192.168.1.101 src 192.168.1.109 proto comp spi $OUTSPI ip xfrm policy update src 192.168.1.101 dst 192.168.1.109 dir in action allow \ tmpl src 192.168.1.101 dst 192.168.1.109 proto comp spi $INSPI ip xfrm state add src 192.168.1.101 dst 192.168.1.109 proto comp spi $INSPI \ comp deflate ip xfrm state add dst 192.168.1.101 src 192.168.1.109 proto comp spi $OUTSPI \ comp deflate tcpdump can capture outbound ping packet, but inbound packet is dropped with XfrmOutNoStates errors. It looks like spi value used for IPComp is expected to be 16bits wide only. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-12-06xfrm: Remove ancient sleeping when the SA is in acquire stateSteffen Klassert2-39/+3
We now queue packets to the policy if the states are not yet resolved, this replaces the ancient sleeping code. Also the sleeping can cause indefinite task hangs if the needed state does not get resolved. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-12-06xfrm: Namespacify xfrm state/policy locksFan Du3-103/+119
By semantics, xfrm layer is fully name space aware, so will the locks, e.g. xfrm_state/pocliy_lock. Ensure exclusive access into state/policy link list for different name space with one global lock is not right in terms of semantics aspect at first place, as they are indeed mutually independent with each other, but also more seriously causes scalability problem. One practical scenario is on a Open Network Stack, more than hundreds of lxc tenants acts as routers within one host, a global xfrm_state/policy_lock becomes the bottleneck. But onces those locks are decoupled in a per-namespace fashion, locks contend is just with in specific name space scope, without causing additional SPD/SAD access delay for other name space. Also this patch improve scalability while as without changing original xfrm behavior. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-12-06xfrm: Using the right namespace to migrate key infoFan Du2-6/+7
because the home agent could surely be run on a different net namespace other than init_net. The original behavior could lead into inconsistent of key info. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-12-06xfrm: Try to honor policy index if it's supplied by userFan Du2-6/+20
xfrm code always searches for unused policy index for newly created policy regardless whether or not user space policy index hint supplied. This patch enables such feature so that using "ip xfrm ... index=xxx" can be used by user to set specific policy index. Currently this beahvior is broken, so this patch make it happen as expected. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-11-07net: move pskb_put() to core codeMathias Krause1-13/+0
This function has usage beside IPsec so move it to the core skbuff code. While doing so, give it some documentation and change its return type to 'unsigned char *' to be in line with skb_put(). Signed-off-by: Mathias Krause <mathias.krause@secunet.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-6/+6
Conflicts: drivers/net/ethernet/emulex/benet/be.h drivers/net/netconsole.c net/bridge/br_private.h Three mostly trivial conflicts. The net/bridge/br_private.h conflict was a function signature (argument addition) change overlapping with the extern removals from Joe Perches. In drivers/net/netconsole.c we had one change adjusting a printk message whilst another changed "printk(KERN_INFO" into "pr_info(". Lastly, the emulex change was a new inline function addition overlapping with Joe Perches's extern removals. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-02Merge branch 'master' of ↵David S. Miller2-2/+11
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Conflicts: net/xfrm/xfrm_policy.c Minor merge conflict in xfrm_policy.c, consisting of overlapping changes which were trivial to resolve. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller3-35/+52
Conflicts: drivers/net/usb/qmi_wwan.c include/net/dst.h Trivial merge conflicts, both were overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-21xfrm: Don't queue retransmitted packets if the original is still on the hostSteffen Klassert1-0/+7
It does not make sense to queue retransmitted packets if the original packet is still in some queue of this host. So add a check to xdst_queue_output() and drop the packet if the original packet is not yet sent. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Eric Dumazet <edumazet@google.com>
2013-10-21xfrm: use vmalloc_node() for percpu scratchesEric Dumazet1-2/+4
scratches are per cpu, we can use vmalloc_node() for proper NUMA affinity. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-10-19net: misc: Remove extern from function prototypesJoe Perches1-2/+2
There are a mix of function prototypes with and without extern in the kernel sources. Standardize on not using extern for function prototypes. Function prototypes don't need to be written with extern. extern is assumed by the compiler. Its use is as unnecessary as using auto to declare automatic/local variables in a block. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-18xfrm: prevent ipcomp scratch buffer race conditionMichal Kubecek1-6/+6
In ipcomp_compress(), sortirq is enabled too early, allowing the per-cpu scratch buffer to be rewritten by ipcomp_decompress() (called on the same CPU in softirq context) between populating the buffer and copying the compressed data to the skb. v2: as pointed out by Steffen Klassert, if we also move the local_bh_disable() before reading the per-cpu pointers, we can get rid of get_cpu()/put_cpu(). v3: removed ipcomp_decompress part (as explained by Herbert Xu, it cannot be called from process context), get rid of cpu variable (thanks to Eric Dumazet) Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-10-08xfrm: check for a vaild skb in xfrm_policy_queue_processSteffen Klassert1-0/+4
We might dreference a NULL pointer if the hold_queue is empty, so add a check to avoid this. Bug was introduced with git commit a0073fe18 ("xfrm: Add a state resolution packet queue") Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-10-08xfrm: Add refcount handling to queued policiesSteffen Klassert1-7/+17
We need to ensure that policies can't go away as long as the hold timer is armed, so take a refcont when we arm the timer and drop one if we delete it. Bug was introduced with git commit a0073fe18 ("xfrm: Add a state resolution packet queue") Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-10-01xfrm: Simplify SA looking up when using wildcard sourceFan Du1-1/+1
__xfrm4/6_state_addr_check is a four steps check, all we need to do is checking whether the destination address match when looking SA using wildcard source address. Passing saddr from flow is worst option, as the checking needs to reach the fourth step while actually only one time checking will do the work. So, simplify this process by only checking destination address when using wildcard source address for looking up SAs. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-10-01xfrm: Force SA to be lookup again if SA in acquire stateFan Du1-1/+1
If SA is in the process of acquiring, which indicates this SA is more promising and precise than the fall back option, i.e. using wild card source address for searching less suitable SA. So, here bail out, and try again. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-09-30Merge branch 'master' of ↵David S. Miller1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Conflicts: include/net/xfrm.h Simple conflict between Joe Perches "extern" removal for function declarations in header files and the changes in Steffen's tree. Steffen Klassert says: ==================== Two patches that are left from the last development cycle. Manual merging of include/net/xfrm.h is needed. The conflict can be solved as it is currently done in linux-next. 1) We announce the creation of temporary acquire state via an asyc event, so the deletion should be annunced too. From Nicolas Dichtel. 2) The VTI tunnels do not real tunning, they just provide a routable IPsec tunnel interface. So introduce and use xfrm_tunnel_notifier instead of xfrm_tunnel for xfrm tunnel mode callback. From Fan Du. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-25xfrm: Fix aevent generation for each received packetThomas Egerer1-24/+27
If asynchronous events are enabled for a particular netlink socket, the notify function is called by the advance function. The notify function creates and dispatches a km_event if a replay timeout occurred, or at least replay_maxdiff packets have been received since the last asynchronous event has been sent. The function is supposed to return if neither of the two events were detected for a state, or replay_maxdiff is equal to zero. Replay_maxdiff is initialized in xfrm_state_construct to the value of the xfrm.sysctl_aevent_rseqth (2 by default), and updated if for a state if the netlink attribute XFRMA_REPLAY_THRESH is set. If, however, replay_maxdiff is set to zero, then all of the three notify implementations perform a break from the switch statement instead of checking whether a timeout occurred, and -- if not -- return. As a result an asynchronous event is generated for every replay update of a state that has a zero replay_maxdiff value. This patch modifies the notify functions such that they immediately return if replay_maxdiff has the value zero, unless a timeout occurred. Signed-off-by: Thomas Egerer <thomas.egerer@secunet.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-09-17xfrm: Guard IPsec anti replay window against replay bitmapFan Du2-3/+3
For legacy IPsec anti replay mechanism: bitmap in struct xfrm_replay_state could only provide a 32 bits window size limit in current design, thus user level parameter sadb_sa_replay should honor this limit, otherwise misleading outputs("replay=244") by setkey -D will be: 192.168.25.2 192.168.22.2 esp mode=transport spi=147561170(0x08cb9ad2) reqid=0(0x00000000) E: aes-cbc 9a8d7468 7655cf0b 719d27be b0ddaac2 A: hmac-sha1 2d2115c2 ebf7c126 1c54f186 3b139b58 264a7331 seq=0x00000000 replay=244 flags=0x00000000 state=mature created: Sep 17 14:00:00 2013 current: Sep 17 14:00:22 2013 diff: 22(s) hard: 30(s) soft: 26(s) last: Sep 17 14:00:00 2013 hard: 0(s) soft: 0(s) current: 1408(bytes) hard: 0(bytes) soft: 0(bytes) allocated: 22 hard: 0 soft: 0 sadb_seq=1 pid=4854 refcnt=0 192.168.22.2 192.168.25.2 esp mode=transport spi=255302123(0x0f3799eb) reqid=0(0x00000000) E: aes-cbc 6485d990 f61a6bd5 e5660252 608ad282 A: hmac-sha1 0cca811a eb4fa893 c47ae56c 98f6e413 87379a88 seq=0x00000000 replay=244 flags=0x00000000 state=mature created: Sep 17 14:00:00 2013 current: Sep 17 14:00:22 2013 diff: 22(s) hard: 30(s) soft: 26(s) last: Sep 17 14:00:00 2013 hard: 0(s) soft: 0(s) current: 1408(bytes) hard: 0(bytes) soft: 0(bytes) allocated: 22 hard: 0 soft: 0 sadb_seq=0 pid=4854 refcnt=0 And also, optimizing xfrm_replay_check window checking by setting the desirable x->props.replay_window with only doing the comparison once for all when xfrm_state is first born. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-09-16xfrm: Fix replay size checking on async eventsSteffen Klassert1-1/+1
We pass the wrong netlink attribute to xfrm_replay_verify_len(). It should be XFRMA_REPLAY_ESN_VAL and not XFRMA_REPLAY_VAL as we currently doing. This causes memory corruptions if the replay esn attribute has incorrect length. Fix this by passing the right attribute to xfrm_replay_verify_len(). Reported-by: Michael Rossberg <michael.rossberg@tu-ilmenau.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-09-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller3-13/+24
Conflicts: drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c net/bridge/br_multicast.c net/ipv6/sit.c The conflicts were minor: 1) sit.c changes overlap with change to ip_tunnel_xmit() signature. 2) br_multicast.c had an overlap between computing max_delay using msecs_to_jiffies and turning MLDV2_MRC() into an inline function with a name using lowercase instead of uppercase letters. 3) stmmac had two overlapping changes, one which conditionally allocated and hooked up a dma_cfg based upon the presence of the pbl OF property, and another one handling store-and-forward DMA made. The latter of which should not go into the new of_find_property() basic block. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-28xfrm: Fix potential null pointer dereference in xdst_queue_outputSteffen Klassert1-8/+1
The net_device might be not set on the skb when we try refcounting. This leads to a null pointer dereference in xdst_queue_output(). It turned out that the refcount to the net_device is not needed after all. The dst_entry has a refcount to the net_device before we queue the skb, so it can't go away. Therefore we can remove the refcount on queueing to fix the null pointer dereference. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-08-26xfrm: announce deleation of temporary SANicolas Dichtel1-1/+1
Creation of temporary SA are announced by netlink, but there is no notification for the deletion. This patch fix this asymmetric situation. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-08-19xfrm: remove irrelevant comment in xfrm_input().Rami Rosen1-2/+0
This patch removes a comment in xfrm_input() which became irrelevant due to commit 2774c13, "xfrm: Handle blackhole route creation via afinfo". That commit removed returning -EREMOTE in the xfrm_lookup() method when the packet should be discarded and also removed the correspoinding -EREMOTE handlers. This was replaced by calling the make_blackhole() method. Therefore the comment about -EREMOTE is not relevant anymore. Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-08-19xfrm: choose protocol family by skb protocolHannes Frederic Sowa1-1/+9
We need to choose the protocol family by skb->protocol. Otherwise we call the wrong xfrm{4,6}_local_error handler in case an ipv6 sockets is used in ipv4 mode, in which case we should call down to xfrm4_local_error (ip6 sockets are a superset of ip4 ones). We are called before before ip_output functions, so skb->protocol is not reset. Cc: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>