summaryrefslogtreecommitdiff
path: root/net/ipv4/af_inet.c
AgeCommit message (Collapse)AuthorFilesLines
2010-02-16percpu: add __percpu sparse annotations to netTejun Heo1-23/+23
Add __percpu sparse annotations to net. These annotations are to make sparse consider percpu variables to be in a different address space and warn if accessed without going through percpu accessors. This patch doesn't affect normal builds. The macro and type tricks around snmp stats make things a bit interesting. DEFINE/DECLARE_SNMP_STAT() macros mark the target field as __percpu and SNMP_UPD_PO_STATS() macro is updated accordingly. All snmp_mib_*() users which used to cast the argument to (void **) are updated to cast it to (void __percpu **). Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David S. Miller <davem@davemloft.net> Cc: Patrick McHardy <kaber@trash.net> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Vlad Yasevich <vladislav.yasevich@hp.com> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-05net: check kern before calling security subsystemEric Paris1-1/+1
Before calling capable(CAP_NET_RAW) check if this operations is on behalf of the kernel or on behalf of userspace. Do not do the security check if it is on behalf of the kernel. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-05net: pass kern to net_proto_family create functionEric Paris1-1/+2
The generic __sock_create function has a kern argument which allows the security system to make decisions based on if a socket is being created by the kernel or by userspace. This patch passes that flag to the net_proto_family specific create function, so it can do the same thing. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-05net: drop capability from protocol definitionsEric Paris1-4/+1
struct can_proto had a capability field which wasn't ever used. It is dropped entirely. struct inet_protosw had a capability field which can be more clearly expressed in the code by just checking if sock->type = SOCK_RAW. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29net,socket: introduce DECLARE_SOCKADDR helper to catch overflow at build timeCyrill Gorcunov1-1/+1
proto_ops->getname implies copying protocol specific data into storage unit (particulary to __kernel_sockaddr_storage). So when we implement new protocol support we should keep such a detail in mind (which is easy to forget about). Lets introduce DECLARE_SOCKADDR helper which check if storage unit is not overfowed at build time. Eventually inet_getname is switched to use DECLARE_SOCKADDR (to show example of usage). Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-18inet: rename some inet_sock fieldsEric Dumazet1-31/+31
In order to have better cache layouts of struct sock (separate zones for rx/tx paths), we need this preliminary patch. Goal is to transfert fields used at lookup time in the first read-mostly cache line (inside struct sock_common) and move sk_refcnt to a separate cache line (only written by rx path) This patch adds inet_ prefix to daddr, rcv_saddr, dport, num, saddr, sport and id fields. This allows a future patch to define these fields as macros, like sk_refcnt, without name clashes. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-07net: mark net_proto_ops as constStephen Hemminger1-1/+1
All usages of structure net_proto_ops should be declared const. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-01net: Use sk_mark for routing lookup in more placesAtis Elsts1-0/+1
This patch against v2.6.31 adds support for route lookup using sk_mark in some more places. The benefits from this patch are the following. First, SO_MARK option now has effect on UDP sockets too. Second, ip_queue_xmit() and inet_sk_rebuild_header() could fail to do routing lookup correctly if TCP sockets with SO_MARK were used. Signed-off-by: Atis Elsts <atis@mikrotik.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
2009-09-14net: constify struct net_protocolAlexey Dobriyan1-9/+9
Remove long removed "inet_protocol_base" declaration. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-28ipv4: af_inet.c cleanupsEric Dumazet1-57/+55
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-12udpv4: Handle large incoming UDP/IPv4 packets and support software UFO.Sridhar Samudrala1-1/+11
- validate and forward GSO UDP/IPv4 packets from untrusted sources. - do software UFO if the outgoing device doesn't support UFO. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-03ipv4: remove ip_mc_drop_socket() declaration from af_inet.c.Rami Rosen1-1/+0
ip_mc_drop_socket() method is declared in linux/igmp.h, which is included anyhow in af_inet.c. So there is no need for this declaration. This patch removes it from af_inet.c. Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-02ipv4: New multicast-all socket optionNivedita Singhvi1-0/+1
After some discussion offline with Christoph Lameter and David Stevens regarding multicast behaviour in Linux, I'm submitting a slightly modified patch from the one Christoph submitted earlier. This patch provides a new socket option IP_MULTICAST_ALL. In this case, default behaviour is _unchanged_ from the current Linux standard. The socket option is set by default to provide original behaviour. Sockets wishing to receive data only from multicast groups they join explicitly will need to clear this socket option. Signed-off-by: Nivedita Singhvi <niv@us.ibm.com> Signed-off-by: Christoph Lameter<cl@linux.com> Acked-by: David Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-27ipv4: Use 32-bit loads for ID and length in GROHerbert Xu1-4/+4
This patch optimises the IPv4 GRO code by using 32-bit loads (instead of 16-bit ones) on the ID and length checks in the receive function. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-27gro: Avoid unnecessary comparison after skb_gro_headerHerbert Xu1-3/+10
For the overwhelming majority of cases, skb_gro_header's return value cannot be NULL. Yet we must check it because of its current form. This patch splits it up into multiple functions in order to avoid this. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-17[PATCH] net: remove superfluous call to synchronize_net()Eric Dumazet1-2/+0
inet_register_protosw() function is responsible for adding a new inet protocol into a global table (inetsw[]) that is used with RCU rules. As soon as the store of the pointer is done, other cpus might see this new protocol in inetsw[], so we have to make sure new protocol is ready for use. All pending memory updates should thus be committed to memory before setting the pointer. This is correctly done using rcu_assign_pointer() synchronize_net() is typically used at unregister time, after unsetting the pointer, to make sure no other cpu is still using the object we want to dismantle. Using it at register time is only adding an artificial delay that could hide a real bug, and this bug could popup if/when synchronize_rcu() can proceed faster than now. This saves about 13 ms on boot time on a HZ=1000 8 cpus machine ;) (4 calls to inet_register_protosw(), and about 3200 us per call) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-27Merge branch 'core/percpu' into percpu-cpumask-x86-for-linus-2Ingo Molnar1-2/+2
Conflicts: arch/parisc/kernel/irq.c arch/x86/include/asm/fixmap_64.h arch/x86/include/asm/setup.h kernel/irq/handle.c Semantic merge: arch/x86/include/asm/fixmap.h Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-10net: convert usage of packet_type to read_mostlyStephen Hemminger1-1/+1
Protocols that use packet_type can be __read_mostly section for better locality. Elminate any unnecessary initializations of NULL. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-20alloc_percpu: add align argument to __alloc_percpu.Rusty Russell1-2/+2
This prepares for a real __alloc_percpu, by adding an alignment argument. Only one place uses __alloc_percpu directly, and that's for a string. tj: af_inet also uses __alloc_percpu(), update it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Jens Axboe <axboe@kernel.dk>
2009-02-08gro: Optimise IPv4 packet receptionHerbert Xu1-6/+7
As this function can be called more than half a million times for 10GbE, it's important to optimise it as much as we can. This patch does some obvious changes to use 2-byte and 4-byte operations instead of byte-oriented ones where possible. Bit ops are also used to replace logical ops to reduce branching. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-01ipv4: Delete redundant sk_family assignmentHerbert Xu1-1/+0
sk_alloc now sets sk_family so this is redundant. In fact it caught my eye because sock_init_data already uses sk_family so this is too late anyway. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-01net: replace uses of __constant_{endian}Harvey Harrison1-1/+1
Base versions handle constant folding now. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-29gro: Avoid copying headers of unmerged packetsHerbert Xu1-5/+5
Unfortunately simplicity isn't always the best. The fraginfo interface turned out to be suboptimal. The problem was quite obvious. For every packet, we have to copy the headers from the frags structure into skb->head, even though for 99% of the packets this part is immediately thrown away after the merge. LRO didn't have this problem because it directly read the headers from the frags structure. This patch attempts to address this by creating an interface that allows GRO to access the headers in the first frag without having to copy it. Because all drivers that use frags place the headers in the first frag this optimisation should be enough. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-12-25netns: igmp: allow IPPROTO_IGMP sockets in netnsAlexey Dobriyan1-0/+1
Looks like everything is already ready. Required for ebtables(8) for one thing. Also, required for ipmr per-netns (coming soon). (Benjamin) Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-12-15tcp: Add GRO supportHerbert Xu1-0/+2
This patch adds the TCP-specific portion of GRO. The criterion for merging is extremely strict (the TCP header must match exactly apart from the checksum) so as to allow refragmentation. Otherwise this is pretty much identical to LRO, except that we support the merging of ECN packets. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-12-15ipv4: Add GRO infrastructureHerbert Xu1-0/+97
This patch adds GRO support for IPv4. The criteria for merging is more stringent than LRO, in particular, we require all fields in the IP header to be identical except for the length, ID and checksum. In addition, the ID must form an arithmetic sequence with a difference of one. The ID requirement might seem overly strict, however, most hardware TSO solutions already obey this rule. Linux itself also obeys this whether GSO is in use or not. In future we could relax this rule by storing the IDs (or rather making sure that we don't drop them when pulling the aggregate skb's tail). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-23net: some optimizations in af_inetEric Dumazet1-5/+4
1) Use eq_net() in inet_netns_ok() to speedup socket creation if !CONFIG_NET_NS 2) Reorder the tests about inet_ehash_secret generation (once only) Use the unlikely() macro when testing if inet_ehash_secret already generated. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-20Merge branch 'master' of ↵David S. Miller1-0/+1
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/ixgbe/ixgbe_main.c include/net/mac80211.h net/phonet/af_phonet.c
2008-11-20TPROXY: supply a struct flowi->flags argument in inet_sk_rebuild_header()Balazs Scheidler1-0/+1
inet_sk_rebuild_header() does a new route lookup if the dst_entry associated with a socket becomes stale. However inet_sk_rebuild_header() didn't use struct flowi->flags, causing the route lookup to fail for foreign-bound IP_TRANSPARENT sockets, causing an error state to be set for the sockets in question. Signed-off-by: Balazs Scheidler <bazsi@balabit.hu> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-31net: replace NIPQUAD() in net/ipv4/ net/ipv6/Harvey Harrison1-5/+2
Using NIPQUAD() with NIPQUAD_FMT, %d.%d.%d.%d or %u.%u.%u.%u can be replaced with %pI4 Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-01ipv4: Allow binding to non-local addresses if IP_TRANSPARENT is setTóth László Attila1-1/+1
Setting IP_TRANSPARENT is not really useful without allowing non-local binds for the socket. To make user-space code simpler we allow these binds even if IP_TRANSPARENT is set but IP_FREEBIND is not. Signed-off-by: Tóth László Attila <panther@balabit.hu> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-26Merge branch 'for-linus' of ↵Linus Torvalds1-0/+4
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (39 commits) [PATCH] fix RLIM_NOFILE handling [PATCH] get rid of corner case in dup3() entirely [PATCH] remove remaining namei_{32,64}.h crap [PATCH] get rid of indirect users of namei.h [PATCH] get rid of __user_path_lookup_open [PATCH] f_count may wrap around [PATCH] dup3 fix [PATCH] don't pass nameidata to __ncp_lookup_validate() [PATCH] don't pass nameidata to gfs2_lookupi() [PATCH] new (local) helper: user_path_parent() [PATCH] sanitize __user_walk_fd() et.al. [PATCH] preparation to __user_walk_fd cleanup [PATCH] kill nameidata passing to permission(), rename to inode_permission() [PATCH] take noexec checks to very few callers that care Re: [PATCH 3/6] vfs: open_exec cleanup [patch 4/4] vfs: immutable inode checking cleanup [patch 3/4] fat: dont call notify_change [patch 2/4] vfs: utimes cleanup [patch 1/4] vfs: utimes: move owner check into inode_change_ok() [PATCH] vfs: use kstrdup() and check failing allocation ...
2008-07-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds1-7/+7
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: netns: fix ip_rt_frag_needed rt_is_expired netfilter: nf_conntrack_extend: avoid unnecessary "ct->ext" dereferences netfilter: fix double-free and use-after free netfilter: arptables in netns for real netfilter: ip{,6}tables_security: fix future section mismatch selinux: use nf_register_hooks() netfilter: ebtables: use nf_register_hooks() Revert "pkt_sched: sch_sfq: dump a real number of flows" qeth: use dev->ml_priv instead of dev->priv syncookies: Make sure ECN is disabled net: drop unused BUG_TRAP() net: convert BUG_TRAP to generic WARN_ON drivers/net: convert BUG_TRAP to generic WARN_ON
2008-07-26[PATCH] sysctl: make sure that /proc/sys/net/ipv4 appears before per-ns onesAl Viro1-0/+4
Massage ipv4 initialization - make sure that net.ipv4 appears as non-per-net-namespace before it shows up in per-net-namespace sysctls. That's the only change outside of sysctl.c needed to get sane ordering rules and data structures for sysctls (esp. for procfs side of that mess). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-25net: convert BUG_TRAP to generic WARN_ONIlpo Järvinen1-7/+7
Removes legacy reinvent-the-wheel type thing. The generic machinery integrates much better to automated debugging aids such as kerneloops.org (and others), and is unambiguous due to better naming. Non-intuively BUG_TRAP() is actually equal to WARN_ON() rather than BUG_ON() though some might actually be promoted to BUG_ON() but I left that to future. I could make at least one BUILD_BUG_ON conversion. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-25list_for_each_rcu must die: networkingPaul E. McKenney1-6/+3
All uses of list_for_each_rcu() can be profitably replaced by the easier-to-use list_for_each_entry_rcu(). This patch makes this change for networking, in preparation for removing the list_for_each_rcu() API entirely. Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-18ipv4: clean the init_ipv4_mibs error pathsPavel Emelyanov1-7/+1
After moving all the stuff outside this function it looks a bit ugly - make it look better. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18mib: put icmpmsg statistics on struct netPavel Emelyanov1-6/+6
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18mib: put icmp statistics on struct netPavel Emelyanov1-5/+6
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18mib: put udplite statistics on struct netPavel Emelyanov1-5/+6
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18mib: put udp statistics on struct netPavel Emelyanov1-5/+6
Similar to... ouch, I repeat myself. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18mib: put net statistics on struct netPavel Emelyanov1-8/+6
Similar to ip and tcp ones :) Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18mib: put ip statistics on struct netPavel Emelyanov1-5/+6
Similar to tcp one. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18mib: put tcp statistics on struct netPavel Emelyanov1-7/+9
Proc temporary uses stats from init_net. BTW, TCP_XXX_STATS are beautiful (w/o do { } while (0) facing) again :) Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18ipv4: add pernet mib operationsPavel Emelyanov1-0/+20
These ones are currently empty, but stuff from init_ipv4_mibs will sequentially migrate there. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16tcp: add net to tcp_mib_initPavel Emelyanov1-1/+1
This one sets TCP MIBs after zeroing them, and thus requires the net. The existing single caller can use init_net (temporarily). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-03ipv4: Do cleanup for ip_mr_initWang Chen1-2/+3
Same as ip6_mr_init(), make ip_mr_init() return errno if fails. But do not do error handling in inet_init(), just print a msg. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-11net: remove CVS keywordsAdrian Bunk1-2/+0
This patch removes CVS keywords that weren't updated for a long time from comments. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-29Remove duplicated unlikely() in IS_ERR()Hirofumi Nakagawa1-1/+1
Some drivers have duplicated unlikely() macros. IS_ERR() already has unlikely() in itself. This patch cleans up such pointless code. Signed-off-by: Hirofumi Nakagawa <hnakagawa@miraclelinux.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Jeff Garzik <jeff@garzik.org> Cc: Paul Clements <paul.clements@steeleye.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: David Brownell <david-b@pacbell.net> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Carsten Otte <cotte@de.ibm.com> Cc: Patrick McHardy <kaber@trash.net> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Jaroslav Kysela <perex@perex.cz> Cc: Takashi Iwai <tiwai@suse.de> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-14[IPV4]: Use NIPQUAD_FMT to format ipv4 addresses.YOSHIFUJI Hideaki1-1/+1
And use %u to format port. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>