The comparison in ip_route_input is a hot path, by recoding the C
"and" as bit operations, fewer conditional branches get generated
so the code should be faster. Maybe someday Gcc will be smart
enough to do this?
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Avoid unneeded test in the case where object to be freed
has to be a leaf. Don't need to use the generic tnode_free()
function, instead just setup leaf to be freed.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The trie pointer is passed down to flush_list and flush_leaf
but never used.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow the use of SACK and window scaling when syncookies are used
and the client supports tcp timestamps. Options are encoded into
the timestamp sent in the syn-ack and restored from the timestamp
echo when the ack is received.
Based on earlier work by Glenn Griffin.
This patch avoids increasing the size of structs by encoding TCP
options into the least significant bits of the timestamp and
by not using any 'timestamp offset'.
The downside is that the timestamp sent in the packet after the synack
will increase by several seconds.
changes since v1:
don't duplicate timestamp echo decoding function, put it into ipv4/syncookie.c
and have ipv6/syncookies.c use it.
Feedback from Glenn Griffin: fix line indented with spaces, kill redundant if ()
Reviewed-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use vmalloc rather than alloc_pages to avoid wasting memory.
The problem is that tnode structure has a power of 2 sized array,
plus a header. So the current code wastes almost half the memory
allocated because it always needs the next bigger size to hold
that small header.
This is similar to an earlier patch by Eric, but instead of a list
and lock, I used a workqueue to handle the fact that vfree can't
be done in interrupt context.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If we register the iucv bus after the infrastructure is ready,
userspace can start relying on it when it receives the uevent
for the bus.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This BUG_ON is not needed, since all (debug) checks are also done
in smp_call_function() which gets called by this function.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
SKF_ADF_NLATTR searches for a netlink attribute, which avoids manually
parsing and walking attributes. It takes the offset at which to start
searching in the 'A' register and the attribute type in the 'X' register
and returns the offset in the 'A' register. When the attribute is not
found it returns zero.
A top-level attribute can be located using a filter like this
(example for nfnetlink, using struct nfgenmsg):
...
{
/* A = offset of first attribute */
.code = BPF_LD | BPF_IMM,
.k = sizeof(struct nlmsghdr) + sizeof(struct nfgenmsg)
},
{
/* X = CTA_PROTOINFO */
.code = BPF_LDX | BPF_IMM,
.k = CTA_PROTOINFO,
},
{
/* A = netlink attribute offset */
.code = BPF_LD | BPF_B | BPF_ABS,
.k = SKF_AD_OFF + SKF_AD_NLATTR
},
{
/* Exit if not found */
.code = BPF_JMP | BPF_JEQ | BPF_K,
.k = 0,
.jt = <error>
},
...
A nested attribute below the CTA_PROTOINFO attribute would then
be parsed like this:
...
{
/* A += sizeof(struct nlattr) */
.code = BPF_ALU | BPF_ADD | BPF_K,
.k = sizeof(struct nlattr),
},
{
/* X = CTA_PROTOINFO_TCP */
.code = BPF_LDX | BPF_IMM,
.k = CTA_PROTOINFO_TCP,
},
{
/* A = netlink attribute offset */
.code = BPF_LD | BPF_B | BPF_ABS,
.k = SKF_AD_OFF + SKF_AD_NLATTR
},
...
The data of an attribute can be loaded into 'A' like this:
...
{
/* X = A (attribute offset) */
.code = BPF_MISC | BPF_TAX,
},
{
/* A = skb->data[X + k] */
.code = BPF_LD | BPF_B | BPF_IND,
.k = sizeof(struct nlattr),
},
...
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Should not count it if the allocation of the object
is failed.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Should not count it if the allocation of this object
failed.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
No urgency on the rehash interval timer, so mark it as deferrable.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since route hash is a triple, use jhash_3words rather doing the mixing
directly. This should be as fast and give better distribution.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't mark functions that are large as inline, let compiler decide.
Also, use inline rather than __inline__.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The sk_filter function is too big to be inlined. This saves 2296 bytes
of text on allyesconfig.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some minor style cleanups:
* Move __KERNEL__ definitions to one place in filter.h
* Use const for sk_filter_len
* Line wrapping
* Put EXPORT_SYMBOL next to function definition
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes kernel bugzilla 10371.
As reported by M.Piechaczek@osmosys.tv, if we try to grab a
char sized socket option value, as in:
unsigned char ttl = 255;
socklen_t len = sizeof(ttl);
setsockopt(socket, IPPROTO_IP, IP_MULTICAST_TTL, &ttl, &len);
getsockopt(socket, IPPROTO_IP, IP_MULTICAST_TTL, &ttl, &len);
The ttl returned will be wrong on big-endian, and on both little-
endian and big-endian the next three bytes in userspace are written
with garbage.
It's because of this test in do_ip_getsockopt():
if (len < sizeof(int) && len > 0 && val>=0 && val<255) {
It should allow a 'val' of 255 to pass here, but it doesn't so it
copies a full 'int' back to userspace.
On little-endian that will write the correct value into the location
but it spams on the next three bytes in userspace. On big endian it
writes the wrong value into the location and spams the next three
bytes.
Signed-off-by: David S. Miller <davem@davemloft.net>
Without this patch, the generic L3 tracker would kick in
if nf_conntrack_ipv4 was not loaded before nf_nat, which
would lead to translation problems with ICMP errors.
NAT does not make sense without IPv4 connection tracking
anyway, so just add a call to need_ipv4_conntrack().
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shifts larger than the data type are undefined, don't try to shift
an u32 by 32. Also remove some special-casing of bitmasks divisible
by 32.
Based on patch by Jan Engelhardt <jengelh@computergmbh.de>.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit df9dcb45 ([IPSEC]: Fix inter address family IPsec tunnel handling)
broke openswan by removing the selector initialization for tunnel mode
in case it is uninitialized.
This patch restores the initialization, fixing openswan, but probably
breaking inter-family tunnels again (unknown since the patch author
disappeared). The correct thing for inter-family tunnels is probably
to simply initialize the selector family explicitly.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
When associating to a b-only AP where there is no ERP IE, short preamble
mode is left at previous state (probably also protection mode). In this
case, disable protection and use short preamble mode as specified in
capability field. The same is done if capability field is changed on-the-fly.
Signed-off-by: Vladimir Koutny <vlado@ksp.sk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Commit 510deb0d was supposed to move the xprt_create_transport() call in
rpc_create(), but neglected to remove the old call site. This resulted in
a transport leak after every rpc_create() call.
This leak is present in 2.6.24 and 2.6.25.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Fix a problem in _copy_to_pages(), whereby it may call flush_dcache_page()
with an invalid pointer due to the fact that 'pgto' gets incremented
beyond the end of the page array. Fix is to exit the loop without this
unnecessary increment of pgto.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
If print_mac() is used inside of a pr_debug() the compiler
can't see that the call is redundant so still performs it
even of pr_debug() ends up being a nop.
So don't use print_mac() in such cases in hot code paths,
use MAC_FMT et al. instead.
As noted by Joe Perches, pr_debug() could be modified to
handle this better, but that is a change to an interface
used by the entire kernel and thus needs to be validated
carefully. This here is thus the less risky fix for
2.6.25
Signed-off-by: David S. Miller <davem@davemloft.net>
The default_key symlink points to the key index rather than
they key counter, fix it.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch renames all mac80211 files (except ieee80211_i.h) to get rid
of the useless ieee80211_ prefix.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Up to now, key manipulation is supposed to run under RTNL to
avoid concurrent manipulations and also allow the set_key()
hardware callback to sleep. This is not feasible because STA
structs are rcu-protected and thus a lot of operations there
cannot take the RTNL. Also, key references are rcu-protected
so we cannot do things atomically.
This patch changes key locking completely:
* key operations are now atomic
* hardware crypto offload is enabled and disabled from
a workqueue, due to that key freeing is also delayed
* debugfs code is also run from a workqueue
* keys reference STAs (and vice versa!) so during STA
unlink the STAs key reference is removed but not the
keys STA reference, to avoid races key todo work is
run before STA destruction.
* fewer STA operations now need the RTNL which was
required due to key operations
This fixes the locking problems lockdep pointed out and also
makes things more light-weight because the rtnl isn't required
as much.
Note that the key todo lock/key mutex are global locks, this
is not required, of course, they could be per-hardware instead.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When a STA is supposed to be unlinked but is pinned, it still needs
to be unlinked from all structures. Only at the end of the unlink
process should we check for pin status and invalidate the callers
reference if it is pinned. Move the pin status check down.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
These two symbols are used only in ifdeffed function. Move them to that
section too.
net/mac80211/sta_info.c:387: warning: `__sta_info_pin' defined but not used
net/mac80211/sta_info.c:397: warning: `__sta_info_unpin' defined but not used
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Michael Wu <flamingice@sourmilk.net>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Jiri Benc <jbenc@suse.cz>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch contains next issues:
1 - prevents "stop BA session" multiple warnings
2 - adds debug print to stop Rx BA session flow
3 - adds EOL in one debug print
Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Add new API to MAC80211 to allow low level driver to
notify MAC with driver status.
Signed-off-by: Mohamed Abbas <mabbas@linux.intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The ieee80211_ioctl_giwrate() ioctl handler doesn't rcu_read_lock()
its access to the sta table, fix it.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Unfortunately, debugfs can be made to access invalid memory by
open()ing a file and then waiting until the corresponding debugfs
file has been removed (and, probably, the underlying object.)
That could be exploited by any user if the user is able to open
debugfs files and can cause networking devices, STA entries or
similar to disappear which is quite easy to do.
Hence, all debugfs files should be root-only.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When drivers receive change notification they may do work that
will enable the changes to take effect. For example, if new association
the device needs to be programmed with this information.
Give the driver chance to make the changes before notifying the
upper layer - thus preventing race condition where upper layer
attempts to utilize state that may not be configured yet.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Really doesn't need to be defined four times.
Also, while at it, remove a useless macro (IEEE80211_ALIGN32_PAD)
and a function prototype for a function we don't actually have
(ieee80211_set_compression.)
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Because we queue the sta-debugfs-adding work on our mac80211
workqueue (which needs to be flushed under RTNL) and that work
needs the RTNL, it can currently deadlock, thanks to Reinette
Chatre for pointing out the lockdep warning about this.
This patch fixes it by moving this work to the common kernel
workqueue (using schedule_work) and canceling it as appropriate.
It also fixes a related problem: When a STA is pinned by the
debugfs adding work and sta_info_flush() runs concurrently
it is not guaranteed that all STAs are removed from the driver
before the corresponding interface is removed which may lead
to bugs.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch is necessary for the upcoming Accesspoint patch for p54.
Signed-off-by: Christian Lamparter <chunkeey@web.de>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch adds assocation capability, timestamp (tsf) and beacon interval
to bss_conf. This is required for successful assocation of iwlwifi drivers
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch eliminates the use of conf_ht, replacing it with
bss_info_changed.
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This reverts commit 6c4711b469.
That patch breaks mesh config comparison between beacons/probe reponses, so
every beacon from a mesh network would be added as a new bss. Since the
comparison has to be performed for every received beacon I believe it is best to
save the mesh config in a format easy to compare, rather than do a bunch of
unaligned accesses to compare field by field.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
MTU probe can cause some remedies for FRTO because the normal
packet ordering may be violated allowing FRTO to make a wrong
decision (it might not be that serious threat for anything
though). Thus it's safer to not run FRTO while MTU probe is
underway.
It seems that the basic FRTO variant should also look for an
skb at probe_seq.start to check if that's retransmitted one
but I didn't implement it now (plain seqno in window check
isn't robust against wraparounds).
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes Bugzilla #10384
tcp_simple_retransmit does L increment without any checking
whatsoever for overflowing S+L when Reno is in use.
The simplest scenario I can currently think of is rather
complex in practice (there might be some more straightforward
cases though). Ie., if mss is reduced during mtu probing, it
may end up marking everything lost and if some duplicate ACKs
arrived prior to that sacked_out will be non-zero as well,
leading to S+L > packets_out, tcp_clean_rtx_queue on the next
cumulative ACK or tcp_fastretrans_alert on the next duplicate
ACK will fix the S counter.
More straightforward (but questionable) solution would be to
just call tcp_reset_reno_sack() in tcp_simple_retransmit but
it would negatively impact the probe's retransmission, ie.,
the retransmissions would not occur if some duplicate ACKs
had arrived.
So I had to add reno sacked_out reseting to CA_Loss state
when the first cumulative ACK arrives (this stale sacked_out
might actually be the explanation for the reports of left_out
overflows in kernel prior to 2.6.23 and S+L overflow reports
of 2.6.24). However, this alone won't be enough to fix kernel
before 2.6.24 because it is building on top of the commit
1b6d427bb7 ([TCP]: Reduce sacked_out with reno when purging
write_queue) to keep the sacked_out from overflowing.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Reported-by: Alessandro Suardi <alessandro.suardi@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes a long-standing bug which makes NewReno recovery crippled.
With GSO the whole head skb was marked as LOST which is in
violation of NewReno procedure that only wants to mark one packet
and ended up breaking our TCP code by causing counter overflow
because our code was built on top of assumption about valid
NewReno procedure. This manifested as triggering a WARN_ON for
the overflow in a number of places.
It seems relatively safe alternative to just do nothing if
tcp_fragment fails due to oom because another duplicate ACK is
likely to be received soon and the fragmentation will be retried.
Special thanks goes to Soeren Sonnenburg <kernel@nn7.de> who was
lucky enough to be able to reproduce this so that the warning
for the overflow was hit. It's not as easy task as it seems even
if this bug happens quite often because the amount of outstanding
data is pretty significant for the mismarkings to lead to an
overflow.
Because it's very late in 2.6.25-rc cycle (if this even makes in
time), I didn't want to touch anything with SACK enabled here.
Fragmenting might be useful for it as well but it's more or less
a policy decision rather than mandatory fix. Thus there's no need
to rush and we can postpone considering tcp_fragment with SACK
for 2.6.26.
In 2.6.24 and earlier, this very same bug existed but the effect
is slightly different because of a small changes in the if
conditions that fit to the patch's context. With them nothing
got lost marker and thus no retransmissions happened.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
The fast retransmission can be forced locally to the rfc3517
branch in tcp_update_scoreboard instead of making such fragile
constructs deeper in tcp_mark_head_lost.
This is necessary for the next patch which must not have
loopholes for cnt > packets check. As one can notice,
readability got some improvements too because of this :-).
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes the STA AID setting and actually makes hostapd/mac80211
work properly in presence of power-saving stations.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>