open(2) must always include one of O_RDONLY, O_WRONLY, or O_RDWR. No need
for any O_APPEND special case.
Passing O_WRONLY|O_RDWR is undefined according to the man page, but the
Linux VFS interprets this as O_RDWR, so we'll do the same.
This fixes open(2) with flags O_RDWR|O_APPEND, which was incorrectly being
translated to readonly.
Reported-by: Fyodor Ustinov <ufm@ufm.su>
Signed-off-by: Sage Weil <sage@newdream.net>
Another regression fix considering incomming l2cap connections with
defer_setup enabled. In situations when incomming connection is
extracted with l2cap_sock_accept, it's bt_sock info will have
'parent' member zerroed, but 'parent' may be used unconditionally
in l2cap_conn_start() and l2cap_security_cfm() when defer_setup
is enabled.
Backtrace:
[<bf02d5ac>] (l2cap_security_cfm+0x0/0x2ac [bluetooth]) from [<bf01f01c>] (hci_event_pac
ket+0xc2c/0x4aa4 [bluetooth])
[<bf01e3f0>] (hci_event_packet+0x0/0x4aa4 [bluetooth]) from [<bf01a844>] (hci_rx_task+0x
cc/0x27c [bluetooth])
[<bf01a778>] (hci_rx_task+0x0/0x27c [bluetooth]) from [<c008eee4>] (tasklet_action+0xa0/
0x15c)
[<c008ee44>] (tasklet_action+0x0/0x15c) from [<c008f38c>] (__do_softirq+0x98/0x130)
r7:00000101 r6:00000018 r5:00000001 r4:efc46000
[<c008f2f4>] (__do_softirq+0x0/0x130) from [<c008f524>] (do_softirq+0x4c/0x58)
[<c008f4d8>] (do_softirq+0x0/0x58) from [<c008f5e0>] (run_ksoftirqd+0xb0/0x1b4)
r4:efc46000 r3:00000001
[<c008f530>] (run_ksoftirqd+0x0/0x1b4) from [<c009f2a8>] (kthread+0x84/0x8c)
r7:00000000 r6:c008f530 r5:efc47fc4 r4:efc41f08
[<c009f224>] (kthread+0x0/0x8c) from [<c008cc84>] (do_exit+0x0/0x5f0)
Signed-off-by: Ilia Kolomisnky <iliak@ti.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Caused by the following commit, partially revert it.
commit 9fa7e4f76f
Author: Gustavo F. Padovan <padovan@profusion.mobi>
Date: Thu Jun 30 16:11:30 2011 -0300
Bluetooth: Fix regression with incoming L2CAP connections
PTS test A2DP/SRC/SRC_SET/TC_SRC_SET_BV_02_I revealed that
( probably after the df3c3931e commit ) the l2cap connection
could not be established in case when the "Auth Complete" HCI
event does not arive before the initiator send "Configuration
request", in which case l2cap replies with "Command rejected"
since the channel is still in BT_CONNECT2 state.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no software fallback implemented for SCTP or FCoE checksumming,
and so it should not be passed on by software devices like bridge or bonding.
For VLAN devices, this is different. First, the driver for underlying device
should be prepared to get offloaded packets even when the feature is disabled
(especially if it advertises it in vlan_features). Second, devices under
VLANs do not get replaced without tearing down the VLAN first.
This fixes a mess I accidentally introduced while converting bonding to
ndo_fix_features.
NETIF_F_SOFT_FEATURES are removed from BOND_VLAN_FEATURES because they
are unused as of commit 712ae51afd.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Packets to devices without NETIF_F_SCTP_CSUM (including NETIF_F_NO_CSUM)
should be properly checksummed because the packets can be diverted or
rerouted after construction. This still leaves packets diverted from
NETIF_F_SCTP_CSUM-enabled devices with broken checksums. Fixing this
needs implementing software offload fallback in networking core.
For users of sctp_checksum_disable, skb->ip_summed should be left as
CHECKSUM_NONE and not CHECKSUM_UNNECESSARY as per include/linux/skbuff.h.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Because struct rpcbind_args *map was declared static, if two
threads entered this method at the same time, the values
assigned to map could be sent two two differen tasks.
This could cause all sorts of problems, include use-after-free
and double-free of memory.
Fix this by removing the static declaration so that the map
pointer is on the stack.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Cc: stable@kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trigger user ABORT if application closes a socket which has data
queued on the socket receive queue or chunks waiting on the
reassembly or ordering queue as this would imply data being lost
which defeats the point of a graceful shutdown.
This behavior is already practiced in TCP.
We do not check the input queue because that would mean to parse
all chunks on it to look for unacknowledged data which seems too
much of an effort. Control chunks or duplicated chunks may also
be in the input queue and should not be stopping a graceful
shutdown.
Signed-off-by: Thomas Graf <tgraf@infradead.org>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upon "ip xfrm state update ..", xfrm_add_sa() takes an extra reference on
the user-supplied SA and forgets to drop the reference when
xfrm_state_update() returns 0. This leads to a memory leak as the
parameter SA is never freed. This change attempts to fix the leak by
calling __xfrm_state_put() when xfrm_state_update() updates a valid SA
(err = 0). The parameter SA is added to the gc list when the final
reference is dropped by xfrm_add_sa() upon completion.
Signed-off-by: Tushar Gohad <tgohad@mvista.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since rpc_killall_tasks may modify the rpc_task's tk_action field
without any locking, we need to be careful when dereferencing it.
Reported-by: Ben Greear <greearb@candelatech.com>
Tested-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
When initiating a graceful shutdown while having data chunks
on the retransmission queue with a peer which is in zero
window mode the shutdown is never completed because the
retransmission error count is reset periodically by the
following two rules:
- Do not timeout association while doing zero window probe.
- Reset overall error count when a heartbeat request has
been acknowledged.
The graceful shutdown will wait for all outstanding TSN to
be acknowledged before sending the SHUTDOWN request. This
never happens due to the peer's zero window not acknowledging
the continuously retransmitted data chunks. Although the
error counter is incremented for each failed retransmission,
the receiving of the SACK announcing the zero window clears
the error count again immediately. Also heartbeat requests
continue to be sent periodically. The peer acknowledges these
requests causing the error counter to be reset as well.
This patch changes behaviour to only reset the overall error
counter for the above rules while not in shutdown. After
reaching the maximum number of retransmission attempts, the
T5 shutdown guard timer is scheduled to give the receiver
some additional time to recover. The timer is stopped as soon
as the receiver acknowledges any data.
The issue can be easily reproduced by establishing a sctp
association over the loopback device, constantly queueing
data at the sender while not reading any at the receiver.
Wait for the window to reach zero, then initiate a shutdown
by killing both processes simultaneously. The association
will never be freed and the chunks on the retransmission
queue will be retransmitted indefinitely.
Signed-off-by: Thomas Graf <tgraf@infradead.org>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Unlike CCMP, the presence or absence of the QoS
field doesn't change the encryption, only the
TID is used. When no QoS field is present, zero
is used as the TID value. This means that it is
possible for an attacker to take a QoS packet
with TID 0 and replay it as a non-QoS packet.
Unfortunately, mac80211 uses different IVs for
checking the validity of the packet's TKIP IV
when it checks TID 0 and when it checks non-QoS
packets. This means it is vulnerable to this
replay attack.
To fix this, use the same replay counter for
TID 0 and non-QoS packets by overriding the
rx->queue value to 0 if it is 16 (non-QoS).
This is a minimal fix for now. I caused this
issue in
commit 1411f9b531
Author: Johannes Berg <johannes@sipsolutions.net>
Date: Thu Jul 10 10:11:02 2008 +0200
mac80211: fix RX sequence number check
while fixing a sequence number issue (there,
a separate counter needs to be used).
Cc: stable@kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We were not allocating memory for the IEs passed in the scheduled_scan
request and this was causing memory corruption (buffer overflow).
Signed-off-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We forgot to send up SCTP_SENDER_DRY_EVENT notification when
user app subscribes to this event, and there is no data to be
sent or retransmit.
This is required by the Socket API and used by the DTLS/SCTP
implementation.
Reported-by: Michael Tüxen <Michael.Tuexen@lurchi.franken.de>
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Tested-by: Robin Seggelmann <seggelmann@fh-muenster.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current tcp/udp/sctp global memory limits are not taking into account
hugepages allocations, and allow 50% of ram to be used by buffers of a
single protocol [ not counting space used by sockets / inodes ...]
Lets use nr_free_buffer_pages() and allow a default of 1/8 of kernel ram
per protocol, and a minimum of 128 pages.
Heavy duty machines sysadmins probably need to tweak limits anyway.
References: https://bugzilla.stlinux.com/show_bug.cgi?id=38032
Reported-by: starlight <starlight@binnacle.cx>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If gso/gro feature of underlying device is turned off,
then new created vlan device never can turn gso/gro on.
Although underlying device don't support TSO, we still
should use software segments for vlan device.
Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As is_multicast_ether_addr returns true on broadcast packets as
well, we need to explicitly exclude broadcast packets so that
they're always flooded. This wasn't an issue before as broadcast
packets were considered to be an unregistered multicast group,
which were always flooded. However, as we now only flood such
packets to router ports, this is no longer acceptable.
Reported-by: Michael Guntsche <mike@it-loops.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
There was a deadlock when rfkill-blocking a wireless interface,
because we were locking the rdev mutex on NETDEV_GOING_DOWN to stop
sched_scans that were eventually running. The rfkill block code was
already holding a mutex under rdev:
kernel: =======================================================
kernel: [ INFO: possible circular locking dependency detected ]
kernel: 3.0.0-rc1-00049-g1fa7b6a #57
kernel: -------------------------------------------------------
kernel: kworker/0:1/4525 is trying to acquire lock:
kernel: (&rdev->mtx){+.+.+.}, at: [<ffffffff8164c831>] cfg80211_netdev_notifier_call+0x131/0x5b0
kernel:
kernel: but task is already holding lock:
kernel: (&rdev->devlist_mtx){+.+.+.}, at: [<ffffffff8164dcef>] cfg80211_rfkill_set_block+0x4f/0xa0
kernel:
kernel: which lock already depends on the new lock.
To fix this, add a new mutex specifically for sched_scan, to protect
the sched_scan_req element in the rdev struct, instead of using the
global rdev mutex.
Reported-by: Duane Griffin <duaneg@dghda.com>
Signed-off-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Hi,
Reinhard Max also pointed out that the error should EAFNOSUPPORT according
to POSIX.
The Linux manpages have it as EINVAL, some other OSes (Minix, HPUX, perhaps BSD) use
EAFNOSUPPORT. Windows uses WSAEFAULT according to MSDN.
Other protocols error values in their af bind() methods in current mainline git as far
as a brief look shows:
EAFNOSUPPORT: atm, appletalk, l2tp, llc, phonet, rxrpc
EINVAL: ax25, bluetooth, decnet, econet, ieee802154, iucv, netlink, netrom, packet, rds, rose, unix, x25,
No check?: can/raw, ipv6/raw, irda, l2tp/l2tp_ip
Ciao, Marcus
Signed-off-by: Marcus Meissner <meissner@suse.de>
Cc: Reinhard Max <max@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Calling icmp_send() on a local message size error leads to
an incorrect update of the path mtu. So use ip_local_error()
instead to notify the socket about the error.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We might call ip_ufo_append_data() for packets that will be IPsec
transformed later. This function should be used just for real
udp packets. So we check for rt->dst.header_len which is only
nonzero on IPsec handling and call ip_ufo_append_data() just
if rt->dst.header_len is zero.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The family arg is not used any more, so remove it.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPV6, unlike IPV4, doesn't have a routing cache.
Routing table entries, as well as clones made in response
to route lookup requests, all live in the same table. And
all of these things are together collected in the destination
cache table for ipv6.
This means that routing table entries count against the garbage
collection limits, even though such entries cannot ever be reclaimed
and are added explicitly by the administrator (rather than being
created in response to lookups).
Therefore it makes no sense to count ipv6 routing table entries
against the GC limits.
Add a DST_NOCOUNT destination cache entry flag, and skip the counting
if it is set. Use this flag bit in ipv6 when adding routing table
entries.
Signed-off-by: David S. Miller <davem@davemloft.net>
If the remote device is not present, the connections attemp fails and
the struct hci_conn was not freed
Signed-off-by: Tomas Targownik <ttargownik@geicp.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
PTS test A2DP/SRC/SRC_SET/TC_SRC_SET_BV_02_I revealed that
( probably after the df3c3931e commit ) the l2cap connection
could not be established in case when the "Auth Complete" HCI
event does not arive before the initiator send "Configuration
request", in which case l2cap replies with "Command rejected"
since the channel is still in BT_CONNECT2 state.
Based on patch from: Ilia Kolomisnky <iliak@ti.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Partial revert of commit aabf6f89. When the hidp session thread
was converted from kernel_thread to kthread, the atomic/wakeups
were replaced with kthread_stop. kthread_stop has blocking semantics
which are inappropriate for the hidp session kthread. In addition,
the kthread signals itself to terminate in hidp_process_hid_control()
- it cannot do this with kthread_stop().
Lastly, a wakeup can be lost if the wakeup happens between checking
for the loop exit condition and setting the current state to
TASK_INTERRUPTIBLE. (Without appropriate synchronization mechanisms,
the task state should not be changed between the condition test and
the yield - via schedule() - as this creates a race between the
wakeup and resetting the state back to interruptible.)
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Avoid creating input routes with ip_route_me_harder.
It does not work for locally generated packets. Instead,
restrict sockets to provide valid saddr for output route (or
unicast saddr for transparent proxy). For other traffic
allow saddr to be unicast or local but if callers forget
to check saddr type use 0 for the output route.
The resulting handling should be:
- REJECT TCP:
- in INPUT we can provide addr_type = RTN_LOCAL but
better allow rejecting traffic delivered with
local route (no IP address => use RTN_UNSPEC to
allow also RTN_UNICAST).
- FORWARD: RTN_UNSPEC => allow RTN_LOCAL/RTN_UNICAST
saddr, add fix to ignore RTN_BROADCAST and RTN_MULTICAST
- OUTPUT: RTN_UNSPEC
- NAT, mangle, ip_queue, nf_ip_reroute: RTN_UNSPEC in LOCAL_OUT
- IPVS:
- use RTN_LOCAL in LOCAL_OUT and FORWARD after SNAT
to restrict saddr to be local
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
A remote user can provide a small value for the command size field in
the command header of an l2cap configuration request, resulting in an
integer underflow when subtracting the size of the configuration request
header. This results in copying a very large amount of data via
memcpy() and destroying the kernel heap. Check for underflow.
Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
ip_append_data() builds packets based on the mtu from dst_mtu(rt->dst.path).
On IPsec the effective mtu is lower because we need to add the protocol
headers and trailers later when we do the IPsec transformations. So after
the IPsec transformations the packet might be too big, which leads to a
slowpath fragmentation then. This patch fixes this by building the packets
based on the lower IPsec mtu from dst_mtu(&rt->dst) and adapts the exthdr
handling to this.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Git commit 59104f06 (ip: take care of last fragment in ip_append_data)
added a check to see if we exceed the mtu when we add trailer_len.
However, the mtu is already subtracted by the trailer length when the
xfrm transfomation bundles are set up. So IPsec packets with mtu
size get fragmented, or if the DF bit is set the packets will not
be send even though they match the mtu perfectly fine. This patch
actually reverts commit 59104f06.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sometimes when reporting a MIC failure rx->key may be unset. This
code path is hit when receiving a packet meant for a multicast
address, and decryption is performed in HW.
Fortunately, the failing key_idx is not used for anything up to
(and including) usermode, so we allow ourselves to drop it on the
way up when a key cannot be retrieved.
Signed-off-by: Arik Nemtsov <arik@wizery.com>
Cc: stable@kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The bridge currently floods packets to groups that we have never
seen before to all ports. This is not required by RFC4541 and
in fact it is not desirable in environment where traffic to
unregistered group is always present.
This patch changes the behaviour so that we only send traffic
to unregistered groups to ports marked as routers.
The user can always force flooding behaviour to any given port
by marking it as a router.
Note that this change does not apply to traffic to 224.0.0.X
as traffic to those groups must always be flooded to all ports.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Consider this scenario: When the size of the first received udp packet
is bigger than the receive buffer, MSG_TRUNC bit is set in msg->msg_flags.
However, if checksum error happens and this is a blocking socket, it will
goto try_again loop to receive the next packet. But if the size of the
next udp packet is smaller than receive buffer, MSG_TRUNC flag should not
be set, but because MSG_TRUNC bit is not cleared in msg->msg_flags before
receive the next packet, MSG_TRUNC is still set, which is wrong.
Fix this problem by clearing MSG_TRUNC flag when starting over for a
new packet.
Signed-off-by: Xufeng Zhang <xufeng.zhang@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
udpv6_recvmsg() function is not using the correct variable to determine
whether or not the socket is in non-blocking operation, this will lead
to unexpected behavior when a UDP checksum error occurs.
Consider a non-blocking udp receive scenario: when udpv6_recvmsg() is
called by sock_common_recvmsg(), MSG_DONTWAIT bit of flags variable in
udpv6_recvmsg() is cleared by "flags & ~MSG_DONTWAIT" in this call:
err = sk->sk_prot->recvmsg(iocb, sk, msg, size, flags & MSG_DONTWAIT,
flags & ~MSG_DONTWAIT, &addr_len);
i.e. with udpv6_recvmsg() getting these values:
int noblock = flags & MSG_DONTWAIT
int flags = flags & ~MSG_DONTWAIT
So, when udp checksum error occurs, the execution will go to
csum_copy_err, and then the problem happens:
csum_copy_err:
...............
if (flags & MSG_DONTWAIT)
return -EAGAIN;
goto try_again;
...............
But it will always go to try_again as MSG_DONTWAIT has been cleared
from flags at call time -- only noblock contains the original value
of MSG_DONTWAIT, so the test should be:
if (noblock)
return -EAGAIN;
This is also consistent with what the ipv4/udp code does.
Signed-off-by: Xufeng Zhang <xufeng.zhang@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the duplicate inclusion of net/icmp.h from net/ipv4/ping.c
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Otherwise we will not see the name of the slave dev in error
message:
[ 388.469446] (null): doesn't support polling, aborting.
Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Knut Tidemann found that first packet of a multicast flow was not
correctly received, and bisected the regression to commit b23dd4fe42
(Make output route lookup return rtable directly.)
Special thanks to Knut, who provided a very nice bug report, including
sample programs to demonstrate the bug.
Reported-and-bisectedby: Knut Tidemann <knut.andre.tidemann@jotron.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A malicious user or buggy application can inject code and trigger an
infinite loop in inet_diag_bc_audit()
Also make sure each instruction is aligned on 4 bytes boundary, to avoid
unaligned accesses.
Reported-by: Dan Rosenberg <drosenberg@vsecurity.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Le jeudi 16 juin 2011 à 23:38 -0400, David Miller a écrit :
> From: Ben Hutchings <bhutchings@solarflare.com>
> Date: Fri, 17 Jun 2011 00:50:46 +0100
>
> > On Wed, 2011-06-15 at 04:15 +0200, Eric Dumazet wrote:
> >> @@ -1594,6 +1594,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
> >> goto discard;
> >>
> >> if (nsk != sk) {
> >> + sock_rps_save_rxhash(nsk, skb->rxhash);
> >> if (tcp_child_process(sk, nsk, skb)) {
> >> rsk = nsk;
> >> goto reset;
> >>
> >
> > I haven't tried this, but it looks reasonable to me.
> >
> > What about IPv6? The logic in tcp_v6_do_rcv() looks very similar.
>
> Indeed ipv6 side needs the same fix.
>
> Eric please add that part and resubmit. And in fact I might stick
> this into net-2.6 instead of net-next-2.6
>
OK, here is the net-2.6 based one then, thanks !
[PATCH v2] net: rfs: enable RFS before first data packet is received
First packet received on a passive tcp flow is not correctly RFS
steered.
One sock_rps_record_flow() call is missing in inet_accept()
But before that, we also must record rxhash when child socket is setup.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Tom Herbert <therbert@google.com>
CC: Ben Hutchings <bhutchings@solarflare.com>
CC: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
Fix a couple of instances where we were exiting the RPC client on
arbitrary signals. We should only do so on fatal signals.
Cc: stable@kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch removes the call to ndo_vlan_rx_register if the underlying
device doesn't have hardware support for VLAN.
Signed-off-by: Antoine Reversat <a.reversat@gmail.com>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
XOFF was mixed up with DOWN indication, causing causing CAIF channel to be
removed from mux and all incoming traffic to be lost after receiving flow-off.
Fix this by replacing FLOW_OFF with DOWN notification.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
Upon reception of a MGM report packet the kernel sets the mrouters_only flag
in a skb that is a clone of the original skb, which means that the bridge
loses track of MGM packets (cb buffers are tied to a specific skb and not
shared) and it ends up forwading join requests to the bridge interface.
This can cause unexpected membership timeouts and intermitent/permanent loss
of connectivity as described in RFC 4541 [2.1.1. IGMP Forwarding Rules]:
A snooping switch should forward IGMP Membership Reports only to
those ports where multicast routers are attached.
[...]
Sending membership reports to other hosts can result, for IGMPv1
and IGMPv2, in unintentionally preventing a host from joining a
specific multicast group.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
Upon reception of a IGMP/IGMPv2 membership report the kernel sets the
mrouters_only flag in a skb that may be a clone of the original skb, which
means that sometimes the bridge loses track of membership report packets (cb
buffers are tied to a specific skb and not shared) and it ends up forwading
join requests to the bridge interface.
This can cause unexpected membership timeouts and intermitent/permanent loss
of connectivity as described in RFC 4541 [2.1.1. IGMP Forwarding Rules]:
A snooping switch should forward IGMP Membership Reports only to
those ports where multicast routers are attached.
[...]
Sending membership reports to other hosts can result, for IGMPv1
and IGMPv2, in unintentionally preventing a host from joining a
specific multicast group.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Tested-by: Hayato Kakuta <kakuta.hayato@oss.ntt.co.jp>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
Avoid double seq adjustment for loopback traffic
because it causes silent repetition of TCP data. One
example is passive FTP with DNAT rule and difference in the
length of IP addresses.
This patch adds check if packet is sent and
received via loopback device. As the same conntrack is
used both for outgoing and incoming direction, we restrict
seq adjustment to happen only in POSTROUTING.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Patrick McHardy <kaber@trash.net>
By default, when broadcast or multicast packet are sent from a local
application, they are sent to the interface then looped by the kernel
to other local applications, going throught netfilter hooks in the
process.
These looped packet have their MAC header removed from the skb by the
kernel looping code. This confuse various netfilter's netlink queue,
netlink log and the legacy ip_queue, because they try to extract a
hardware address from these packets, but extracts a part of the IP
header instead.
This patch prevent NFQUEUE, NFLOG and ip_QUEUE to include a MAC header
if there is none in the packet.
Signed-off-by: Nicolas Cavallari <cavallar@lri.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Userspace allows to specify inversion for IP header ECN matches, the
kernel silently accepts it, but doesn't invert the match result.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Check for protocol inversion in ecn_mt_check() and remove the
unnecessary runtime check for IPPROTO_TCP in ecn_mt().
Signed-off-by: Patrick McHardy <kaber@trash.net>
In hci_conn_security ( which is used during L2CAP connection
establishment ) test for HCI_CONN_ENCRYPT_PEND state also
sets this state, which is bogus and leads to connection time-out
on L2CAP sockets in certain situations (especially when
using non-ssp devices )
Signed-off-by: Ilia Kolomisnky <iliak@ti.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>