The cxgb3 NIC driver can handle more firmware versions than iw_cxgb3,
and since commit 8207befa ("cxgb3: untie strict FW matching") cxgb3
will load with firmware versions that iw_cxgb3 can't handle. The FW
major number indicates a specific interface between the FW and
iw_cxgb3. Thus if the major number of the running firmware does not
match the required version compiled into iw_cxgb3, then iw_cxgb3 must
not register that device.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Also, removed unnecessary memset() since alloc_netdev returns
zeroed memory.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert this driver to new net_device_ops infrastructure.
Also use default net_device get-stats infrastructure
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
According to the ConnectX programmer's reference manual, all
operations should be stopped, all QPs should be torn down and all WQEs
flushed before the CLOSE_PORT command is invoked. In some cases
reversing the order of operations (as implemented now) could cause
a loss of completions.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
STag zero is a special STag that allows consumers to access any bus
address without registering memory. The nes driver unfortunately
allows STag zero to be used even with QPs created by unprivileged
userspace consumers, which means that any process with direct verbs
access to the nes device can read and write any memory accessible to
the underlying PCI device (usually any memory in the system). Such
access is usually given for cluster software such as MPI to use, so
this is a local privilege escalation bug on most systems running this
driver.
The driver was using STag zero to receive the last streaming mode
data; to allow STag zero to be disabled for unprivileged QPs, the
driver now registers a special MR for this data.
Cc: <stable@kernel.org>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
While doing testing, there are failures as MPA Reject call is not
handled. To handle MPA Reject call, following changes are done:
*Handle inbound/outbound MPA Reject response message.
When nes_reject() is called for pending MPA request reply,
send the MPA Reject message to its peer (active
side)cm_node. The peer cm_node (active side) will indicate
Reject message event for the pending Connect Request.
*Handle MPA Reject response message for loopback connections and listener.
When MPA Request is rejected, check if it is a loopback
connection and if it is then it will send Reject message event
to its peer loopback node. Also when destroying listener,
check if the cm_nodes for that listener are loopback or not.
*Add gracefull connection close with the MPA Reject response message.
Send gracefull close (FIN, FIN ACK..) to terminate the cm_nodes.
*Some code re-org while making the above changes.
Removed recv_list and recv_list_lock from the cm_node
structure as there can be only one receive close entry on the
timer. Also implemented handle_recv_entry() as receive close
entry is processed from both nes_rem_ref_cm_node() as well as
nes_cm_timer_tick().
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Two level 256 byte PBLs was not implemented so the driver could report
out of memory when in fact there were PBLs still available.
This solution prefers to use 4KB PBLs over two level 256B PBLs until
the number of 4KB PBLs falls below a threshold. At this point the 4KB
PBL structure is converted to use 256B PBLs which prevents the driver
from running out of 4KB PBLs too quickly.
Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
NETIF_F_LLTX is deprecated. Remove private TX locking from the driver
and remove the NETIF_F_LLTX feature flag. This also fixes a warning
in some configs that comes from doing skb_linearize() call in the
hard_start_xmit method with IRQs disabled (if HIGHMEM is enabled,
skb_linearize() may end up enabling BHs, which is a no-no if hard IRQs
are disabled in that context). By getting rid of LLTX, we do not
disable IRQs when skb_linearize() is called.
Remove the sq_lock as it is not needed for non-LLTX. Fix ethtool not
to show the counter for sq_lock.
Reported-by: aluno3@poczta.onet.pl
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When asynchronous events are processed by software, it is necessary
to let the hardware know that software has handled the event. This
frees up the entry in the asynchronous event queue.
Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
In find_node(), tmp_addr causes an "unused variable" warning when
INFINIBAND_NES_DEBUG is not defined. It's only used in a nes_debug()
and the print does not make sense. So take out the whole thing.
Reported-by: Manish Katiyar <mkatiyar@gmail.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ibv_devinfo displays 0 for vendor_id and vendor_part_id. Fill in OUI
and device_id for those two fields.
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Update copyright to the new legal entity, Intel-NE, Inc., an Intel
company. Update copyright for the new year.
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fix occurrences where the software PBL counts were changed before the
hardware was updated. This bug allowed another thread to overallocate
the hardware resources.
Add proper PBL accounting in case nes_reg_mr() fails.
Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ipath_release_user_pages_on_close() just allocated a structure to
schedule work with but just returned (leaking the structure) rather than
actually doing schedule_work(). Fix the logic to what was intended.
This was spotted by the Coverity checker (CID 2700).
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If the second vmalloc() fails, the wrong pointer is pased to vfree(), so
the first vmalloc() ends up getting leaked.
This was spotted by the Coverity checker (CID 2709).
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Impact: new timer API
Based on an idea from Martin Josefsson with the help of
Patrick McHardy and Stephen Hemminger:
introduce the mod_timer_pending() API which is a mod_timer()
offspring that is an invariant on already removed timers.
(regular mod_timer() re-activates non-pending timers.)
This is useful for the networking code in that it can
allow unserialized mod_timer_pending() timer-forwarding
calls, but a single del_timer*() will stop the timer
from being reactivated again.
Also while at it:
- optimize the regular mod_timer() path some more, the
timer-stat and a debug check was needlessly duplicated
in __mod_timer().
- make the exports come straight after the function, as
most other exports in timer.c already did.
- eliminate __mod_timer() as an external API, change the
users to mod_timer().
The regular mod_timer() code path is not impacted
significantly, due to inlining optimizations and due to
the simplifications.
Based-on-patch-from: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Cc: netdev@vger.kernel.org
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Remove modulo usage to avoid a divide in the fast path (not all
gcc versions do strength reduction here).
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The poll and flush code needs to handle all send opcodes: SEND,
SEND_WITH_SE, SEND_WITH_INV, and SEND_WITH_SE_INV.
Ignore TERM indications if the connection already gone.
Ignore HW receive completions if the RQ is empty.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The variable 'offset' in iwch_sgl2pbl_map() needs to be a u64.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When snooping a PortInfo MAD, its client_reregister bit is checked.
If the bit is ON then a CLIENT_REREGISTER event is dispatched,
otherwise a LID_CHANGE event is dispatched. This way of decision
ignores the cases where the MAD changes the LID along with an
instruction to reregister (so a necessary LID_CHANGE event won't be
dispatched) or the MAD is neither of these (and an unnecessary
LID_CHANGE event will be dispatched).
This causes problems at least with IPoIB, which will do a "light"
flush on reregister, rather than the "heavy" flush required due to a
LID change.
Fix this by dispatching a CLIENT_REREGISTER event if the
client_reregister bit is set, but also compare the LID in the MAD to
the current LID. If and only if they are not identical then a
LID_CHANGE event is dispatched.
Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When snooping a PortInfo MAD, its client_reregister bit is checked.
If the bit is ON then a CLIENT_REREGISTER event is dispatched,
otherwise a LID_CHANGE event is dispatched. This way of decision
ignores the cases where the MAD changes the LID along with an
instruction to reregister (so a necessary LID_CHANGE event won't be
dispatched) or the MAD is neither of these (and an unnecessary
LID_CHANGE event will be dispatched).
This causes problems at least with IPoIB, which will do a "light"
flush on reregister, rather than the "heavy" flush required due to a
LID change.
Fix this by dispatching a CLIENT_REREGISTER event if the
client_reregister bit is set, but also compare the LID in the MAD to
the current LID. If and only if they are not identical then a
LID_CHANGE event is dispatched.
Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Freeze activity when notified that the underlying chip
is getting reset on a EEH event or fatal error.
Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Following the removal of the unused struct net_device * parameter from
the NAPI functions named *netif_rx_* in commit 908a7a1, they are
exactly equivalent to the corresponding *napi_* functions and are
therefore redundant.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The base versions handle constant folding just fine, use them
directly. The replacements are OK in the include/ files as they are
not exported to userspace so we don't need the __ prefixed versions.
This patch does not affect code generation at all.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ehca_plpar_hcall9() takes an unsigned long array, so make all callers
pass that in. This fixes warnings introduced by commit fe333321
("powerpc: Change u64/s64 to a long long integer type"), which changed
u64 from unsigned long to unsigned long long.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Commit fe333321 ("powerpc: Change u64/s64 to a long long integer
type") changed u64 from unsigned long to unsigned long long, which
means that printk formats for printing u64 values should use "ll"
instead of "l" to avoid warnings. Fix all the places affected by this
in ehca.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The current work request posting code writes the LSO segment before
writing any data segments. This leaves a window where the LSO segment
overwrites the stamping in one cacheline that the HCA prefetches
before the rest of the cacheline is filled with the correct data
segments. When the HCA processes this work request, a local
protection error may result.
Fix this by saving the LSO header size field off and writing it only
after all data segments are written. This fix is a cleaned-up version
of a patch from Jack Morgenstein <jackm@dev.mellanox.co.il>.
This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1383>.
Reported-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Commit 63779436 ("drivers: replace NIPQUAD()") accidentally replaced
some HIPQUAD()s, causing IP addresses to be printed in reverse order.
Add temporary local vars until the byteswapping can be pushed further
up the stack.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If the mlx4_ib driver finds an adapter that has only ethernet ports, the
current code will register an IB device with 0 ports. Nothing useful or
sensible can be done with such a device, so just skip registering it.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When I review ocfs2 code, find there are 2 typos to "successfull". After
doing grep "successfull " in kernel tree, 22 typos found totally -- great
minds always think alike :)
This patch fixes all the similar typos. Thanks for Randy's ack and comments.
Signed-off-by: Coly Li <coyli@suse.de>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Roland Dreier <rolandd@cisco.com>
Cc: Jeremy Kerr <jk@ozlabs.org>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Vlad Yasevich <vladislav.yasevich@hp.com>
Cc: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The flags argument to spin_lock_irqsave() should really be unsigned
long. This will also help prevent some warnings when we change u64 to
unsigned long long.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
... and don't bother in callers. Don't bother with zeroing i_blocks,
while we are at it - it's already been zeroed.
i_mode is not worth the effort; it has no common default value.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Commit f780a9f1 ("mlx4_core: Add ethernet fields to CQE struct")
introduced a bug in how wc->sl is set in mlx4_ib_poll_one() -- since
cqe->sl_vid is a big-endian value, the shift must be done after
converting to host endianness.
This bug was found using sparse endianness checking.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Impact: cleanup
We're moving from handing around cpumask_t's to handing around struct
cpumask *'s. cpus_*, cpumask_t and cpu_*_map are deprecated: convert
to cpumask_*, cpu_*_mask.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Ralph Campbell <infinipath@qlogic.com>
Impact: cleanup
We're moving from handing around cpumask_t's to handing around struct
cpumask *'s. cpus_*, cpumask_t and cpu_*_map are deprecated: convert
to cpumask_*, cpu_*_mask.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Tested-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Cc: Christoph Raisch <raisch@de.ibm.com>
Impact: cleanup
In future, accessing cpu numbers beyond nr_cpu_ids (the runtime limit)
will be undefined. We can avoid future problems by using
for_each_online_cpu() here.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Tested-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Cc: Christoph Raisch <raisch@de.ibm.com>
When we removed the network device argument from several
NAPI interfaces in 908a7a16b8
("net: Remove unused netdev arg from some NAPI interfaces.")
several drivers now started getting unused variable warnings.
This fixes those up.
Signed-off-by: David S. Miller <davem@davemloft.net>
When resizing a CQ, when copying over unpolled CQEs from the old CQE
buffer to the new buffer, the ownership bit must be set appropriately
for the new buffer, or the ownership bit in the new buffer gets
corrupted.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
There is no lock protecting tx_free_list thus causing a system crash
when skb_dequeue() is called and the list is empty. Since it did not give
any performance boost under heavy load, remove it to simplify the code.
Replace get_free_pkt() with dev_alloc_skb() to allocate MAX_CM_BUFFER skb
for connection establishment/teardown as well as MPA request/response.
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When the napi api was changed to separate its 1:1 binding to the net_device
struct, the netif_rx_[prep|schedule|complete] api failed to remove the now
vestigual net_device structure parameter. This patch cleans up that api by
properly removing it..
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When using MSI-X mode, create a completion event queue for each CPU.
Report the number of completion EQs in a new struct mlx4_caps member,
num_comp_vectors, and extend the mlx4_cq_alloc() interface with a
vector parameter so that consumers can specify which completion EQ
should be used to report events for the CQ being created.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
vpage is checked not to be NULL just after it is initialized at the
beginning of each loop iteration.
A simplified version of the semantic patch that makes this change is
as follows: (http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@r exists@
local idexpression x;
expression E;
position p1,p2;
@@
if (x@p1 == NULL || ...) { ... when forall
return ...; }
... when != \(x=E\|x--\|x++\|--x\|++x\|x-=E\|x+=E\|x|=E\|x&=E\|&x\)
(
x@p2 == NULL
|
x@p2 != NULL
)
// another path to the test that is not through p1?
@s exists@
local idexpression r.x;
position r.p1,r.p2;
@@
... when != x@p1
(
x@p2 == NULL
|
x@p2 != NULL
)
@fix depends on !s@
position r.p1,r.p2;
expression x,E;
statement S1,S2;
@@
(
- if ((x@p2 != NULL) || ...)
S1
|
- if ((x@p2 == NULL) && ...) S1
|
- BUG_ON(x@p2 == NULL);
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
With the latest flush error completion patch we introduced modulus
operation to calculate the next index within a qmap. Based on
comments from other mailing lists we decided to optimize this
operation by using an addition and an if-statement instead of modulus,
even though this is on the error path.
Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fixes timing race resulting in panic. Not a performance sensitive path.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ipath_piobufbase was a single value offset, but is multiple values on
newer chips, so use only the 32 bits for the 2K buffers (4K buffers
are currently used only by the driver).
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Implement the ignoring of ibsymbol errors and linkrecover errors while
the link is at less than INIT (long needed), to get accurate counts.
Particularly an issue when doing non-IBTA DDR negotiation with chips
from vendors that do not support IBTA mode negotiation. If the driver
is unloaded, and there is a delta, the adjusted counters are written
back to the chip, so they stay adjusted across driver reload.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This fixes an obvious oversight where the return value is not checked
for error.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The PSN of the first packet after an RDMA read is based on the size of
the RDMA read request. This is calculated correctly for the WQE sent
after the first request message but not on subsequent requests if the
RDMA read is resent.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>