This patch addresses a long-standing bug where the get_user_pages_fast()
write parameter used for setting the underlying page table entry permission
bits was incorrectly set to write=1 for data_direction=DMA_TO_DEVICE, and
passed into get_user_pages_fast() via vhost_scsi_map_iov_to_sgl().
However, this parameter is intended to signal WRITEs to pinned userspace
PTEs for the virtio-scsi DMA_FROM_DEVICE -> READ payload case, and *not*
for the virtio-scsi DMA_TO_DEVICE -> WRITE payload case.
This bug would manifest itself as random process segmentation faults on
KVM host after repeated vhost starts + stops and/or with lots of vhost
endpoints + LUNs.
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Asias He <asias@redhat.com>
Cc: <stable@vger.kernel.org> # 3.6+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Remove a lingering macro that just hid a dereference.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Fix GFP_KERNEL -> GFP_ATOMIC usage of percpu_ida_alloc() within
vhost_scsi_get_tag(), as this code is expected to be called directly
from interrupt context.
v2 changes:
- Handle possible tag < 0 failure with GFP_ATOMIC
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Asias He <asias@redhat.com>
Cc: Kent Overstreet <kmo@daterainc.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
As vhost scsi device struct is large, if the device is
created on a busy system, kzalloc() might fail, so this patch does a
fallback to vzalloc().
As vmalloc() adds overhead on data-path, add __GFP_REPEAT
to kzalloc() flags to do this fallback only when really needed.
Reviewed-by: Asias He <asias@redhat.com>
Reported-by: Dan Aloni <alonid@stratoscale.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Update copyright ownership/year information for target-core,
loopback, iscsi-target, tcm_qla2xx, vhost and iser-target.
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds support for pre-allocation of per tv_cmd descriptor
scatterlist + user-space page pointer memory using se_sess->sess_cmd_map
within tcm_vhost_make_nexus() code.
This includes sanity checks within vhost_scsi_map_to_sgl()
to reject I/O that exceeds these initial hardcoded values, and
the necessary cleanup in tcm_vhost_make_nexus() failure path +
tcm_vhost_drop_nexus().
v3 changes:
- Rebase to v3.11-rc5 code
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Asias He <asias@redhat.com>
Cc: Kent Overstreet <kmo@daterainc.com>
Reviewed-by: Asias He <asias@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch changes vhost/scsi to use transport_init_session_tags()
pre-allocation logic for per-cpu session tag pooling with internal
ida_alloc() + ida_free() calls based upon the saved se_cmd->map_tag id.
FIXME: Make transport_init_session_tags() number of tags setup
configurable per vring client setting via configfs
v5 changes:
- Convert to percpu_ida.h include
v3 changes:
- Update to percpu-ida usage
- Rebase to v3.11-rc5 code
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Asias He <asias@redhat.com>
Cc: Kent Overstreet <kmo@daterainc.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Now, vq->private_data is always accessed under vq mutex. No need to play
the vhost rcu trick.
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The return value wasn't checked by any of the callers. Assuming this is
correct behaviour, we can simplify some code by not bothering to
generate it.
nab: Add srpt_queue_data_in() + srpt_queue_tm_rsp() nops around
srpt_queue_response() void return
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
$ make C=1 M=drivers/vhost
drivers/vhost/net.c:168:5: warning: symbol 'vhost_net_set_ubuf_info' was not declared. Should it be static?
drivers/vhost/net.c:194:6: warning: symbol 'vhost_net_vq_reset' was not declared. Should it be static?
drivers/vhost/scsi.c:219:6: warning: symbol 'tcm_vhost_done_inflight' was not declared. Should it be static?
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently, vhost-net and vhost-scsi are sharing the vhost core code.
However, vhost-scsi shares the code by including the vhost.c file
directly.
Making vhost a separate module makes it is easier to share code with
other vhost devices.
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This way, we use cmd for struct tcm_vhost_cmd and evt for struct
tcm_vhost_cmd.
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It was needed when struct tcm_vhost_tpg is in tcm_vhost.h
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch coverts vhost/scsi to se_cmd->cmd_kref TARGET_SCF_ACK_KREF
usage, instead of assuming that vhost_scsi_free_cmd() is always called
before TCM processing is completed in the response fast path.
This includes adding vhost_scsi_check_stop_free() -> target_put_sess_cmd()
to perform the second se_cmd->cmd_kref put, and moving vhost_scsi_free_cmd()
resource release into tcm_vhost_release_cmd() that is invoked once the last
se_cmd->cmd_kref put occurs.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Roland Dreier <roland@kernel.org>
Cc: Kent Overstreet <koverstreet@google.com>
Cc: Asias He <asias@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Moussa Ba <moussaba@micron.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch changes vhost_scsi_free_cmd() to call transport_generic_free_cmd()
with wait_for_tasks=false in order to avoid the extra se_cmd->t_state_lock
access for the wait_for_tasks=true case.
This is unnecessary because vhost_scsi_free_cmd() is only ever called by
vhost_scsi_complete_cmd_work() after TCM completion handoff, and by
vhost_scsi_handle_vq() exception code before TCM submission handoff, so
there is never a case where se_cmd is still active from TCM's perspective
when transport_generic_free_cmd() is called.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Roland Dreier <roland@kernel.org>
Cc: Kent Overstreet <koverstreet@google.com>
Cc: Asias He <asias@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Moussa Ba <moussaba@micron.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
It was disabled as a workaround. Now userspace bits work fine with it.
The broken version was not ever committed to QEMU, I guess the same is
true for nlkt.
So, let's enable it.
Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Rename module and update Kconfig and Makefile.
Add alias for compatibility with old userspace
scripts if any.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Asias He <asias@redhat.com>
Acked-by: Nicholas Bellinger <nab@linux-iscsi.org>
move uapi parts to vhost.h
move .c private parts to .c itself
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Asias He <asias@redhat.com>
Acked-by: Nicholas Bellinger <nab@linux-iscsi.org>
Move tcm_vhost.c -> scsi.c
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Asias He <asias@redhat.com>
Acked-by: Nicholas Bellinger <nab@linux-iscsi.org>
Unlike tcm_vhost_evt requests, tcm_vhost_cmd requests are passed to the
target core system, we can not make sure all the pending requests will
be finished by flushing the virt queue.
In this patch, we do refcount for every tcm_vhost_cmd requests to make
vhost_scsi_flush() wait for all the pending requests issued before the
flush operation to be finished.
This is useful when we call vhost_scsi_clear_endpoint() to stop
tcm_vhost. No new requests will be passed to target core system because
we clear the endpoint by setting vs_tpg to NULL. And we wait for all the
old requests. These guarantee no requests will be leaked and existing
requests will be completed.
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This is useful for any device who wants device specific fields per vq.
For example, tcm_vhost wants a per vq field to track requests which are
in flight on the vq. Also, on top of this we can add patches to move
things like ubufs from vhost.h out to net.c.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Everything for hotplug is ready. Let's enable the feature bit.
Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
In commit 365a715009 ([SCSI] virtio-scsi: hotplug support for
virtio-scsi), hotplug support is added to virtio-scsi.
This patch adds hotplug and hotunplug support to tcm_vhost.
You can create or delete a LUN in targetcli to hotplug or hotunplug a
LUN in guest.
Signed-off-by: Asias He <asias@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
We want to use tcm_vhost_mutex to make sure hotplug/hotunplug will not
happen when set_endpoint/clear_endpoint is in process.
Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Send bad target to guest in case:
1) we can not allocate the cmd
2) fail to submit the cmd
Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Share the send bad target code with other use cases.
Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
If we fail to submit the allocated tv_vmd to tcm_vhost_submission_work,
we will leak the tv_vmd. Free tv_vmd on fail path.
Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
We did the length of response check twice.
Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch fixes guest hang when booting seabios and guest.
[ 0.576238] scsi0 : Virtio SCSI HBA
[ 0.616754] virtio_scsi virtio1: request:id 0 is not a head!
vq->last_used_idx is initialized only when /dev/vhost-scsi is
opened or closed.
vhost_scsi_open -> vhost_dev_init() -> vhost_vq_reset()
vhost_scsi_release() -> vhost_dev_cleanup -> vhost_vq_reset()
So, when guest talks to tcm_vhost after seabios does, vq->last_used_idx
still contains the old valule for seabios. This confuses guest.
Fix this by calling vhost_init_used() to init vq->last_used_idx when
we set endpoint.
Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
not. It is set or cleared in vhost_scsi_set_endpoint() or
vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
we check it in vhost_scsi_handle_vq(), we ignored the lock.
Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
indicate the status of the endpoint, we use per virtqueue
vq->private_data to indicate it. In this way, we can only take the
vq->mutex lock which is per queue and make the concurrent multiqueue
process having less lock contention. Further, in the read side of
vq->private_data, we can even do not take the lock if it is accessed in
the vhost worker thread, because it is protected by "vhost rcu".
(nab: Do s/VHOST_FEATURES/~VHOST_SCSI_FEATURES)
Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
In vhost_scsi_handle_vq:
tv_tpg = vs->vs_tpg[target];
if (!tv_tpg) {
....
return
}
tv_cmd = vhost_scsi_allocate_cmd(tv_tpg, &v_req,
1) vs->vs_tpg[target] might change after the NULL check and 2) the above
line might access tv_tpg from vs->vs_tpg[target]. To prevent 2), use
ACCESS_ONCE. Thanks mst for catching this up!
Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds a VHOST_SCSI_FEATURES mask minus VIRTIO_RING_F_EVENT_IDX
so that vhost-scsi-pci userspace will strip this feature bit once
GET_FEATURES reports it as being unsupported on the host.
This is to avoid a bug where ->handle_kicks() are missed when EVENT_IDX
is enabled by default in userspace code.
(mst: Rename to VHOST_SCSI_FEATURES + add comment)
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Asias He <asias@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
We also need to flush the vhost_works. It is the completion vhost_work
currently.
Signed-off-by: Asias He <asias@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
tv_tpg->tv_tpg_vhost_count should be protected by tv_tpg->tv_tpg_mutex.
Signed-off-by: Asias He <asias@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This adds virtio-scsi multi-queue support to tcm_vhost. In order to use
multi-queue, guest side multi-queue support is need. It can
be found here:
https://lkml.org/lkml/2012/12/18/166
Currently, only one thread is created by vhost core code for each
vhost_scsi instance. Even if there are multi-queues, all the handling of
guest kick (vhost_scsi_handle_kick) are processed in one thread. This is
not optimal. Luckily, most of the work is offloaded to the tcm_vhost
workqueue.
Some initial perf numbers:
1 queue, 4 targets, 1 lun per target
4K request size, 50% randread + 50% randwrite: 127K/127k IOPS
4 queues, 4 targets, 1 lun per target
4K request size, 50% randread + 50% randwrite: 181K/181k IOPS
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
In order to take advantages of Paolo's multi-queue virito-scsi, we need
multi-target support in tcm_vhost first. Otherwise all the requests go
to one queue and other queues are idle.
This patch makes:
1. All the targets under the wwpn is seen and can be used by guest.
2. No need to pass the tpgt number in struct vhost_scsi_target to
tcm_vhost.ko. Only wwpn is needed.
3. We can always pass max_target = 255 to guest now, since we abort the
request who's target id does not exist.
Changes in v2:
- Handle non-contiguous tpgt
Changes in v3:
- Simplfy lock in vhost_scsi_set_endpoint
- Return -EEXIST when does not match
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
We can get all the pages in one time instead of calling
gup N times.
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Add a helper to calculate the number of pages needed for a iov entry.
(nab: Drop unnecessary inline)
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This drops the cmd completion list spin lock and makes the cmd
completion queue lock-less.
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
It's OK to get kick before backend is set or after
it is cleared, we can just ignore it.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The variable se_sess is initialized but never used
otherwise, so remove the unused variable.
dpatch engine is used to auto generate this patch.
(https://github.com/weiyj/dpatch)
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>