The CONFIG_EXPERIMENTAL config item has not carried much meaning for a
while now and is almost always enabled by default. As agreed during the
Linux kernel summit, remove it from any "depends on" lines in Kconfigs.
CC: Avi Kivity <avi@redhat.com>
CC: Marcelo Tosatti <mtosatti@redhat.com>
CC: Christian Borntraeger <borntraeger@de.ibm.com>
CC: Cornelia Huck <cornelia.huck@de.ibm.com>
CC: Martin Schwidefsky <schwidefsky@de.ibm.com>
CC: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Switching to or from guest context is done on ioctl context.
So by the time we call kvm_guest_enter() or kvm_guest_exit()
we know we are not running the idle task.
As a result, we can directly account the cputime using
vtime_account_system().
There are two good reasons to do this:
* We avoid some useless checks on guest switch. It optimizes
a bit this fast path.
* In the case of CONFIG_IRQ_TIME_ACCOUNTING, calling vtime_account()
checks for irq time to account. This is pointless since we know
we are not in an irq on guest switch. This is wasting cpu cycles
for no good reason. vtime_account_system() OTOH is a no-op in
this config option.
* We can remove the irq disable/enable around kvm guest switch in s390.
A further optimization may consist in introducing a vtime_account_guest()
that directly calls account_guest_time().
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Xiantao Zhang <xiantao.zhang@intel.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Newer kernels (linux-next with the transparent huge page patches)
use rrbm if the feature is announced via feature bit 66.
RRBM will cause intercepts, so KVM does not handle it right now,
causing an illegal instruction in the guest.
The easy solution is to disable the feature bit for the guest.
This fixes bugs like:
Kernel BUG at 0000000000124c2a [verbose debug info unavailable]
illegal operation: 0001 [#1] SMP
Modules linked in: virtio_balloon virtio_net ipv6 autofs4
CPU: 0 Not tainted 3.5.4 #1
Process fmempig (pid: 659, task: 000000007b712fd0, ksp: 000000007bed3670)
Krnl PSW : 0704d00180000000 0000000000124c2a (pmdp_clear_flush_young+0x5e/0x80)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 EA:3
00000000003cc000 0000000000000004 0000000000000000 0000000079800000
0000000000040000 0000000000000000 000000007bed3918 000000007cf40000
0000000000000001 000003fff7f00000 000003d281a94000 000000007bed383c
000000007bed3918 00000000005ecbf8 00000000002314a6 000000007bed36e0
Krnl Code:>0000000000124c2a: b9810025 ogr %r2,%r5
0000000000124c2e: 41343000 la %r3,0(%r4,%r3)
0000000000124c32: a716fffa brct %r1,124c26
0000000000124c36: b9010022 lngr %r2,%r2
0000000000124c3a: e3d0f0800004 lg %r13,128(%r15)
0000000000124c40: eb22003f000c srlg %r2,%r2,63
[ 2150.713198] Call Trace:
[ 2150.713223] ([<00000000002312c4>] page_referenced_one+0x6c/0x27c)
[ 2150.713749] [<0000000000233812>] page_referenced+0x32a/0x410
[...]
CC: stable@vger.kernel.org
CC: Alex Graf <agraf@suse.de>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
EXTERNAL_CALL and EMERGENCY type interrupts need to preserve their interrupt
code parameter when being injected from user space.
Signed-off-by: Jason J. Herne <jjherne@us.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Recent changes (KVM: make processes waiting on vcpu mutex killable)
now requires to check the return value of vcpu_load. This triggered
a warning in s390 specific kvm code. Turns out that we can actually
remove the put/load, since schedule will do the right thing via
the preempt notifiers.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Change return code handling of the stsi() function:
In case function code 0 was specified the return value is the
current configuration level (already shifted). That way all
the code that actually copied the stsi_0() function can go
away.
Otherwise the return value is 0 (success) or negative to
indicate an error (currently only -EOPNOTSUPP).
Also stsi() is no longer an inline function. The function is
not performance critical, but every caller would generate an
exception table entry for this function.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Split the base menu into more descriptive categories like processor,
memory, etc. and move virtualization related entries to the
Virtualization menu.
Also select KEXEC unconditionally.
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Introducing kvm_arch_flush_shadow_memslot, to invalidate the
translations of a single memory slot.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Introduce a new trace system, kvm-s390, for some kvm/s390 specific
trace points:
- injection of interrupts
- delivery of interrupts to the guest
- creation/destruction of kvm machines and vcpus
- stop actions for vcpus
- reset requests for userspace
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Add trace events for several s390 architecture specifics:
- SIE entry/exit
- common intercepts
- common instructions (sigp/diagnose)
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Suggested-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> # on s390x
Signed-off-by: Avi Kivity <avi@redhat.com>
Remove the file name from the comment at top of many files. In most
cases the file name was wrong anyway, so it's rather pointless.
Also unify the IBM copyright statement. We did have a lot of sightly
different statements and wanted to change them one after another
whenever a file gets touched. However that never happened. Instead
people start to take the old/"wrong" statements to use as a template
for new files.
So unify all of them in one go.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
If sigp sense doesn't have any status bits to report, it should set
cc 0 and leave the register as-is.
Since we know about the external call pending bit, we should report
it if it is set as well.
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Just use the defines instead of using plain numbers and adding
a comment behind each line.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
If an invalid parameter is passed or the addressed cpu is in an
incorrect state sigp set prefix will store a status.
This status must only have bits set as defined by the architecture.
The current kvm implementation missed to clear bits and also did
not set the intended status bit ("and" instead of "or" operation).
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Only if the sensed cpu is not running a status is stored, which
is reflected by condition code 1. If the cpu is running, condition
code 0 should be returned.
Just the opposite of what the code is doing.
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
The smp and the kvm code have different defines for the sigp order codes.
Let's just have a single place where these are defined.
Also move the sigp condition code and sigp cpu status bits to the new
sigp.h header file.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
The initial cpu reset sets the cpu in the stopped state.
Several places check for the cpu state (e.g. sigp set prefix) and
not setting the STOPPED state triggered errors with newer guest
kernels after reboot.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
The smp and the kvm code have different defines for the sigp order codes.
Let's just have a single place where these are defined.
Also move the sigp condition code and sigp cpu status bits to the new
sigp.h header file.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Enhance the KVM ONE_REG capability within S390 to allow
getting/setting the following special cpu registers: clock comparator
and the cpu timer. These are needed for migration.
Signed-off-by: Jason J. herne <jjherne@us.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
This patch makes vcpu epoch difference and the TOD programmable
field accessible from userspace. This is needed in order to
implement a couple of instructions that deal with the time of
day clock on s390, such as SET CLOCK and for migration.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
This patch enables KVM_CAP_ONE_REG for s390 and implements stubs
for KVM_GET/SET_ONE_REG. This is based on the ppc implementation.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Currently qemu/kvm on s390 uses a guest mapping that does not
allow the guest backing page table to be write-protected to
support older systems. On those older systems a host write
protection fault will be delivered to the guest.
Newer systems allow to write-protect the guest backing memory
and let the fault be delivered to the host, thus allowing COW.
Use a capability bit to tell qemu if that is possible.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Let userspace know the number of max and supported cpus for kvm on s390.
Return KVM_MAX_VCPUS (currently 64) for both values.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Lets replace the old open coded version of diag 0x44 (which relied on
compat_sched_yield) with kvm_vcpu_on_spin.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
This patch implements the directed yield hypercall found on other
System z hypervisors. It delegates execution time to the virtual cpu
specified in the instruction's parameter.
Useful to avoid long spinlock waits in the guest.
Christian Borntraeger: moved common code in virt/kvm/
Signed-off-by: Konstantin Weitz <WEITZKON@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
The kvm_vcpu_kick function performs roughly the same funcitonality on
most all architectures, so we shouldn't have separate copies.
PowerPC keeps a pointer to interchanging waitqueues on the vcpu_arch
structure and to accomodate this special need a
__KVM_HAVE_ARCH_VCPU_GET_WQ define and accompanying function
kvm_arch_vcpu_wq have been defined. For all other architectures this
is a generic inline that just returns &vcpu->wq;
Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
The 16 bit value at the lowcore location with offset 0x84 is the
cpu address that is associated with an external interrupt. Rename
the field from cpu_addr to ext_cpu_addr to make that clear.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Some members of kvm_memory_slot are not used by every architecture.
This patch is the first step to make this difference clear by
introducing kvm_memory_slot::arch; lpage_info is moved into it.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
There are several cases were we need the control registers for
userspace. Lets also provide those in kvm_run.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
When we do a stop and store status we need to pass ACTION_STOP_ON_STOP
flag to __sigp_stop().
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
In __inject_sigp_stop() do nothing when the CPU is already in stopped
state.
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
On reboot the guest sends in smp_send_stop() a sigp stop to all CPUs
except for current CPU. Then the guest switches to the IPL cpu by
sending a restart to the IPL CPU, followed by a sigp stop to the
current cpu. Since restart is handled by userspace it's possible that
the restart is delivered before the old stop. This means that the IPL
CPU isn't restarted and we have no running CPUs. So let's make sure
that there is no stop action pending when we do the restart.
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
In handle_stop() handle the stop bit before doing the store status as
described for "Stop and Store Status" in the Principles of Operation.
We have to give up the local_int.lock before calling kvm store status
since it calls gmap_fault() which might sleep. Since local_int.lock
only protects local_int.* and not guest memory we can give up the lock.
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
commit 7eef87dc99 (KVM: s390: fix
register setting) added a load of the floating point control register
to the KVM_SET_FPU path. Lets make sure that the fpc is valid.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
This patch adds the access registers to the kvm_run structure.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
This patch adds the general purpose registers to the kvm_run structure.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Add the prefix register to the synced register field in kvm_run.
While we need the prefix register most of the time read-only, this
patch also adds handling for guest dirtying of the prefix register.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
There are several places in the kvm module, which set the prefix register.
Since we need to flush the cpu, lets combine this operation into a helper
function. This helper will also explicitely mask out the unused bits.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
This patch fixes the return code of kvm_arch_vcpu_ioctl in case
of an unkown ioctl number.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
This patch announces a new capability KVM_CAP_S390_UCONTROL that
indicates that kvm can now support virtual machines that are
controlled by userspace.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
This patch makes sure user controlled virtual machines do not use a
system control area (sca). This is needed in order to create
virtual machines with more cpus than the size of the sca [64].
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
This patch allows the user to fault in pages on a virtual cpus
address space for user controlled virtual machines. Typically this
is superfluous because userspace can just create a mapping and
let the kernel's page fault logic take are of it. There is one
exception: SIE won't start if the lowcore is not present. Normally
the kernel takes care of this [handle_validity() in
arch/s390/kvm/intercept.c] but since the kernel does not handle
intercepts for user controlled virtual machines, userspace needs to
be able to handle this condition.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
This patch disables the in-kernel interrupt stack for KVM virtual
machines that are controlled by user. Userspace has to take care
of handling interrupts on its own.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
This patch disables in-kernel handling of SIE intercepts for user
controlled virtual machines. All intercepts are passed to userspace
via KVM_EXIT_SIE exit reason just like SIE intercepts that cannot be
handled in-kernel for regular KVM guests.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
This patch exports the s390 SIE hardware control block to userspace
via the mapping of the vcpu file descriptor. In order to do so,
a new arch callback named kvm_arch_vcpu_fault is introduced for all
architectures. It allows to map architecture specific pages.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
This patch introduces a new exit reason in the kvm_run structure
named KVM_EXIT_S390_UCONTROL. This exit indicates, that a virtual cpu
has regognized a fault on the host page table. The idea is that
userspace can handle this fault by mapping memory at the fault
location into the cpu's address space and then continue to run the
virtual cpu.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>