This reverts commit 00e37bdb01.
During shutdown of PVHVM guests with more than 2VCPUs on certain
machines we can hit the race where the replaced shared_info is not
replaced fast enough and the PV time clock retries reading the same
area over and over without any any success and is stuck in an
infinite loop.
Acked-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Currently kexec in a PVonHVM guest fails with a triple fault because the
new kernel overwrites the shared info page. The exact failure depends on
the size of the kernel image. This patch moves the pfn from RAM into
MMIO space before the kexec boot.
The pfn containing the shared_info is located somewhere in RAM. This
will cause trouble if the current kernel is doing a kexec boot into a
new kernel. The new kernel (and its startup code) can not know where the
pfn is, so it can not reserve the page. The hypervisor will continue to
update the pfn, and as a result memory corruption occours in the new
kernel.
One way to work around this issue is to allocate a page in the
xen-platform pci device's BAR memory range. But pci init is done very
late and the shared_info page is already in use very early to read the
pvclock. So moving the pfn from RAM to MMIO is racy because some code
paths on other vcpus could access the pfn during the small window when
the old pfn is moved to the new pfn. There is even a small window were
the old pfn is not backed by a mfn, and during that time all reads
return -1.
Because it is not known upfront where the MMIO region is located it can
not be used right from the start in xen_hvm_init_shared_info.
To minimise trouble the move of the pfn is done shortly before kexec.
This does not eliminate the race because all vcpus are still online when
the syscore_ops will be called. But hopefully there is no work pending
at this point in time. Also the syscore_op is run last which reduces the
risk further.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
While debugging kexec issues in a PVonHVM guest I modified
xen_hvm_platform() to return false to disable all PV drivers. This
caused a crash in platform_pci_init() because it expects certain data
structures to be initialized properly.
To avoid such a crash make sure the driver is initialized only if
running in a Xen guest.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Add xs_reset_watches function to shutdown watches from old kernel after
kexec boot. The old kernel does not unregister all watches in the
shutdown path. They are still active, the double registration can not
be detected by the new kernel. When the watches fire, unexpected events
will arrive and the xenwatch thread will crash (jumps to NULL). An
orderly reboot of a hvm guest will destroy the entire guest with all its
resources (including the watches) before it is rebuilt from scratch, so
the missing unregister is not an issue in that case.
With this change the xenstored is instructed to wipe all active watches
for the guest. However, a patch for xenstored is required so that it
accepts the XS_RESET_WATCHES request from a client (see changeset
23839:42a45baf037d in xen-unstable.hg). Without the patch for xenstored
the registration of watches will fail and some features of a PVonHVM
guest are not available. The guest is still able to boot, but repeated
kexec boots will fail.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Coverity points out that we do not free in one case the
pr_backup - and sure enough we forgot.
Found by Coverity (CID 401970)
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
If a driver leaves its poll method NULL, the device is assumed to
be both readable and writable without blocking.
This patch add .poll method to xen mcelog device driver, so that
when mcelog use system calls like ppoll or select, it would be
blocked when no data available, and avoid spinning at CPU.
Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
copy_to_user might sleep and print a stack trace if it is executed
in an atomic spinlock context. Like this:
(XEN) CMCI: send CMCI to DOM0 through virq
BUG: sleeping function called from invalid context at /home/konradinux/kernel.h:199
in_atomic(): 1, irqs_disabled(): 0, pid: 4581, name: mcelog
Pid: 4581, comm: mcelog Tainted: G O 3.5.0-rc1upstream-00003-g149000b-dirty #1
[<ffffffff8109ad9a>] __might_sleep+0xda/0x100
[<ffffffff81329b0b>] xen_mce_chrdev_read+0xab/0x140
[<ffffffff81148945>] vfs_read+0xc5/0x190
[<ffffffff81148b0c>] sys_read+0x4c/0x90
[<ffffffff815bd039>] system_call_fastpath+0x16
This patch schedule a workqueue for IRQ handler to poll the data,
and use mutex instead of spinlock, so copy_to_user sleep in atomic
context would not occur.
Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Suggested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
This patch provide Xen physical cpus online/offline sys interface.
User can use it for their own purpose, like power saving:
by offlining some cpus when light workload it save power greatly.
Its basic workflow is, user online/offline cpu via sys interface,
then hypercall xen to implement, after done xen inject virq back to dom0,
and then dom0 sync cpu status.
Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
When MCA error occurs, it would be handled by Xen hypervisor first,
and then the error information would be sent to initial domain for logging.
This patch gets error information from Xen hypervisor and convert
Xen format error into Linux format mcelog. This logic is basically
self-contained, not touching other kernel components.
By using tools like mcelog tool users could read specific error information,
like what they did under native Linux.
To test follow directions outlined in Documentation/acpi/apei/einj.txt
Acked-and-tested-by: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Ke, Liping <liping.ke@intel.com>
Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
For returning errors out to non-PCI code. Re-name xen's version.
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Some SR-IOV devices may use more than one bus number, but there is no real bridges
because that have internal routing mechanism. So need to check whether the bridge is
existing before using it.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
All of the bind_XYZ_to_irq do a quick lookup to see if the
event exists. And if it does, then the initialized IRQ number
is returned instead of initializing a new IRQ number.
This patch adds an extra logic to check that the type returned
is proper one and that there is an IRQ handler setup for it.
This patch has the benefit of being able to find drivers that
are doing something naught.
[v1: Enhanced based on Stefano's review]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
PV on HVM guests map GSIs into event channels. At restore time the
event channels are resumed by restore_pirqs.
Device drivers might try to register the same GSI again through ACPI at
restore time, but the GSI has already been mapped and bound by
restore_pirqs. This patch detects these situations and avoids
mapping the same GSI multiple times.
Without this patch we get:
(XEN) irq.c:2235: dom4: pirq 23 or emuirq 28 already mapped
and waste a pirq.
CC: stable@kernel.org
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Currently, the memory target in the Xen selfballooning driver is mainly
driven by the value of "Committed_AS". However, there are cases in
which it is desirable to assign additional memory to be available for
the kernel, e.g. for local caches (which are not covered by cleancache),
e.g. dcache and inode caches.
This adds an additional tunable in the selfballooning driver (accessible
via sysfs) which allows the user to specify an additional constant
amount of memory to be reserved by the selfballoning driver for the
local domain.
Signed-off-by: Jana Saout <jana@saout.de>
Acked-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Add an ioctl to the /dev/xen/xenbus_backend device allowing the xenbus
backend to be started after the kernel has booted. This allows xenstore
to run in a different domain from the dom0.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
This file depends on <xen/xen.h>, but the dependency was hidden due
to: <asm/acpi.h> -> <asm/trampoline.h> -> <asm/io.h> -> <xen/xen.h>
With the removal of <asm/trampoline.h>, this exposed the missing
Reported-by: Ingo Molnar <mingo@kernel.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
This patch is a significant performance improvement for the
m2p_override: about 6% using the gntdev device.
Each m2p_add/remove_override call issues a MULTI_grant_table_op and a
__flush_tlb_single if kmap_op != NULL. Batching all the calls together
is a great performance benefit because it means issuing one hypercall
total rather than two hypercall per page.
If paravirt_lazy_mode is set PARAVIRT_LAZY_MMU, all these calls are
going to be batched together, otherwise they are issued one at a time.
Adding arch_enter_lazy_mmu_mode/arch_leave_lazy_mmu_mode around the
m2p_add/remove_override calls forces paravirt_lazy_mode to
PARAVIRT_LAZY_MMU, therefore makes sure that they are always batched.
However it is not safe to call arch_enter_lazy_mmu_mode if we are in
interrupt context or if we are already in PARAVIRT_LAZY_MMU mode, so
check for both conditions before doing so.
Changes in v4:
- rebased on 3.4-rc4: all the m2p_override users call gnttab_unmap_refs
and gnttab_map_refs;
- check whether we are in interrupt context and the lazy_mode we are in
before calling arch_enter/leave_lazy_mmu_mode.
Changes in v3:
- do not call arch_enter/leave_lazy_mmu_mode in xen_blkbk_unmap, that
can be called in interrupt context.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
[v5: s/int lazy/bool lazy/]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Provide the registration callback to call in the Xen's
ACPI sleep functionality. This means that during S3/S5
we make a hypercall XENPF_enter_acpi_sleep with the
proper PM1A/PM1B registers.
Based of Ke Yu's <ke.yu@intel.com> initial idea.
[ From http://xenbits.xensource.com/linux-2.6.18-xen.hg
change c68699484a65 ]
[v1: Added Copyright and license]
[v2: Added check if PM1A/B the 16-bits MSB contain something. The spec
only uses 16-bits but might have more in future]
Signed-off-by: Liang Tang <liang.tang@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Fit it into 80 columns so that it is readable in menuconfig.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
We did a similar check for the P-states but did not do it for
the C-states. What we want to do is ignore cases where the DSDT
has definition for sixteen CPUs, but the machine only has eight
CPUs and we get:
xen-acpi-processor: (CX): Hypervisor error (-22) for ACPI CPU14
Reported-by: Tobias Geiger <tobias.geiger@vido.info>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
In pirq_check_eoi_map use the pirq number rather than the Linux irq
number to check whether an eoi is needed in the pirq_eoi_map.
The reason is that the irq number is not always identical to the
pirq number so if we wrongly use the irq number to check the
pirq_eoi_map we are going to check for the wrong pirq to EOI.
As a consequence some interrupts might not be EOI'ed by the
guest correctly.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Tested-by: Tobias Geiger <tobias.geiger@vido.info>
[v1: Added some extra wording to git commit]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
linux/drivers/xen/manage.c: In function 'do_suspend':
linux/drivers/xen/manage.c:160:5: warning: 'si.cancelled' may be used uninitialized in this function
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
A rather annoying and common case is when booting a PVonHVM guest
and exposing the PV KBD and PV VFB - as broken toolstacks don't
always initialize the backends correctly.
Normally The HVM guest is using the VGA driver and the emulated
keyboard for this (though upstream version of QEMU implements
PV KBD, but still uses a VGA driver). We provide a very basic
two-stage wait mechanism - where we wait for 30 seconds for all
devices, and then for 270 for all them except the two mentioned.
That allows us to wait for the essential devices, like network
or disk for the full 6 minutes.
To trigger this, put this in your guest config:
vfb = [ 'vnc=1, vnclisten=0.0.0.0 ,vncunused=1']
instead of this:
vnc=1
vnclisten="0.0.0.0"
CC: stable@kernel.org
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
[v3: Split delay in non-essential (30 seconds) and essential
devices per Ian and Stefano suggestion]
[v4: Added comments per Stefano suggestion]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Rather than just leaking pages that can't be freed at the point where
access permission for the backend domain gets revoked, put them on a
list and run a timer to (infrequently) retry freeing them. (This can
particularly happen when unloading a frontend driver when devices are
still present, and the backend still has them in non-closed state or
hasn't finished closing them yet.)
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Since we are using the m2p_override we do have struct pages
corresponding to the user vma mmap'ed by gntdev.
Removing the VM_PFNMAP flag makes get_user_pages work on that vma.
An example test case would be using a Xen userspace block backend
(QDISK) on a file on NFS using O_DIRECT.
CC: stable@kernel.org
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Jump to the label ini_nomem as done on the failure of the page allocations
above.
The code at ini_nomem is modified to accommodate different return values.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Prior to 2.6.19 and as of 2.6.31, pci_enable_msix() can return a
positive value to indicate the number of vectors (less than the amount
requested) that can be set up for a given device. Returning this as an
operation value (secondary result) is fine, but (primary) operation
results are expected to be negative (error) or zero (success) according
to the protocol. With the frontend fixed to match the XenoLinux
behavior, the backend can now validly return zero (success) here,
passing the upper limit on the number of vectors in op->value.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Adapt core x86 and IA64 architecture code for dma_map_ops changes: replace
alloc/free_coherent with generic alloc/free methods.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
[removed swiotlb related changes and replaced it with wrappers,
merged with IA64 patch to avoid inter-patch dependences in intel-iommu code]
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Tony Luck <tony.luck@intel.com>
The functions: "acpi_processor_*" sound like they depend on CONFIG_ACPI_PROCESSOR
but in reality they are exposed when CONFIG_CPU_FREQ=[y|m]. As such
update the Kconfig to have this dependency and fix compile issues:
ERROR: "acpi_processor_unregister_performance" [drivers/xen/xen-acpi-processor.ko] undefined!
ERROR: "acpi_processor_notify_smm" [drivers/xen/xen-acpi-processor.ko] undefined!
ERROR: "acpi_processor_register_performance" [drivers/xen/xen-acpi-processor.ko] undefined!
ERROR: "acpi_processor_preregister_performance" [drivers/xen/xen-acpi-processor.ko] undefined!
Note: We still need the CONFIG_ACPI
Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
When xen_emul_unplug=never is specified on kernel command line
reading files from /sys/hypervisor is broken (returns -EBUSY).
It is caused by xen_bus dependency on platform-pci and
platform-pci isn't initialized when xen_emul_unplug=never is
specified.
Fix it by allowing platform-pci to ignore xen_emul_unplug=never,
and do not intialize xen_[blk|net]front instead.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
When booting the kernel under machines that do not have P-states
we would end up with:
------------[ cut here ]------------
WARNING: at drivers/xen/xen-acpi-processor.c:504
xen_acpi_processor_init+0x286/0
x2e0()
Hardware name: ProLiant BL460c G6
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.39-200.0.3.el5uek #1
Call Trace:
[<ffffffff8191d056>] ? xen_acpi_processor_init+0x286/0x2e0
[<ffffffff81068300>] warn_slowpath_common+0x90/0xc0
[<ffffffff8191cdd0>] ? check_acpi_ids+0x1e0/0x1e0
[<ffffffff8106834a>] warn_slowpath_null+0x1a/0x20
[<ffffffff8191d056>] xen_acpi_processor_init+0x286/0x2e0
[<ffffffff8191cdd0>] ? check_acpi_ids+0x1e0/0x1e0
[<ffffffff81002168>] do_one_initcall+0xe8/0x130
.. snip..
Which is OK - the machines do not have P-states, so we fail to register
to process the _PXX states. But there is no need to WARN the user
of it.
Oracle BZ# 13871288
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Use 'bool' for boolean variables. Do proper section placement.
Eliminate an unnecessary export.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The pirq_eoi_map is a bitmap offered by Xen to check which pirqs need to
be EOI'd without having to issue an hypercall every time.
We use PHYSDEVOP_pirq_eoi_gmfn_v2 to map the bitmap, then if we
succeed we use pirq_eoi_map to check whether pirqs need eoi.
Changes in v3:
- explicitly use PHYSDEVOP_pirq_eoi_gmfn_v2 rather than
PHYSDEVOP_pirq_eoi_gmfn;
- introduce pirq_check_eoi_map, a function to check if a pirq needs an
eoi using the map;
-rename pirq_needs_eoi into pirq_needs_eoi_flag;
- introduce a function pointer called pirq_needs_eoi that is going to be
set to the right implementation depending on the availability of
PHYSDEVOP_pirq_eoi_gmfn_v2.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
With patch "xen/cpufreq: Disable the cpu frequency scaling drivers
from loading." we do not have to worry about said drivers loading
themselves before the xen-acpi-processor driver. Hence we can remove
the default selection (=y if CPU frequency drivers were built-in, or
=m if CPU frequency drivers were built as modules), and just
select =m for the default case.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
This driver solves three problems:
1). Parse and upload ACPI0007 (or PROCESSOR_TYPE) information to the
hypervisor - aka P-states (cpufreq data).
2). Upload the the Cx state information (cpuidle data).
3). Inhibit CPU frequency scaling drivers from loading.
The reason for wanting to solve 1) and 2) is such that the Xen hypervisor
is the only one that knows the CPU usage of different guests and can
make the proper decision of when to put CPUs and packages in proper states.
Unfortunately the hypervisor has no support to parse ACPI DSDT tables, hence it
needs help from the initial domain to provide this information. The reason
for 3) is that we do not want the initial domain to change P-states while the
hypervisor is doing it as well - it causes rather some funny cases of P-states
transitions.
For this to work, the driver parses the Power Management data and uploads said
information to the Xen hypervisor. It also calls acpi_processor_notify_smm()
to inhibit the other CPU frequency scaling drivers from being loaded.
Everything revolves around the 'struct acpi_processor' structure which
gets updated during the bootup cycle in different stages. At the startup, when
the ACPI parser starts, the C-state information is processed (processor_idle)
and saved in said structure as 'power' element. Later on, the CPU frequency
scaling driver (powernow-k8 or acpi_cpufreq), would call the the
acpi_processor_* (processor_perflib functions) to parse P-states information
and populate in the said structure the 'performance' element.
Since we do not want the CPU frequency scaling drivers from loading
we have to call the acpi_processor_* functions to parse the P-states and
call "acpi_processor_notify_smm" to stop them from loading.
There is also one oddity in this driver which is that under Xen, the
physical online CPU count can be different from the virtual online CPU count.
Meaning that the macros 'for_[online|possible]_cpu' would process only
up to virtual online CPU count. We on the other hand want to process
the full amount of physical CPUs. For that, the driver checks if the ACPI IDs
count is different from the APIC ID count - which can happen if the user
choose to use dom0_max_vcpu argument. In such a case a backup of the PM
structure is used and uploaded to the hypervisor.
[v1-v2: Initial RFC implementations that were posted]
[v3: Changed the name to passthru suggested by Pasi Kärkkäinen <pasik@iki.fi>]
[v4: Added vCPU != pCPU support - aka dom0_max_vcpus support]
[v5: Cleaned up the driver, fix bug under Athlon XP]
[v6: Changed the driver to a CPU frequency governor]
[v7: Jan Beulich <jbeulich@suse.com> suggestion to make it a cpufreq scaling driver
made me rework it as driver that inhibits cpufreq scaling driver]
[v8: Per Jan's review comments, fixed up the driver]
[v9: Allow to continue even if acpi_processor_preregister_perf.. fails]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The functions these get passed to have been taking pointers to const
since at least 2.6.16.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Unfortunately xend creates a bogus console/0 frotend/backend entry pair
on xenstore that console backends cannot properly cope with.
Any guest behavior that is not completely ignoring console/0 is going
to either cause problems with xenconsoled or qemu.
Returning 0 or -ENODEV from xencons_probe is not enough because it is
going to cause the frontend state to become 4 or 6 respectively.
The best possible thing we can do here is just ignore the entry from
xenbus_probe_frontend.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The individual drivers' remove functions could legitimately attempt to
access this information (for logging messages if nothing else). Note
that I did not in fact observe a problem anywhere, but I came across
this while looking into the reasons for what turned out to need the
fix at https://lkml.org/lkml/2012/3/5/336 to vsprintf().
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
- casting pointers to integer types of different size is being warned on
- an uninitialized variable warning occurred on certain gcc versions
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
We use the __pci_reset_function_locked to perform the action.
Also on attaching ("bind") and detaching ("unbind") we save and
restore the configuration states. When the device is disconnected
from a guest we use the "pci_reset_function" to also reset the
device before being passed to another guest.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
So far only the watch path was checked to be zero terminated, while
the watch token was merely assumed to be.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
.. as the rest of the kernel is using that format.
Suggested-by: Марк Коренберг <socketpair@gmail.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
When the initial domain starts, it prints (depending on the
amount of CPUs) a slew of
XENBUS: Unable to read cpu state
XENBUS: Unable to read cpu state
XENBUS: Unable to read cpu state
XENBUS: Unable to read cpu state
which provide no useful information - as the error is a valid
issue - but not on the initial domain. The reason is that the
XenStore is not accessible at that time (it is after all the
first guest) so the CPU hotplug watch cannot parse "availability/cpu"
attribute.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The current device suspend/resume phases during system-wide power
transitions appear to be insufficient for some platforms that want
to use the same callback routines for saving device states and
related operations during runtime suspend/resume as well as during
system suspend/resume. In principle, they could point their
.suspend_noirq() and .resume_noirq() to the same callback routines
as their .runtime_suspend() and .runtime_resume(), respectively,
but at least some of them require device interrupts to be enabled
while the code in those routines is running.
It also makes sense to have device suspend-resume callbacks that will
be executed with runtime PM disabled and with device interrupts
enabled in case someone needs to run some special code in that
context during system-wide power transitions.
Apart from this, .suspend_noirq() and .resume_noirq() were introduced
as a workaround for drivers using shared interrupts and failing to
prevent their interrupt handlers from accessing suspended hardware.
It appears to be better not to use them for other porposes, or we may
have to deal with some serious confusion (which seems to be happening
already).
For the above reasons, introduce new device suspend/resume phases,
"late suspend" and "early resume" (and analogously for hibernation)
whose callback will be executed with runtime PM disabled and with
device interrupts enabled and whose callback pointers generally may
point to runtime suspend/resume routines.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Complete the renaming from "flush" to "invalidate" across
both tmem frontends (cleancache and frontswap) and both tmem backends
(Xen and zcache), as required by akpm.
This change is completely cosmetic.
[v10: no change]
[v9: akpm@linux-foundation.org: change "flush" to "invalidate", part 3]
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Jan Beulich <JBeulich@novell.com>
Acked-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Rik Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
[v11: Remove the frontswap part]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Fix these warnings:
drivers/xen/biomerge.c:14:1: warning: data definition has no type or storage class [enabled by default]
drivers/xen/biomerge.c:14:1: warning: type defaults to 'int' in declaration of 'EXPORT_SYMBOL' [-Wimplicit-int]
drivers/xen/biomerge.c:14:1: warning: parameter names (without types) in function declaration [enabled by default]
And this build error:
ERROR: "xen_biovec_phys_mergeable" [drivers/block/nvme.ko] undefined!
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
module_param(bool) used to counter-intuitively take an int. In
fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy
trick.
It's time to remove the int/unsigned int option. For this version
it'll simply give a warning, but it'll break next kernel version.
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>