This function will initialize everthing necessary so that
devices can do DMA. This includes dma_ops and iommu_ops.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
This function will be called before the PCI subsystem is
initialized. Therefore dev_name doen't work and IOMMU
information can't be printed to the klog as before. Move the
code to print that information to a later point where PCI
initializtion has already happened.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
For interrupt remapping the relevant IOMMU initialization
needs to run earlier at boot when the PCI subsystem is not
yet initialized. To support that this patch splits the parts
of IOMMU initialization which need PCI accesses out of the
initial setup path so that this can be done later.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
This makes it easier to propagate errors while parsing the
IVRS table and makes the amd_iommu_init_err hack obsolete.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
A few sparse warnings fire in drivers/iommu/amd_iommu_init.c.
Fix most of them with this patch. Also fix the sparse
warnings in drivers/iommu/irq_remapping.c while at it.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
write_file_bool() modifies 32 bits of data, so "amd_iommu_unmap_flush"
needs to be 32 bits as well or we'll corrupt memory. Fortunately it
looks like the data is aligned with a gap after the declaration so this
is harmless in production.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
The iommu_shutdown callback is not initialized when the AMD
IOMMU driver runs in passthrough mode. Fix that by moving
the callback initialization before the check for
passthrough mode.
Signed-off-by: Shuah Khan <shuah.khan@hp.com>
Cc: stable@vger.kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
At some point pci_get_bus_and_slot started to enable
interrupts. Since this function is used in the
amd_iommu_resume path it will enable interrupts on resume
which causes a warning. The fix will use a cached pointer
to the root-bridge to re-enable the IOMMU in case the BIOS
is broken.
Cc: stable@vger.kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
For interrupt remapping the enablement of the IOMMU MSI
interrupt needs to be deferred because the IOMMU itself will
be initialized before the io-apics are up and running. So
the code to setup the MSI is moved seperated from the
hardware-setup routine now.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Unfortunatly the interrupts for the event log and the
peripheral page-faults are only enabled at boot but not
re-enabled at resume. Fix that.
Cc: stable@vger.kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
When ioremap_nocache fails in iommu initialization the code
calls release_mem_region immediatly. But the function is
called again when the propagates into the upper init
functions leading to a double-free. Fix that.
Reported-by: Don Dutile <ddutile@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
This function is called from enable_iommus(), which in turn is used
from amd_iommu_resume().
Cc: stable@vger.kernel.org
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
The hardware-initializtion part of the AMD IOMMU driver is
split out into a seperate function. This function can now be
called either from amd_iommu_init() itself or any other
place if the hardware needs to be ready earlier. This will
be used to implement interrupt remapping for AMD.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
To protect the command buffer from hanging when a device
does not respond to an IOTLB invalidation, set a timeout of
1s for outstanding IOTLB invalidations.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
If the device starts to use IOMMUv2 features the dma handles
need to stay valid. The only sane way to do this is to use a
identity mapping for the device and not translate it by the
iommu. This is implemented with this patch. Since this lifts
the device-isolation there is also a new kernel parameter
which allows to disable that feature.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
In mixed IOMMU setups this flag inidicates whether an IOMMU
supports the v2 features or not. This patch also adds a
global flag together with a function to query that flag from
other code. The flag shows if at least one IOMMUv2 is in the
system.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Read the number of PASIDs supported by each IOMMU in the
system and take the smallest number as the maximum value
supported by the IOMMU driver.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Convert the contents of 'struct dev_table_entry' to u64 to
allow updating the DTE wit 64bit writes as required by the
spec.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
A few parts of the driver were missing in drivers/iommu.
Move them there to have the complete driver in that
directory.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
The driver contains several loops counting on an u16 value
where the exit-condition is checked against variables that
can have values up to 0xffff. In this case the loops will
never exit. This patch fixed 3 such loops.
Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Move the interupt handling for the iommu into the interupt
thread to reduce latencies and prepare interupt handling for
pri handling.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
This patch adds detection of the extended features of an
AMD IOMMU. The available features are printed to dmesg on
boot.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
This patch adds a flag to the AMD IOMMU driver to indicate
that all IOMMUs present in the system support device IOTLBs.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
The old code only flushed a DTE or a domain TLB before it is
actually used by the IOMMU driver. While this is efficient
and works when done right it is more likely to introduce new
bugs when changing code (which happened in the past).
This patch adds code to flush all DTEs and all domain TLBs
in each IOMMU right after it is enabled (at boot and after
resume). This reduces the complexity of the driver and makes
it less likely to introduce stale-TLB bugs in the future.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Some subsystems in the x86 tree need to carry out suspend/resume and
shutdown operations with one CPU on-line and interrupts disabled and
they define sysdev classes and sysdevs or sysdev drivers for this
purpose. This leads to unnecessarily complicated code and excessive
memory usage, so switch them to using struct syscore_ops objects for
this purpose instead.
Generally, there are three categories of subsystems that use
sysdevs for implementing PM operations: (1) subsystems whose
suspend/resume callbacks ignore their arguments entirely (the
majority), (2) subsystems whose suspend/resume callbacks use their
struct sys_device argument, but don't really need to do that,
because they can be implemented differently in an arguably simpler
way (io_apic.c), and (3) subsystems whose suspend/resume callbacks
use their struct sys_device argument, but the value of that argument
is always the same and could be ignored (microcode_core.c). In all
of these cases the subsystems in question may be readily converted to
using struct syscore_ops objects for power management and shutdown.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
AMD's reference BIOS code had a bug that could result in the
firmware failing to reenable the iommu on resume. It
transpires that this causes certain less than desirable
behaviour when it comes to PCI accesses, to whit them ending
up somewhere near Bristol when the more desirable outcome
was Edinburgh. Sadness ensues, perhaps along with filesystem
corruption. Let's make sure that it gets turned back on,
and that we restore its configuration so decisions it makes
bear some resemblance to those made by reasonable people
rather than crack-addled lemurs who spent all your DMA on
Thunderbird.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
This patch adds a workaround for an IOMMU BIOS problem to
the AMD IOMMU driver. The result of the bug is that the
IOMMU does not execute commands anymore when the system
comes out of the S3 state resulting in system failure. The
bug in the BIOS is that is does not restore certain hardware
specific registers correctly. This workaround reads out the
contents of these registers at boot time and restores them
on resume from S3. The workaround is limited to the specific
IOMMU chipset where this problem occurs.
Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
This patch moves the setting of the configuration and
feature flags out out the acpi table parsing path and moves
it into the iommu-enable path. This is needed to reliably
fix resume-from-s3.
Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
We utilize the IOMMU_INIT macros to create this dependency:
[null]
|
[pci_xen_swiotlb_detect]
|
[pci_swiotlb_detect_override]
|
[pci_swiotlb_detect_4gb]
|
+-------+--------+
/ \
[detect_calgary] [gart_iommu_hole_init]
|
[amd_iommu_detect]
Meaning that 'amd_iommu_detect' will be called after
'gart_iommu_hole_init'.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
LKML-Reference: <1282845485-8991-9-git-send-email-konrad.wilk@oracle.com>
CC: Fujita Tomonori <fujita.tomonori@lab.ntt.co.jp>
CC: Joerg Roedel <joerg.roedel@amd.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
We return 1 if the IOMMU has been detected. Zero or an error number
if we failed to find it. This is in preperation of using the IOMMU_INIT
so that we can detect whether an IOMMU is present. I have not
tested this for regression on Calgary, nor on AMD Vi chipsets as
I don't have that hardware.
CC: Muli Ben-Yehuda <muli@il.ibm.com>
CC: "Jon D. Mason" <jdmason@kudzu.us>
CC: "Darrick J. Wong" <djwong@us.ibm.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
CC: David Woodhouse <David.Woodhouse@intel.com>
CC: Chris Wright <chrisw@sous-sol.org>
CC: Yinghai Lu <yinghai@kernel.org>
CC: Joerg Roedel <joerg.roedel@amd.com>
CC: H. Peter Anvin <hpa@zytor.com>
CC: Fujita Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
LKML-Reference: <1282845485-8991-3-git-send-email-konrad.wilk@oracle.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
This patch implements a fallback to the GART IOMMU if this
is possible and the AMD IOMMU initialization failed.
Otherwise the fallback would be nommu which is very
problematic on machines with more than 4GB of memory or
swiotlb which hurts io-performance.
Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
When request_mem_region fails the error path tries to
disable the IOMMUs. This accesses the mmio-region which was
not allocated leading to a kernel crash. This patch fixes
the issue.
Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
This patch adds a command line option to tell the AMD IOMMU
driver to not initialize any IOMMU it finds.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
To catch future potential issues we can add a warning whenever we issue
a command before the command buffer is fully initialized.
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Hit another kdump problem as reported by Neil Horman. When initializaing
the IOMMU, we attach devices to their domains before the IOMMU is
fully (re)initialized. Attaching a device will issue some important
invalidations. In the context of the newly kexec'd kdump kernel, the
IOMMU may have stale cached data from the original kernel. Because we
do the attach too early, the invalidation commands are placed in the new
command buffer before the IOMMU is updated w/ that buffer. This leaves
the stale entries in the kdump context and can renders device unusable.
Simply enable the IOMMU before we do the attach.
Cc: stable@kernel.org
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Since acpi_table_parse ignores the return values of the
parsing function this patch introduces a workaround and
reports these errors upstream via a global variable.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
This patch moves the initialization of the iommu-api out of
the dma-ops initialization code. This ensures that the
iommu-api is initialized even with iommu=pt.
Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
The assumption that acpi_table_parse passes the return value
of the hanlder function to the caller proved wrong
recently. The return value of the handler function is
totally ignored. This makes the initialization code for AMD
IOMMU buggy in a way that could cause a kernel panic on
initialization. This patch fixes the issue in the AMD IOMMU
driver.
Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
The device change notifier is initialized in the dma_ops
initialization path. But this path is never executed for
iommu=pt. Move the notifier initialization to IOMMU hardware
init code to fix this.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
The data structure changes to use dev->archdata.iommu field
broke the iommu=pt mode because in this case the
dev->archdata.iommu was left uninitialized. This moves the
inititalization of the devices into the main init function
and fixes the problem.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>