This adds code to handle alignment traps generated by the following
SPE (signal processing engine) load/store instructions, by emulating
the instruction in the kernel (as is done for other instructions that
generate alignment traps):
evldd[x] Vector Load Double Word into Double Word [Indexed]
evldw[x] Vector Load Double into Two Words [Indexed]
evldh[x] Vector Load Double into Four Half Words [Indexed]
evlhhesplat[x] Vector Load Half Word into Half Words Even and Splat [Indexed]
evlhhousplat[x] Vector Load Half Word into Half Word Odd Unsigned and Splat [Indexed]
evlhhossplat[x] Vector Load Half Word into Half Word Odd Signed and Splat [Indexed]
evlwhe[x] Vector Load Word into Two Half Words Even [Indexed]
evlwhou[x] Vector Load Word into Two Half Words Odd Unsigned (zero-extended) [Indexed]
evlwhos[x] Vector Load Word into Two Half Words Odd Signed (with sign extension) [Indexed]
evlwwsplat[x] Vector Load Word into Word and Splat [Indexed]
evlwhsplat[x] Vector Load Word into Two Half Words and Splat [Indexed]
evstdd[x] Vector Store Double of Double [Indexed]
evstdw[x] Vector Store Double of Two Words [Indexed]
evstdh[x] Vector Store Double of Four Half Words [Indexed]
evstwhe[x] Vector Store Word of Two Half Words from Even [Indexed]
evstwho[x] Vector Store Word of Two Half Words from Odd [Indexed]
evstwwe[x] Vector Store Word of Word from Even [Indexed]
evstwwo[x] Vector Store Word of Word from Odd [Indexed]
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
This adds code to handle alignment traps generated by the following
new floating-point load/store instructions, by emulating the
instruction in the kernel (as is done for other instructions that
generate alignment traps):
lfiwax load floating-point as integer word algebraic indexed
stfiwx store floating-point as integer word indexed
lfdp load floating-point double pair
lfdpx load floating-point double pair indexed
stfdp store floating-point double pair
stfdpx store floating-point double pair indexed
All these except stfiwx are new in POWER6.
lfdp/lfdpx/stfdp/stfdpx load and store 16 bytes of memory into an
even/odd FP register pair. In little-endian mode each 8-byte value is
byte-reversed separately (i.e. not as a 16-byte unit). lfiwax/stfiwx
load or store the lower 4 bytes of a floating-point register from/to
memory; lfiwax sets the upper 4 bytes of the FP register to the sign
extension of the value loaded.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Make the alignment exception handler use the new _inatomic variants
of __get/put_user. This fixes erroneous warnings in the very rare
cases where we manage to have copy_tofrom_user_inatomic() trigger
an alignment exception.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
arch/powerpc/kernel/align.c | 56 ++++++++++++++++++++++++--------------------
1 file changed, 31 insertions(+), 25 deletions(-)
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds the PowerPC part of the code to allow processes to change
their endian mode via prctl.
This also extends the alignment exception handler to be able to fix up
alignment exceptions that occur in little-endian mode, both for
"PowerPC" little-endian and true little-endian.
We always enter signal handlers in big-endian mode -- the support for
little-endian mode does not amount to the creation of a little-endian
user/kernel ABI. If the signal handler returns, the endian mode is
restored to what it was when the signal was delivered.
We have two new kernel CPU feature bits, one for PPC little-endian and
one for true little-endian. Most of the classic 32-bit processors
support PPC little-endian, and this is reflected in the CPU feature
table. There are two corresponding feature bits reported to userland
in the AT_HWCAP aux vector entry.
This is based on an earlier patch by Anton Blanchard.
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch merges align.c, the result isn't quite what was in ppc64 nor
what was in ppc32 :) It should implement all the functionalities of both
though. Kumar, since you played with that in the past, I suppose you
have some test cases for verifying that it works properly before I dig
out the 601 machine ? :)
Since it's likely that I won't be able to test all scenario, code
inspection is much welcome.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The recent merge of fpu.S broken the handling of fpscr for
ARCH=powerpc and CONFIG_PPC64=y. FP registers could be corrupted,
leading to strange random application crashes.
The confusion arises, because the thread_struct has (and requires) a
64-bit area to save the fpscr, because we use load/store double
instructions to get it in to/out of the FPU. However, only the low
32-bits are actually used, so we want to treat it as a 32-bit quantity
when manipulating its bits to avoid extra load/stores on 32-bit. This
patch replaces the current definition with a structure of two 32-bit
quantities (pad and val), to clarify things as much as is possible.
The 'val' field is used when manipulating bits, the structure itself
is used when obtaining the address for loading/unloading the value
from the FPU.
While we're at it, consolidate the 4 (!) almost identical versions of
cvt_fd() and cvt_df() (arch/ppc/kernel/misc.S,
arch/ppc64/kernel/misc.S, arch/powerpc/kernel/misc_32.S,
arch/powerpc/kernel/misc_64.S) into a single version in fpu.S. The
new version takes a pointer to thread_struct and applies the correct
offset itself, rather than a pointer to the fpscr field itself, again
to avoid confusion as to which is the correct field to use.
Finally, this patch makes ARCH=ppc64 also use the consolidated fpu.S
code, which it previously did not.
Built for G5 (ARCH=ppc64 and ARCH=powerpc), 32-bit powermac (ARCH=ppc
and ARCH=powerpc) and Walnut (ARCH=ppc, CONFIG_MATH_EMULATION=y).
Booted on G5 (ARCH=powerpc) and things which previously fell over no
longer do.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!