commit 746357d (x86: Prevent GCC 4.4.x (pentium-mmx et al) function
prologue wreckage) uses -mtune=generic to work around the function
prologue problem with mcount on -march=pentium-mmx and others.
Jakub pointed out that we can use -maccumulate-outgoing-args instead
which is selected by -mtune=generic and prevents the problem without
losing the -march specific optimizations.
Pointed-out-by: Jakub Jelinek <jakub@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable@kernel.org
When the kernel is compiled with -pg for tracing GCC 4.4.x inserts
stack alignment of a function _before_ the mcount prologue if the
-march=pentium-mmx is set and -mtune=generic is not set. This breaks
the assumption of the function graph tracer which expects that the
mcount prologue
push %ebp
mov %esp, %ebp
is the first stack operation in a function because it needs to modify
the function return address on the stack to trap into the tracer
before returning to the real caller.
The generated code is:
push %edi
lea 0x8(%esp),%edi
and $0xfffffff0,%esp
pushl -0x4(%edi)
push %ebp
mov %esp,%ebp
so the tracer modifies the copy of the return address which is stored
after the stack alignment and therefor does not trap the return which
in turn breaks the call chain logic of the tracer and leads to a
kernel panic.
Aside of the fact that the generated code is horrible for no good
reason other -march -mtune options generate the expected:
push %ebp
mov %esp,%ebp
and $0xfffffff0,%esp
which does the same and keeps everything intact.
After some experimenting we found out that this problem is restricted
to gcc4.4.x and to the following -march settings:
i586, pentium, pentium-mmx, k6, k6-2, k6-3, winchip-c6, winchip2, c3,
geode
By adding -mtune=generic the code generator produces always the
expected code.
So forcing -mtune=generic when CONFIG_FUNCTION_GRAPH_TRACER=y is not
pretty, but at the moment the only way to prevent that the kernel
trips over gcc-shrooms induced code madness.
Most distro kernels have CONFIG_X86_GENERIC=y anyway which forces
-mtune=generic as well so it will not impact those.
References: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42109http://lkml.org/lkml/2009/11/19/17
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <alpine.LFD.2.00.0911200206570.24119@localhost.localdomain>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>,
Cc: Jeff Law <law@redhat.com>
Cc: gcc@gcc.gnu.org
Cc: David Daney <ddaney@caviumnetworks.com>
Cc: Andrew Haley <aph@redhat.com>
Cc: Richard Guenther <richard.guenther@gmail.com>
Cc: stable@kernel.org
Add CPU optimizations for AMD Geode LX.
Signed-off-by: Matteo Croce <technoboy85@gmail.com>
LKML-Reference: <40101cc30910010811v5d15ff4cx9dd57c9cc9b4b045@mail.gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Add another option when selecting CPU family so the kernel can be
optimized for Intel Atom CPUs. If GCC supports tuning options for
Intel Atom they will be used.
Signed-off-by: Tobias Doerffel <tobias.doerffel@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <1251018457-19157-1-git-send-email-tobias.doerffel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The Winchip-2 and Winchip-2A cpu choices select the
same options for kernel and compiler.
Merge them to save few bytes and reduce confusion.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
binutils, contrary to documented behaviour, will generate long NOPs (a
P6-or-higher instruction which is broken on at least some VIA chips,
Virtual PC/Virtual Server, and some versions of Qemu) depending on the
-mtune= option, which is not supposed to change architectural
behaviour.
Pass an explicit override to the assembler, in case ends up passing
the -mtune= parameter to gas (gcc 4.3.0 does not appear to.)
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Moving the ARCH specific Makefiles for i386 and x86_64
required a litle bit tweaking in the top-lvel Makefile.
SRCARCH is now set in the top-level Makefile
because we need this info to include the correct
arch Makefile.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Hi!
I sent this simple patch to lkml about two weeks ago and also cc'ed
to Linus, but seems that the patch got ignored. I decided to write to
you, because you have modified the relevant file most recently.
Below is a copy of the mail that is also available at
<http://lkml.org/lkml/2007/2/28/230>.
Signed-off-by: Andi Kleen <ak@suse.de>
The VIA C7 is a 686 (with TSC) that supports MMX, SSE and SSE2, it also has
a cache line length of 64 according to
http://www.digit-life.com/articles2/cpu/rmma-via-c7.html. This patch sets
gcc to -march=686 and select s the correct cache shift.
Signed-off-by: Simon Arlott <simon@fire.lp0.eu>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
gcc doesn't support -mtune=core2 yet, but will be soon. Use -mtune=generic or -mtune=i686
as fallback
TBD need benchmarking for INTEL_USERCOPY etc. So far I used the same defaults as MPENTIUMM
Signed-off-by: Andi Kleen <ak@suse.de>
The upcomming gcc 4.2 got a new option -mtune=generic to tune
code for both common AMD and Intel CPUs. Use this option
when available for generic kernels.
On x86-64 it is used with CONFIG_GENERIC_CPU. On i386 it is
enabled with CONFIG_X86_GENERIC. It won't affect the base
line CPU support in any ways and also not the minimum supported CPU.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove various things which were checking for gcc-1.x and gcc-2.x compilers.
From: Adrian Bunk <bunk@stusta.de>
Some documentation updates and removes some code paths for gcc < 3.2.
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I just noted that -mtune is used, which is only supported on recent GCCs; by
reading http://gcc.gnu.org/gcc-3.4/changes.html, you see "-mcpu has been
renamed to -mtune.", so for GCC < 3.4 we're not using any specific tuning in
the appropriate cases. However -mcpu is deprecated, so use -mtune when
possible.
This was introduced by commit e9d4dce954a60dc23dd1d967766ca2347b780e54 of the
old tree (between 2.6.10-rc3 and 2.6.10) by Linus Torvalds, to remove the use
of -march, since that could trigger gcc using SSE on its own. But no
attention was used about using -mcpu vs. -mtune.
And btw, the old 2.6.4 code (for instance) was:
cflags-$(CONFIG_MPENTIUMII) += $(call check_gcc,-march=pentium2,-march=i686)
cflags-$(CONFIG_MPENTIUMIII) += $(call check_gcc,-march=pentium3,-march=i686)
cflags-$(CONFIG_MPENTIUMM) += $(call check_gcc,-march=pentium3,-march=i686)
cflags-$(CONFIG_MPENTIUM4) += $(call check_gcc,-march=pentium4,-march=i686)
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make UML share the underlying cpu-specific tuning done on i386.
Actually, for now many config options aren't used a lot - but that can be done
later. Also, UML relies on GCC optimization for things like memcpy and such
more than i386, so specifying the correct -march and -mtune should be enough.
Later, we may want to correct some other stuff.
For instance, since FPU context switching, for us, is done (at least
partially, i.e. between our kernelspace and userspace) by the host, we may
allow usage of FPU operations by GCC. This doesn't hold for kernelspace vs.
kernelspace, but we don't support preemption.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>