Need this in the next patch in time_init and that happens early.
This includes a minor fix on i386 where early_intel_workarounds()
[which is now called early_init_intel] really executes early as
the comments say.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
map vsyscalls early enough. This is important if a __vsyscall_fn
function is used by other kernel code too.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
According to Intel RDTSC can be always synchronized with LFENCE
on all current CPUs. Implement the necessary CPUID bit for that.
It is unclear yet if that is true for all future CPUs too,
but if there's another way the kernel can be always updated.
Cc: asit.k.mallick@intel.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
According to AMD RDTSC can be synchronized through MFENCE.
Implement the necessary CPUID bit for that.
Cc: andreas.herrmann3@amd.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch moves definitions that are present in only one of the files
(between processor_32.h and processor_64.h), to processor.h. They're mostly
structures and function definitions.
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch adds basic runtime services support for EFI x86_64 system. The
main file of the patch is the addition of efi_64.c for x86_64. This file is
modeled after the EFI IA32 avatar. EFI runtime services initialization are
implemented in efi_64.c. Some x86_64 specifics are worth noting here. On
x86_64, parameters passed to EFI firmware services need to follow the EFI
calling convention. For this purpose, a set of functions named efi_call<x>
(<x> is the number of parameters) are implemented. EFI function calls are
wrapped before calling the firmware service. The duplicated code between
efi_32.c and efi_64.c is placed in efi.c to remove them from efi_32.c.
Signed-off-by: Chandramouli Narayanan <mouli@linux.intel.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch prepares the x86_64 architecture initialization for
paravirt. It requires a memory initialization step, which is done
by implementing 64-bit version for machine_specific_memory_setup,
and putting an ARCH_SETUP hook, for guest-dependent initialization.
This last step is done akin to i386
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Resend using different mail client
Changes to the last version:
- split implementation into two layers: ds/bts and ptrace
- renamed TIF's
- save/restore ds save area msr in __switch_to_xtra()
- make block-stepping only look at BTF bit
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The patch to suppress bitops-related warnings added a pile of ugly
casts. Many of these were related to the management of x86 CPU
capabilities. Clean these up by adding specific set/clear_cpu_cap
macros, and use them consistently.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We need to store core id bits to cpuinfo_x86 in early_identify_cpu. So we
use it to create acpiid_to_node array in k8topolgy.c
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch removes the extern struct resource declarations for
data_resource, code_resource and bss_resource on x86 and declares that
three structures as static as done on other architectures like IA64.
On i386, these structures are moved to setup_32.c (from e820_32.c) because
that's code that is not specific to e820 and also required on EFI systems.
That makes the "extern" reference superfluous.
On x86_64, data_resource, code_resource and bss_resource are passed to
e820_reserve_resources() as arguments just as done on i386 and IA64. That
also avoids the "extern" reference and it's possible to make it static.
Signed-off-by: Bernhard Walle <bwalle@suse.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The commit 399287229c hacked the
ioapic resource mapping into apic.c for no good reason.
Move the code into io_apic_64.c where it belongs.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Move the mce related declarations where they belong, fix the
users and remove 32bit dependency in mce.h
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Move acpi/pci related declarations to the correct headers
and remove the duplicate.
Build fix from: Andrew Morton
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86: provide a DMI based port 0x80 I/O delay override.
Certain (HP) laptops experience trouble from our port 0x80 I/O delay
writes. This patch provides for a DMI based switch to the "alternate
diagnostic port" 0xed (as used by some BIOSes as well) for these.
David P. Reed confirmed that port 0xed works for him and provides a
proper delay. The symptoms of _not_ working are a hanging machine,
with "hwclock" use being a direct trigger.
Earlier versions of this attempted to simply use udelay(2), with the
2 being a value tested to be a nicely conservative upper-bound with
help from many on the linux-kernel mailinglist but that approach has
two problems.
First, pre-loops_per_jiffy calibration (which is post PIT init while
some implementations of the PIT are actually one of the historically
problematic devices that need the delay) udelay() isn't particularly
well-defined. We could initialise loops_per_jiffy conservatively (and
based on CPU family so as to not unduly delay old machines) which
would sort of work, but...
Second, delaying isn't the only effect that a write to port 0x80 has.
It's also a PCI posting barrier which some devices may be explicitly
or implicitly relying on. Alan Cox did a survey and found evidence
that additionally some drivers may be racy on SMP without the bus
locking outb.
Switching to an inb() makes the timing too unpredictable and as such,
this DMI based switch should be the safest approach for now. Any more
invasive changes should get more rigid testing first. It's moreover
only very few machines with the problem and a DMI based hack seems
to fit that situation.
This also introduces a command-line parameter "io_delay" to override
the DMI based choice again:
io_delay=<standard|alternate>
where "standard" means using the standard port 0x80 and "alternate"
port 0xed.
This retains the udelay method as a config (CONFIG_UDELAY_IO_DELAY) and
command-line ("io_delay=udelay") choice for testing purposes as well.
This does not change the io_delay() in the boot code which is using
the same port 0x80 I/O delay but those do not appear to be a problem
as David P. Reed reported the problem was already gone after using the
udelay version. He moreover reported that booting with "acpi=off" also
fixed things and seeing as how ACPI isn't touched until after this DMI
based I/O port switch I believe it's safe to leave the ones in the boot
code be.
The DMI strings from David's HP Pavilion dv9000z are in there already
and we need to get/verify the DMI info from other machines with the
problem, notably the HP Pavilion dv6000z.
This patch is partly based on earlier patches from Pavel Machek and
David P. Reed.
Signed-off-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
92cb7612ae sets cpu_info->cpu_index to zero
for no reason. Referencing cpu_info->cpu_index now points always to CPU#0,
which is apparently not what we want.
Remove it.
Spotted-by: Zou Nan hai <nanhai.zou@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Fix regressions introduced with 92cb7612ae.
It can happen that cpuinfo is displayed for CPUs that are not online or
even worse for CPUs not present at all. As an example, following was
shown for a "second" CPU of a single core K8 variant:
processor : 0
vendor_id : unknown
cpu family : 0
model : 0
model name : unknown
stepping : 0
cache size : 0 KB
fpu : yes
fpu_exception : yes
cpuid level : 0
wp : yes
flags :
bogomips : 0.00
clflush size : 0
cache_alignment : 0
address sizes : 0 bits physical, 0 bits virtual
power management:
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Additional CPUID strings (sse4_1, sse4_2, sse5, skinit, wdt); fix the
positioning of the AMD ecx strings (cr8_legacy was duplicated under
two different names, so the alignment of all the other strings were
off by one.)
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add the BSS to the resource tree just as kernel text and kernel data are in
the resource tree. The main reason behind this is to avoid crashkernel
reservation in that area.
While it's not strictly necessary to have the BSS in the resource tree (the
actual collision detection is done in the reserve_bootmem() function before),
the usage of the BSS resource should be presented to the user in /proc/iomem
just as Kernel data and Kernel code.
Note: The patch currently is only implemented for x86 and ia64 (because
efi_initialize_iomem_resources() has the same signature on i386 and ia64).
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Bernhard Walle <bwalle@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: <linux-arch@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch removes the crashkernel parsing from
arch/x86_64/kernel/machine_kexec.c and calls the generic function, introduced
in the last patch, in setup_bootmem_allocator().
This is necessary because the amount of System RAM must be known in this
function now because of the new syntax.
Signed-off-by: Bernhard Walle <bwalle@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
cpu_data is currently an array defined using NR_CPUS. This means that
we overallocate since we will rarely really use maximum configured cpus.
When NR_CPU count is raised to 4096 the size of cpu_data becomes
3,145,728 bytes.
These changes were adopted from the sparc64 (and ia64) code. An
additional field was added to cpuinfo_x86 to be a non-ambiguous cpu
index. This corresponds to the index into a cpumask_t as well as the
per_cpu index. It's used in various places like show_cpuinfo().
cpu_data is defined to be the boot_cpu_data structure for the NON-SMP
case.
Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Dmitry Torokhov <dtor@mail.ru>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Cc: Mark M. Hoffman <mhoffman@lightlink.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch converts the x86_cpu_to_apicid array to be a per cpu
variable. This saves sizeof(apicid) * NR unused cpus. Access is mostly
from startup and CPU HOTPLUG functions.
MP_processor_info() is one of the functions that require access to the
x86_cpu_to_apicid array before the per_cpu data area is setup. For this
case, a pointer to the __initdata array is initialized in setup_arch()
and removed in smp_prepare_cpus() after the per_cpu data area is
initialized.
A second change is included to change the initial array value of ARCH
i386 from 0xff to BAD_APICID to be consistent with ARCH x86_64.
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Miscellaneous x86 stuff that can live in .rodata.
[ tglx: arch/x86 adaptation ]
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This is from an earlier message from 'Christoph Lameter':
cpu_core_map is currently an array defined using NR_CPUS. This means that
we overallocate since we will rarely really use maximum configured cpu.
If we put the cpu_core_map into the per cpu area then it will be allocated
for each processor as it comes online.
This means that the core map cannot be accessed until the per cpu area
has been allocated. Xen does a weird thing here looping over all processors
and zeroing the masks that are not yet allocated and that will be zeroed
when they are allocated. I commented the code out.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since the x86 merge, lots of files that referenced their own filenames
are no longer correct. Rather than keep them up to date, just delete
them, as they add no real value.
Additionally:
- fix up comment formatting in scx200_32.c
- Remove a credit from myself in setup_64.c from a time when we had no SCM
- remove longwinded history from tsc_32.c which can be figured out from
git.
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
AMDs C1E enabled CPUs stop the local apic timer, when both cores are
idle. This is a hardware feature which breaks highres/dynticks.
Add the same quirk as we have for 32 bit already.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
This reverts commit e66485d747, since
Rafael Wysocki noticed that the change only works for his in -mm, not in
mainline (and that both "noapictimer" _and_ "apicmaintimer" are broken
on his hardware, but that's apparently not a regression, just a symptom
of the same issue that causes the automatic apic timer disable to not
work).
It turns out that it really doesn't work correctly on x86-64, since
x86-64 doesn't use the generic clock events for timers yet.
Thanks to Rafal for testing, and here's the ugly details on x86-64 as
per Thomas:
"I just looked into the code and the logic vs. noapictimer on SMP is
completely broken.
On i386 the noapictimer option not only disables the local APIC
timer, it also registers the CPUs for broadcasting via IPI on SMP
systems.
The x86-64 code uses the broadcast only when the local apic timer is
active, i.e. "noapictimer" is not on the command line. This defeats
the whole purpose of "noapictimer". It should be there to make boxen
work, where the local APIC timer actually has a hardware problem,
e.g. the nx6325.
The current implementation of x86_64 only fixes the ACPI c-states
related problem where the APIC timer stops in C3(2), nothing else.
On nx6325 and other AMD X2 equipped systems which have the C1E
enabled we run into the following:
PIT keeps jiffies (and the system) running, but the local APIC timer
interrupts can get out of sync due to this C1E effect.
I don't think this is a critical problem, but it is wrong
nevertheless.
I think it's safe to revert the C1E patch and postpone the fix to the
clock events conversion."
On further reflection, Thomas noted:
"It's even worse than I thought on the first check:
"noapictimer" on the command line of an SMP box prevents _ONLY_ the
boot CPU apic timer from being used. But the secondary CPU is still
unconditionally setting up the APIC timer and uses the non
calibrated variable calibration_result, which is of course 0, to
setup the APIC timer. Wreckage guaranteed."
so we'll just have to wait for the x86 merge to hopefully fix this up
for x86-64.
Tested-and-requested-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
commit 3556ddfa92 titled
[PATCH] x86-64: Disable local APIC timer use on AMD systems with C1E
solves a problem with AMD dual core laptops e.g. HP nx6325 (Turion 64
X2) with C1E enabled:
When both cores go into idle at the same time, then the system switches
into C1E state, which is basically the same as C3. This stops the local
apic timer.
This was debugged right after the dyntick merge on i386 and despite the
patch title it fixes only the 32 bit path.
x86_64 is still missing this fix. It seems that mainline is not really
affected by this issue, as the PIT is running and keeps jiffies
incrementing, but that's just waiting for trouble.
-mm suffers from this problem due to the x86_64 high resolution timer
patches.
This is a quick and dirty port of the i386 code to x86_64.
I spent quite a time with Rafael to debug the -mm / hrt wreckage until
someone pointed us to this. I really had forgotten that we debugged this
half a year ago already.
Sigh, is it just me or is there something yelling arch/x86 into my ear?
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Restore the 2.6.22 CONFIG_ACPI_SLEEP build option, but now shadowing the
new CONFIG_PM_SLEEP option.
Signed-off-by: Len Brown <len.brown@intel.com>
[ Modified to work with the PM config setup changes. ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As it was a synonym for (CONFIG_ACPI && CONFIG_X86),
the ifdefs for it were more clutter than they were worth.
For ia64, just add a few stubs in anticipation of future
S3 or S4 support.
Signed-off-by: Len Brown <len.brown@intel.com>
Previously this flag was only used on 32bit, but some shared code can use
it now.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With that an L3 cache is correctly reported in the cache information in /sys
With fixes from Andreas Herrmann and Dean Gaudet and Joachim Deguara
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some Intel features are spread around in different CPUID leafs like 0x5,
0x6 and 0xA. Make this feature detection code common across i386 and
x86_64.
Display Intel Dynamic Acceleration feature in /proc/cpuinfo. This feature
will be enabled automatically by current acpi-cpufreq driver.
Refer to Intel Software Developer's Manual for more details about the feature.
Thanks to hpa (H Peter Anvin) for the making the actual code detecting the
scattered features data-driven.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Unify the handling of the CPU features vectors between i386 and x86-64.
This also adopts the collapsing of features which are required at
compile-time into constant tests from x86-64 to i386.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The code was ok, but triggered warnings for calling __init from
__cpuinit. Instead call it from check_bugs instead.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This was broken. It adds complexity, for no good reason. Rather than
separate __pa() and __pa_symbol(), we should deprecate __pa_symbol(),
and preferably __pa() too - and just use "virt_to_phys()" instead, which
is more readable and has nicer semantics.
However, right now, just undo the separation, and make __pa_symbol() be
the exact same as __pa(). That fixes the bugs this patch introduced,
and we can do the fairly obvious cleanups later.
Do the new __phys_addr() function (which is now the actual workhorse for
the unified __pa()/__pa_symbol()) as a real external function, that way
all the potential issues with compile/link-time optimizations of
constant symbol addresses go away, and we can also, if we choose to, add
more sanity-checking of the argument.
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It doesn't put the CPU into deeper sleep states, so it's better to use the standard
idle loop to save power. But allow to reenable it anyways for benchmarking.
I also removed the obsolete idle=halt on i386
Cc: andreas.herrmann@amd.com
Signed-off-by: Andi Kleen <ak@suse.de>
remove the reporting of the constant_tsc flag from the "power management"
field in /proc/cpuinfo. The NULL value there was replaced by "" because
the former would result in a printout of [8] if the flag is set.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Currently __pa_symbol is for use with symbols in the kernel address
map and __pa is for use with pointers into the physical memory map.
But the code is implemented so you can usually interchange the two.
__pa which is much more common can be implemented much more cheaply
if it is it doesn't have to worry about any other kernel address
spaces. This is especially true with a relocatable kernel as
__pa_symbol needs to peform an extra variable read to resolve
the address.
There is a third macro that is added for the vsyscall data
__pa_vsymbol for finding the physical addesses of vsyscall pages.
Most of this patch is simply sorting through the references to
__pa or __pa_symbol and using the proper one. A little of
it is continuing to use a physical address when we have it
instead of recalculating it several times.
swapper_pgd is now NULL. leave_mm now uses init_mm.pgd
and init_mm.pgd is initialized at boot (instead of compile time)
to the physmem virtual mapping of init_level4_pgd. The
physical address changed.
Except for the for EMPTY_ZERO page all of the remaining references
to __pa_symbol appear to be during kernel initialization. So this
should reduce the cost of __pa in the common case, even on a relocated
kernel.
As this is technically a semantic change we need to be on the lookout
for anything I missed. But it works for me (tm).
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
With the rewrite of the SMP trampoline and the early page
allocator there is nothing that needs identity mapped pages,
once we start executing C code.
So add zap_identity_mappings into head64.c and remove
zap_low_mappings() from much later in the code. The functions
are subtly different thus the name change.
This also kills boot_level4_pgt which was from an earlier
attempt to move the identity mappings as early as possible,
and is now no longer needed. Essentially I have replaced
boot_level4_pgt with trampoline_level4_pgt in trampoline.S
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
This modifies the SMP trampoline and all of the associated code so
it can jump to a 64bit kernel loaded at an arbitrary address.
The dependencies on having an idenetity mapped page in the kernel
page tables for SMP bootup have all been removed.
In addition the trampoline has been modified to verify
that long mode is supported. Asking if long mode is implemented is
down right silly but we have traditionally had some of these checks,
and they can't hurt anything. So when the totally ludicrous happens
we just might handle it correctly.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>