The ARM thread struct allocator is racy on SMP systems. Fix it by turning it into a per-cpu based allocator. This also allows keeps the cache cache warm for thread structs and kernel stacks. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>