Combine all shash algs to be registered and use new crypto_[un]register_shashes
functions. This simplifies init/exit code.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The current code only increments the upper 64 bits of the SHA-512 byte
counter when the number of bytes hashed happens to hit 2^64 exactly.
This patch increments the upper 64 bits whenever the lower 64 bits
overflows.
Signed-off-by: Kent Yoder <key@linux.vnet.ibm.com>
Cc: stable@kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use standard ror64() instead of hand-written.
There is no standard ror64, so create it.
The difference is shift value being "unsigned int" instead of uint64_t
(for which there is no reason). gcc starts to emit native ROR instructions
which it doesn't do for some reason currently. This should make the code
faster.
Patch survives in-tree crypto test and ping flood with hmac(sha512) on.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Unfortunately in reducing W from 80 to 16 we ended up unrolling
the loop twice. As gcc has issues dealing with 64-bit ops on
i386 this means that we end up using even more stack space (>1K).
This patch solves the W reduction by moving LOAD_OP/BLEND_OP
into the loop itself, thus avoiding the need to duplicate it.
While the stack space still isn't great (>0.5K) it is at least
in the same ball park as the amount of stack used for our C sha1
implementation.
Note that this patch basically reverts to the original code so
the diff looks bigger than it really is.
Cc: stable@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The previous patch used the modulus operator over a power of 2
unnecessarily which may produce suboptimal binary code. This
patch changes changes them to binary ands instead.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use standard ror64() instead of hand-written.
There is no standard ror64, so create it.
The difference is shift value being "unsigned int" instead of uint64_t
(for which there is no reason). gcc starts to emit native ROR instructions
which it doesn't do for some reason currently. This should make the code
faster.
Patch survives in-tree crypto test and ping flood with hmac(sha512) on.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
For rounds 16--79, W[i] only depends on W[i - 2], W[i - 7], W[i - 15] and W[i - 16].
Consequently, keeping all W[80] array on stack is unnecessary,
only 16 values are really needed.
Using W[16] instead of W[80] greatly reduces stack usage
(~750 bytes to ~340 bytes on x86_64).
Line by line explanation:
* BLEND_OP
array is "circular" now, all indexes have to be modulo 16.
Round number is positive, so remainder operation should be
without surprises.
* initial full message scheduling is trimmed to first 16 values which
come from data block, the rest is calculated before it's needed.
* original loop body is unrolled version of new SHA512_0_15 and
SHA512_16_79 macros, unrolling was done to not do explicit variable
renaming. Otherwise it's the very same code after preprocessing.
See sha1_transform() code which does the same trick.
Patch survives in-tree crypto test and original bugreport test
(ping flood with hmac(sha512).
See FIPS 180-2 for SHA-512 definition
http://csrc.nist.gov/publications/fips/fips180-2/fips180-2withchangenotice.pdf
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
commit f9e2bca6c2
aka "crypto: sha512 - Move message schedule W[80] to static percpu area"
created global message schedule area.
If sha512_update will ever be entered twice, hash will be silently
calculated incorrectly.
Probably the easiest way to notice incorrect hashes being calculated is
to run 2 ping floods over AH with hmac(sha512):
#!/usr/sbin/setkey -f
flush;
spdflush;
add IP1 IP2 ah 25 -A hmac-sha512 0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000025;
add IP2 IP1 ah 52 -A hmac-sha512 0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000052;
spdadd IP1 IP2 any -P out ipsec ah/transport//require;
spdadd IP2 IP1 any -P in ipsec ah/transport//require;
XfrmInStateProtoError will start ticking with -EBADMSG being returned
from ah_input(). This never happens with, say, hmac(sha1).
With patch applied (on BOTH sides), XfrmInStateProtoError does not tick
with multiple bidirectional ping flood streams like it doesn't tick
with SHA-1.
After this patch sha512_transform() will start using ~750 bytes of stack on x86_64.
This is OK for simple loads, for something more heavy, stack reduction will be done
separatedly.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch replaces the 32-bit counters in sha512_generic with
64-bit counters. It also switches the bit count to the simpler
byte count.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch renames struct sha512_ctx and exports it as struct
sha512_state so that other sha512 implementations can use it
as the reference structure for exporting their state.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch changes sha512 and sha384 to the new shash interface.
Signed-off-by: Adrian-Ken Rueegsegger <ken@codelabs.ch>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The message schedule W (u64[80]) is too big for the stack. In order
for this algorithm to be used with shash it is moved to a static
percpu area.
Signed-off-by: Adrian-Ken Rueegsegger <ken@codelabs.ch>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
On Thu, Mar 27, 2008 at 03:40:36PM +0100, Bodo Eggert wrote:
> Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> wrote:
>
> > This patch cleanups the crypto code, replaces the init() and fini()
> > with the <algorithm name>_init/_fini
>
> This part ist OK.
>
> > or init/fini_<algorithm name> (if the
> > <algorithm name>_init/_fini exist)
>
> Having init_foo and foo_init won't be a good thing, will it? I'd start
> confusing them.
>
> What about foo_modinit instead?
Thanks for the suggestion, the init() is replaced with
<algorithm name>_mod_init ()
and fini () is replaced with <algorithm name>_mod_fini.
Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Rename sha512 to sha512_generic and add a MODULE_ALIAS for sha512
so all sha512 implementations can be loaded automatically.
Keep the broken tabs so git recognizes this as a rename.
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Not architecture specific code should not #include <asm/scatterlist.h>.
This patch therefore either replaces them with
#include <linux/scatterlist.h> or simply removes them if they were
unused.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
There are currently several SHA implementations that all define their own
initialization vectors and size values. Since this values are idential
move them to a header file under include/crypto.
Signed-off-by: Jan Glauber <jang@de.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The SHA384 block size should be 128 bytes, not 96 bytes. This was
spotted by Andrew Donofrio.
Fortunately the block size isn't actually used anywhere so this typo
has had no real impact.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Up until now algorithms have been happy to get a context pointer since
they know everything that's in the tfm already (e.g., alignment, block
size).
However, once we have parameterised algorithms, such information will
be specific to each tfm. So the algorithm API needs to be changed to
pass the tfm structure instead of the context pointer.
This patch is basically a text substitution. The only tricky bit is
the assembly routines that need to get the context pointer offset
through asm-offsets.h.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Various digest algorithms operate one block at a time and therefore
keep a temporary buffer of partial blocks. This buffer does not need
to be initialised since there is a counter which indicates what is and
isn't valid in it.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Some hash modules load/store data words directly. The digest layer
should pass properly aligned buffer to update()/final() method. This
patch also add cra_alignmask to some hash modules.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
A lot of crypto code needs to read/write a 32-bit/64-bit words in a
specific gender. Many of them open code them by reading/writing one
byte at a time. This patch converts all the applicable usages over
to use the standard byte order macros.
This is based on a previous patch by Denis Vlasenko.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!