AEAD key parsing is duplicated to multiple places in the kernel. Add a
common helper function to consolidate that functionality.
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Mathias Krause <mathias.krause@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When comparing MAC hashes, AEAD authentication tags, or other hash
values in the context of authentication or integrity checking, it
is important not to leak timing information to a potential attacker,
i.e. when communication happens over a network.
Bytewise memory comparisons (such as memcmp) are usually optimized so
that they return a nonzero value as soon as a mismatch is found. E.g,
on x86_64/i5 for 512 bytes this can be ~50 cyc for a full mismatch
and up to ~850 cyc for a full match (cold). This early-return behavior
can leak timing information as a side channel, allowing an attacker to
iteratively guess the correct result.
This patch adds a new method crypto_memneq ("memory not equal to each
other") to the crypto API that compares memory areas of the same length
in roughly "constant time" (cache misses could change the timing, but
since they don't reveal information about the content of the strings
being compared, they are effectively benign). Iow, best and worst case
behaviour take the same amount of time to complete (in contrast to
memcmp).
Note that crypto_memneq (unlike memcmp) can only be used to test for
equality or inequality, NOT for lexicographical order. This, however,
is not an issue for its use-cases within the crypto API.
We tried to locate all of the places in the crypto API where memcmp was
being used for authentication or integrity checking, and convert them
over to crypto_memneq.
crypto_memneq is declared noinline, placed in its own source file,
and compiled with optimizations that might increase code size disabled
("Os") because a smart compiler (or LTO) might notice that the return
value is always compared against zero/nonzero, and might then
reintroduce the same early-return optimization that we are trying to
avoid.
Using #pragma or __attribute__ optimization annotations of the code
for disabling optimization was avoided as it seems to be considered
broken or unmaintained for long time in GCC [1]. Therefore, we work
around that by specifying the compile flag for memneq.o directly in
the Makefile. We found that this seems to be most appropriate.
As we use ("Os"), this patch also provides a loop-free "fast-path" for
frequently used 16 byte digests. Similarly to kernel library string
functions, leave an option for future even further optimized architecture
specific assembler implementations.
This was a joint work of James Yonan and Daniel Borkmann. Also thanks
for feedback from Florian Weimer on this and earlier proposals [2].
[1] http://gcc.gnu.org/ml/gcc/2012-07/msg00211.html
[2] https://lkml.org/lkml/2013/2/10/131
Signed-off-by: James Yonan <james@openvpn.net>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Florian Weimer <fw@deneb.enyo.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Bit sliced AES gives around 45% speedup on Cortex-A15 for encryption
and around 25% for decryption. This implementation of the AES algorithm
does not rely on any lookup tables so it is believed to be invulnerable
to cache timing attacks.
This algorithm processes up to 8 blocks in parallel in constant time. This
means that it is not usable by chaining modes that are strictly sequential
in nature, such as CBC encryption. CBC decryption, however, can benefit from
this implementation and runs about 25% faster. The other chaining modes
implemented in this module, XTS and CTR, can execute fully in parallel in
both directions.
The core code has been adopted from the OpenSSL project (in collaboration
with the original author, on cc). For ease of maintenance, this version is
identical to the upstream OpenSSL code, i.e., all modifications that were
required to make it suitable for inclusion into the kernel have been made
upstream. The original can be found here:
http://git.openssl.org/gitweb/?p=openssl.git;a=commit;h=6f6a6130
Note to integrators:
While this implementation is significantly faster than the existing table
based ones (generic or ARM asm), especially in CTR mode, the effects on
power efficiency are unclear as of yet. This code does fundamentally more
work, by calculating values that the table based code obtains by a simple
lookup; only by doing all of that work in a SIMD fashion, it manages to
perform better.
Cc: Andy Polyakov <appro@openssl.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
This patch fixes lack of license, otherwise x509_key_parser.ko taints kernel.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Only public keys, with certificates signed by an existing
'trusted' key on the system trusted keyring, should be added
to a trusted keyring. This patch adds support for verifying
a certificate's signature.
This is derived from David Howells pkcs7_request_asymmetric_key() patch.
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: David Howells <dhowells@redhat.com>
The keyring expansion patches introduces a new search method by which
key_search() attempts to walk directly to the key that has exactly the same
description as the requested one.
However, this causes inexact matching of asymmetric keys to fail. The
solution to this is to select iterative rather than direct search as the
default search type for asymmetric keys.
As an example, the kernel might have a key like this:
Magrathea: Glacier signing key: 6a2a0f82bad7e396665f465e4e3e1f9bd24b1226
and:
keyctl search <keyring-ID> asymmetric id:d24b1226
should find the key, despite that not being its exact description.
Signed-off-by: David Howells <dhowells@redhat.com>
Remove the certificate date checks that are performed when a certificate is
parsed. There are two checks: a valid from and a valid to. The first check is
causing a lot of problems with system clocks that don't keep good time and the
second places an implicit expiry date upon the kernel when used for module
signing, so do we really need them?
Signed-off-by: David Howells <dhowells@redhat.com>
cc: David Woodhouse <dwmw2@infradead.org>
cc: Rusty Russell <rusty@rustcorp.com.au>
cc: Josh Boyer <jwboyer@redhat.com>
cc: Alexander Holler <holler@ahsoftware.de>
cc: stable@vger.kernel.org
Handle certificates that lack an authorityKeyIdentifier field by assuming
they're self-signed and checking their signatures against themselves.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Josh Boyer <jwboyer@redhat.com>
Check that the algorithm IDs obtained from the ASN.1 parse by OID lookup
corresponds to algorithms that are available to us.
Reported-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Embed a public_key_signature struct in struct x509_certificate, eliminating
now unnecessary fields, and split x509_check_signature() to create a filler
function for it that attaches a digest of the signed data and an MPI that
represents the signature data. x509_free_certificate() is then modified to
deal with these.
Whilst we're at it, export both x509_check_signature() and the new
x509_get_sig_params().
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Josh Boyer <jwboyer@redhat.com>
Modify public_key_verify_signature() so that it now takes a public_key struct
rather than a key struct and supply a wrapper that takes a key struct. The
wrapper is then used by the asymmetric key subtype and the modified function is
used by X.509 self-signature checking and can be used by other things also.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Josh Boyer <jwboyer@redhat.com>
Store public key algo ID in public_key struct for reference purposes. This
allows it to be removed from the x509_certificate struct and used to find a
default in public_key_verify_signature().
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Josh Boyer <jwboyer@redhat.com>
Move the public-key algorithm pointer array from x509_public_key.c to
public_key.c as it isn't X.509 specific.
Note that to make this configure correctly, the public key part must be
dependent on the RSA module rather than the other way round. This needs a
further patch to make use of the crypto module loading stuff rather than using
a fixed table.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Josh Boyer <jwboyer@redhat.com>
Rename the arrays of public key parameters (public key algorithm names, hash
algorithm names and ID type names) so that the array name ends in "_name".
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Josh Boyer <jwboyer@redhat.com>
Move all users of ablk_helper under x86/ to the generic version
and delete the x86 specific version.
Acked-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Create a generic version of ablk_helper so it can be reused
by other architectures.
Acked-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Stephan Mueller reported to me recently a error in random number generation in
the ansi cprng. If several small requests are made that are less than the
instances block size, the remainder for loop code doesn't increment
rand_data_valid in the last iteration, meaning that the last bytes in the
rand_data buffer gets reused on the subsequent smaller-than-a-block request for
random data.
The fix is pretty easy, just re-code the for loop to make sure that
rand_data_valid gets incremented appropriately
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Stephan Mueller <stephan.mueller@atsec.com>
CC: Stephan Mueller <stephan.mueller@atsec.com>
CC: Petr Matousek <pmatouse@redhat.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Unfortunately, even with a softdep some distros fail to include
the necessary modules in the initrd. Therefore this patch adds
a fallback path to restore existing behaviour where we cannot
load the new crypto crct10dif algorithm.
In order to do this, the underlying crct10dif has been split out
from the crypto implementation so that it can be used on the
fallback path.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
crypto_larval_lookup should only return a larval if it created one.
Any larval created by another entity must be processed through
crypto_larval_wait before being returned.
Otherwise this will lead to a larval being killed twice, which
will most likely lead to a crash.
Cc: stable@vger.kernel.org
Reported-by: Kees Cook <keescook@chromium.org>
Tested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch reinstates commits
67822649d739761214ee0b95a7f85731d939625a2d31e518a4
Now that module softdeps are in the kernel we can use that to resolve
the boot issue which cause the revert.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Crypto layer only passes nbytes to encrypt but in omap-aes driver we need to
know number of SG elements to pass to dmaengine slave API. We add function for
the same to scatterwalk library.
Signed-off-by: Joel Fernandes <joelf@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Adjust alignment and replace commas by semicolons in automatically
generated code.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tables used from assembler should be marked __visible to let
the compiler know.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch removes redundant execution of the same test suite in cases
where alg and driver variables are the same (e.g. when alg_test is
called from tcrypt_test)
Signed-off-by: Cristian Stoica <cristian.stoica@freescale.com>
Reviewed-by: Horia Geanta <horia.geanta@freescale.com>
Reviewed-by: Ruchika Gupta <ruchika.gupta@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This reverts commits
67822649d739761214ee0b95a7f85731d939625a2d31e518a4
Unfortunately this change broke boot on some systems that used an
initrd which does not include the newly created crct10dif modules.
As these modules are required by sd_mod under certain configurations
this is a serious problem.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add support for lz4 and lz4hc compression algorithm using the lib/lz4/*
codebase.
[akpm@linux-foundation.org: fix warnings]
Signed-off-by: Chanho Min <chanho.min@lge.com>
Cc: "Darrick J. Wong" <djwong@us.ibm.com>
Cc: Bob Pearson <rpearson@systemfabricworks.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Herbert Xu <herbert@gondor.hengli.com.au>
Cc: Yann Collet <yann.collet.73@gmail.com>
Cc: Kyungsik Lee <kyungsik.lee@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There have never been any real users of MEMSET operations since they
have been introduced in January 2007 by commit 7405f74bad ("dmaengine:
refactor dmaengine around dma_async_tx_descriptor"). Therefore remove
support for them for now, it can be always brought back when needed.
[sebastian.hesselbarth@gmail.com: fix drivers/dma/mv_xor]
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Acked-by: Dan Williams <djbw@fb.com>
Cc: Tomasz Figa <t.figa@samsung.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Olof Johansson <olof@lixom.net>
Cc: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For the workqueue creation interfaces that do not expect format strings,
make sure they cannot accidently be parsed that way. Additionally, clean
up calls made with a single parameter that would be handled as a format
string. Many callers are passing potentially dynamic string content, so
use "%s" in those cases to avoid any potential accidents.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The template lookup interface does not provide a way to use format
strings, so make sure that the interface cannot be abused accidentally.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On Thu, Jun 20, 2013 at 10:00:21AM +0200, Daniel Borkmann wrote:
> After having fixed a NULL pointer dereference in SCTP 1abd165e ("net:
> sctp: fix NULL pointer dereference in socket destruction"), I ran into
> the following NULL pointer dereference in the crypto subsystem with
> the same reproducer, easily hit each time:
>
> BUG: unable to handle kernel NULL pointer dereference at (null)
> IP: [<ffffffff81070321>] __wake_up_common+0x31/0x90
> PGD 0
> Oops: 0000 [#1] SMP
> Modules linked in: padlock_sha(F-) sha256_generic(F) sctp(F) libcrc32c(F) [..]
> CPU: 6 PID: 3326 Comm: cryptomgr_probe Tainted: GF 3.10.0-rc5+ #1
> Hardware name: Dell Inc. PowerEdge T410/0H19HD, BIOS 1.6.3 02/01/2011
> task: ffff88007b6cf4e0 ti: ffff88007b7cc000 task.ti: ffff88007b7cc000
> RIP: 0010:[<ffffffff81070321>] [<ffffffff81070321>] __wake_up_common+0x31/0x90
> RSP: 0018:ffff88007b7cde08 EFLAGS: 00010082
> RAX: ffffffffffffffe8 RBX: ffff88003756c130 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffff88003756c130
> RBP: ffff88007b7cde48 R08: 0000000000000000 R09: ffff88012b173200
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000282
> R13: ffff88003756c138 R14: 0000000000000000 R15: 0000000000000000
> FS: 0000000000000000(0000) GS:ffff88012fc60000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 0000000000000000 CR3: 0000000001a0b000 CR4: 00000000000007e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Stack:
> ffff88007b7cde28 0000000300000000 ffff88007b7cde28 ffff88003756c130
> 0000000000000282 ffff88003756c128 ffffffff81227670 0000000000000000
> ffff88007b7cde78 ffffffff810722b7 ffff88007cdcf000 ffffffff81a90540
> Call Trace:
> [<ffffffff81227670>] ? crypto_alloc_pcomp+0x20/0x20
> [<ffffffff810722b7>] complete_all+0x47/0x60
> [<ffffffff81227708>] cryptomgr_probe+0x98/0xc0
> [<ffffffff81227670>] ? crypto_alloc_pcomp+0x20/0x20
> [<ffffffff8106760e>] kthread+0xce/0xe0
> [<ffffffff81067540>] ? kthread_freezable_should_stop+0x70/0x70
> [<ffffffff815450dc>] ret_from_fork+0x7c/0xb0
> [<ffffffff81067540>] ? kthread_freezable_should_stop+0x70/0x70
> Code: 41 56 41 55 41 54 53 48 83 ec 18 66 66 66 66 90 89 75 cc 89 55 c8
> 4c 8d 6f 08 48 8b 57 08 41 89 cf 4d 89 c6 48 8d 42 e
> RIP [<ffffffff81070321>] __wake_up_common+0x31/0x90
> RSP <ffff88007b7cde08>
> CR2: 0000000000000000
> ---[ end trace b495b19270a4d37e ]---
>
> My assumption is that the following is happening: the minimal SCTP
> tool runs under ``echo 1 > /proc/sys/net/sctp/auth_enable'', hence
> it's making use of crypto_alloc_hash() via sctp_auth_init_hmacs().
> It forks itself, heavily allocates, binds, listens and waits in
> accept on sctp sockets, and then randomly kills some of them (no
> need for an actual client in this case to hit this). Then, again,
> allocating, binding, etc, and then killing child processes.
>
> The problem that might be happening here is that cryptomgr requests
> the module to probe/load through cryptomgr_schedule_probe(), but
> before the thread handler cryptomgr_probe() returns, we return from
> the wait_for_completion_interruptible() function and probably already
> have cleared up larval, thus we run into a NULL pointer dereference
> when in cryptomgr_probe() complete_all() is being called.
>
> If we wait with wait_for_completion() instead, this panic will not
> occur anymore. This is valid, because in case a signal is pending,
> cryptomgr_probe() returns from probing anyway with properly calling
> complete_all().
The use of wait_for_completion_interruptible is intentional so that
we don't lock up the thread if a bug causes us to never wake up.
This bug is caused by the helper thread using the larval without
holding a reference count on it. If the helper thread completes
after the original thread requesting for help has gone away and
destroyed the larval, then we get the crash above.
So the fix is to hold a reference count on the larval.
Cc: <stable@vger.kernel.org> # 3.6+
Reported-by: Daniel Borkmann <dborkman@redhat.com>
Tested-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds unaligned buffer tests for hashes.
The first new test is with one byte offset and the second test checks if
cra_alignmask for driver is big enough; for example, for testing a case
where cra_alignmask is set to 7, but driver really needs buffers to be
aligned to 16 bytes.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds unaligned buffer tests for AEADs.
The first new test is with one byte offset and the second test checks if
cra_alignmask for driver is big enough; for example, for testing a case
where cra_alignmask is set to 7, but driver really needs buffers to be
aligned to 16 bytes.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds unaligned buffer tests for blkciphers.
The first new test is with one byte offset and the second test checks if
cra_alignmask for driver is big enough; for example, for testing a case
where cra_alignmask is set to 7, but driver really needs buffers to be
aligned to 16 bytes.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Patch adds check for alg_test_descs list order, so that accidentically
misplaced entries are found quicker. Duplicate entries are also checked for.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This reverts commit cf1521a1a5.
Instruction (vpgatherdd) that this implementation relied on turned out to be
slow performer on real hardware (i5-4570). The previous 8-way twofish/AVX
implementation is therefore faster and this implementation should be removed.
Converting this implementation to use the same method as in twofish/AVX for
table look-ups would give additional ~3% speed up vs twofish/AVX, but would
hardly be worth of the added code and binary size.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This reverts commit 6048801070.
Instruction (vpgatherdd) that this implementation relied on turned out to be
slow performer on real hardware (i5-4570). The previous 4-way blowfish
implementation is therefore faster and this implementation should be removed.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
It appears that the performance of 'vpgatherdd' is suboptimal for this kind of
workload (tested on Core i5-4570) and causes blowfish-avx2 to be significantly
slower than blowfish-amd64. So disable the AVX2 implementation to avoid
performance regressions.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
It appears that the performance of 'vpgatherdd' is suboptimal for this kind of
workload (tested on Core i5-4570) and causes twofish_avx2 to be significantly
slower than twofish_avx. So disable the AVX2 implementation to avoid
performance regressions.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
'sha512_generic' should set driver name now that there is alternative sha512
provider (sha512_ssse3).
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
These are simple tests to do sanity check of CRC T10 DIF hash. The
correctness of the transform can be checked with the command
modprobe tcrypt mode=47
The speed of the transform can be evaluated with the command
modprobe tcrypt mode=320
Set the cpu frequency to constant and turn turbo off when running the
speed test so the frequency governor will not tweak the frequency and
affects the measurements.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Glue code that plugs the PCLMULQDQ accelerated CRC T10 DIF hash into the
crypto framework. The config CRYPTO_CRCT10DIF_PCLMUL should be turned
on to enable the feature. The crc_t10dif crypto library function will
use this faster algorithm when crct10dif_pclmul module is loaded.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When CRC T10 DIF is calculated using the crypto transform framework, we
wrap the crc_t10dif function call to utilize it. This allows us to
take advantage of any accelerated CRC T10 DIF transform that is
plugged into the crypto framework.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use prandom_bytes() to generate random bytes for test data.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Dan Williams <djbw@fb.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch adds AVX2/AES-NI/x86-64 implementation of Camellia cipher, requiring
32 parallel blocks for input (512 bytes). Compared to AVX implementation, this
version is extended to use the 256-bit wide YMM registers. For AES-NI
instructions data is split to two 128-bit registers and merged afterwards.
Even with this additional handling, performance should be higher compared
to the AES-NI/AVX implementation.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Patch adds AVX2/x86-64 implementation of Serpent cipher, requiring 16 parallel
blocks for input (256 bytes). Implementation is based on the AVX implementation
and extends to use the 256-bit wide YMM registers. Since serpent does not use
table look-ups, this implementation should be close to two times faster than
the AVX implementation.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Patch adds AVX2/x86-64 implementation of Twofish cipher, requiring 16 parallel
blocks for input (256 bytes). Table look-ups are performed using vpgatherdd
instruction directly from vector registers and thus should be faster than
earlier implementations. Implementation also uses 256-bit wide YMM registers,
which should give additional speed up compared to the AVX implementation.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>