On Wed, Jan 19, 2022 at 8:48 PM Niels Möller nisse@lysator.liu.se wrote:
Maamoun TK maamoun.tk@googlemail.com writes:
I created merge requests that have improvements of Chacha20 for arm64 and s390x architectures by following the approach used in powerpc implementation. https://git.lysator.liu.se/nettle/nettle/-/merge_requests/37 https://git.lysator.liu.se/nettle/nettle/-/merge_requests/40 The patches have 80.85% speedup for arm64 arch and 284.79% speedup for s390x arch.
Nice, I've had a quick first look.
It would be nice if the arm64 patch will be tested on big-endian mode
since
I don't have access to any big-endian variant for testing.
I've merged the arm64 code to a branch, for CI testing.
For the ARM code, which instructions are provided by the asimd extension? Basic simd is always available, if I've understood correctly.
As far as I understand, SIMD is called Advanced SIMD on AArch64 and it's standard for this architecture. simd is enabled by default in GCC but it can be disabled with nosimd option as I can see in here https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html which is why I made a specific config option for it.
regards, Mamone
Regards, /Niels
-- Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677. Internet email is subject to wholesale government surveillance.