Maamoun TK maamoun.tk@googlemail.com writes:
I created merge requests that have improvements of Chacha20 for arm64 and s390x architectures by following the approach used in powerpc implementation. https://git.lysator.liu.se/nettle/nettle/-/merge_requests/37 https://git.lysator.liu.se/nettle/nettle/-/merge_requests/40 The patches have 80.85% speedup for arm64 arch and 284.79% speedup for s390x arch.
Nice, I've had a quick first look.
It would be nice if the arm64 patch will be tested on big-endian mode since I don't have access to any big-endian variant for testing.
I've merged the arm64 code to a branch, for CI testing.
For the ARM code, which instructions are provided by the asimd extension? Basic simd is always available, if I've understood correctly.
Regards, /Niels