[cfarm-users] CPU speeds of GCC135?

Andy Polyakov appro at openssl.org
Sun Oct 28 13:33:04 CET 2018


> The initial benchmarks are kind of flat when using 3.8 GHz as the
> frequency. I think the problem is, we are not working the machine hard
> enough so the cpu's are reluctant to move from a low energy state.

I'd say it's more likely because POWER9 appears to be "allergic" to
mixtures of vector and scalar instructions. And since you are likely to
reference memory you will always have scalar instructions at least to
calculate effective addresses. Normalized[!] difference to POWER8 can be
anywhere from "little" to a "lot". Example of "little" can be ~15% in
SHA512(*) and VSX Chacha20. Example of "lot" is ~50% for pre-VSX
Chacha20 implementation where one interleaves scalar and vector in more
or less equal proportion. Though on the other hand pure scalar code is
normally faster...

(*) this is *after* optimization, which simply by omitting Ktable++ in
each round and switching to Ktable+=8 every 8th gave ~7%.


More information about the cfarm-users mailing list