[cfarm-users] CPU speeds of GCC135?
chuck.atkins at kitware.com
Wed Jun 5 21:08:58 CEST 2019
It could be also that you're hitting thread migration across cores, thus
missing out on the clock ramp up. I saw quite a few issues with that on
the p8 running benchmarks a while back. If that's the case then you can
try to pin your benchmark to a single cpu core (or a set of them if youd
like) by running it under "numactl -C <CORE_NUMBER> ./benchmark"
On Sun, Oct 28, 2018 at 9:23 AM Jeffrey Walton via cfarm-users <
cfarm-users at lists.tetaneutral.net> wrote:
> On Sun, Oct 28, 2018 at 8:42 AM Andy Polyakov via cfarm-users
> <cfarm-users at lists.tetaneutral.net> wrote:
> > > The initial benchmarks are kind of flat when using 3.8 GHz as the
> > > frequency. I think the problem is, we are not working the machine hard
> > > enough so the cpu's are reluctant to move from a low energy state.
> > I'd say it's more likely because POWER9 appears to be "allergic" to
> > mixtures of vector and scalar instructions. And since you are likely to
> > reference memory you will always have scalar instructions at least to
> > calculate effective addresses. Normalized[!] difference to POWER8 can be
> > anywhere from "little" to a "lot". Example of "little" can be ~15% in
> > SHA512(*) and VSX Chacha20. Example of "lot" is ~50% for pre-VSX
> > Chacha20 implementation where one interleaves scalar and vector in more
> > or less equal proportion. Though on the other hand pure scalar code is
> > normally faster...
> Thanks Andy.
> That's what we are seeing. AES and SHA slowed down, and the ChaChaR
> sped-up (even the SIMD version of ChaCha benefited).
> cfarm-users mailing list
> cfarm-users at lists.tetaneutral.net
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the cfarm-users