[cfarm-users] Fixing CPU/core/threads count

Brice Goglin Brice.Goglin at free.fr
Sat Apr 4 21:21:19 CEST 2020


Le 04/04/2020 à 20:28, Baptiste Jonglez a écrit :
> Hi,
>
> Following up on a thread from 2018 [1], we still have issue with counting
> the number of CPU, cores and threads on "exotic" machines.
>
> I gave a try at hwloc on all farm machines, below is the result where:
>
> M = machine
> N = node (NUMA node)
> P = package
> C = core
> T = pu (thread)
>
>
> gcc10   M1 N4 P2 C24 T24
> gcc13   M1 N2 P2 C4 T4
> gcc14   M1 N P2 C8 T8
> gcc22   M1 N P C2 T2
> gcc23   M1 N P C2 T2
> gcc45   M1 N P C4 T4
> gcc67   M1 N P1 C4 T8
> gcc70   M1 N P1 C1 T2
> gcc110  M1 N2 P16 C16 T64
> gcc112  M1 N4 P4 C20 T160
> gcc113  M1 N P C8 T8
> gcc114  M1 N P C8 T8
> gcc115  M1 N P C8 T8
> gcc116  M1 N P C8 T8
> gcc117  M1 N P4 C8 T8
> gcc120  M1 N2 P2 C16 T32
> gcc121  M1 N2 P2 C16 T32
> gcc122  M1 N2 P2 C16 T32
> gcc123  M1 N2 P2 C16 T32
> gcc135  M1 N2 P2 C32 T128
> gcc202  M1 N1 P1 C8 T64
> gcc203  M1 N2 P4 C4 T32
>
> Can someone with experience with each kind of machine make sense of this
> data, and determine which field we should use for "CPU" (sockets), "cores"
> and "thread" in https://cfarm.tetaneutral.net/machines/list/ ?
>
> From a first look, cores and threads seem to be correctly detected.
> To get the number of CPU sockets, "NUMA node" seems rather unreliable
> compared to "package", but both sometimes give strange results
> (e.g. gcc110).


Hello

Old POWER machines (e.g. gcc110 POWER7) are known to report strange or
invalid topology information, for instance by reporting one CPU package
per core. I don't remember the reason but IBM developers didn't want to
fix their firmware because it would break something else. Things are
reported correctly on modern POWER8/9 afaik.

Some ARM platforms (gcc117,118) have a similar issue but things are
improving now that vendors are implementing the PPTT ACPI table (once
you use a recent Linux kernel).

Using NUMA nodes for CPU sockets is indeed unreliable these days. Most
vendors can expose multiple NUMA nodes per CPU package.

Brice (the main hwloc developer)


>
> Maybe we should just stop trying to determine the number of CPU sockets
> except on x86 systems?  Does anybody need this kind of data?
>
>
> The data above was obtained with "hwloc-calc -N $type all", the full command is:
>
>     # echo M$(hwloc-calc -N machine all) N$(hwloc-calc -N numanode all 2>/dev/null) P$(hwloc-calc -N package all 2>/dev/null) C$(hwloc-calc -N core all) T$(hwloc-calc -N pu all)
>
> Thanks,
> Baptiste
>
> [1] https://lists.tetaneutral.net/pipermail/cfarm-users/2018-November/000424.html

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.tetaneutral.net/pipermail/cfarm-users/attachments/20200404/5bb5bf00/attachment.sig>


More information about the cfarm-users mailing list