[cfarm-users] Fixing CPU/core/threads count

Brice Goglin Brice.Goglin at free.fr
Mon Jul 6 07:46:39 CEST 2020


Le 06/07/2020 à 01:08, Zach van Rijn a écrit :
> On Sun, 2020-07-05 at 23:24 +0200, Baptiste Jonglez via cfarm-users
> wrote:
>> Hi Brice,
>>
>> On 04-04-20, Brice Goglin wrote:
>>> Le 04/04/2020 à 20:28, Baptiste Jonglez a écrit :
>>>> Following up on a thread from 2018 [1], we still have issue with
>>>> counting
>>>> the number of CPU, cores and threads on "exotic" machines.
>> We encountered another interesting case: on the new gcc102 machine
>> with a Sparc T3 CPU, hwloc reports 64 cores, while there are actually
>> only 32 cores (2 sockets x 16 cores).
>>
>> hwloc apparently thinks that there are 4 threads per core, instead of
>> the correct value 8.


Hello

If thread_siblings in sysfs reports 4 threads per core as shown below,
that's what hwloc will use.

This is indeed inconsistent with the "core_id" sysfs information which
says that 8 threads have the same core ID.

It's not clear to me whether it's a firmware or kernel bug. The kernel
code seems fairly simple and generic here. So maybe it's the firmware
reporting inconsistent info (and hwloc unfortunately uses the wrong one
because siblings maps are easier to use :/).

Brice





> On Sun, 2020-07-05 at 23:11 +0200, Baptiste Jonglez wrote:
>> ...
>> I am fixing the core count and will ask hwloc's author about
>> this.
> One funny thing I've observed previously:
>
> for k in $(seq 0 $(($(nproc)-1))); do
>     cat /sys/devices/system/cpu/cpu$k/topology/thread_siblings;
> done
>
> prints:
>
> ...,00000000,00000000,00000000,0000000f
> ...,00000000,00000000,00000000,0000000f
> ...,00000000,00000000,00000000,0000000f
> ...,00000000,00000000,00000000,0000000f
> ...,00000000,00000000,00000000,000000f0
> ...,00000000,00000000,00000000,000000f0
> ...,00000000,00000000,00000000,000000f0
> ...,00000000,00000000,00000000,000000f0
> ...
>
> (or friendlier thread_siblings_list):
>
> 0-3
> 0-3
> 0-3
> 0-3
> 4-7
> 4-7
> 4-7
> 4-7
> 8-11
> 8-11
> ...
>
> instead of what I would expect:
>
> 0-7
> 0-7
> 0-7
> 0-7
> 0-7
> 0-7
> 0-7
> 0-7
> 8-15
> 8-15
> 8-15
> 8-15
> 8-15
> 8-15
> 8-15
> 8-15
> ...
>
>
> perhaps this is related?
>
>
> ZV
>



More information about the cfarm-users mailing list