[cfarm-users] cfarm109 GPU access problem (was: cfarm109 additional details)

Thomas Schwinge tschwinge at baylibre.com
Wed Mar 11 16:36:07 CET 2026


Hi!

As maintainer of GCC's NVIDIA GPU support, I'd like to thank NVIDIA/David
Edelsohn for providing these modern NVIDIA systems, and also thank you,
Zach (and team?), for administration!


Is there some (permissions?) problem on cfarm109?  With 'nvidia-smi',
there are some errors, but the GPU shows up:

    $ nvidia-smi
    NvRmMemInitNvmap failed: error Permission denied
    NvRmMemMgrInit failed: Memory Manager Not supported, line 333
    NvRmMemMgrInit failed: error type 196626
    libnvrm_gpu.so: NvRmGpuLibOpen failed, error=196625
    NvRmMemInitNvmap failed: error Permission denied
    NvRmMemMgrInit failed: Memory Manager Not supported, line 333
    NvRmMemMgrInit failed: error type 196626
    libnvrm_gpu.so: NvRmGpuLibOpen failed, error=196625
    Wed Mar 11 07:19:17 2026       
    +-----------------------------------------------------------------------------------------+
    | NVIDIA-SMI 580.00                 Driver Version: 580.00         CUDA Version: 13.0     |
    +-----------------------------------------+------------------------+----------------------+
    | GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
    | Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
    |                                         |                        |               MIG M. |
    |=========================================+========================+======================|
    |   0  NVIDIA Thor                    Off |   00000000:01:00.0 Off |                  N/A |
    | N/A   43C  N/A               2W /  N/A  | Not Supported          |      0%      Default |
    |                                         |                        |             Disabled |
    +-----------------------------------------+------------------------+----------------------+
    [...]

However, attempting to actually use it, I get similar errors; fatal:

    $ [...]/nvptx-none-run [...]
    NvRmMemInitNvmap failed: error Permission denied
    NvRmMemMgrInit failed: Memory Manager Not supported, line 333
    NvRmMemMgrInit failed: error type 196626
    nvptx-run: cuInit failed: no CUDA-capable device is detected (CUDA_ERROR_NO_DEVICE, 100)


Grüße
 Thomas


On 2026-03-06T13:53:35-0600, Zach van Rijn via cfarm-users <cfarm-users at lists.tetaneutral.net> wrote:
> Dear Compile Farm Users,
>
>
> We are happy to announce the immediate availability of another
> NVIDIA machine: a Jetson AGX Thor with 14C and 128GB memory; it
> is available at cfarm109:2109. The announcement is here:
>
>     https://portal.cfarm.net/news/58
>
> This email is to provide additional information about how this
> machine has been configured and how it should (and should not)
> be used. It is slightly more specialized than cfarm107/cfarm108.
>
> In addition to some CPU and GPU differences, the base operating
> system is nominally Ubuntu 24.04 but with different kernels. For
> most GPU-accelerated tasks, please prefer cfarm107/cfarm108 if
> exact hardware doesn't matter as those are generally faster.
>
> cfarm109 should be reserved for users needing the specialized
> CPU features or specific kernel/software versions. Of course,
> everyone is free to use any machine for all legitimate purposes.
>
> Please report any issues to the mailing list(s) or privately. We
> welcome discussion about how to best configure these machines so
> that they are of maximum use to all users.
>
>
> Sincerely,
>
> Compile Farm Admins
>
> _______________________________________________
> cfarm-users mailing list
> cfarm-users at lists.tetaneutral.net
> https://lists.tetaneutral.net/listinfo/cfarm-users


More information about the cfarm-users mailing list