[cfarm-users] cfarm427 and cfarm430 crashing
Pierre Muller
pierre at freepascal.org
Mon Jun 23 10:30:11 CEST 2025
Le 23/06/2025 à 04:07, Jacob Bachmeyer via cfarm-users a écrit :
> On 6/22/25 18:12, Segher Boessenkool via cfarm-users wrote:
>> Hi!
>> On Mon, Jun 23, 2025 at 12:01:59AM +0900, Luke Yasuda via cfarm-users wrote:
>>> On 2025-06-20 14:19, Pierre Muller via cfarm-users wrote:
>>> [...]
>>>> Do you suspect a link between the CPU exceeded time messages and the
>>>> crashes you are reporting?
>>>> Should I disable the testsuite on gcc430?
>>>>
>>>> I sincerely hope that the troubles are not due
>>>> to the scripts that I run on these test machines.
>>> I hope not too, I don't believe the free pascal test suites are the
>>> cause of those kernel crashes. You don't have to disable all of them
>>> (maybe just those very lengthy ones as you can tell).
>> No matter what they aren't the cause, the bug is elsewhere, but sure it
>> might be possible to get the machines more stable if we do not run such
>> testsuites on them anymore. We have no indication that would be true
>> though :-)
>
> Is the testsuite exercising network functions? The crash backtraces pointed the proverbial bloody finger at the network code.
None of the tests should rely on external network communication,
a few might check local connection, but I don't think that this is the case
for the ones listed in dmesg.
On cfarm427, in /var/log directory, I found a lot of lines like:
daemon.log:10935:Mar 14 03:28:34 cfarm427 fsck[2061]: /dev/vtbd1p1: UNREF FILE I=5128250 OWNER=muller MODE=100644
But nothing since March 14.
On cfarm430, I do get two tests that are kill due to the limit to 1 minute per test:
Jun 23 05:08:01 cfarm430 kernel: pid 79158 (theapthread), jid 0, uid 61083, was killed: exceeded maximum CPU limit
pid 80850 (tissurrogatepair2), jid 0, uid 61083, was killed: exceeded maximum CPU limit
Jun 23 05:09:07 cfarm430 kernel: pid 80850 (tissurrogatepair2), jid 0, uid 61083, was killed: exceeded maximum CPU limit
pid 93116 (theapthread), jid 0, uid 61083, was killed: exceeded maximum CPU limit
For tissurrogatepair2, it might really be that this comprehensive test is really too lengthy.
https://gitlab.com/freepascal.org/fpc/source/-/blob/main/tests/test/units/character/tissurrogatepair2.pp?ref_type=heads
I added a counter locally, the test does 66 millions checks, but normally lasts about 0.6 s!
The other test, theapthread.pp takes about 8 seconds usually,
so I also have no idea why it can sometimes use more that 60 seconds of CPU time...
But this test is about threads, and it can sometimes deadlock...
This might be the result of bad code generation in the Free Pascal compiler...
But the first test doesn't seem to use multithreading...
Pierre
Pierre
More information about the cfarm-users
mailing list