[cfarm-users] gcc104: rosetta and disk space concerns

Zach van Rijn me at zv.io
Mon Oct 31 06:14:48 CET 2022


On Sun, 2022-10-30 at 21:43 -0500, Jacob Bachmeyer wrote:
> Zach van Rijn via cfarm-users wrote:
> > On Mon, 2022-10-24 at 17:29 +0200, Pierre Muller wrote:
> >   
> > > ...
> 
> > I've just cleared the cache:
> > 
> > gcc104:~ root# ./rosetta_check 
> > 131G    /System/Volumes/Data/private/var/db/oah/
> > gcc104:~ root# ./rosetta_clear 
> > gcc104:~ root# ./rosetta_check 
> > 4.0K    /System/Volumes/Data/private/var/db/oah/
> >   
> 
> Are those scripts simple enough that you could post them
> here?  I might have some ideas that could be helpful, or I
> might be able to write a quick cleanup tool that will purge any
> cache entries not used recently.

It's just (as a quick temporary measure):

    d=/System/Volumes/Data/private/var/db/oah/
    du -sh $d

and

    d=/System/Volumes/Data/private/var/db/oah/
    rm -fr $(find $d -mindepth 1 -maxdepth 1 -type d)

The latter could be easily modified to find files modified more
than a few minutes (or hours) ago, which would probably mitigate
the side effect of killing actively-running tests.

Run via 'cron' it would reduce the maintenance burden on my end,
though I opine that a better solution should be sought before
this one is made to be "good enough" to render a real fix moot.


> If the tests are as I suspect (many binaries built/run once and
> then discarded) then I ask how Rosetta knows when to use a
> cached artifact

>From what I can tell, after the 'exec' system call, the 'oahd'
daemon checks for the existence of a corresponding .aot file [0].

If it does not exist, it is created, else it is executed. E.g.:

$ ~/foo
hello
$ pwd && ls -l && file foo.aot 
/System/Volumes/Data/private/var/db/oah/be848028e6c67c55a1e77a09d
95e94553f86b7de4784e7669ac3fb380cd89e32/2d96d9e84db99761782556b30
668cad5ad1c0f5caf31740396f3d5a42b38384b
total 24
-rwxr-xr-x  1 _oahd  _oahd  8424 Oct 30 22:32 foo.aot
foo.aot: Mach-O 64-bit executable arm64


> and could we thereby link the cached artifacts back to their
> original binaries and then quickly remove artifacts that derive
> from binaries that are no longer on the system?  (An hourly
> cron job ought to serve nicely here.)  Purging pre-translated
> "shadow" copies of erased binaries also sounds like something
> Rosetta should do on its own;

Are you suggesting scanning the disk for x86_64 binaries, then
matching them to translated .aot files, and then removing .aot
files which do not have a corresponding parent binary?

That would, in my opinion, be unnecessarily expensive/invasive.

A safer bet would be to remove all cached binaries that are not
currently being translated or loaded. Perhaps 24 hours for sure.

What happens if the cache is cleared while a process is running?

Are Pierre's tests aborting because the translation is deleted
before one is executed or while it is being executed?


> have you tried complaining to Apple about this?

No, but I will reach out to a few contacts there to see if there
is a known workaround since this issue is sure to affect more
than a few people. With Rosetta and SIP enabled, won't your disk
fill up if you don't do this type of maintenance? By design???

I'll write to the list if I hear back from any of my contacts.


ZV


[0]: https://ffri.github.io/ProjectChampollion/part1/ <-- juicy!



More information about the cfarm-users mailing list