[Scons-dev] proposed improvement to temp file names used by scons cache

Mats Wichmann mats at wichmann.us
Mon Aug 31 16:17:53 EDT 2020


On 8/31/20 1:06 PM, Raven Kopelman wrote:
> Hi there,
> 
> We have a CI build framework configured such that many machines are
> concurrently building and sharing a scons cache.  This cache lives on an
> Amazon EFS filesystem, mounted as NFS.
> 
> In general this has been spectacularly successful, but every once in a
> while corrupted files start coming out of the cache.  Our theory is that
> the EFS + NFS locking guarantees aren't good enough for the SCons temp
> name collision detection algorithm - attached is a patch we are going to
> try running with to see if it improves things.
> 
> In addition to hoping a formalized version of this will be considered
> for SCons, I'm curious if anyone sees a more likely explanation for the
> symptoms described above.
> 
> --- CacheDir.py 2020-08-19 12:59:25.790302000 -0700
> +++ CacheDir.py.uuid 2020-08-19 14:00:29.693749695 -0700
> @@ -32,6 +32,7 @@
>  import os
>  import stat
>  import sys
> +import uuid
> 
>  import SCons.Action
>  import SCons.Warnings
> @@ -100,7 +101,11 @@
> 
>      cd.CacheDebug('CachePush(%s):  pushing to %s\n', t, cachefile)
> 
> -    tempfile = cachefile+'.tmp'+str(os.getpid())
> +    # UUID in case filesystem doesn't support file operations well
> enough to deal with multiple
> +    # machines sharing a cache and attempting to write the same file at
> the same time (NFS mount of
> +    # AWS EFS?).
> +    # TODO: Long filename concern on Windows?
> +    tempfile = cachefile+'.tmp'+str(os.getpid()) + '_' + str(uuid.uuid1())
>      errfmt = "Unable to copy %s to cache. Cache file is %s"

probably not much reason to keep the getpid().. that's a pretty weak way
to generate a "unique" filename if there are multiple machines in play...




More information about the Scons-dev mailing list