[Scons-dev] Eliding content signature calculation during cache extraction

Andrew C. Morrow andrew.c.morrow at gmail.com
Sat Feb 3 17:52:40 EST 2018


Moving this to the developers list, perhaps it was a little to internals
focused for the users group.


On Sat, Jun 3, 2017 at 1:50 PM, Andrew C. Morrow <andrew.c.morrow at gmail.com>
wrote:

>
> Currently, if you extract a file from the cache, the content signature for
> that file must be computed. But presumably we knew what the content
> signature was when we placed the object in the cache in the first place.
>
> It seems like it should be possible to write the content signature to the
> cache along with the object itself. Then, when a needed object is found in
> the cache, we can also read back its content signature, and avoid the extra
> overhead of recomputing the content signature.
>
> I'm imagining a system where if an object is stored to the cache as
> $CACHE_DIR/00/00ea7297f00248e292eaca1bba9999c8, then we would also write
> a file $CACHE_DIR/00/00ea7297f00248e292eaca1bba9999c8.csig, containing
> the content signature of the object we just inserted.
>
> Then when we go to retrieve an object from the cache, once we have
> computed the build signature which points us to a specific object, if we
> find a .csig file along with the object itself, we just set the contents of
> that .csig file as the content signature for the object, and elide the
> computation.
>
> I think this would have the potential to significantly speed up extracting
> files from the cache.
>
> Once nice feature of this would be that I think it wouldn't require an
> upgrade to the existing cache schema, since if you don't find the .csig
> file, you just compute the content signature as you always would have.
> Similarly, if you have a .csig file, but the version of SCons you are using
> is unaware of the feature, it will ignore it, and again compute the
> signature.
>
> The only thing that seems somewhat risky to me is that SCons would now be
> reading and writing two files rather than one for each object, so perhaps
> there are some issues around atomicity and consistency with multiple
> consumers and producers using the same cache? I'd think that is probably
> tractable though, with some thought and careful coding.
>
> Does this seem like a useful feature? Does the suggested implementation
> seem plausible? Any risks or adverse consequences?
>
> Thanks,
> Andrew
>
>

I've pushed a repo that demonstrates the issue I'd like to solve:

https://github.com/acmorrow/scons-cache-csig-demo

The repo sets up a little C++ project that uses the MD5-Timestamp decider
and MaxDrift(1) to minimize signature computations. It also configures a
CacheDir, and patches in a signature computation function that logs.

When you build it initially, you get output like this:

$ scons --implicit-cache
scons: Reading SConscript files ...
scons: done reading SConscript files.
scons: Building targets ...
scons: building associated VariantDir targets: variant
Computed csig for hello_world.cpp f171722740523b90468ffaf11d971c38
Computed csig for /usr/bin/g++ a40c4fb32d258fcf4c8d3bc08e19911a
g++ -o variant/hello_world.o -c hello_world.cpp
Computed csig for variant/hello_world.o 60bf0dd78f95741711096157b4fab395
g++ -o variant/hello_world variant/hello_world.o
Computed csig for variant/hello_world fefbbbb259bcf3eec74e8549f513a30c
scons: done building targets.


All well and good. Now, make a simple but meaningful edit to
hello_world.cpp (I like to add a space in the "Hello World" string), and
rebuild:

$ scons --implicit-cache
scons: Reading SConscript files ...
scons: done reading SConscript files.
scons: Building targets ...
scons: building associated VariantDir targets: variant
Computed csig for hello_world.cpp cdbe23ce39e496da7eb619daa26705ba
g++ -o variant/hello_world.o -c hello_world.cpp
Computed csig for variant/hello_world.o 7588f63cd96e0c4319ea68fbc631e641
g++ -o variant/hello_world variant/hello_world.o
Computed csig for variant/hello_world 9799cc026834284de30f85142f5a1bf1
scons: done building targets.


Again, all makes sense. We need to compute the signatures for everything
again. Now, restore hello_world.cpp to its original form, and rebuild

$ scons --implicit-cache
scons: Reading SConscript files ...
scons: done reading SConscript files.
scons: Building targets ...
scons: building associated VariantDir targets: variant
Computed csig for hello_world.cpp f171722740523b90468ffaf11d971c38
Retrieved `variant/hello_world.o' from cache
Computed csig for variant/hello_world.o 60bf0dd78f95741711096157b4fab395
Retrieved `variant/hello_world' from cache
Computed csig for variant/hello_world fefbbbb259bcf3eec74e8549f513a30c
scons: done building targets.

The good news is that the objects come back from the cache, but the bad
news is that we need to recompute their signatures. That shouldn't really
be necessary - we could have easily stashed the signatures in the cache,
along with the files, when we pushed them. We definitely knew the
signatures at cache push time, after all.

I started working on implementing just that, and a preliminary version is
posted here: https://github.com/acmorrow/scons/tree/cache-csig

However, I haven't quite been able to get it to work, and my efforts to
debug it have just lead me to a confusing conclusion. I'm hoping someone
can shed some light.

Starting clean again, the first issue is a puzzle. It seems like we compute
the signature an extra time:

scons.py --implicit-cache --cache-debug=-
scons: Reading SConscript files ...
scons: done reading SConscript files.
scons: Building targets ...
scons: building associated VariantDir targets: variant
Computed csig for hello_world.cpp f171722740523b90468ffaf11d971c38
Computed csig for /usr/bin/g++ a40c4fb32d258fcf4c8d3bc08e19911a
CacheRetrieve(variant/hello_world.o):  89a00ced672ec25c4317d808f220f20b not
in cache
g++ -o variant/hello_world.o -c hello_world.cpp
CachePush(variant/hello_world.o):  pushing to
89a00ced672ec25c4317d808f220f20b
Computed csig for variant/hello_world.o 60bf0dd78f95741711096157b4fab395
CachePush(variant/hello_world.o):  pushing signature
60bf0dd78f95741711096157b4fab395
Computed csig for variant/hello_world.o 60bf0dd78f95741711096157b4fab395
CacheRetrieve(variant/hello_world):  e3599414ec8697fc5bfc89dcd6d3c72a not
in cache
g++ -o variant/hello_world variant/hello_world.o
CachePush(variant/hello_world):  pushing to e3599414ec8697fc5bfc89dcd6d3c72a
*Computed csig for variant/hello_world fefbbbb259bcf3eec74e8549f513a30c*
CachePush(variant/hello_world):  pushing signature
fefbbbb259bcf3eec74e8549f513a30c
*Computed csig for variant/hello_world fefbbbb259bcf3eec74e8549f513a30c*
scons: done building targets.

The first bolded line shows us computing the csig so that we can push the
.csig file into the cache. Since we have already called get_csig on the
node, I would have expected the memo'd value to be re-used. But something
wants that signature later on, and finds it needs to recompute it? Why is
that? I suspect that somewhere the signature info in the nodes ninfo is
getting dropped, but I'm not sure when or why? It seems silly to drop it if
we are just going to need it again later.

The second, and I suspect related issue, is that I'm not sure where to
attach the signature information I've read out of the cache. You can see
the logic here:

https://github.com/acmorrow/scons/blob/638e8fae96be9a6f448411fb6699d8fd92062975/src/engine/SCons/CacheDir.py#L72

If we start moving the state of hello_world.cpp back and forth, we can see
that eventually, we do start getting "hits" on the cached csig file:

scons.py --implicit-cache --cache-debug=-
scons: Reading SConscript files ...
scons: done reading SConscript files.
scons: Building targets ...
scons: building associated VariantDir targets: variant
Computed csig for hello_world.cpp f171722740523b90468ffaf11d971c38
Retrieved `variant/hello_world.o' from cache
CacheRetrieve(variant/hello_world.o):  retrieving from
89a00ced672ec25c4317d808f220f20b
CacheRetrieve(variant/hello_world.o):  cached signature hit
60bf0dd78f95741711096157b4fab395
Computed csig for variant/hello_world.o 60bf0dd78f95741711096157b4fab395
Retrieved `variant/hello_world' from cache
CacheRetrieve(variant/hello_world):  retrieving from
e3599414ec8697fc5bfc89dcd6d3c72a
CacheRetrieve(variant/hello_world):  cached signature hit
fefbbbb259bcf3eec74e8549f513a30c
Computed csig for variant/hello_world fefbbbb259bcf3eec74e8549f513a30c
scons: done building targets.

Now it isn't a surprise that we also again computed the csig for the
binary: after all, we haven't wired the data we read out of the cache
anywhere. But, unfortunately, doing the obvious thing of setting

            t.get_ninfo().csig = cached_csig

In my modified CacheDir.py doesn't have the intended effect - we still end
up computing the csig for things we pull out of the cache.

So, my first question is - should setting it on t.get_ninfo().csig as I am
attempting to do work? Or is there another (less ephemeral?) place I should
be setting it after pulling the signature out of the cache?

Having gotten this far, I got curious about where it was that the csig in
the ninfo was being reset, so I added a stack traceback immediately after
my log in my patched get_csig. The result is interesting:

Computed csig for variant/hello_world fefbbbb259bcf3eec74e8549f513a30c
  File "../../../oss/scons/src/script/scons.py", line 201, in <module>
    SCons.Script.main()
  File
"/Users/acm/Documents/from-retina/Documents/10gen/dev/oss/scons/src/engine/SCons/Script/Main.py",
line 1361, in main
    _exec_main(parser, values)
  File
"/Users/acm/Documents/from-retina/Documents/10gen/dev/oss/scons/src/engine/SCons/Script/Main.py",
line 1324, in _exec_main
    _main(parser)
  File
"/Users/acm/Documents/from-retina/Documents/10gen/dev/oss/scons/src/engine/SCons/Script/Main.py",
line 1103, in _main
    nodes = _build_targets(fs, options, targets, target_top)
  File
"/Users/acm/Documents/from-retina/Documents/10gen/dev/oss/scons/src/engine/SCons/Script/Main.py",
line 1298, in _build_targets
    jobs.run(postfunc = jobs_postfunc)
  File
"/Users/acm/Documents/from-retina/Documents/10gen/dev/oss/scons/src/engine/SCons/Job.py",
line 111, in run
    self.job.start()
  File
"/Users/acm/Documents/from-retina/Documents/10gen/dev/oss/scons/src/engine/SCons/Job.py",
line 216, in start
    task.executed()
  File
"/Users/acm/Documents/from-retina/Documents/10gen/dev/oss/scons/src/engine/SCons/Script/Main.py",
line 237, in executed
    SCons.Taskmaster.OutOfDateTask.executed(self)
*  File
"/Users/acm/Documents/from-retina/Documents/10gen/dev/oss/scons/src/engine/SCons/Taskmaster.py",
line 313, in executed_with_callbacks*
*    t.built()*
*  File
"/Users/acm/Documents/from-retina/Documents/10gen/dev/oss/scons/src/engine/SCons/Node/FS.py",
line 3197, in built*
*    SCons.Node.Node.built(self)*
*  File
"/Users/acm/Documents/from-retina/Documents/10gen/dev/oss/scons/src/engine/SCons/Node/__init__.py",
line 772, in built*
*    self.ninfo.update(self)*
*  File
"/Users/acm/Documents/from-retina/Documents/10gen/dev/oss/scons/src/engine/SCons/Node/__init__.py",
line 369, in update*
*    setattr(self, f, func())*
*  File
"/Users/acm/Documents/from-retina/Documents/10gen/dev/src/experiments/scons-csig-cache/SConstruct",
line 43, in monkey_get_csig*
*    traceback.print_stack()*

The interesting part is in bold. Apparently, after the node is built, we
ask the ninfo to update, which discards all state and immediately
re-computes it, including the csig.

I'd been assuming that the reason we computed a signature after extracting
something from the cache was because some other subsystem required it. But
apparently that is not the case. Instead, it seems that we are discarding
and then re-creating the signature unconditionally. If I turn off the
"re-create" part of the NodeInfoBase.update method, then I don't see the
signature being recomputed!

Can anyone offer some insight into why it throws away all of the NodeInfo
after building the node? And, given that it appears that nothing *does*
need the signature again after pulling from the cache, since it never gets
called again even when unset, is there still utility to caching the
signatures as I'm attempting to do? The extra signature computation had
convinced me that something needed it, but maybe not?

Thanks,
Andrew
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist2.pair.net/pipermail/scons-dev/attachments/20180203/0e408852/attachment-0001.html>


More information about the Scons-dev mailing list