[Scons-dev] RFC: Proposal for remote caching

Mats Wichmann mats at wichmann.us
Tue Dec 10 10:50:50 EST 2019


On 12/10/19 6:50 AM, Adam Gross via Scons-dev wrote:
> The random python comparisons I have seen online with python 3 is that 
> SHA-1 is faster than MD5. With python 2 I don’t think that was the case. 
> Remote caching wouldn’t come earlier than 4.0, so it’s reasonable to 
> assume py3 anyway.
> 
> I am not sure about SHA-256.

256 is a bit slower. Here are some random relative timings from a Linux 
host, excluding the sha3* family (which are all slower):

==== md5        [4.895055413013324, 4.794695725897327, 4.788602610118687]
==== sha1       [4.8523716980125755, 4.790286544011906, 4.779321678215638]
==== sha256     [5.980340243084356, 6.025372178060934, 5.983012914890423]
==== sha512     [7.033806097926572, 7.030697694979608, 7.019461386138573]
==== blake2b_20 [4.735298512969166, 4.768527464941144, 4.754039308987558]
==== blake2b    [4.883447706932202, 4.856357607059181, 4.866529860068113]
==== blake2s_20 [4.393687708070502, 4.392260011984035, 4.4033572028856725]
==== blake2s    [4.266355097061023, 4.233708509942517, 4.228863182011992]
==== sha224     [5.833340571029112, 5.849389663897455, 5.8033752529881895]
==== sha384     [6.833396520931274, 6.844970899866894, 6.844409989891574]


On a Windows host the timings fall into roughly the same ranking, 
although blake2s wins by more over md5/sha1 (it's more like 40% faster 
than 12% faster)

This isn't particularly important, I'm just following up.

It just seems odd to me to add md5 support anywhere that doesn't have 
it, when you usually get flamed by someone when you mention md5 these 
days :)


> 
>  >> Also there needs to be a reasonable solution to (de)serializing 
> which hash is used to sconsign.
> 
> Doing it based on length would be fine. The Bazel remote cache change we 
> are going to upstream (adding support for MD5 and SHA-1) does that 
> because the maintainer preferred its simplicity.
> 
> *From:* Scons-dev <scons-dev-bounces at scons.org> *On Behalf Of *Bill Deegan
> *Sent:* Monday, December 9, 2019 6:51 PM
> *To:* SCons developer list <scons-dev at scons.org>
> *Subject:* Re: [Scons-dev] RFC: Proposal for remote caching
> 
> re switching hashes.
> 
> Do we have any perf comparisons for MD5 vs SHA-256 in general and in SCons?
> 
> While I think adding SHA-256 has value, I'd be hesitant to make it the 
> default and/or remove MD5.
> 
> Also there needs to be a reasonable solution to (de)serializing which 
> hash is used to sconsign.
> 
> On Mon, Dec 9, 2019 at 1:34 PM Mats Wichmann <mats at wichmann.us 
> <mailto:mats at wichmann.us>> wrote:
> 
> 
>      > 2.1.3 Changes Needed To Bazel Remote Cache Server
>      >
>      > Currently the Bazel remote cache server only supports SHA-256 for
>     requests (e.g. GET http://bazel-cache.corp.int/cache/ac/
>     <https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fbazel-cache.corp.int%2Fcache%2Fac%2F&data=02%7C01%7Cgrossag%40vmware.com%7C0f3e205b7aae4d9514a208d77d02bdb4%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637115323036911396&sdata=bUEJneQzYW8VcTrfN4s9uWCawwRWp%2B%2FixwrZOuPSsag%3D&reserved=0><sha_256_hash>),
>     while SCons by default uses MD5. As part of this project, VMware
>     will be contributing code to the upstream Bazel remote cache server
>     project to support MD5 and SHA-1. We have received confirmation from
>     the project maintainer that (1) it is acceptable to do this and (2)
>     no prefix is needed for these alternative hashing formats. As a
>     result, the requests SCons would make would be of the form
>     http://bazel-cache.corp.int/cache/ac/
>     <https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fbazel-cache.corp.int%2Fcache%2Fac%2F&data=02%7C01%7Cgrossag%40vmware.com%7C0f3e205b7aae4d9514a208d77d02bdb4%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637115323036911396&sdata=bUEJneQzYW8VcTrfN4s9uWCawwRWp%2B%2FixwrZOuPSsag%3D&reserved=0><md5_hash>
>     or http://bazel-cache.corp.int/cache/ac/
>     <https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fbazel-cache.corp.int%2Fcache%2Fac%2F&data=02%7C01%7Cgrossag%40vmware.com%7C0f3e205b7aae4d9514a208d77d02bdb4%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637115323036921389&sdata=jIg1xouKCbwfFxKG4eOatqg4sjWmsj0RBvVX0w1XqgA%3D&reserved=0><sha1_hash>.
>     As mentioned before, see the Threat Modeling section at the end of
>     this page for more discussion on hash formats.
> 
>     I'm not sure we should actually further push md5.  While it's not
>     intended to be used for security purposes (and yes I read the
>     section on
>     that), we've already run into users who are not allowed to use it no
>     matter what (there's a pending patch to failover to sha1 to address one
>     of those users' concerns)... and there are fast algorithms in the SHA-2
>     family as well as ones that didn't quite make the SHA-3 choice (namely,
>     Blake) which are quite fast on Python.  It may be time to transition?
> 
>     _______________________________________________
>     Scons-dev mailing list
>     Scons-dev at scons.org <mailto:Scons-dev at scons.org>
>     https://pairlist2.pair.net/mailman/listinfo/scons-dev
>     <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpairlist2.pair.net%2Fmailman%2Flistinfo%2Fscons-dev&data=02%7C01%7Cgrossag%40vmware.com%7C0f3e205b7aae4d9514a208d77d02bdb4%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637115323036921389&sdata=EKLzHg0N8xYEmvYX%2FtvWfeYpHAyN5mNx3COMD55rO9Y%3D&reserved=0>
> 
> 
> _______________________________________________
> Scons-dev mailing list
> Scons-dev at scons.org
> https://pairlist2.pair.net/mailman/listinfo/scons-dev
> 



More information about the Scons-dev mailing list