[Scons-dev] Proposal for a more efficient version of the Parallel scheduler

Bill Deegan bill at baddogconsulting.com
Wed Dec 7 14:25:52 EST 2022


Adam,

We just pushed a new parallel job implementation by Andrew Morrow which
might be worth trying in your environment.
It's currently an experimental feature enabled by --experimental=tm_v2 or
SetOption('experimental','tm_v2')

-Bill

On Tue, Jul 9, 2019 at 11:35 AM Adam Gross via Scons-dev <
scons-dev at scons.org> wrote:

> In VMware builds, the sheer number of tasks (33000 leaf tasks in the first
> iteration of ESX builds, for example) means that Parallel.start can take up
> to 20 minutes simply collecting and preparing tasks over the course of a
> build. I started a project to look at batch task handling in order to
> support remote caching (e.g. asking the cache for 1000 nodes at a time
> instead of 1) and realized that the approach that I want to take actually
> makes normal builds more efficient as well. In this e-mail, I’d like to
> explain my proposal so I can get your thoughts on it.
>
>
>
> --- Current Parallel.start performance problems ---
>
>
>
> Reference:
> https://github.com/SCons/scons/blob/master/src/engine/SCons/Job.py#L369
>
>
>
> In the current implementation, SCons collects just enough tasks to
> dispatch to the thread pool such that the number of active jobs is equal to
> the max number of jobs. It then waits for at least one job to be done,
> gathers all finished jobs, then repeats the process of collecting enough
> tasks to have jobs==self.maxjobs.
>
>
>
> Waiting on at least one job to be done misses an opportunity to keep
> calling taskmaster.next_task() and task.prepare() while jobs are active.
> These calls are not cheap for many reasons, including that it initiates
> scanning of source nodes.
>
>
>
> --- Proposal ---
>
>
>
> A first rough draft is contained in draft pull request
> https://github.com/SCons/scons/pull/3404 . In this form it is an
> alternative child class of Parallel; it could just replace it if people
> felt strongly.
>
>
>
> I would like to implement an alternative to the Parallel class that only
> waits for jobs to complete if there are no tasks left (i.e.
> taskmaster.next_task() returns None). It is optimized for keeping
> jobs==self.maxjobs but otherwise, will keep looking for more tasks. If
> there are no more tasks left, it waits for a job to complete and then
> rechecks whether there are any tasks left, just in case other tasks were
> unblocked by its completion.
>
>
>
> One very useful side effect is that this class will be collecting lists of
> tasks instead of operating on one at a time, so it serves as a useful
> building block towards remote caching. The current one-at-a-time cache
> retrieval approach wouldn’t work for remote caching due to network latency
> but this approach can.
>
>
>
> Please let me know what you think either over e-mail or on the
> aforementioned pull request.
>
>
>
> Thanks,
>
> Adam Gross
> _______________________________________________
> Scons-dev mailing list
> Scons-dev at scons.org
> https://pairlist2.pair.net/mailman/listinfo/scons-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist2.pair.net/pipermail/scons-dev/attachments/20221207/e726fa4d/attachment.htm>


More information about the Scons-dev mailing list