[Scons-dev] Toolchain requirements

Kenny, Jason L jason.l.kenny at intel.com
Tue May 27 11:59:01 EDT 2014


Let me add thought that I have based on work trying to address this in my addon.

First off we need to clarify language. The one really annoying thing I have with SCons is the current language for a tool/toolchain/library/platform.

I would like to suggest a few things that I found good.


·        Toolchain – this is a collection of information about the tools to be used.

·        Tool – this the logic for setting up a commandline based tool. This means:

o   Setting up the environment in which the tool will run in

o   Setting up any variables with possible default values that are needed to setup the command ( stuff like CC or CCFLAGS etc.. depending on the tool)

o   Setting up the builder

·        Library – I have not addressed this in Parts. But the idea is solid and calling it a tool, I have found causes confusion, given some thought I believe it may different from a tool that it needs to be viewed as such. The main notion we need to address here is that a library does not setup tool, but configures a library to be used by a tool. This includes environment stuff, variable to be exapanded, and appending values to other varaibles so it is configured correctly by a given tool. We need to separate stuff like QT that has tools in it, from the library part. Another example might be the Intel compiler, that ships with MKL, TBB and other libraries. You don’t always want to use these libraries, but when you do it would be nice to make it easy to set these up, without just assuming we want everything.

·        Platforms – This is really a base notion that is needed to make tools work better. Some tools can work without it, but most really need this to help with finding the tool on the system, and to help support cross builds. This is generally useful for the user as well as they need to state what special flags of values they may need to add for building on a given system. For me the bare minimum we need for this is to define OS and Architecture ( general CPU type). What I believe would be useful to add to such an object would be CPU ( more exact CPU type, this is useful for adding special flags when targeting optimization for certain platforms), and some more stuff about the OS. In this case we only support posix/Darwin/sunos/windows. This is fine for lots of people, but at times posix is to general. Separating out BSD, and or Fedora from Ubuntu can be useful. This also needs to be extendable. I found this very helpful when supporting Android and Intel PHI ( K1OM) systems. Not having this makes it hard to get the value of SCons for the user on new or emerging platforms. We need to have two different platforms defined at a time. The Host platform ( what we are building on) and Target ( what we are building for, which is often the same as host). I use a case of “any” to deal with the cases of platform wildcards for cases when we don’t care about the value. What I have is define in the Parts user guild on the Parts site. We don’t need and I would suggest against it to do what autoconf does, which is to define build, host, target. In this case build is what I call host and host is what I call target. Target in autoconf is for defining what the tool you are going to build will output for. It only useful for compilers, and honestly working with compiler there is a much better and easier way to control this. Most people get host as being the system we are on and target is what we build for. We need to make this easy for normal developers use and understand.

·        Tool setup – In parts I call this ToolSetting/ToolInfo and they work with some objects called finders. This object for me is infrastructure to make it much easier to define how to find and what environment setup for a given version. This also helps defines default error messages and query what versions of a tool exists for some target platform. This system also get version information about the tool which is very useful when you have base tool requirements. While my version in Parts could use a little more clean up, the basic design of this allows us to modify it behind the covers to allow for better cache, startup behaviors. This is all documented in the Parts user guide.

With that stated. Let me state a few other experiences.

Tools:


·        It works best when we can separate tools to be independent, vs dependent. For example the intel compiler is a tool that requires GCC/GXX/CLang or MSVC tools to setup to run correctly. The version of that tool needs to be configured independent of intel tool. It is a lot easier to have the intelC tool assume that it the toolchain will be setup correctly, vs have the intelC tool try to configure a dependent tool. Another case of this is have seen a lot of on Linux is gnu compiler is the version of the autotools being used. While normally this is built-in when building you own compiler, on many system setup I have seen, users build on an old Linux system, and need to control a build to use a different autotools version that the “default” for some technical reason. The main point I have learned is that there is never a reason to have a tools setup another tool. Always have a tools setup what it needs to get that piece working, and let the “toolchain” logic deal with getting all the different tools defined

·        The current tools logic in SCons defines that all tool define a generate(env,**kw) and an exists(env,**kw) function. I talked long and hard with Steve Knight years ago on this. I was originally in a camp to add more stuff, as Gary had suggested, as defined arguments to these functions. I have changed to Steve Knights view which is that the environment should hold these. There are few things to consider with this in why this is better:

o   Simpler API

o   Transparent flow to the user. By this it is easy to see values that are used to setup the tool in the environment, vs what the system did setup. This really helps debug issue the user has locally as makes SCons seem less of a majic black box.

o   All the values as stated above are easy to pass in a environment object. If anything we should standardize on certain variable forms to normalize common setup cases

·        Tools should always setup a full path to the tool. This I found to be important as the toolchain gets more complex, more so on posix systems as it easy to get a path mixup that redirects a tool to a different one than was expected. This basically means stuff like CC when expanded does not expand to GCC but /opt/gcc-4.1/bin/gcc or /usr/bin/gcc depending on what should be used.

·        Variable such as CC or AS also need to be testable to type. I looked at other ways to deal with this, and made a toolsvar object to deal with this ( look at parts.tools.Common.toolvar). What this allows, which is very important is the ability to test if a certain toolchain is being used. For example is icc is being used, one might want to add a special option that would not work with gcc, or maybe add a special CPPDEFINE if the Microsoft compiler is being used vs minGW. The ability to test the which tool chain version of a tool being used is important. What I did solves the problem, without breaking existing build files, which allowing for safer fullpath of tools being used when doing a build ( which increases the reliability of the build a great deal)

·        All tools need to have input notions to control the setup of the tool correctly. ( This can make it easy to have a Toolsetup object for easy addition of new tools) These are:

o   Version, most tools have different versions that can be installed at a time. Some tool don’t really care, or only allow one version to be installed normally. In either case being able to state you need a base level of a tool is important

o   What host platform they can run on. There is little point in a tool testing to see if it exists if it does not run on linux or windows. This is also needed to allow for more efficient setup.

o   What target platform. May tools such as compiler have cross build cases that are installed. We need this to easily manage what we are setting up and testing if we can do what is requested. On some systems like android, you really can’t build it native on that system. In these cases you are always defining some sort of cross build. In the android NDK case it x86/x86_64/ARM/MIP. These details are important as the tools have very different setup depending on the version of the NDK being used. Having this also make cross building a breeze for other tools. Some tools such as documentation tool don’t really care as much here, as they output text data in some format. In these cases they just ignore this value.

o   Install root which as an input is a place to look for the tool, as an output is the place the tool was found. Normally this is what the toolsettings object will fill in

o   Use script which says one of three things:

§  Use the provided script to setup the environment. Would be set string path for the script to run

§  Use the tools default script, would be set to true in this case. The toolsetting object would have meta data on what to call

§  Use some predefined default, ie don’t use a script ( because it does not exist or don’t want to use it). Would be set to false.

o   Use_env – I have to add this back in to Parts again. This is basically the punt to use the shell. Normally I found this to be a bad thing, and an abused thing. There is however a very good use case of this. When a user is setting up a new tool version or a new platform that Scon does not know about yet, this can be easy way to bootstrap the build to get it working. This also is useful for a tool writer to test and tweak a tool to get it working by validating what is needed to get something to work.


General stuff
Unless you build is small, it is always best to say we cannot setup a tool at startup. Being lazy is always worse and tends to upset people as they may have waited 10 minutes to find out a tool was missing. People are happy when they know they can go get the cup of coffee or whatever and not have to worry that as soon as they leave it might fail because of a missing tool. They also don’t like to have to feel like they have to be chained to the desk as the build happens because it might fail because of some tool not being on the system when they are trying to setup a box.

Because of cross builds we have to make different environments. This means that all tools cannot be loaded in the same environment. Setting up GCC for android targets is different from GCC for linux targets. Also we cannot set up an environment that allows for build with GCC and intelc at the same time. The whole point of the environment is to separate which type of tool is used.

The default toolchain has to be easy to state on the command line. ( ie something like --toolchain=xyz,abc)

We want to separate the values for tool setup from the values that say what was setup. I added a namespace object ( ie a dictionary of dictionary object) in Parts to help with this. I have found it to be very useful.

Honestly setting up tools is fast, but when we setup or check for a lot of tools, this can take a second. This can easily be corrected by some simple cache logic that can do a quick md5/timestamp check for changes on the system and or an option to do a rebuild the cache. This need not be fancy. I know for build of vtune we have cross build with many different platforms, tools and compiler all on the same box. It takes longer for Python to import Parts than to do the check on the tools most of the time.

I have not finished it yet.. but I was addressing this in the Parts Setting object ( based on Gregs IAPAT thoughts), but toolchains we need to have easy to define allof/one of/etc like statements to make it easy to setup complex chains

The idea to get new external tools, should be independent from the build. Ie it should be a different action to query and control where you get it.

The issue with lazy tools I believe has more to do with the notion of allowing a toolchain to be define but only testing the tools is while we read the files something looked and or called something that would be used by a tool. For example if I don’t use G++ I don’t want to get messages that a 64-bit version is not installed, unless some Scons/Parts file queried or set a C++ unique variable in the environment and or call a build with a C++ source file. I get this. I think this needs some infrastructure to do this to make this easy, such as type of variable object we can to the environment or some tweak to a builder to load stuff. For me this is not really a speed issue, but more of a memory and easy of use issue. Adding stuff the environment takes memory and this can add up in large builds. Lots of memory effects speed in the end much more than a second of lookup to get some environment stuff setup. ( this might be better addressed by tweaks to make the environment more aggressive sharing and COW logic) As an easy of use we don’t want the build system to complain about a missing tool unless it is used.

Sure I have more thought and detail I could share based on why I have learned… but this mail is long enough as is ☺

Jason

From: Scons-dev [mailto:scons-dev-bounces at scons.org] On Behalf Of Gary Oberbrunner
Sent: Sunday, May 25, 2014 12:15 PM
To: SCons Dev List
Subject: [Scons-dev] Toolchain requirements

I'd like to kick off a round of discussion about toolchains, so can make some progress toward a design.  I have some preliminary thoughts.  Please comment.  Apologies for the HTML mail, let me know if this isn't readable for you.

·        Allow installing external tools (pip install or ...)
·
o   scons --version (or similar) should list installed tools and toolchains
o   missing external tools should give sensible errors
·        Tool setup must happen before reading SConstruct somehow
·
o   DefaultEnvironment and all new Environments should know about all tools
o   alternative: lazy-construct DefaultEnvironment
o   user-specified tools and toolchains need to be specifiable at beginning of build
·        User should be able to set default tools and toolchains
·
o   unused tools shouldn't take any startup time
·        Lazy init of tools and chains
·
o   This is faster because unused tools don't matter
o   It allows missing unused tools to not give errors, but missing used tools can (and should)
o   But it makes configuring environments much harder for users, because they can't override or append to tool-provided variables until those exist.  This would break a lot of existing SConstructs.
o   We need to find some kind of compromise here:
o
§  Explicitly list tools required by build (where?): this should work well because only the needed tools will be initialized
§  if nothing explicitly specified, fall back to current method
·        Within a tool:
·
o   specify dependencies on other tools
o   detect existence on system reliably, and without modifying env
o
§  need better error messaging: ability to probe silently, but also give sensible errors when needed
o   constructor needs to allow args: version, path, ABI, etc. (this is important)
o   allow for common setup (all C compilers, etc.) as now
o   tools should be versioned so user can check if up to date, etc.
·        Tool chains:
·
o   either-or
o   and
o   collections
·        Platform
·
o   How much do we need to know about the platform, for tools to initialize themselves?
o   Cross-compilation comes into this, but may be too much to include as a general part of this project.
o   It may be useful to define toolchains and enable/disable them by platform
o   Of course the default toolchains need to be different by platform
o   It may be possible for a default toolchain to just search for all tools in a particular order and pick the first, as long as the tool-dependency system is robust enough.
·        Usability
·
o   $CC etc. must never be left blank (without a prior tool-missing error message at least) - this is a common problem
o   Must be backward compatible, at least for all common cases.
o   Must not require any new user files (e.g. something in site_scons) for normal operation
o   Need a clear guide on requirements for new tools
o
§  how to make a a tool
§  how to include tests
·        Considerations
·
o   "batteries included?"
o
§  Each tool should do its best to set itself up, find executables, etc.
§  What about SCons policy of not relying on $PATH?  Maybe we should relax that or have an option?
·
o   minimum magic, maximum flexibility
o   what about single tools?  Should every tool be required to be part of a toolchain (even if it's just one tool)?  Maybe this doesn't matter much.
·        Non-goals:
·
o   New command-line args like autotools (platform, install paths, etc.).  We should build something that would enable that, but it's too much to bite off now.
o   Persistence -- remember configuration on disk between runs.  This is a performance enhancement which we should address only once we know it's needed.  Better if we can design a system that's fast without needing this.
·        References:
·
o   http://www.scons.org/wiki/PlatformToolConfig (Greg's original proposal)
o   http://www.scons.org/wiki/RevampToolsSubsystem
o   http://www.scons.org/wiki/PlatformToolConfigAlt (my proposal from 2008)
o   http://www.scons.org/wiki/EvanEnhancedConfigurationPackageProposal

--
Gary
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://two.pairlist.net/pipermail/scons-dev/attachments/20140527/347c80e1/attachment-0001.html>


More information about the Scons-dev mailing list