[Scons-dev] Java Development

Mark A. Flacy mflacy at verizon.net
Thu Jul 31 23:04:39 EDT 2014


On Sunday, July 27, 2014 03:10:34 PM William Deegan wrote:
> William,
> On July 25, 2014 at 8:27:02 PM, William Blevins (wblevins001 at gmail.com)
> wrote:
> 
> Team,
> 
> I want to get another thread going for SCons Java development.
> 
> The SCons Java tool has a ton of error reports on Tigris including 7
> priority 1 issues.  At the moment, this tool doesn't stand a chance against
> other Java competitors, and not because they are great tools.  I frankly
> hate ANT.  I have used Java support from SCons and it's seriously painful;
> nothing like the C++ support.  Some other developers have made statements
> like "No one outside SCons builds Java programs more complicated than hello
> world."  The SCons tool framework is great, and I would really like to see
> the Java toolkit see some love.  It has potential to be a hidden gem, and I
> want help out with this, but I don't have the experience to do this on my
> own, so firstly I'd like to list some of the biggest hurdles to users SCons
> Java.  I'm not gonna try and propose any solutions at the moment.  I just
> want to see if I can get the group thinking about the problems.
> 
> 1. Adding resource files to a jar causes SCons
> segfault: http://scons.tigris.org/issues/show_bug.cgi?id=2550
> 
> I have firsthand experience with this bug.  The only way I could figure out
> how to workaround it was to make a separate jar just for resources.
> 
> 2. Java emitter almost never gets the java output correct.
> 
> One of the many things I hate about ANT is that ANT is stupid.  It always
> executes a build even if code is up-to-date and I usually have to
> explicitly clean.  SCons COULD resolve both problems if the emitters
> worked.  The only way to get remotely consistent working build is to call
> Jar( 'buildDir' ) when everyone wants to do Jar( [ 'class1', ... 'classN' ]
> ).
> 
> 3. Dependencies:
> 
> SCons does not automatically add classpath items as dependencies.  Why do I
> need to do this manually?  This is what SCons does!  It's the heart and
> soul!
> 
> I believe this is because of item #6.
> 
> 4. Consistency:
> 
> Classpath tokens (among other items) do not behave the same as other
> builders. Example: I cannot use "#jar/item.jar" in the classpath without
> expanding via something like File(...).get_path().
> 
> 5. Interfaces:
> 
> Java(...) parameters and internal handling aren't intuitive and only handles
> sources = 'directory' correctly.  It doesn't do lists of java files or list
> of directories in a sane manner.
> http://scons.tigris.org/issues/show_bug.cgi?id=1772
> 
> Personally, I don't think that Java and Jar should be separate functions.
>  How do you get to the classes then?  What about Javah!  I have an idea,
> but that's outside the scope of this rant.
> 
> 6. Performance:
> 
> The dependency structure for Java exposes class files in a way that creates
> tons of false positives.
> 
> SCons current:
> 
> classes1 = Java(...)
> 
> classes2 = Java(...)
> 
> Depends( classes2, classes1 ) # O( N^2 ) dependency graph with tons of false
> positives
> 
> SCons if I have anything useful to say about it
> 
> jar1 = Jar( classes1 )
> 
> jar2 = Jar( classes2 )
> 
> Depends( jar2, jar1 ) # O( 1 ) which obviously fails in parallel builds
> currently.
> 
> I am currently data mining a production Java codebase to prove my point.
> Dirk and I have already discussed this issue somewhat; thanks Dirk :)
> 
> This actually causes the Task Master thread to get blocked on large jars
> reducing parallel efficiency in builds to None.
> 
> 
> The main issue here (if I understand SCons’ internals enough) is that
> SCons’s doing all dependencies on a per file basis. For many types of
> builds that works fine. For Java (building jars, and other issues) and some
> other types of builds, that’s very inefficient.
> 
> There’s no “blob” of files where you have N inputs and M outputs, and thus
> you get the N*M arcs in the graph.
> 
> Currently the only similar but not really similar enough is the Dir() Node
> type.  But that has it’s own problems, which could be solved by a N*M type
> node.
> 
> 
> 
> Other Java issues could likely be resolved building on top of such a new
> Node type.
> 
> Though resolving the anonymous and inner classes in a java file creating
> more than one class file and what it might be named is also still an issue
> which the scanner and emitter try to solve by parsing the java files and
> figuring out the proper naming.  This of course is not (as I understand it)
> formally defined as part of the java language and thus is a per compiler
> implementation detail.

In my opinion, you are wasting your time.

The way java produces artifacts is so different from the model that SCons 
"expects" that you are attempting to cram an elephant down a shrew's throat.  
(or attempting to feed a Great White shark with ants, if you prefer) to get 
this to work.

SCons works *great* when there's one output per input, especially when you can 
deduce the output name from the input name and little else.  It also works 
great when it is fairly cheap to determine what files a given file depends upon.

Sadly, neither of those conditions exist with java source.


Between 2003 and 2010, I developed and maintained a python based build tool 
for a Java project that contained ~10K java source files.  It would analyse the 
java source files to figure out the package-level compile dependencies and then 
send lists of files to be compiled as a unit to a persistent compile server (in 
fact, you could run multiple compile servers at once to get a large degree of 
parallelism).  It would then analyze the generated class files to see if the 
total public/protected interface for the package changed, preventing 
recompiles if all you did was change an implementation detail versus a visible 
method or attribute.

You could also register various hooks against packages (such as run rmic) if 
the package compiled correctly.  We did some funky stuff with autogenerating 
WSDL from java source and then generate client classes from the WSDL and 
compiling those classes also.  Almost all of that was data driven.  It did 
other neat stuff that I don't remember off the top of my head, but I've got all 
the source sitting around in various tla/hg repositories.

Nobody uses it any more.  Even me, and I thought about providing some it in an 
open source build tool.

For our Windows users, it was faster to nuke the output directories and 
recompile everything in one invocation of javac. (Well, 2 invocations since 
some of the funky stuff was to use the .class files from the first compile to 
generate the wsdl and use *that* wsdl to generate java source as part of the 
second bulk compile.) Most of the time difference had to do with the way the 
python tool talked to the compile processes by sending the list of file names 
through a pipe to the other process and then flushing the pipe.  The rest had 
to do with the time spent parsing the files (and I maintained a cache of parsed 
file information that contained the last mtime of the source files and would not 
process a file whose mtime was the same as what was recorded).

For our Linux users (me, for one), it was not quite a wash to change the 
behavior but the delta wasn't enough to worry about.

I'm more than willing to discuss this and also to be proven wrong.

-- 
Mark A. Flacy


More information about the Scons-dev mailing list