I was doing research on program analysis techniques targeting runtime bloat. I have recently developed a static analysis to find objects and calls that can be pulled out of loops. One interesting warning that the tool reported for DaCapo/xalan was in method "run" of class "dacapo.xalan.XalanHarness" (in the 2006 version). At line 88 of the class, a new transformer is created by calling "_template.newTransformer()". My tool suggested to pull this call out of the while loop starting at line 83. I inspected the source code and found this was actually true, because we don't need to create a transformer per input file. When I get this call pulled out, I could see 10% running time reduction and the number of objects created was reduced by almost a million (on IBM J9 1.5.0 build 2.4). I also checked the latest version of DaCapo, and found it was still inside the loop (line 86 of class org.dacapo.xalan.XSLTBench).
I am wondering if this is indeed a performance problem, or a false warning reported by our tool. Thank you !
It would seem more reasonable to hoist the generation of a new transform out of the loop (from a correctness point of view there is only a need for one transformer per thread).
Hoisting the creation of a new transformer outside the loop will change the workload, thus producing workload and consequently a different benchmark. While this is probably an appropriate change, it should be for the next revision of the DaCapo benchmark but not for a maintenance release of the current version.