Started here
Benchmark Mode Thr Count Sec Mean Mean error Units
o.b.PathExpression.employeeExtractionBenchBoon thrpt 8 5 1 1271.227 56.568 ops/s
o.b.PathExpression.employeeExtractionBenchJava thrpt 8 5 1 12588.107 342.374 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromDepartmentBoonBench thrpt 8 5 1 102.523 10.285 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromDepartmentJavaBench thrpt 8 5 1 3195.343 131.266 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromEmployeeListBoonBench thrpt 8 5 1 119.087 6.978 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromEmployeeListJavaBench thrpt 8 5 1 3176.597 295.307 ops/s
Made it to here
Benchmark Mode Thr Count Sec Mean Mean error Units
o.b.PathExpression.employeeExtractionBenchBoon thrpt 8 5 1 1432.687 43.977 ops/s
o.b.PathExpression.employeeExtractionBenchJava thrpt 8 5 1 12036.937 65.131 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromDepartmentBoonBench thrpt 8 5 1 587.150 71.170 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromDepartmentJavaBench thrpt 8 5 1 3333.123 682.546 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromEmployeeListBoonBench thrpt 8 5 1 927.043 65.186 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromEmployeeListJavaBench thrpt 8 5 1 3279.203 282.563 ops/s
We are 5x to 10x faster than when we started. There might be some more optimization. But it is looking a lot better. :)
BenchMarking
Notes on benchmarking.
This is more notebook than docs.
I have been benchmarking different aspect of Boon and improving them.
I did this for JSON (now faster than Jackson and GSON). I did this String parsing (now faster than JDK string parsing). And now I am looking at path expressions.
@GenerateMicroBenchmark
@OutputTimeUnit(TimeUnit.SECONDS)
public void employeeExtractionBenchBoon(BlackHole bh) throws Exception {
bh.consume(employeeExtractionBoon());
}
private Object employeeExtractionBoon() {
return atIndex(departments, "this.employees");
}
@GenerateMicroBenchmark
@OutputTimeUnit(TimeUnit.SECONDS)
public void employeeExtractionBenchJava(BlackHole bh) throws Exception {
bh.consume(employeeExtractionJava());
}
private Object employeeExtractionJava() {
List<Employee> employees = new ArrayList();
for (Department department : departments) {
employees.addAll(department.employees);
}
return employees;
}
I am using JMH and as you can see from above. I am just using two ways to iterate through a list and extract employees from departments. There are only four departments so this might not be the best way to test.
Here are the initial results:
Benchmark Mode Thr Count Sec Mean Mean error Units
o.b.PathExpression.employeeExtractionBenchBoon thrpt 8 5 1 3192132.090 164404.276 ops/s
o.b.PathExpression.employeeExtractionBenchJava thrpt 8 5 1 54021283.153 21598841.289 ops/s
3,192,132 operations per second versus 54,021,283 operations per second.
The time difference between plain java and the Boon expression is 15x. It is 15 times faster to use plain Java for grabbing list of lists. 16.9 to be exact. Next I will start up visualvm and see where Boon is spending its time.
According to jvisualvm profiler Boon is spending most of its time in Conversions.unifyList. Then BeanUtils.getFieldValues, and thenBeanUtils.getPropertyValues. With Conversions.unifyList is in the far lead. It is a nasty bit of code to unify lists of lists into a single list.
I actually did not think it would call unifyList is this use case, but now that I think about it of course it does.
jvisualvm tells a different story. It says unifyList, splitByCharsNoneEmpty are about equal in time. The Str.splitByCharsNoneEmpty made the list, but not by much.
Let me add some more paths to the mix.
@GenerateMicroBenchmark
@OutputTimeUnit(TimeUnit.SECONDS)
public void getAllOfTheEmployeesFirstNameFromEmployeeListBoonBench(BlackHole bh) throws Exception {
bh.consume(getAllOfTheEmployeesFirstNameFromEmployeeListBoon());
}
private Object getAllOfTheEmployeesFirstNameFromEmployeeListBoon() {
return atIndex(allEmployees, "firstName");
}
@GenerateMicroBenchmark
@OutputTimeUnit(TimeUnit.SECONDS)
public void getAllOfTheEmployeesFirstNameFromEmployeeListJavaBench(BlackHole bh) throws Exception {
bh.consume(getAllOfTheEmployeesFirstNameFromEmployeeListJava());
}
private Object getAllOfTheEmployeesFirstNameFromEmployeeListJava() {
List<String> employeesFirstNames = new ArrayList();
for (Employee employee : allEmployees) {
employeesFirstNames.add(employee.getFirstName());
}
return employeesFirstNames;
}
The above is a path that should not hit unifyList.
Here is another path that will hit unifyList
@GenerateMicroBenchmark
@OutputTimeUnit(TimeUnit.SECONDS)
public void getAllOfTheEmployeesFirstNameFromDepartmentBoonBench(BlackHole bh) throws Exception {
bh.consume(getAllOfTheEmployeesFirstNameFromDepartmentBoon());
}
private Object getAllOfTheEmployeesFirstNameFromDepartmentBoon() {
return atIndex(departments, "employees.firstName");
}
@GenerateMicroBenchmark
@OutputTimeUnit(TimeUnit.SECONDS)
public void getAllOfTheEmployeesFirstNameFromDepartmentJavaBench(BlackHole bh) throws Exception {
bh.consume(getAllOfTheEmployeesFirstNameFromDepartmentJava());
}
private Object getAllOfTheEmployeesFirstNameFromDepartmentJava() {
List<String> employeesFirstNames = new ArrayList();
for (Department department : departments) {
for (Employee employee : department.getEmployees()) {
employeesFirstNames.add(employee.getFirstName());
}
}
return employeesFirstNames;
}
For this benchmark. I am using 100,000 employees (generated with mostly random names).
A test run in JMH looks like this:
$ java -jar target/microbenchmarks.jar ".*PathExpression.*" -wi 5 -i 5 -f 1 -t 8
Here are the results.
Benchmark Mode Thr Count Sec Mean Mean error Units
o.b.PathExpression.employeeExtractionBenchBoon thrpt 8 5 1 2723259.153 1145620.830 ops/s
o.b.PathExpression.employeeExtractionBenchJava thrpt 8 5 1 48914088.320 5119843.984 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromDepartmentBoonBench thrpt 8 5 1 903079.780 30861.050 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromDepartmentJavaBench thrpt 8 5 1 45053486.367 4804329.054 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromEmployeeListBoonBench thrpt 8 5 1 1324116.787 81300.600 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromEmployeeListJavaBench thrpt 8 5 1 51982899.700 3694392.319 ops/s
I was expecting different results and it looks like I messed up and did not add the 100K auto-generated employees. Damn it.
Fixed. Let me try again.
Ok. Now here is actually processing 100K employees.
Benchmark Mode Thr Count Sec Mean Mean error Units
o.b.PathExpression.employeeExtractionBenchBoon thrpt 8 5 1 1271.227 56.568 ops/s
o.b.PathExpression.employeeExtractionBenchJava thrpt 8 5 1 12588.107 342.374 ops/s
Boon 10x slower than doing it by Java by hand.
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromDepartmentBoonBench thrpt 8 5 1 102.523 10.285 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromDepartmentJavaBench thrpt 8 5 1 3195.343 131.266 ops/s
Boon 30x slower than Java by hand (for path expressions)
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromEmployeeListBoonBench thrpt 8 5 1 119.087 6.978 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromEmployeeListJavaBench thrpt 8 5 1 3176.597 295.307 ops/s
Boon 30x slower than doing it by hand.
I was expecting getAllOfTheEmployeesFirstNameFromEmployeeListBoonBench to do much better so I will start there with the profiling.
I run the test again but this time with just the method I want to profile.
java -jar target/microbenchmarks.jar ".PathExpression.*getAllOfTheEmployeesFirstNameFromEmployeeListBoonBench." -wi 5 -i 5000 -f 1 -t 8
And a lot of iteration so I can look at jvisualvm and think.
When I do this Reflection.combineFieldMaps is taking the most time (I don't remember exactly what this does)... SplitByCharsNoneEmpty is taking the second most time. I know how to fix that one. Then we have BeanUtils.getPropertyValue. This is with the jvisualvmSampler. Now let's look at the jvisualvm profiler. It says splitbychars is where I want to look the most and Reflection.combineFieldMaps a far second.
Ok.. So I cache the results of a string split, and I fixed the combineFieldMaps problem (or at least I think I did).
java -jar target/microbenchmarks.jar ".PathExpression." -wi 5 -i 5 -f 1 -t 8
Benchmark Mode Thr Count Sec Mean Mean error Units
o.b.PathExpression.employeeExtractionBenchBoon thrpt 8 5 1 1249.670 95.282 ops/s
o.b.PathExpression.employeeExtractionBenchJava thrpt 8 5 1 12028.693 124.035 ops/s
10x slower still.
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromDepartmentBoonBench thrpt 8 5 1 217.363 11.890 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromDepartmentJavaBench thrpt 8 5 1 3227.960 233.379 ops/s
Now instead of 30x slower, we are at 11x so almost a 3x improvement in speed.
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromEmployeeListBoonBench thrpt 8 5 1 292.207 20.021 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromEmployeeListJavaBench thrpt 8 5 1 3241.347 167.004 ops/s
From 30x slower to 12 x slower so like a 2.75 x improvements in speed.
2x to 3x improvement in speed is not too bad for a couple hours of performance tuning.
Lets run jvisualvm again and see what the long poles are....
$ java -jar target/microbenchmarks.jar ".*PathExpression.*getAllOfTheEmployeesFirstNameFromEmployeeListBoonBench.*" -wi 5 -i 5000 -f 0 -t 8
# Fork: N/A, test runs in the existing VM
# Warmup: 5 iterations, 1 s each
# Measurement: 5000 iterations, 1 s each
# Threads: 8 threads, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Running: org.boon.PathExpression.getAllOfTheEmployeesFirstNameFromEmployeeListBoonBench
# Warmup Iteration 1: 308.500 ops/s
# Warmup Iteration 2: 314.417 ops/s
# Warmup Iteration 3: 311.867 ops/s
# Warmup Iteration 4: 321.533 ops/s
# Warmup Iteration 5: 312.867 ops/s
Iteration 1: 314.667 ops/s
Iteration 2: 318.733 ops/s
Iteration 3: 319.383 ops/s
Iteration 4: 319.700 ops/s
Iteration 5: 264.350 ops/s
Iteration 6: 266.783 ops/s
According to jvisualvm sampler, I need to look at BeanUtils.atIndex (which is the method we are using... so...). Next up is BeanUtils.getPropertyValue, then we have unifyList (which I don't know why this is getting called per se). Also FastStringUtils.toCharArray (and I don't know why this is getting called at all). Those are the tops. Now let's look at the profiler. The profiler says look at BeanUtils.getFieldValues, getPropertyValue, Conversion.unifyList, StringScanner.isDigits (which is the one that calls FastStringUtils.toCharArray).
I optimized unifyList, and I avoid the call to FastStringUtils.toCharArray altogether as well as StringScanner.isDigits.
Benchmark Mode Thr Count Sec Mean Mean error Units
o.b.PathExpression.employeeExtractionBenchBoon thrpt 8 5 1 1479.100 90.829 ops/s
o.b.PathExpression.employeeExtractionBenchJava thrpt 8 5 1 11888.187 546.265 ops/s
Now it is only 8x slower instead of 10.
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromDepartmentBoonBench thrpt 8 5 1 265.120 17.130 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromDepartmentJavaBench thrpt 8 5 1 3170.723 284.468 ops/s
Looks like about a 20% improvement here.
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromEmployeeListBoonBench thrpt 8 5 1 323.763 20.463 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromEmployeeListJavaBench thrpt 8 5 1 3229.353 143.103 ops/s
Looks to be about a 20% improvement here to.
So it is not time to crack open the vodka yet, but it is getting better.
It looks like getFieldValues is the long poll.
I did some tweaks to getFieldValues and this is what I got now...
Benchmark Mode Thr Count Sec Mean Mean error Units
o.b.PathExpression.employeeExtractionBenchBoon thrpt 8 5 1 1563.507 142.736 ops/s
o.b.PathExpression.employeeExtractionBenchJava thrpt 8 5 1 12033.940 270.036 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromDepartmentBoonBench thrpt 8 5 1 279.177 12.791 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromDepartmentJavaBench thrpt 8 5 1 3287.407 301.868 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromEmployeeListBoonBench thrpt 8 5 1 378.337 35.866 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromEmployeeListJavaBench thrpt 8 5 1 3263.107 232.833 ops/s
There are some improvements but nothing to write home about yet.
Now everything points to getPropertyValue. :) Slay one dragon and Pareto has you focused on another.
getFieldValues is still around but no where near the top.
Ok so now another round of optimization.
Benchmark Mode Thr Count Sec Mean Mean error Units
o.b.PathExpression.employeeExtractionBenchBoon thrpt 8 5 1 1360.407 202.870 ops/s
o.b.PathExpression.employeeExtractionBenchJava thrpt 8 5 1 11996.353 54.584 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromDepartmentBoonBench thrpt 8 5 1 570.703 29.017 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromDepartmentJavaBench thrpt 8 5 1 3291.660 250.998 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromEmployeeListBoonBench thrpt 8 5 1 946.940 13.226 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromEmployeeListJavaBench thrpt 8 5 1 3298.597 185.509 ops/s
Looks like that last round helped out a lot. We increased our throughput by 2x to 3x.
Summary
Started here
Benchmark Mode Thr Count Sec Mean Mean error Units
o.b.PathExpression.employeeExtractionBenchBoon thrpt 8 5 1 1271.227 56.568 ops/s
o.b.PathExpression.employeeExtractionBenchJava thrpt 8 5 1 12588.107 342.374 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromDepartmentBoonBench thrpt 8 5 1 102.523 10.285 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromDepartmentJavaBench thrpt 8 5 1 3195.343 131.266 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromEmployeeListBoonBench thrpt 8 5 1 119.087 6.978 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromEmployeeListJavaBench thrpt 8 5 1 3176.597 295.307 ops/s
Made it to here
Benchmark Mode Thr Count Sec Mean Mean error Units
o.b.PathExpression.employeeExtractionBenchBoon thrpt 8 5 1 1432.687 43.977 ops/s
o.b.PathExpression.employeeExtractionBenchJava thrpt 8 5 1 12036.937 65.131 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromDepartmentBoonBench thrpt 8 5 1 587.150 71.170 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromDepartmentJavaBench thrpt 8 5 1 3333.123 682.546 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromEmployeeListBoonBench thrpt 8 5 1 927.043 65.186 ops/s
o.b.PathExpression.getAllOfTheEmployeesFirstNameFromEmployeeListJavaBench thrpt 8 5 1 3279.203 282.563 ops/s
We are 5x to 10x faster than when we started. There might be some more optimization. But it is looking a lot better. :)
No comments:
Post a Comment