Tuesday, May 20, 2014

Gluten Intolerance unless you have Celiacs disease is probably bullshit

There are people who want your money, and they will sell you whatever bullshit that you are willing to buy.

Gluten Intolerance May Not Exist (from the same guy who "discovered it").
If you are gluten intolerant, it is more likely that you are just open to suggestion and have a weak mind. "Peter Gibson, a professor,...published (the original) study that found gluten, a protein found in grains like wheat, rye, and barley, to cause gastrointestinal distress... By extension, the study also lent credibility to the meteoric rise of the gluten-free diet."

"Gibson wasn’t satisfied with his first study...." He redid a study with placebo and non placebo. "Analyzing the data, Gibson found that each treatment diet, whether it included gluten or not, prompted subjects to report a worsening of gastrointestinal symptoms to similar degrees. "


"But those who elect to put themselves on a gluten-free diet without consulting a physician may be creating problems for themselves in two ways." (LA TIMES)

"And there are broader concerns. Some dietitians worry about the long-term effects of a strict gluten-free diet on those who don't need to be on it, because in avoiding foods with gluten, people may give themselves nutritional deficiencies. Those who elect to go on the diet need to watch that they get adequate amounts of B vitamins, particularly folic acid, Badaracco says." (LA TIMES)

 "If you just Google the silly thing, there's all sorts of dietitians and medical professionals against it," Badaracco says. "They're just not organized yet to [band together and] say, 'You know what? This is ridiculous.' "


Too bad the study on gluten intolerance came out after ASAPScience guys did their video on Gluten because then it would say that Gluten intolerance is mostly a myth and you really only have to worry about Gluten if you have Celiac disease.



Most people on Gluten free diets do not know what Gluten is.

"Now that there’s a multibillion-dollar market for gluten-free foods, people are ready to ask the essential question: What is gluten anyway?"

"Last week, Jimmy Kimmel sent a reporter out to ask people who don't eat gluten what it was they were avoiding. None of them knew. " LATIMES

1 in 133 people have Celiacs disease if you do not have it, then Gluten is not a problem for you.


Friday, May 16, 2014

Installing wrk on Linux

No package yet. You may need to upgrade and update.
wrk needs openssl dev package and gcc/dev stack.
What follows is brief instructions on how to install wrk on Linux.

Ubuntu/Debian (clean box)

sudo apt-get install build-essential
sudo apt-get install libssl-dev
sudo apt-get install git
git clone https://github.com/wg/wrk.git
cd wrk
Installs the build tools, open ssl dev libs (including headers), and git. Then uses git to download wrk and build.

CentOS / RedHat / Fedora

sudo  yum groupinstall 'Development Tools'
sudo yum install  openssl-devel
sudo yum install  git
git clone https://github.com/wg/wrk.git
cd wrk
Installs the build tools, open ssl dev libs (including headers), and git. Then uses git to download wrk and build.

Installing wrk on OSX

  • Go to app store and install OSX developer tools, i.e., Xcode.
  • Ensure that you have the xcode command line tools installed.
xcode-select --install 
  • Install brew see instructions at: http://brew.sh/ (Very simple process)
  • Install openssl latest lib this will put lib files and *.h files on your OSX box
brew install openssl
  • If you do not have git then install it
brew install git
  • Ensure that the search order for openssl and brew is ahead of system bin
  • You can do the last step by putting this in your ~/.profile
export PATH=/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin
  • Note that brew installs packages to /usr/local/bin
  • Download the source code
 git clone https://github.com/wg/wrk.git
 cd wrk
  • Now you need to edit the Makefile and add a few things
Change the first two lines from:
CFLAGS  := -std=c99 -Wall -O2 -D_REENTRANT 
LIBS    := -lpthread -lm -lcrypto -lssl 
CFLAGS  := -std=c99 -Wall -O2 -D_REENTRANT -I/usr/local/opt/openssl/include
LIBS    := -lpthread -lm -lcrypto -lssl -L/usr/local/opt/openssl/lib
  • Note the last step just adds the libs and headers to the openssl that brew installed.
  • Now run make
  • Expected output
$ make
CC src/wrk.c
CC src/net.c
CC src/ssl.c
CC src/aprintf.c
CC src/stats.c
CC src/script.c
CC src/units.c
CC src/ae.c
CC src/zmalloc.c
CC src/http_parser.c
CC src/tinymt64.c
LUAJIT src/wrk.lua
LINK wrk

notes on techempower client / server setting

For example, in the recent past I used autobench/httperf and remember setting the tcp_tw_recycle to true, but I was pounding from many servers and trying to get 100K to 200K clients per second.
I was wondering if there is a setup guide that I a missing.
Are there any special setting?
# Connection timeout                                                                                                                
net.ipv4.tcp_fin_timeout = 10
net.ipv4.tcp_tw_recycle = 1
#net.ipv4.ip_local_port_range=    # default was 30k ephemeral ports
There was no package manager. I notice you used Ubuntu after I installed centOS. CentOS is my knee jerk when I am not sure of the OS.

I did not find a package for wrk so I built it. Do you publish the version of wrk that you use?

Building wrk on centos

$ sudo  yum groupinstall 'Development Tools'
$ sudo yum install  openssl-devel
$ git clone https://github.com/wg/wrk.git
$ cd wrk
$ make
Is there a location of the benchmark run script? A wiki page... Bread crumbs?

Running wrk

./wrk -c 100 -d 30s
What is the right way?
Is it all in git?

Are there any special server setting?

sudo sysctl -w net.core.somaxconn=10000
sudo sysctl -w net.ipv4.tcp_max_syn_backlog=10000
Vertx recommends the above, do you have a guide on how you setup the boxes?

No package yet. You may need to upgrade and update.
wrk needs openssl dev package and gcc/dev stack.
What follows is brief instructions on how to install wrk on Linux.

Ubuntu/Debian (clean box)

sudo apt-get install build-essential
sudo apt-get install libssl-dev
sudo apt-get install git
git clone https://github.com/wg/wrk.git
cd wrk
Installs the build tools, open ssl dev libs (including headers), and git. Then uses git to download wrk and build.

CentOS / RedHat / Fedora

sudo  yum groupinstall 'Development Tools'
sudo yum install  openssl-devel
sudo yum install  git
git clone https://github.com/wg/wrk.git
cd wrk
Installs the build tools, open ssl dev libs (including headers), and git. Then uses git to download wrk and build.

rick@firefly:~/wrk$ ./wrk -H 'Accept: application/json,text/html;q=0.9,application/xhtml+xml;q=0.9,application/xml;q=0.8,*/*;q=0.7' -H 'Connection: keep-alive'  -c 200 -d 30s -t 2 -s pipeline-json.lua
Running 30s test @
  2 threads and 200 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     5.17ms    6.54ms  96.83ms   88.69%
    Req/Sec   591.42k   162.19k    1.13M    63.74%
  34215750 requests in 30.00s, 4.84GB read
Requests/sec: 1140534.77
Transfer/sec:    165.33MB


rick@firefly:~/wrk$ ./wrk -H 'Accept: application/json,text/html;q=0.9,application/xhtml+xml;q=0.9,application/xml;q=0.8,*/*;q=0.7' -H 'Connection: keep-alive'  -c 200 -d 30s -t 2 -s pipeline.lua
Running 30s test @
  2 threads and 200 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     4.53ms    5.80ms 106.19ms   89.31%
    Req/Sec   677.06k   173.23k    1.22M    62.24%
  39016230 requests in 30.00s, 4.80GB read
Requests/sec: 1300551.32
Transfer/sec:    163.72MB


Results for vertx.

Sunday, May 11, 2014

More details on GSON 100x faster than Jackson and Boon (for Hash Collision attack) with code examples (very unlikely but now never with Boon 0.18)

For Boon the hash collision would be a very rare occurrence which I explain below. And what is rare in Boon 0.17 should be impossible for Boon 0.18 for the standard JSON parser that comes with Boon. 

Update: Boon is now patched. It should not be possible to do a hash key collision attack using the standard Boon JSON parser. Boon checks for Java 1.7 and above and then checks to see jdk.map.althashing.threshold is set if 1.7 is detected. If Java is below 1.7, then it uses TreeMap, if Java is 1.7 and jdk.map.althashing.threshold not set then Boon uses TreeMap. If Java 1.8, Boon uses LinkedHashMap. So now the one in a million use case that I have never seen is now a never possible with Boon use case using the standard JSON parser. Boon was never impacted with Pojos even prior to this patch. The index overlay is the default parser so if you do nothing special, Boon will just work. There are other places Boon uses HashMap and LinkedHashMap but not as the default parser so for the public facing APIs use the standard JSON parser (if you do nothing special, that will be what you get).

Older post.

Boon's Index overlay by design does not hash the keys.
Boon only works on JDK 1.7 and above. JDK 1.7 above can block this attack.

This is in response to this:

Where to begin....

POJOs the dominant case for REST and Websocket are not impacted on any version of the JDK.

Two examples of Boon using the evil JSON file from hell that Jesse Wilson provided and not having any issue. The most common case for REST/Websocket uses Pojos and Boon handles that with no issues. 

The second less common case uses Boon index overlay API to white list the keys. I did think of this when I designed Boon. It has been mentioned before. 

In short if you use Pojos, repeat after me: NOT AN ISSUE and never was.

If you have an internal API (a non public API), repeat after me NOT AN ISSUE and never was.

If you use JDK 1.7 and set jdk.map.althashing.threshold, NOT AN ISSUE and never was.

If you use JDK 1.8, NOT AN ISSUE and never was.

If you control both sides of the wire, NOT AN ISSUE and never was.

If you are using JSON to write to a file, post a message to an internal event bus, read a config file, whatever, NOT AN ISSUE and never was.

If you have a public API, don't know about the "jdk.map.althashing.threshold of JDK 1.7" or have somehow got Boon to work in JDK 1.6 (Boon only support 1.7 and above), and insist on using a map instead of Pojos, then you are a purple unicorn, and Boon has just the API for you, it is called ValueMap, example provided. It is easy to white list your property names and avoid this issue. (Update this is now mostly moot due to the patch, but it is a an example of using the index overlay API).

(NOTE You no longer have to use white listing after Boon 0.18. Boon has been patched see https://github.com/RichardHightower/boon/issues/182).

There are actually about 20 other ways to do this with Boon as well. There is a whole bucket load of classes that do things while using a white listed set of key and/or properties.

Best to keep the description short.

Also you ops team can block large JSON files with an F5, etc. etc. etc.

Thanks Jesse for bringing this up.

Rather than document it at length, I figured it was easier just to fix it.

BTW that json file was DAMN evil. It blew up IntelliJ like three times and blew up the github Atom editor twice. That JSON is awesome in its evilness, and I feel more powerful having grasped that the Sith can live so completely in a single JSON file.

Boon loves purple unicorns so come on over. Save a tree. Use Boon.

___ previous post before picture provided ____

I looked at what GSON for safety, and it just does not make sense in the real world to slow down every use case which in my mind is a minority use when there are better ways to do it. Both Boon and Jackson have ways to avoid this, and there are plenty of other ways. 

Firstly Boon parses to a index overlay so you can actually convert this into any sort of Map. You even use the same Map that GSON uses IF YOU NEEDED TO. Notice that last part. 


The whole assertion that this is a common case is just wrong. There are internal SOA/REST/JSON calls. There are JSON/REST calls where you control both sides of the wire. There is JSON serialized to disk. To tie all JSON parsing and serializing to index hash collision is not needed (note I added it). It is like saying SSL/TLS are safer so everyone should always use them always.  

There is always uncle fester who disables JavaScript cookies and wears an aluminum foil hat, but.... In a way he is right (thanks NSA), but there is a tradeoff, and I think Boon and Jackson make the right ones, and I think GSON does not make the right tradeoff. (Uncle fester wins, and I will explain my change of heart later, but...)

It should take me less than ten minutes to show an example that uses Boon to prevent index hash collision. (Update two above and now a patch.) Boon's Index overlay by design does not hash the keys. You should be able to white list keys, and this should never come up. (I have some examples, but will double check that they do in fact never create the map until after the white list is validated). I will double check, and make sure that is true. I believe the POJO case using the index overlay would also not have this problem. I am fairly certain of it, but I would double check and add a unit test to make sure.

It is more about separation of concerns.  Safety / Speed are a continuum and when you tie yourself to a particular design choice you are limiting your solution. Also if you are expecting REST calls between 2k and 4K let's say, you can block large posts that are 1.2 MB at the F5 or NginX that would end to DDOS as well. You can even filter payloads that don't have certain strings in their body. (Which might have been true, but I will explain later.)

I know Jackson has similar capabilities to white list keys. There is usually more than on way to skin a cat. Boon was designed with this case in mind, and I did look at what GSON did. I decided it did not make sense for Boon to depend on a collection lib for a use case that would impact about less than 5% of all REST/Websocket, and 0% of the other 900 reasons you might want to use JSON parsing/serialization (see SlumberDB). Since the map that GSON uses is on google code, you can even use it with Boon. Boon is pretty flexible. The index overlay has an object that looks like a Map and it can be converted into a real map... even the on that GSON uses. 

I did not make this map the default because 1) you can white list the keys before they get to the map 2) not all APIs are public facing free for alls that need that level of protection 3) not every case is even used for SOA/REST/Websocket at all so the additional overhead and dependency on a lib did not make sense when you have plenty of other ways using Boon to accomplish this same thing.

With all that said, this has given me an opportunity to ponder this again, and I can at least have a WIKI page that describes what one should not do. 

Please understand that 100% of the use cases I have seen for public JSON REST calls involve POJOs, and this should not be possible with JSON to Java mapping using the index overlay which is the default with POJOs so of the 5% where this matters 100% of that 5% also does not matter because it inherently uses white listing. That said. I will verify that this is the case (even though I know it is the case because avoiding hashing was how I improved the JSON to Java mapping). But I will do some more due diligence. 

Show less
Rob Williams's profile photoRick Hightower's profile photoJesse Wilson's profile photo
Hide comments

Jesse Wilson
12:03 PM
I think we mostly agree actually. Safety/speed are a continuum. Gson is safer. Boon is faster. This is the Boon code that spent 15 seconds parsing 1.2 MiB of nasty JSON.

    Reader reader = ...
    Map<?, ?> map = (Map) JsonFactory.fromJson(reader);

Was I using it wrong?
Read more

Rick Hightower
1:38 PM
If you were concerned with safety you would use the Boon parser that builds an index overlay map and then white list the keys. If you are going to spend the time making a benchmark, you could look at the one you are trying to disprove which has been validated by three other benchmarks and uses the index overlay throughout.

Different use cases can use an index overlay. The API you are testing against is the connivence API when you just want to do a quick and dirty parse of a config file.

The benchmark I wrote uses the index overlay which does not use a HashMap. Put your benchmark on GitHub and let me take a look. I kept quiet for months and had my benchmarks validated by several people before I went public. This is not the first time I heard this. I've been waiting for you to go public with this since December. Every one knows everyone else.

Also I think the POJO mapping would not have this problem. It was designed that way.

Jackson has ways around this to. But Tatu is nicer than me and better at handling this then I am so he can speak for Jackson.

It is not a matter of safety and non safety. It is a matter of how do you prevent a DDoS and when. If you control both sides of the wire, might be overkill especially if payloads are already encrypted via TLS.

You can also white list the keys. You can also use the same map that GSON uses which is on google code.

Publish your code. I am pretty sure Boon does handle this case but not by default.

Most the services I work with are internal SOA or mobile to backend where we own both sides of the wire and encrypt the JSON. In these cases your point is moot.

In cases where your point is not moot, I would use the index overlay and the white list the allowed keys. Or use POJOs which does the same thing or convert the index overlay map into the map you use.

Publish your benchmark. Publish your code. I have not really tested this case because in the apps I work on it does not come up but if you publish your code then I can more easily validate that boon does not have this issue.

Btw I have found some use cases where Jackson is a lot faster up to 4x. It is not that I cherry picked, it is that these use cases have not come up yet in the projects that I am working on.

Also I plan to match Jackson speed in these use cases within the next week or so.

Sorry if I was rough but you came out swinging and I grew up in the murder capital of the US.

Read more (63 lines)

Rick Hightower
2:16 PM
Btw blocking comments on your blog is very cowardly. Open it up for comments. Fight like a man! :)

Friday, May 9, 2014

Boon 2x faster than Jackson at InputStream and not using index overlay

Boon non-index overlay mode using inputstream4 minutes ago by Richard Hightower
This is with InputStream. According to Tatu comments InputStream is a use case that Boon could not compete in because boon is really just optimized for String. Tatu also said that Boon likely only wins because it uses index overlay.

So here is a test that uses Boon without index overlay and uses InputStream not String.

Benchmark                       Mode Thr     Count  Sec         Mean   Mean error    Units
MainBoonBenchmark.webxml              thrpt  16         6    1   455347.925    46637.751    ops/s
BoonClassicEagerNoLazyParse.webxml    thrpt  16         6    1   401126.575    28331.138    ops/s
JacksonASTBenchmark.webxml            thrpt  16         6    1   233730.506    17868.136    ops/s
MainJacksonObjectBenchmark.webxml     thrpt  16         6    1   227287.992    21363.353    ops/s
BoonReaderSource.webxml               thrpt  16         6    1   216429.247    22538.238    ops/s
BoonAsciiBenchMark.webxml             thrpt  16         6    1   210416.450    10610.062    ops/s
BoonUTF8BenchMark.webxml              thrpt  16         6    1   199869.811     8742.968    ops/s
GSONBenchmark.webxml                  thrpt  16         6    1   168144.639     5311.387    ops/s

MainBoonBenchmark is the index overlay parser it comes in first.
BoonClassicEagerNoLazyParse (the original boon parser) comes in second.

BoonClassicEagerNoLazyParse does not use index overlay so all those comments are just off. You can see for this benchmark index overlay is only about an 11% improvement.

If I include full chop and chop, Jackson will come in fourth.
But then I would have to explain what chop and full chop mean, and then I would have to refer you to the article on InfoQ on index overlay and explain when it makes sense and what use cases you can't use it, and then introduce full chop and chop. :)


Jackson used to come after BoonReaderSource but Tata has been busy so now Jackson is only twice as slow BoonClassicEagerNoLazyParse. So there goes that myth about Index overlay and Boon not doing anything and Boon in only able to handle string and not InputStreams.

LET ME REPEAT THAT: Boon using InputStream, and not using index overlay is faster than Jackson.

GSON comes in last place. I remember when I started GSON would often beat Jackson.

Jackson has improved but even the improved Jackson is twice as slow as Boon for parsing with no index overlay.

RE: Input source. Most commonly cited tests start with Java Strings. Strings are rarely used as input source, because they are JVM constructs -- all external input comes as byte streams.

The benchmark linked which has not changed since this was published and for quite some time covers InputStream, string, byte[], etc. It is in the first paragraph. Next time I will use a blink tag. It did not cover not using index overlay because in only rare cases does index overlay not make sense as a viable alternative to full parse and index overlay is actually better for POJO serialization and REST. SO OF COURSE I included it in the benchmark. To think otherwise is just wrong. But there are cases where index overlay does not make sense (thus chop, full chop, and the full parse version), but that is very nuanced discussion, and an argument hard to make when there is some clear FUD. So instead of muddying the water and defending index overlay, let just keep at this, Boon does not need index overlay to beat Jackson at JSON parsing. PERIOD!

. So when user uses, say, JAX-RS style REST handling, where all JSON data gets bound to a POJO, from an InputStream; and reverse direction goes from another POJO into OutputStream, performance experienced is very different from what a benchmark would suggest.

Actually Boon was designed for exactly the REST / Websocket use case. And it was designed for POJO serialization. And you can see in the benchmarks cited that Boon does better than Jackson at POJO serialization in most cases, and in the article by Andrey (Groovy in that case which has some Boon DNA), and in the Gatling Benchmark and its POJO performance was confirmed by Julien Ponge (the author of Golo julien.ponge.org/blog/revisiting-a-json-benchmark/). Boon used to be consistently twice as fast as Jackson at POJO object serialization but Boon added some features which made it slower and Jackson got faster so now Boon usually wins by 30%, but not always. And once you are doing POJO mapping the index overlay vs non-overlay point is completely moot.

Let's go bigger. webxml, which is from the JSON.org examples is small, let's use a big file.. The catalog which is 170K.

Benchmark                                                     Mode Thr     Count  Sec         Mean   Mean error    Units
i.g.j.inputStream.MainBoonBenchmark.citmCatalog              thrpt  16         6    1     1061.592      106.462    ops/s
i.g.j.inputStream.BoonClassicEagerNoLazyParse.citmCatalog    thrpt  16         6    1      979.794       51.968    ops/s
i.g.j.inputStream.BoonReaderSource.citmCatalog               thrpt  16         6    1      681.072       37.804    ops/s
i.g.j.inputStream.JacksonASTBenchmark.citmCatalog            thrpt  16         6    1      554.181       26.974    ops/s
i.g.j.inputStream.GSONBenchmark.citmCatalog                  thrpt  16         6    1      538.389       43.384    ops/s
i.g.j.inputStream.MainJacksonObjectBenchmark.citmCatalog     thrpt  16         6    1      486.275       70.753    ops/s

So again, the claim was that Boon can't handle things fast unless it uses Strings and unless it uses index overlay is just plain wrong.

This is from an inputstream. Jackson is 4th. Boon index overlay beats. Boon full parser, non lazy version beats it. It beats by a wide margin. When Jackson does beat Boon which happens in some uses cases the margin is usually very small. When Boon wins which happens in most of the test cases, it wins by a wide margin with and without index overlay.

Since no one asked me why its Boon is faster or what it does, I wont say other than I have been talking about it on my blog for the last six months, and I'd like to work on an article for InfoQ about it with Jakob Jenkov and/or Stephane Landelle at some point.

Jackson can be faster than Boon. It probably will be. But today. Today. It is not.

Jackson and Boon have different philosophies.

Jackson is more mature. Jackson is more stable. Jackson integrates with more frameworks. Jackson has different and possibly more features. There are use cases where Jackson will be faster. There are many use cases where JSON parser speed will not matter. Jackson probably has less bugs since it has had more eyeballs for more years.

That said... Boon is faster. Today.
JMeter vs. Gatling: Fact Checking: SHILL! ASTROTURFING SHILL!