Friday, January 28, 2011

Thoughts on jBPM, Spring Integration, Activiti and Spring Batch

The slides are a continuation of my last post. Essentially, just sharing my current thoughts. I would love some feedback from people in the know.

--Rick Hightower

Thursday, January 27, 2011

Draft: Thoughts on Workflow, BPMN, BPM, Batch jobs, Integration Patterns, Spring Batch and Spring Integration

 Raw thoughts on BPMN, Java Workflow, etc.

  1. Use Business Process Management Engine to get visibility into the process of push (Activiti) abbreviated BPME
  2. Use Enterprise Integration Patterns approach for messaging/eventing between processes (Spring Integration) abbreviated EIPM
    1. Use it to message to the BPME
    2. Use it to kick off other processes
    3. Use it to integrate via adapters XMPP, JMS, AMQP, FTP, etc.
    4. Use it to publish events that other processes want to listent to
  3. Use recoverable batch processing management to do our large batch processes (Spring Batch)
    1. Use EIPM to communicate state from batch Jobs/Steps to BPME 
    2. Use retry/recover features
    3. Use transaction batching
    4. Use re-entrant features
    5. Use auditing features 

Problem domain:

This is deliberately vague enough for two reasons 1) I want to use it to solicit feedback from an external audience 2) I don't posses a clear understanding of all of the details of the current process.
Let's say you have the following problem. You have a bunch of producers (photographers, ERP specialists, artist, lawyers, marketing folks, project managers, content providers, etc) who produce raw data into a large raw data database.
The data in this raw data database is managed by a series of admin tools, and content tools. There are 30 some tools.
This raw data database consist of a highly normalized version of the actual data. It also consist of tabular business rules and metadata.
Later this raw data is pushed to the live working system.
The push process goes something like this.
The raw data is converted into domain objects by a decision support system.
The rich domain objects may nor may not end up in one of 20 or so data marts (fully relational operations data).
Later this richer domain data is further filtered by a variety of things for a variety of reasons (vague enough).
Some of this richer data is then copied into Service level caches (think memcache).
Then there are content caches (think file systems).
Then there are edge content caches for images, PDFs, and videos (think Akamai).
To simplify, the process is something like this:
  1. Transform raw data into rich domain objects using DSS
  2. Filter rich domain objects into many different operational data marts
  3. Populate Service Caches 
  4. Populate Content Caches
  5. Populate Edge Content Caches
To keep it really simple it would be:
  1. Transform raw data into domain objects
  2. Push domain objects to various data marts
  3. Populate caches
Again realize that most of the above is somewhat contrived and in reality way oversimplified.
The current system consist of 
  1. Korn shell scripts that call into a homegrown but not fully baked workflow engine (written in Java, using a custom JSON files)
  2. Perl/Python/GroovyScripts that do batch processing and then send messages to workflow engine
  3. Expensive proprietary, poorly documented, cluster messaging solution from an evil vendor (that gets used by the workflow engine)
  4. Java services that receive tasks from the aforementioned cluster messaging solution
    1. SOAP Service? No
    2. REST Service? No
    3. Homegrown service framework with custom marshaling and management (also poorly documented)? Yes :(
  5. Unicorn tears
  6. Blood of innocent babies
I bet you can guess what is wrong already. This is a complex system with many moving parts. Each one of those steps above consist of 1 to many processes.
Chasing down a problem is well in a word: HARD!
Only a few elite software engineers can find out what the problems are and fix them. They tail log files, hit admin tools, run custom groovy scripts and collect unicorn tears.
Also when something goes wrong after it is fixed, the whole process needs to run again from the top.
There is no retry. It all works or it all fails.
One bump in the road means no push.
Thus what they really want is a system that is:
  1. Maintainable
  2. Auditable
  3. Traceable
  4. Supportable
  5. Recoverable
They really need the ability to fix a problem and then continue where the process left off not have to restart the thing from the beginning.
Also they want to track how long each step is taking and need to know where in the process they are.
There is also a future need. You remember all those tools that were used to populate the raw data. If you put the raw data in wrong, it can cause a push to fail.
The admin tools rely on undocumented, ad hoc business processes. It would be nice to document, enforce and control these business processes.
Also some of these business processes have a human interaction to worry about (think approval process and content management).

Tuesday, January 11, 2011

Crank gets a mention

Crank/Krank gets a mention.

I wrote most of the DAO support for Crank and quite a bit of the rest. The project was always called Crank. It is Crank as in Crank something out, i.e., get it done quickly. The idea behind Crank was to idiomatically create GUIs in JSF, Spring MVC, and GWT. I only ever got around to writing the JSF front end. My personal life got a little busy, and then I got a job where I ended up writing embedded C and Python daemons instead of Java. There are about 40 or so projects that I know that use Crank. I imagine there is probably not a lot of new projects getting started who use Crank. 

I was looking at Hades. It has more momentum than Crank ever had. It also seems to be well done and documented. 

You could extend the Crank GenericDAO support and add methods, but it was not easy.

I used AOP introductions to add the additional finder methods. You could also weave in delete methods, and update methods. It also had a very useful Criteria API (before one was added to JPA 2). I got the idea from a IBM developerWorks article (the article used Hibernate). I ported the idea to JPA 1.0. 

The reason it is Krank in the google source code has nothing to do with what Crank means in German. It was because there was already an open source project called Crank (that did something else). And google code svn thingy would not let me call the project Crank. So the google project was called Krank but the project was always Crank. Clear? Ok probably not. Sort of a moot point now. 

Hades is much more focused and has much better documentation. The Crank DAO support is very similar. It has a lot more features. Crank was my first real open source project and I did not really know how to foster a community. I still don't. I think there are a lot of good ideas in Crank. Crank had a validation framework (that worked with JSF and Spring MVC), a GUI component framework, and much more. 

In early 2009, I got hired by company who used Crank to port Crank features to a Grails plugin. The project was cancelled before it really took off due to budget cuts. 

I think it makes sense to have a agnostic to technology Criteria API. This API could get reused with many persistence mechanism and state machines and whatever else needed a Criteria. I created one of these as part of Crank. It worked with JPA and straight up SQL. Crank uses JSF 1.2, Spring 2.5 and JPA 1.0. It is old school. 


The project is mainly historical at this point. I wanted to upgrade it to JSF 2.0, Spring 3 and JPA 2.0, but never got around to it. I think there are a lot of good ideas there.

It is good see that some of the ideas that were in Crank are becoming main stream. We were doing what Hades did back in 2006. I am sure someone else was doing it before us. 

Our docs (my fault) were a scattered mess. http://code.google.com/p/krank/w/list

I learned a lot about Spring AOP, JPA, Maven, etc. by writing Crank and we got a lot of work helping folks ramp up with Spring, JPA and JSF.

It was a learning experience.

Monday, January 10, 2011

What am I going to be when I grow up Part 2

I ran the technical side of a company for a few years that I co-owned. I created a lot of great courses. I did a lot of consulting and I created two great frameworks. I worked on a lot of great projects with a lot of cool people.  I got paid to do what I love: program.

It was cool perfecting my knowledge of Hibernate, Spring and JSF. It was fun working with JEE. I don't work for that company any more. It is gone. For a lot of reasons, most of which are not technical.

Now what are the next steps. I used to blog a lot about all of the cool stuff I was working with, but lately that had become less and less. Partly, I was not extremely happy with the blog software I was using. I have been going back and forth between blogger and jroller for, well, hmm, the last nine years. Anyway, I have come back to blogger and I plan on blogging more again.

The other reason is I am busy in the evenings. I have a big family and a new fiancee. I have a lot family responsibilities. In the last five years, I have had three children enter in the school system. They need help with their homework and more. I also have a young teenager who needs help and oversight. It is a full time job.

Another reason, is this I have been working with a lot of homegrown frameworks at work. A lot of stuff, I am just not allowed to blog about.

Since the end of ArcMind, I have been working with many different technologies and languages:

  • Python
  • Stomp
  • ActiveMQ
  • Django
  • Groovy
  • Grails
  • C
  • C TCP/IP programming
  • embedded C
  • GTK
  • Perl
  • Roo,
  • AspectJ,
  • Jackson,
  • JSON,
  • JavaScript,
  • SproutCore,
  • and of course Java

One project I worked on required programming in C, Python and Java. Bouncing back and fourth between many modules communicating over ActiveMQ.

I did something I never thought I would, I programmed in JavaScript full time for a while. I actually learned to like JavaScript at some level.

Now I am working on things I can't blog about. I do work with Hibernate, Spring and some outdated frameworks that no one has touched in 10 years, and tons, tons, tons of home grown Java frameworks that are complex and undocumented (not ones I created either).

I want to start blogging again. I want to blog about things I am researching. But what? In the past, work was always an inspiration.

For a while I was researching Objective C / iPhone / Cocoa development. I even wrote an iPhone math game for my son. We all played it. I planned on extending it and posting it on the store. Then life came along. Now, I am not sure. Do I want to continue and learn more about Objective C? Do I want to focus on things I am already good at? There is so much good stuff out there.

I recently did a lot of research (for work) on Java garbage collection and had a lot of success tuning misbehaving GC of some of our services. That would have been a good blog to write up. Also with the tools you have to employ to do proper tuning and debugging.

It seems like every other day SpringSource comes up with something new that could be applicable to work.

I am committed to do some blogging about something. Right now I don't know what.

What am I going to be when I grow up

I have landed in the bay area. I now live in Pleasanton CA. I work remotely a few days of the week and go into the office a few days a week. I work with a lot of home grown frameworks.

Gone are my consulting and traveling days. I have too many responsibilities at home to roam the globe (ok it was more like U.S. and Canada). This is a blessing.

I used to target a technology based on what I wanted to work with. Then I would find work. Then I would become an expert. Then I would write courseware. Then I would find more work.

Now there are more important issues. Are the work hour flexible? Can I work from home a few days of the week to be near my children and not stuck in a commute all of the time? Is there good public transit? I can't stand driving in traffic.

more thoughts later.

Just install syntax highlighting, and I wonder if it works

I used these instructions:

Crazy Fella guide to adding syntaxhighlighter support

And this...

// Comment
public class Testing {
    public Testing() {
    public void Method() {
    /* Another Comment
     on multiple lines */
    int x = 9;

Essentially, I had to do this:

I did this...

<link href='http://alexgorbatchev.com/pub/sh/current/styles/shCore.css' rel='stylesheet' type='text/css'/> 
<link href='http://alexgorbatchev.com/pub/sh/current/styles/shThemeDefault.css' rel='stylesheet' type='text/css'/> 
<script src='http://alexgorbatchev.com/pub/sh/current/scripts/shCore.js' type='text/javascript'/> 
<script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushCpp.js' type='text/javascript'/> 
<script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushCSharp.js' type='text/javascript'/> 
<script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushCss.js' type='text/javascript'/> 
<script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushJava.js' type='text/javascript'/> 
<script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushJScript.js' type='text/javascript'/> 
<script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushPhp.js' type='text/javascript'/> 
<script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushPython.js' type='text/javascript'/> 
<script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushRuby.js' type='text/javascript'/> 
<script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushSql.js' type='text/javascript'/> 
<script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushVb.js' type='text/javascript'/> 
<script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushXml.js' type='text/javascript'/> 
<script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushPerl.js' type='text/javascript'/> 
<script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushGroovy.js' type='text/javascript'/> 

<script language='javascript'> 
SyntaxHighlighter.config.bloggerMode = true;
SyntaxHighlighter.config.clipboardSwf = &#39;http://alexgorbatchev.com/pub/sh/current/scripts/clipboard.swf&#39;;

Kafka and Cassandra support, training for AWS EC2 Cassandra 3.0 Training