What if it were easy to query a complex set of Java objects at runtime? What if there were an API that kept your object indexes (really just TreeMaps, and HashMaps) in sync.? Well then you would have Boon's data repo. This article shows how to use Boon's data repo utilities to query Java objects. This is part one. There can be many, many parts. :)

Boon's data repo makes doing index based queries on collections a lot easier.

Why Boon data repo

Boon's data repo allows you to treat Java collections more like a database at least when it comes to querying the collections. Boon's data repo is not an in-memory database, and cannot substitute arranging your objects into data structures optimized for your application.

If you want to spend your time providing customer value and building your objects and classes and using the Collections API for your data structures, then DataRepo is meant for you. This does not preclude breaking out the Knuth books and coming up with an optimized data structure. It just helps keep the mundane things easy so you can spend your time making the hard things possible.

Born out of need

This project came out of a need. I was working on a project that planned to store large collection of domain objects in-memory for speed, and somebody asked an all to important question that I overlooked. How are we going to query this data. My answer was we will use the Collections API and the Streaming API. Then I tried to do this... Hmmm...

I also tired using the JDK 8 stream API on a large data set, and it was slow. (Boon's data repo works with JDK7 and JDK8). It was a linear search/filter. This is by design, but for what I was doing, it did not work. I needed indexes to support arbitrary queries.

Boon at Dzone!

Boon's data repo augments the streaming API.

Boon's data repo does not endeavor to replace the JDK 8 stream API, and in fact it works well with it. Boon's data repo allows you to create indexed collections. The indexes can be anything (it is pluggable).

At this moment in time, Boon's data repo indexes are based on ConcurrentHashMap and ConcurrentSkipListMap.

By design, Boon's data repo works with standard collection libraries. There is no plan to create a set of custom collections. One should be able to plug in Guava, Concurrent Trees or Trove if one desires to do so.

It provides a simplified API for doing so. It allows linear search for a sense of completion but I recommend using it primarily for using indexes and then using the streaming API for the rest (for type safety and speed).

Sneak peak before the step by step

Let's say you have a method that creates 200,000 employee objects like this:

  List<Employee> employees = TestHelper.createMetricTonOfEmployees(200_000);

So now we have 200,000 employees. Let's search them...

First wrap Employees in a searchable query:

  employees = query(employees);

Now search:

  List<Employee> results = query(employees, eq("firstName", firstName));

So what is the main difference between the above and the stream API?

  employees.stream().filter(emp -> emp.getFirstName().equals(firstName)

About a factor of 20,000% faster to use Boon's DataRepo! Ah the power of HashMaps and TreeMaps. :)

Question: UPDATE QUESTION FROM A READER:

On Saturday, November 2, 2013, Chris B wrote:

"Very interesting, Rick - I only had time to do a quick read of the article, so forgive me if this is answered within your write up. But, I was curious as to the overhead of building the indexes when you wrap your collection in a query object... Say for the 200_000 employees in the example below. How long would it take to build the indexed structure ?"

Thanks Chris. Good Question! It would be quite expensive so you would only use this construct if you plan on holding on to the collection for a while. There is also a repo object if you want to gradually update a collection's indexes.

For a use case: Imagine you are pulling a list of employees from memcached every 20 minutes. You want to query against the employees so you need the indexes and you want something faster than storing every possible combination of a query in memcached (yes I have seen people do this to the tune of 12 GB... 20x more than the data in the DB. I have seen this more than once). So now you pull them down, query against the searchable collection. This avoids the call to memcached and the explosion of every possible query being cached. Anyway... Data repo was written to avoid many of the anti patterns that I have seen with caching. It allows you a way to query against Java objects.

There is an API that looks just like your built-in collections. There is also an API that looks more like a DAO object or a Repo Object.

A simple query with the Repo/DAO object looks like this (the repo allows gradual update of the indexes):

  List<Employee> employees = repo.query(eq("firstName", "Diana"));

A more involved query would look like this:

  List<Employee> employees = repo.query(
          and(eq("firstName", "Diana"), eq("lastName", "Smith"), eq("ssn", "21785999")));

Or this:

  List<Employee> employees = repo.query(
          and(startsWith("firstName", "Bob"), eq("lastName", "Smith"), lte("salary", 200_000),
                  gte("salary", 190_000)));

Or even this:

  List<Employee> employees = repo.query(
          and(startsWith("firstName", "Bob"), eq("lastName", "Smith"), between("salary", 190_000, 200_000)));

Or if you want to use JDK 8 stream API, this works with it not against it:

  int sum = repo.query(eq("lastName", "Smith")).stream().filter(emp -> emp.getSalary()>50_000)
          .mapToInt(b -> b.getSalary())
          .sum();

The above would be much faster if the number of employees was quite large. It would narrow down the employees whose name started with Smith and had a salary above 50,000. Let's say you had 100,000 employees and only 50 named Smith so now you narrow to 50 quickly by using the index which effectively pulls 50 employees out of 100,000, then we do the filter over just 50 instead of the whole 100,000.

Here is a benchmark run from data repo of a linear search versus an indexed search in nano seconds:

Name index Time 218 Boon data repo!
Name linear Time 3542320 Not boon. :(
Name index Time 218
Name linear Time 3511667
Name index Time 218
Name linear Time 3709120
Name index Time 213
Name linear Time 3606171
Name index Time 219
Name linear Time 3528839

Someone recently said to me: "But with the streaming API, you can run the filter in parralel).

Let's see how the math holds up:

35,28,839 / 16 threads vs. 219

201,802 vs. 219.

Indexes win, but it was a photo finish. :)
It was only 9,500% faster instead of 40,000% faster. So close.....

By default all search indexes and lookup indexes allow duplicates (except for primary key index).

      repoBuilder.primaryKey("ssn")
              .searchIndex("firstName").searchIndex("lastName")
              .searchIndex("salary").searchIndex("empNum", true)
              .usePropertyForAccess(true);

You can override that by providing a true flag as the second argument to searchIndex.

Notice empNum is a searchable unique index.

Using Boon's data repo by example

Brief announcement from our sponsor "Boon = Simple opinionated Java for the novice to expert level Java Programmer. Low Ceremony. High Productivity. A real boon to Java to developers!"

Boon Home | Boon Source | If you are new to boon, you might want to start here.

Java Boon - Getting Started with Boon data repo

Using Boon DataRepo by example

Boon's data repo allows you to quickly query complex trees of Java objects that are stored inside of Java collections.

Let's start with some simple examples.

First here are the imports to show where the example classes and utility classes come from:


//Sample Test model
import com.examples.model.test.Email;
import com.examples.security.model.User;
import static com.examples.security.model.User.user;

//Data repo classes
import org.boon.datarepo.DataRepoException;
import org.boon.datarepo.Repo;
import org.boon.datarepo.RepoBuilder;
import org.boon.datarepo.Repos;


//Boon utility methods
import org.boon.Lists;
import org.boon.core.Typ;
import static org.boon.Boon.putl;
import static org.boon.Boon.puts;
import static org.boon.Exceptions.die;
import static org.boon.Lists.idx;
import static org.boon.criteria.CriteriaFactory.eq;
import static org.boon.criteria.CriteriaFactory.eqNested;
import static org.boon.criteria.CriteriaFactory.notEq;

We will create a repository of user objects and then perform some basic queries against them.

Some house keeping. Let's create a constant for email so we don't keep using it over an over.


  private static String EMAIL = "email";

Let's get going.


  public static void main ( String... args ) {


      boolean test = true;

Next create a repo builder that creates a user repo. User is a sample domain object, and it has one property called email.

A repo is core concept in Boon data repo. Think of a repo as a collection that allows queries. The queries can be very fast as they use search indexes (TreeMap), and lookup indexes (HashMap).

     
      RepoBuilder repoBuilder = Repos.builder ();
      repoBuilder.primaryKey ( EMAIL );

The above creates a repo whose primary key is the property email.

Next we create a repo of that has a key of type String.class and an item class of a User.class.


      final Repo< String, User > userRepo = repoBuilder.build ( Typ.string, user );

The above creates the userRepo using the builder. Note that Typ.string is just equal to String.class.

Now lets add some users to our repo so we can test it out.

      final List<User> users = Lists.list (
              user ( "rick.hightower@foo.com" ),
              user ( "bob.jones@foo.com" ),
              user ( "sam.jones@google.com" )
      );

The method user is a static method from the class User. I will show you that in a second. List.list is a helper method that creates java.util.List (see wiki for more detail on Boon, links at bottom of blog).

The userRepo allows you to add a list of users to it with the addAll method as follows:

      userRepo.addAll ( users );

Now that we have a repo with users let's run some queries against it.

Here is a query using the eq criteria.

      List<User> results =
              userRepo.query ( eq ( EMAIL, "rick.hightower@foo.com") );

The method userRepo.query returns a java.util.List of Users.

Let's print out all of the users.

      putl ( "Simple Query using Equals Results", results );

Boon has two utility methods puts, and putl. The puts method is similar to Ruby puts. The putl method puts out lines per item.

Output:

  [User{email='rick.hightower@foo.com'}]

Now let's make sure we got what we expected.



      /** Same as results.get(0) */
      User rick = idx (results, 0);

      /* Make sure we got what we wanted. */
      test |= Objects.equals (rick.getEmail (), "rick.hightower@foo.com") ||
              die( "Rick's email not equal to 'rick.hightower@foo.com' " );

The idx method is a core concept in Boon but not in Boon data repo. The idx method indexes things. There are quite a few wiki pages on slice notation and indexing in the Boon wiki.

Now let's try some different queries on our repo collection, let's try a notEq criteria:



      results =
              userRepo.query ( notEq( EMAIL, "rick.hightower@foo.com" ) );

There are many criteria operators in Boon's data repo, here is partial list:


static Group         and(Criteria... expressions) 
static Criterion between(java.lang.Class clazz, java.lang.Object name, java.lang.String svalue, java.lang.String svalue2) 
static Criterion between(java.lang.Object name, java.lang.Object value, java.lang.Object value2) 
static Criterion between(java.lang.Object name, java.lang.String svalue, java.lang.String svalue2) 
static Criterion contains(java.lang.Object name, java.lang.Object value) 
static Criterion empty(java.lang.Object name) 
static Criterion endsWith(java.lang.Object name, java.lang.Object value) 
static Criterion eq(java.lang.Object name, java.lang.Object value) 
static Criterion eqNested(java.lang.Object value, java.lang.Object... path) 
static Criterion gt(java.lang.Object name, java.lang.Object value) 
static Criterion gt(java.lang.Object name, java.lang.String svalue) 
static Criterion gte(java.lang.Object name, java.lang.Object value) 
static Criterion implementsInterface(java.lang.Class<?> cls) 
static Criterion in(java.lang.Object name, java.lang.Object... values) 
static Criterion instanceOf(java.lang.Class<?> cls) 
static Criterion isNull(java.lang.Object name) 
static Criterion lt(java.lang.Object name, java.lang.Object value) 
static Criterion lte(java.lang.Object name, java.lang.Object value) 
static Not         not(Criteria expression) 
static Criterion notContains(java.lang.Object name, java.lang.Object value) 
static Criterion notEmpty(java.lang.Object name) 
static Criterion notEq(java.lang.Object name, java.lang.Object value) 
static Criterion notIn(java.lang.Object name, java.lang.Object... values) 
static Criterion notNull(java.lang.Object name) 
static Group         or(Criteria... expressions) 
static Criterion startsWith(java.lang.Object name, java.lang.Object value) 
static Criterion typeOf(java.lang.String className)

We will not cover them all in detail, but you can guess what they do by their name.

Let's continue our notEq example.

Now we have to make sure we get users who are not Rick. No one wants Rick. :(


      putl ( "Simple Query using Not Equals Results", results );

      /** Same as results.get(0) */
      User notRick = idx (results, 0);

      putl ( notRick );

      /* Make sure we got what we wanted, i.e. no Rick! */
      test |= !Objects.equals (notRick.getEmail (), "rick.hightower@foo.com") ||
              die( "User Not Rick's email should NOT be equal " +
                      "to 'rick.hightower@foo.com' " );

The above shows that Rick is not in the results of the not Rick query. :)

output

  [User{email='bob.jones@foo.com'}, User{email='sam.jones@google.com'}]

  User{email='bob.jones@foo.com'}

Sometimes, you know that you only want one result from a query or sometimes you just want one result period. For that you use the results method which return a ResultSet as follows:


      rick = userRepo.results ( eq ( EMAIL, "rick.hightower@foo.com" ) ).firstItem ();

Notice that we are using **userRepo.results(...).firstItem() **to get one item instead of a list of items.

What if we only expect one item and only one item then we would use a results(...).expectOne().firstItem() as follows:

      rick =  (User)     //expectOne is not generic
              userRepo.results ( eq ( EMAIL, "rick.hightower@foo.com" ) )
                      .expectOne ().firstItem ();

Notice that we have to use a cast. You can use results with selects, which means you can select any part of an object so firstItemmay not always return user. If you don't want to cast, you can use the following form.

      rick =  userRepo.results ( eq ( EMAIL, "rick.hightower@foo.com" ) )
              .expectOne (user).firstItem ();

Notice we pass user to expectOne (user is of type Class and is set to User.class).

You could write the above like this instead:

      rick =  userRepo.results ( eq ( EMAIL, "rick.hightower@foo.com" ) )
              .expectOne (User.class).firstItem ();

I prefer having types that are easier on the eyes since they get used a bit in Boon data repo support.

If you expect one and only one, then you can use the expectOne method and if there are more than one item or zero items, then it will throw an exception demonstrated as follows:

      /** Example 6: Expect only one item with expectOne(user).firstItem() and we have many. */

      try {
          putl ( "Example 6: Failure case, we have more than one for",
                  "query using ResultSet.expectOne(user).firstItem");

          rick =  userRepo.results ( notEq ( EMAIL, "rick.hightower@foo.com" ) )
                  .expectOne (user).firstItem ();

          die("We should never get here!");

      } catch (DataRepoException ex) {
          puts ("success for Example 6");
      }

By the way, you have seen this die(...) method use. It basically just throws a runtime exception. It is a boon helper method. Boon got the die concept from Perl (http://perldoc.perl.org/functions/die.html). Boon endeavors to take good ideas from other languages, and keep them to make a productive sets of APIs.

Boon gets indexing and slicing from Python, puts from Ruby and die from Perl. There is more to come. Boon's older brother was called EasyJava (later named Facile), and EasyJava/Facile borrow many more ideas from Perl, Python and Ruby. But EasyJava/Facile were more experimental whilst Boon is better designed and meant to be curated and used on projects. Boon is not new per se. Boon is an evolution of DataRepo, EasyJava/Facile and Crank. When Boon grows up, it will be a very capable library. But I digress....

Let's take a look at our User object to show that is is just a plain old Java Object:

User class for example


  package com.examples.security.model;

  public class User  {

      public static final Class<User> user = User.class;

      public static User user (String email) {
            return new User( email );
      }


      private final String email;

      ...

      public String getEmail () {
          return email;
      }

User is a pretty simple POJO.

Boon's data repo can work with complex object hierarchies and relationships and it can setup search indexes (TreeMap), lookup indexes (HashMap), non-unique indexes (TreeMap with multi-value), and non unique lookup indexes, but that is for another time.

Just to give you an idea of what a Repo can do. A Repo is an ObjectEditor and a SearchableCollection which can produceResultSets so it has the following method:

void updateByFilter(java.util.List<Update> values, Criteria... expressions) 
void updateByFilter(java.lang.String property, byte value, Criteria... expressions) 
void updateByFilter(java.lang.String property, char value, Criteria... expressions) 
void updateByFilter(java.lang.String property, double value, Criteria... expressions) 
void updateByFilter(java.lang.String property, float value, Criteria... expressions) 
void updateByFilter(java.lang.String property, int value, Criteria... expressions) 
void updateByFilter(java.lang.String property, long value, Criteria... expressions) 
void updateByFilter(java.lang.String property, java.lang.Object value, Criteria... expressions) 
void updateByFilter(java.lang.String property, short value, Criteria... expressions) 
void updateByFilterUsingValue(java.lang.String property, java.lang.String value, Criteria... expressions) 

void addAll(ITEM... items) 
void addAll(java.util.List<ITEM> items) 
void addAllAsync(java.util.Collection<ITEM> items) 
boolean compareAndIncrement(KEY key, java.lang.String property, byte compare) 
boolean compareAndIncrement(KEY key, java.lang.String property, int compare) 
boolean compareAndIncrement(KEY key, java.lang.String property, long compare) 
boolean compareAndIncrement(KEY key, java.lang.String property, short compare) 
boolean compareAndUpdate(KEY key, java.lang.String property, byte compare, byte value) 
boolean compareAndUpdate(KEY key, java.lang.String property, char compare, char value) 
boolean compareAndUpdate(KEY key, java.lang.String property, double compare, double value) 
boolean compareAndUpdate(KEY key, java.lang.String property, float compare, float value) 
boolean compareAndUpdate(KEY key, java.lang.String property, int compare, int value) 
boolean compareAndUpdate(KEY key, java.lang.String property, long compare, long value) 
boolean compareAndUpdate(KEY key, java.lang.String property, java.lang.Object compare, java.lang.Object value) 
boolean compareAndUpdate(KEY key, java.lang.String property, short compare, short value) 
ITEM get(KEY key) 
byte getByte(ITEM item, java.lang.String property) 
char getChar(ITEM item, java.lang.String property) 
double getDouble(ITEM item, java.lang.String property) 
float getFloat(ITEM item, java.lang.String property) 
int getInt(ITEM item, java.lang.String property) 
KEY getKey(ITEM item) 
long getLong(ITEM item, java.lang.String property) 
java.lang.Object getObject(ITEM item, java.lang.String property) 
short getShort(ITEM item, java.lang.String property) 
<T> T getValue(ITEM item, java.lang.String property, java.lang.Class<T> type) 
void modify(ITEM item) 
void modify(ITEM item, java.lang.String property, byte value) 
void modify(ITEM item, java.lang.String property, char value) 
void modify(ITEM item, java.lang.String property, double value) 
void modify(ITEM item, java.lang.String property, float value) 
void modify(ITEM item, java.lang.String property, int value) 
void modify(ITEM item, java.lang.String property, long value) 
void modify(ITEM item, java.lang.String property, java.lang.Object value) 
void modify(ITEM item, java.lang.String property, short value) 
void modify(ITEM item, Update... values) 
void modifyAll(java.util.Collection<ITEM> items) 
void modifyAll(ITEM... items) 
void modifyByValue(ITEM item, java.lang.String property, java.lang.String value) 
void put(ITEM item) 
byte readByte(KEY key, java.lang.String property) 
char readChar(KEY key, java.lang.String property) 
double readDouble(KEY key, java.lang.String property) 
float readFloat(KEY key, java.lang.String property) 
int readInt(KEY key, java.lang.String property) 
long readLong(KEY key, java.lang.String property) 
byte readNestedByte(KEY key, java.lang.String... properties) 
char readNestedChar(KEY key, java.lang.String... properties) 
double readNestedDouble(KEY key, java.lang.String... properties) 
float readNestedFloat(KEY key, java.lang.String... properties) 
int readNestedInt(KEY key, java.lang.String... properties) 
long readNestedLong(KEY key, java.lang.String... properties) 
short readNestedShort(KEY key, java.lang.String... properties) 
java.lang.Object readNestedValue(KEY key, java.lang.String... properties) 
java.lang.Object readObject(KEY key, java.lang.String property) 
short readShort(KEY key, java.lang.String property) 
<T> T readValue(KEY key, java.lang.String property, java.lang.Class<T> type) 
void removeAll(ITEM... items) 
void removeAllAsync(java.util.Collection<ITEM> items) 
void removeByKey(KEY key) 
void update(KEY key, java.lang.String property, byte value) 
void update(KEY key, java.lang.String property, char value) 
void update(KEY key, java.lang.String property, double value) 
void update(KEY key, java.lang.String property, float value) 
void update(KEY key, java.lang.String property, int value) 
void update(KEY key, java.lang.String property, long value) 
void update(KEY key, java.lang.String property, java.lang.Object value) 
void update(KEY key, java.lang.String property, short value) 
void update(KEY key, Update... values) 
void updateByValue(KEY key, java.lang.String property, java.lang.String value)
void addLookupIndex(java.lang.String name, LookupIndex<?,?> si) 
void addSearchIndex(java.lang.String name, SearchIndex<?,?> si) 
java.util.List<ITEM> all() 
int count(KEY key, java.lang.String property, byte value) 
int count(KEY key, java.lang.String property, char value) 
int count(KEY key, java.lang.String property, double value) 
int count(KEY key, java.lang.String property, float value) 
int count(KEY key, java.lang.String property, int value) 
int count(KEY key, java.lang.String property, long value) 
int count(KEY key, java.lang.String property, java.lang.Object value) 
int count(KEY key, java.lang.String property, short value) 
boolean delete(ITEM item) 
ITEM get(KEY key) 
KEY getKey(ITEM item) 
void invalidateIndex(java.lang.String property, ITEM item) 
<T> T max(KEY key, java.lang.String property, java.lang.Class<T> type) 
double maxDouble(KEY key, java.lang.String property) 
int maxInt(KEY key, java.lang.String property) 
long maxLong(KEY key, java.lang.String property) 
java.lang.Number maxNumber(KEY key, java.lang.String property) 
java.lang.String maxString(KEY key, java.lang.String property) 
<T> T min(KEY key, java.lang.String property, java.lang.Class<T> type) 
double minDouble(KEY key, java.lang.String property) 
int minInt(KEY key, java.lang.String property) 
long minLong(KEY key, java.lang.String property) 
java.lang.Number minNumber(KEY key, java.lang.String property) 
java.lang.String minString(KEY key, java.lang.String property) 
java.util.List<ITEM> query(Criteria... expressions) 
java.util.List<java.util.Map<java.lang.String,java.lang.Object>> query(java.util.List<Selector> selectors, Criteria... expressions) 
void query(Visitor<KEY,ITEM> visitor, Criteria... expressions) 
java.util.List<java.util.Map<java.lang.String,java.lang.Object>> queryAsMaps(Criteria... expressions) 
void removeByKey(KEY key) 
ResultSet<ITEM> results(Criteria... expressions) 
java.util.List<ITEM> sortedQuery(Sort sortBy, Criteria... expressions) 
java.util.List<java.util.Map<java.lang.String,java.lang.Object>> sortedQuery(Sort sortBy, java.util.List<Selector> selectors, Criteria... expressions) 
java.util.List<ITEM> sortedQuery(java.lang.String sortBy, Criteria... expressions) 
java.util.List<java.util.Map<java.lang.String,java.lang.Object>> sortedQuery(java.lang.String sortBy, java.util.List<Selector> selectors, Criteria... expressions) 
void sortedQuery(Visitor<KEY,ITEM> visitor, Sort sortBy, Criteria... expressions) 
void sortedQuery(Visitor<KEY,ITEM> visitor, java.lang.String sortBy, Criteria... expressions) 
void validateIndex(java.lang.String property, ITEM item)

Here is what a ResultSet produces from results:

package org.boon.datarepo;

public interface ResultSet<T> extends Iterable<T> {

    ResultSet expectOne();


    <EXPECT> ResultSet <EXPECT> expectOne(Class<EXPECT> clz);


    ResultSet expectMany();

    ResultSet expectNone();

    ResultSet expectOneOrMany();

    ResultSet removeDuplication();

    ResultSet sort(Sort sort);

    Collection<T> filter(Criteria criteria);

    ResultSet<List<Map<String, Object>>> select(Selector... selectors);

    int[] selectInts(Selector selector);

    float[] selectFloats(Selector selector);

    short[] selectShorts(Selector selector);

    double[] selectDoubles(Selector selector);

    byte[] selectBytes(Selector selector);

    char[] selectChars(Selector selector);

    Object[] selectObjects(Selector selector);

    <OBJ> OBJ[] selectObjects(Class<OBJ> cls, Selector selector);

    <OBJ> ResultSet<OBJ> selectObjectsAsResultSet(Class<OBJ> cls, Selector selector);


    Collection<T> asCollection();

    String asJSONString();

    List<Map<String, Object>> asListOfMaps();

    List<T> asList();

    Set<T> asSet();

    List<PlanStep> queryPlan();

    T firstItem();

    Map<String, Object> firstMap();

    String firstJSON();

    int firstInt(Selector selector);

    float firstFloat(Selector selector);

    short firstShort(Selector selector);

    double firstDouble(Selector selector);

    byte firstByte(Selector selector);

    char firstChar(Selector selector);

    Object firstObject(Selector selector);

    <OBJ> OBJ firstObject(Class<OBJ> cls, Selector selector);


    List<T> paginate(int start, int size);

    List<Map<String, Object>> paginateMaps(int start, int size);

    String paginateJSON(int start, int size);

    //Size can vary if you allow duplication.
    //The size can change after removeDuplication.
    int size();


}

Also ObjectEditor is a Bag and SearchableColleciton is a java.util.Collection so Repo is a Bag and a Collection too. The point is to say that you could write a book on what a Repo does. It does not do it all (in fact it is just an interface), but it does a lot with containment and delegation (compose-able objects) that form a Repo. A Repo is just a collection that allows queries through indexes (HashMap, TreeMap).

By default Boon works with the fields in the class, but you can instruct it to only use properties (getters and setters).

Let's see how Boon's data repo handles composite objects and property access as follows, but first let's define some more classes:

Email class*

package com.examples.model.test;

public class Email {


    private String content;

    public String getEmail () {
        return content;
    }

    public void setEmail ( String content ) {
        this.content = content;
    }

    public Email ( String content ) {
        this.content = content;
    }



    public Email (  ) {
    }

    @Override
    public boolean equals ( Object o ) {
        if ( this == o ) return true;
        if ( !( o instanceof Email ) ) return false;

        Email email = ( Email ) o;

        if ( content != null ? !content.equals ( email.content ) : email.content != null ) return false;

        return true;
    }

    @Override
    public int hashCode () {
        return content != null ? content.hashCode () : 0;
    }
}

The new User object does not have a simple email field but a complex email field (Email).

UserEmail class that uses email instead of a simple string

package com.examples.model.test;



public class UserEmail {

    private Email email;

    public UserEmail ( String email ) {

        this.email = new Email ( email ) ;

    }

    public Email getEmail () {
        return email;
    }
}

Let's setup a repo again, but this time we will only use properties not fields:

      boolean ok = true;

      RepoBuilder repoBuilder = Repos.builder ();

      repoBuilder.usePropertyForAccess ( true );


      putl ("The primary key is set to email");

      repoBuilder.primaryKey ( "email" );


      putl ("For ease of use you can setup nested properties ",
              "UserEmail.email property is a Email object not a string",
              "Email.email is a string.");

      //You can index component objects if you want
      repoBuilder.nestedIndex ( "email", "email" );

Read the above comments and putl calls, and puts method calls for more instruction.

Next we define our userRepo. Notice that I opted to use the Email.class, and UserEmail.class to show you exactly what we are passing. Email.class is the primary key class, and UserEmail.class is the item type.


      /** Create a repo of type String.class and User.class */
      final Repo<Email, UserEmail> userRepo = repoBuilder.build (
                                              Email.class, UserEmail.class );

Let's add some test data using the repo.add method.


      puts("Adding three test objects for bob, sam and joe ");
      userRepo.add ( new UserEmail ( "bob@bob.com" ) );
      userRepo.add ( new UserEmail ( "sam@bob.com" ) );
      userRepo.add ( new UserEmail ( "joe@bob.com" ) );

Now we can query userRepo and look for Bob's email:

      UserEmail bob = (UserEmail) userRepo.results (
                  eqNested ( "bob@bob.com", "email", "email" ) )
              .expectOne ().firstItem ();

Notice that we are using an eqNested operator. So in effect we are looking for the property path of user.email.email. Since the root object of userRepo are users (UserEmail to be precise) then we are looking for root.email.email. This is because UserEmail has a property of type Email called email, and Email has a property called email.

We can test we got the right email address:

      ok |= bob.getEmail ().getEmail ().equals ( "bob@bob.com" ) || die();

The above is boon speak for if Bob's email's property called email is not equal to "bob@bob.com", then die (which just means throw a runtime exception).

We can avoid the cast as follows (read putl and comments):


      putl("Avoid the cast with using nested query Repo.eqNested(UserEmail.class)");

      bob = userRepo.results ( eqNested ( "bob@bob.com", "email", "email" ) )
              .expectOne (UserEmail.class).firstItem ();


      ok |= bob.getEmail ().getEmail ().equals ( "bob@bob.com" ) || die();

You are not stuck with strings, you can query primitives and any complex object that implements an equals method and a hashCodemethod as follows:


      Email email = new Email ( "bob@bob.com" );
      bob = (UserEmail) userRepo.results ( eq ( EMAIL, email ) )
              .expectOne ().firstItem ();

      ok |= bob.getEmail ().getEmail ().equals ( "bob@bob.com" ) || die();

      puts("success=", ok);

Notice you can query with email (complex object) direct and remember that the email property is not a string but a class called Email that has an equals and a hashCode method.

That is all for part 1 as far as examples go (it goes on and on).

There is a lot more to dataRepo than meets the eye. It can do many things. This was just an introductory article to whet your appetite, and hopefully, you want to learn more.

Here are some of the classes in Boon's data repo package.

Interfaces

Bag                    //Like a java.util.List
CollectionDecorator    //Decorates built-in collecitons to add indexes and searches
Filter                 //The filter interface
LookupIndex            //LookupIndex implemented with HashMaps or equiv
ObjectEditor           //Abilty to edit fields and peform Select Queries against collecitons
Repo                   //Like a data repo but uses collections api to sync one or more indexes
RepoBuilder            //Allows you to build a Repo, setup indexes, etc.
ResultSet              //The results from a query.
SearchableCollection   //A collection that is searchable.

Classes

Collections   //converts regular collections to SearchableCollections with indexes to/for
Repos         //Helper class for creating RepBuilders

DataRepoException  //Exception class for DataRepo

There is also a Criteria API that works with the DataRepo queries. DataRepo queries start by using the index, and then when an item is not in the index or if the index query did not fully satisfy the criteria, DataRepo can then use the criteria API to perform the rest in a linear search:

Criteria                 //Criteria
CriteriaFactory          //Criteria Factory, we say this earlier, eq, notEq, etc.
Criterion                //Concrete classes
Criterion.PrimitiveCriterion //Primitive criteria to avoid wrappers and lots of temp objects
Group                   //Group of Criteria, Group is also a Criteria so it is nested
Group.And               //And criteria together
Group.Or                //Or Criteria
Not                     //Inverse the logic
ProjectedSelector       //Perform projections over a Criteria
QueryFactory            //Create queries
Selector                //Selectors like SQL select
Sort                    //Define how you want the data Sorted
Update                  //Ability to select and update objects based on criteria

Enums

Grouping
Operator
SortType

To learn more see the JavaDocs.

http://richardhightower.github.io/site/javadocs/index.html

Let me know what you think, and if you have issues, please file a bug.

If you want to get a sneak peek at what is coming in Boon see https://github.com/RichardHightower/datarepo.

There is a lot more to Boon's data repo then we can cover in one article.

Why Boon

Easily read in files into lines or a giant string with one method call. Boon has Slice notation for dealing with Strings, Lists, primitive arrays, etc. If you are from Groovy land, Ruby land, Python land, or whatever land, and you have to use Java then Boon might give you some relief. If you are like me, and you like to use Java, then Boon is for you too.

Core Boon Philosophy

Core Boon will never have any dependencies. It will always be able to run as a single jar.

Contact Info

Sleepless Dev

Rick

Saturday, November 2, 2013

What if Java collections and Java hierarchies were easily searchable? They are with Boon!

Why Boon data repo

Born out of need

Boon's data repo augments the streaming API.

Sneak peak before the step by step

Using Boon's data repo by example

Java Boon - Getting Started with Boon data repo

Using Boon DataRepo by example

Further Reading:

Why Boon

Core Boon Philosophy

1 comment:

About Me

Related sites

Blog Archive

Subscribe To

Rick

Saturday, November 2, 2013

What if Java collections and Java hierarchies were easily searchable? They are with Boon!

Why Boon data repo

Born out of need

Boon's data repo augments the streaming API.

Sneak peak before the step by step

Using Boon's data repo by example

Java Boon - Getting Started with Boon data repo

Using Boon DataRepo by example

Further Reading:

Why Boon

Core Boon Philosophy

1 comment:

About Me

Related sites

Blog Archive