Tuesday, October 22, 2013

Java slice notation to split up strings, lists, sets, arrays, and to search sorted sets and maps

Boon Home | Boon Source | If you are new to boon, you might want to start here. Boon is "Simple opinionated Java for the novice to expert level Java Programmer". Boon is a "Low Ceremony. High Productivity" framework. It is meant to be a "real boon to Java to developers!"
Many languages have slice notation (Ruby, Groovy and Python). Boon adds this to Java.
Boon has three slc operators: slcslc (start only), and slcEnd.
With Boon you can slice strings, arrays (primitive and generic), lists, sets, tree sets, tree map's and more. This article explains slice notation and how it is implemented in Boon. It shows how to use slice notation with arrays, sets, tree sets, etc. You can use Boon slice notation to search TreeMaps and TreeSets easily. With Boon, slicing primitive arrays does not use auto-boxing so it is very efficient.
Slice notations - a gentle introduction 
The boon slice operators works like Python/Ruby slice notation:
Ruby slice notation
  arr = [1, 2, 3, 4, 5, 6]
  arr[2]    #=> 3
  arr[-3]   #=> 4
  arr[2, 3] #=> [3, 4, 5]
  arr[1..4] #=> [2, 3, 4, 5]
Python slice notation
  string = "foo bar" 
  string [0:3]  #'foo'
  string [-3:7] #'bar'
What follows is derived from an excellent write up on Python's slice notation:
The basics of slice notations are as follows:
Python Slice Notation
         a[ index ]       # index of item
         a[ start : end ] # items start through end-1
         a[ start : ]     # items start through the rest of the array
         a[ : end ]       # items from the beginning through end-1
         a[ : ]           # a copy of the whole array
Java Slice Notation using Boon:
          idx( index )         // index of item
          slc( a, start, end ) // items start through end-1
          slc( a, start )      // items start through the rest of the array
          slcEnd( a, end )     // items from the beginning through end-1
          copy( a )            // a copy of the whole array
  • slc stands for slice
  • idx stands for index
  • slcEnd stands for end slice.
  • copy stands for well, err, um copy of course
The key point to remember is that the end value represents the first value that is not in the selected slice. So, the difference between end and start is the number of elements selected.
The other feature is that start or end may be a negative number, which means it counts from the end of the array instead of the beginning. Thus:
Python slice notation with negative index
             a[ -1 ]    # last item in the array
             a[ -2: ]   # last two items in the array
             a[ :-2 ]   # everything except the last two items
Java negative index
             idx   ( a, -1)     // last item in the array
             slc   ( -2 )       // last two items in the array
             slcEnd( -2 )       // everything except the last two items
Python and Boon are kind to the programmer if there are fewer items than you ask for: Python does not allow you to go out of bounds, if you do it returns at worse an empty list. Boon follows this tradition, but provides an option to get exception for out of bounds (described later). In Python and Boon, if you go to far, you get the length, if you try to go under 0 you get 0 (under 0 after calculation). Conversely, Ruby gives you a null pointer (Nil). Boon copies Python style as one of the goals of Boon is to avoid ever returning null (you get an exception, Option). (Boon has second operator called zlc which throws an out of bounds index exception, but most people should use slc.)
For example, if you ask for slcEnd(a, -2) (a[:-2]) and a only contains one element, you get an empty list instead of an error. Sometimes you would prefer the error, and with Boon you have that option.

More slicing

Here are some basic Java types, list, array, veggies, primitive char array, and a primitive byte array.

Declare variables to work with in Boon
  //Boon works with lists, arrays, sets, maps, sorted maps, etc.
  List<String> fruitList;
  String [] fruitArray;
  Set<String> veggiesSet;
  char [] letters;
  byte [] bytes;
  NavigableMap <Integer, String> favoritesMap;
  Map<String, Integer> map;

  //In Java a TreeMap is a SortedMap and a NavigableMap by the way.

Boon comes with helper methods that allow you to easily create lists, sets, maps, concurrent maps, sorted maps, sorted sets, etc. The helper methods are safeListlistsetsortedSetsafeSetsafeSortedSet, etc. The idea is to make Java feel more like list and maps are built in types.

Initialize set, list, array of strings, array of chars, and array of bytes
  veggiesSet  =  set( "salad", "broccoli", "spinach");
  fruitList   =  list( "apple", "oranges", "pineapple");
  fruitArray  =  array( "apple", "oranges", "pineapple");
  letters     =  array( 'a', 'b', 'c');
  bytes       =  array( new byte[]{0x1, 0x2, 0x3, 0x4});

There are even methods to create maps and sorted maps called mapsortedMapsafeMap (concurrent) and sortedSafeMap(concurrent). These were mainly created because Java does not have literals for lists, maps, etc.

Java: Use map operator to generate a SortedMap <Integer, String> and a Map<String, Integer>
   favoritesMap = sortedMap(
          2, "pineapple",
          1, "oranges",
          3, "apple"

   map =    map (
      "pineapple",  2,
      "oranges",    1,
      "apple",      3
You can index maps, lists, arrays, etc. using the idx operator.
Java: Using the Boon Java idx operator to get the values at an index
   //Using idx to access a value.

   assert idx( veggiesSet, "b").equals("broccoli");

   assert idx( fruitList, 1 ).equals("oranges");

   assert idx( fruitArray, 1 ).equals("oranges");

   assert idx( letters, 1 ) == 'b';

   assert idx( bytes, 1 )      == 0x2;

   assert idx( favoritesMap, 2 ).equals("pineapple");

   assert idx( map, "pineapple" )  == 2;
The idx operators works with negative indexes as well.

Java: Using idx operator with negative values
             //Negative indexes

              assert idx( fruitList, -2 ).equals("oranges");

              assert idx( fruitArray, -2 ).equals("oranges");

              assert idx( letters, -2 ) == 'b';

              assert idx( bytes, -3 )   == 0x2;

Ruby, Groovy and Python have this feature. Now you can use this in Java as well! The Java version (Boon) works with primitive arrays so you get no auto-boxing.
Something that Ruby and Python don't have is slice notation for SortedSets and SortedMaps.

You can use slice notation to search sorted maps and sets in Java

Slice notations works with sorted maps and sorted sets.
Here is an example that puts a few concepts together.
              set = sortedSet("apple", "kiwi", "oranges", "pears", "pineapple")

              slcEnd( set, "o" )      //returns ("oranges", "pears", "pineapple")
              slc( set, "ap", "o" )   //returns ("apple", "kiwi"),
              slc( set, "o" )         //returns ("apple", "kiwi")
You are really doing with slicing of sorted maps and sorted sets is a between query of sorts.
What item comes after "pi"?
              after(set, "pi") //pineapple
And before pineapple?
              before(set, "pi")
Ok, let go through it step by step....
      NavigableSet<String> set =
              sortedSet("apple", "kiwi", "oranges", "pears", "pineapple");


              "oranges", idx(set, "ora")

TreeSet implements NavigableSet and SortedSet.

Brief reminder that TreeSet is a NavigableSet

We can look up the first fruit in the set that starts with 'o' using:
idx(set, "o")
Here is is with the set of fruit we created earlier (set is a TreeSet with "apple", "kiwi", "oranges", "pears", "pineapple" in it).

              "oranges", idx(set, "o")

We found oranges!
Here it is again but this time we are searching for fruits that start with "p", i.e., idx(set, "p").

              idx(set, "p")

Yeah! We found pears!
How about fruits that start with a "pi" like "pineapple" - idx(set, "pi")

              idx(set, "pi")

You could also ask for the item that is after another item. What is after "pi"?
after(set, "pi")

              after(set, "pi")

The "pineapple" is after the item "pi". after and idx are the same by the way. So why did I add an after? So I can have a before!!! :)
What if you want to know what is before "pi"?
before(set, "pi")

              before(set, "pi")

How about all fruits that are between "ap" and "o"? As I promised there is slice notation!
slc(set, "ap", "o")

              sortedSet("apple", "kiwi"),
              slc(set, "ap", "o")

How about all fruits after "o"?
slc(set, "o")

              sortedSet("apple", "kiwi"),
              slc(set, "o")

So all fruits after "o" is "apple" and "kiwi".
How about all fruits up to "o"? (slcEnd read it as I am slicing off the end.)
slcEnd(set, "o")

              sortedSet("oranges", "pears", "pineapple"),
              slcEnd(set, "o")
So all fruits up to and including "o" are "oranges", "pears" and "pineapple".

Safe slicing for list like things

These operators throw an exception if the index is out of bounds:
Java Slice Notation as follows using Boon:
          ix( index )         // index of item
          zlc( a, start, end ) // items start through end-1
          zlc( a, start )      // items start through the rest of the array
          zlcEnd( a, end )     // items from the beginning through end-1
  • zlc stands for zero tolerance slice
  • ix stands for zero tolerance index
  • zlcEnd stands for zero tolerance end slice.
  • copy stands for well, err, um copy of course

Works with Primitives too so no auto-boxing

Indexing primitives
  byte[] letters =
          array((byte)'a', (byte)'b', (byte)'c', (byte)'d');

          idx(letters, 0)

          idx(letters, -1)

          idx(letters, letters.length - 1)

  idx(letters, 1, (byte)'z');

          idx(letters, 1)
The method len and idx are universal operators and they work on lists, arrays, sets, maps, etc.
  • len give me the length of an array-like, list-like, map-like, thing.
  • idx give me the item at the location of an "index" in the array-like, list-like, map-like, thing.
HOME MC String Slice!
Here are some examples of Boon Java String Slicing
      String letters = "abcd";

      boolean worked = true;

      worked &=

              idx(letters, 0)  == 'a'
                      || die("0 index is equal to a");

      worked &=

              idx(letters, -1)  == 'd'
                      || die("-1 index is equal to a");
Another way to express idx(letters, -1) == 'd' is idx(letters, letters.length() - 1) == 'd'!
I prefer the shorter way!
      worked &=

              idx(letters, letters.length() - 1) == 'd'
                       || die("another way to express what the -1 means");

      //We can modify too
      letters = idx(letters, 1, 'z');

      worked &=

              idx(letters, 1) == 'z'
                      || die("Set the 1 index of letters to 'z'");

      worked &= (
              in('a', letters) &&
              in('z', letters)
      ) || die("'z' is in letters and 'a' is in letters");
Slice Slice Baby!
      letters = "abcd";

      worked &=
              slc(letters, 0, 2).equals("ab")
                  || die("index 0 through index 2 is equal to 'ab'");

      worked &=
              slc(letters, 1, -1).equals("bc")
                      || die("index 1 through index (length -1) is equal to 'bc'");

      worked &=
              slcEnd(letters, -2).equals("ab")
                      || die("Slice of the end of the string!");

      worked &=
              slcEnd(letters, 2).equals("ab")
                      || die("Vanilla Slice Slice baby!");


Slice notation has always been a real boon to me when I used it with Python and Groovy. I plan to use slice notation with Java for years to come. I hope you enjoy Boon Slice Notation.
More to come....

Further Reading:

If you are new to boon start here:

Why Boon

Easily read in files into lines or a giant string with one method call. Boon has Slice notation for dealing with Strings, Lists, primitive arrays, etc. If you are from Groovy land, Ruby land, Python land, or whatever land, and you have to use Java then Boon might give you some relief. If you are like me, and you like to use Java, then Boon is for you too.

Core Boon Philosophy

Core Boon will never have any dependencies. It will always be able to run as a single jar.

Contact Info
blog|twitter|java lobby|Other | richard high tower AT g mail dot c-o-m (Rick Hightower)|work|cloud|nosql
Kafka and Cassandra support, training for AWS EC2 Cassandra 3.0 Training