Build an Embedded Array Language in Java

t’s really not surprising that the first “killer app” in the world of personal computing was the spreadsheet. After all, a table filled with numbers is one of the simplest and most useful metaphors for displaying and analyzing data.

Arrays of numbers are a cornerstone of numerical programming. They are so central to it that a number of languages (notably APL and its successors, J and K) have been created that specialize in multidimensional arrays. Proponents of these languages claim that array languages allow for much cleaner, comprehensible code.

If you’ve ever done any kind of financial programming, you’ve dealt with multidimensional arrays of numbers. Dealing with high-dimension arrays can be tricky, because it’s hard to picture the data structures, and hard to draw diagrams. And code itself can be hard to read. Just think back to the last triply-nested for-loop you wrote?you remember?the one that took up three screens of code.

The advantages are great, however. Even though your numerical data structures might have an unwieldy number of dimensions, they have the advantage of homogeneous axes?that is, you can rotate the array through higher-dimensional space, extract lower-dimensional slices, and manipulate and display them. Each axis is similar to every other axis, and so the various axes can all be handled by the same set of routines.

This article discusses the design and implementation of a library for multidimensional arrays, and how it can make programming simpler.

A Note on Generics
For those of you interested in learning about the Java Generics feature of JDK 1.5, note that the library described in this article does use them. The implementation of the Cube relies heavily on the new Generics feature of JDK 1.5. The Cube class is parameterized by T, the type of the individual cells of the cube. One nice aspect of this is that you don’t have to decide beforehand how many dimensions the Cube has?the constructor will take an array of any dimension and recursively construct the cube from it. I have to admit that I was surprised that this worked; it seemed a little magical. The code for that constructor should be enlightening.

As this was the first time I had used this feature in Java, I was surprised at how rarely I needed to specify the type parameter “T” in various parts of the code; the compiler seemed able to infer the type of T pretty often.

The Cube Class
Our code centers around a class called Cube. A Cube is a multidimensional array. The name is a bit misleading, since a cube is three-dimensional, but rest assured that the Cube class handles every dimension from 1 on up.

A Cube is a multidimensional array. In Java, that also means it’s a nested array?an array of arrays, an array of arrays of arrays, and so on. I chose the name “Cube” to avoid any implication of a tree concept, where each nested part can be a different shape; rather, a cube is a regular grid of any dimension.

First, you’ll see how to use a Cube, and after that, you’ll see how it’s implemented.

Creating a Cube
First, let’s start with some data. Here’s a typical array of data:

   Integer data[] = { 10, 20, 30 };

It’s easy to turn this into a one-dimensional cube:

   Cube cube = new Cube( data );
 
Figure 1. A One-dimensional cube: Here’s how you might visualize the data in a one-dimensional cube.

Figure 1 shows how the data would look inside the cube:

It’s just a coincidence that I put 10 in slot 0, 20 in slot 1, and 30 in slot 2. I’m going to do this to make the data nice and uniform so it’s easy to see what’s happening to it; in real life, it would be more like real data.

Modifying Data
One thing you can do with a cube is reverse it along the first axis, for example:

   Cube r = cube.reverse();
 
Figure 2: Reversed Cube: If you reverse the data in Figure 1, you get this cube.

This returns a new Cube that looks like Figure 2:

Note that I said “returns a new Cube.” This class was written in a functional style. In this context, that means that an object’s method returns a new object, rather than modifying itself. This may seem a little strange, but in the context of array programming, it’s very useful. It’s an idea borrowed from other programming languages (and array languages like APL in particular), but it works just fine in Java.

The Map Operation
A map is an operation in which you apply a function to each element of a list. The result of a map is the list of the outputs of each function call. For example, you can map a function called ‘doubler’ across the Cube object, resulting in a cube that looks like Figure 3.

 
Figure 3. A Doubled Cube: The figure shows the results after doubling the cube values in Figure 1.
   Cube d = cube.map( doubler );

But what is doubler? It isn’t a function, exactly, since you can’t pass those around in Java. Rather, it’s a simple object that contains a single function:

   Fun doubler = new Fun() {     public Integer apply( Integer n ) {       return n 2;     }   };   

The doubler class implements the Fun interface, which has a single method called apply(). Apply takes an object of some time, modifies it, and returns the modified version. A Fun is a value transformer; in this case, the transformation takes a number and doubles it. When you hand this Fun to the map() method, it applies the transformation to the whole cube, doubling each cell.

The Fold Operation
A fold is another useful operation you can apply to a multidimensional array. Whereas a map takes an N-dimensional array and returns another N-dimensional array, a fold takes an N-dimensional array and returns an (N-1)-dimensional array. That is to say, it collapses the cube along one dimension.

Here’s an example. Going back to the original [10 20 30] array, you can fold it like this:

   Integer data[] = { 10, 20, 30 }   Cube cube = new Cube( data );   int sum = cube.fold( plus, 0 );

The result?10+20+30 = 60?gets stored in the integer variable sum. So, that adds up all the elements of the array. But how did it turn a fold (whatever that is) into a sum?

The method call was fold( plus, 0 ). One way to look at this is:

   int sum = 0;            // 0   sum = plus( sum, 30 );  // 30 =  0+30   sum = plus( sum, 20 );  // 50 = 20+30   sum = plus( sum, 10 );  // 60 = 10+50

In other words, the code contracts the array by repeatedly calling plus. You can also look at this in a more functional style:

   int sum = plus( 10, plus( 20, plus( 30, 0 ) ) );

Or, using the ‘+’ operator:

   int sum = 10 + (20 + (30 + 0));

The cool thing about fold is that you can use different functions and get different, useful results. For example, if you wanted the product instead of the sum, you could just use ‘times’ and ‘1’ instead of ‘plus’ and ‘0’:

   Integer data[] = { 10, 20, 30 }   Cube cube = new Cube( data );   Integer product = cube.fold( times, 1 );

The preceding code returns 10 (20 (30 1)), or 6000. Of course, you need to define plus and times. You’d do that in much the same way as doubler, except that you use a function that takes two arguments, rather than one:

   Fun2 plus = new Fun2() {     public Integer apply( Integer a, Integer b ) {       return a + b;     }   };      Fun2 times = new Fun2() {     public Integer apply( Integer a, Integer b ) {       return a b;     }   };   

Admittedly, these definitions are a bit wordy, but the great thing is that they are reusable. It’s easy to create a library of mappers and folders that you can refer to by name throughout your code.

Folding Left and Right
The folds shown in the previous section aren’t the only way to fold; in fact, there are two different kinds of fold: left folds and right folds.

The folds in the last section were right folds (also known as foldr), which are the default type. In other words, fold() really means foldr(). When you do a right fold, you combine the elements from right to left, for example:

   Integer data[] = { 10, 20, 30 }   Cube cube = new Cube( data );   int sum = cube.foldr( plus, 0 );

Thus, a right fold as shown above is the same as:

   int sum = 10 + (20 + (30 + 0)); // 60

But you could also do a left fold (also known as foldl), which adds the numbers from left to right:

   Integer data[] = { 10, 20, 30 }   Cube cube = new Cube( data );   int sum = cube.foldl( plus, 0 );

And that’s the same as:

   int sum = ((0 + 10) + 20) + 30; // 60

Obviously, if you’re just adding, you’ll get the same answer from the left fold as from the right fold, because addition is associative?that is, (a + b) + c is the same as a + (b + c). But division isn’t associative; what do you get in that case?

   Float data[] = { 10F, 20F, 30F, 40F };   Cube cube = new Cube( data );   float sum_right = (Float)cube.foldr( dividedby, 1F );   float sum_left = (Float)cube.foldl( dividedby, 1F );

The preceding code results in 0.375 for sum_right, and 2.6666667 for sum_left.

Adding Dimensions
So far, you’ve dealt with only one-dimensional arrays, which isn’t really very challenging. So let’s add more data. Creating a two-dimensional cube is just like creating a one-dimensional cube, only you need two dimensions of data:

 
Figure 4. A Two-dimensional Cube: The figure shows a visualization of the data in a two-dimensional cube.
   Integer data[][] = { { 1010, 1020, 1030 },                        { 2010, 2020, 2030 },                         { 3010, 3020, 3030 } };   Cube cube = new Cube( data );   

This two-dimensional cube looks like Figure 4.

Map works on a two-dimensional array just like it did on a one-dimensional array. Applying ‘doubler’ to this gives you what you would expect, as shown in Figure 5:

A fold works just fine on a two-dimensional array as well. Remember that a fold reduces the number of dimensions by one. Just as you used fold to turn a one-dimensional array (10, 20, 30) into a single number value (the sum 60), you can use a fold to turn a two-dimensional array into a one-dimensional array:

 
Figure 5. A Doubled Two-dimensional Cube: Doubling works on a two-dimensional cube just as it does on a one-dimensional cube.
   Cube f = cube.fold( 'plus', 0 );

Folding the cube that way results in a single column of values, as shown in Figure 6.

 
Figure 6. A Folded Two-dimensional Cube: Folding works on multi-dimensional cubes as well.

Playing with Axes
In higher dimensions, you may often it useful to move the axes around. For example, you might want to swap the axes of the two-dimensional array from the last section. That is, you might want to turn the array shown in Figure 4 into the form shown in Figure 7.

No problem. You simple permute the axes:

   int permutation[] = { 1, 0 };   Cube pcube = cube.permuteAxes( permutation );
 
Figure 7: Swapping Axes: The figure shows a two-dimensional cube after swapping axes.

The array { 1, 0 } controls how to build the output array. In this case, the code will change the array so that axis #1 of the input becomes axis #0 of the output, and axis #0 of the input becomes axis #1 of the output. In other words, the code swaps them.

Map and fold are powerful operations. Their power lies in the fact that they don’t actually describe the computation to be performed?they only describe the way to traverse the data. The computation itself is passed in as a Fun or Fun2.

This separation of traversal from computation is a staple of functional programming; it also touches on the “separation of concerns” idea from Aspect-oriented programming. Thus, the maps and folds shown here are only the beginning; it’s up to you to create new functions to map and fold across your data structures.

Share the Post:
Share on facebook
Share on twitter
Share on linkedin

Overview

Recent Articles: