Tuesday, May 24

What is J?


Superficial introduction:

J was, originally, a notation introduced to describe computer architecture (Kenneth E. Iverson, 1957, Harvard School of Business). Iverson then went to work for IBM and with Adin Falkoff, Fred Brooks and others published a few reports and parts of books. A part of this was some documentation on the System/360, and the notation used to document the architecture evolved into an executable notation (a programming language). This programming language gained some early users but eventually more interactive systems (spreadsheets) began eating into its niche. J is a rewrite of this language, designed to eliminate some of the obstacles which could prevent its use in schools (for example: the original language could not be programmed in ASCII).

So, what's it like to work in J?

Imagine, if you will, that you are working in a computer language which has some new-fangled operators, like "multiplication" and you are talking with people used to working in traditional languages which do not have any such operation.  How would you describe your language?

"Multiplication is like repeated addition."

"Oh, I can do that!  I just use a loop!"

"Ok, but.. polynomials..."

"Hey, how do you make multiplication conditional?"


"I have this loop here where when I=X+Y we subtract H instead of adding K.  How do you do that with this multiplication thing?"

"Um... what is it that you are trying to do?"

(long conversation follows)

"This multiplication stuff is too esoteric.  I'm a practical person and I do not need to do this obscure multiplication stuff.  Besides, my language enforces double-entry bookkeeping, and I really do not want to learn another language if it does not enforce double-entry bookkeeping.  Have I shown you my differential equation simulator that translates mass and acceleration into energy?"


Anyways, that's kind of how it feels like, working in J.

Tuesday, May 3

TDD Bowling and J, revised, part .... 5 of 3?

This is a continuation of my notes on my recent tdd bowling scorer effort.

The problem here, is that code that seems simple, to me, is not simple to someone that does not understand it.  But bringing some that does not understand something up to where they do understand it can be tricky.  They need to have the interest to do so, and the ability to find things that they do understand.  And I can only try to help them.  If I offer things that do not answer their questions, I am not doing anything worthwhile.

But, of course, this is one of those blogs on the edge of nowhere and I am piloting blind: I do not actually have any questions to answer.  But what I can do is take a sentence of code, and break it down into parts, explain each of the parts, and try to show how the parts work together.

For today, the sentence I am going to work on will be: limitSpares=: ] * 1,.1,.10 <: +/ .*&1 1 0

This is from:

 nextFrameIndex=: ] + 1 + 10 > {.@{~
 limitSpares=: ] * 1,.1,.10 <: +/ .*&1 1 0
 bowlingScores=: limitSpares@({~ nextFrameIndex^:(i.10)&0)@(3 ]\ ])@{.~&21

 isScore=: -:&(+/@,) , (10 3 >: $@]) , 2 -: #@$@]
 assert 300 isScore bowlingScores 12#10
 assert 190 isScore bowlingScores 21$9 1
 assert 0 isScore bowlingScores 20$0
 assert 90 isScore bowlingScores 20$8 1

And, here are some examples of how this definition might get used:

   limitSpares 10 3$10
   limitSpares 10 3$1

In other words, the argument will be a table with 10 rows and three columns.  The rows correspond to bowling frames.  The columns are the balls that count towards that frame.  The problem is that while we get three balls in a frame when we have a strike or a spare, we only get to count the first two when we did not score 10 points with those two balls.

In other words, sometimes we need to multiply that third ball score by 0, so we can ignore it when we are totalling our score.  (Yes, we could also do something fancy so that we do not have a third ball score for those frames.  But simplicity is a virtue.)

Speaking of simplicity, let's cut down the size of our data.  Instead of 10 rows, let's just have two rows:

   1 1 10,:10 10 10
 1  1 10
10 10 10
   limitSpares 1 1 10,:10 10 10
 1  1  0
10 10 10

limitSpares will only work on three column tables, but it does not care how many rows we have.

So, first off, let's take this step by step, operating on the data the same way the parser would:

   (+/ .*&1 1 0) 1 1 10,:10 10 10
2 20

The first two balls in the first frame scored 2.  The first two balls in the second frame scored 20.  So the first frame was "open" while the second frame was a "strike".  (If the score for those two balls was exactly 10 it might have been a spare or a strike, but that distinction does not matter here.  If their score was greater than 10, it had to be a strike.)

And, yes, I am "sort of cheating" here.  That's a rather complicated expression in those parenthesis.  But bear with me and I will get back to this.  First though, I want to show how this intermediate result gets used:

   (10 <: +/ .*&1 1 0) 1 1 10,:10 10 10
0 1

Here, we ask: was 10 less than or equal to the score for the first two balls of a frame?  And the answer for the first frame is no (0), while the answer for the second frame is yes (1).   <:

By the way, some people have tried to tell me that using 0 for false and 1 for true is bad.  But 0 and 1 are very special numbers in arithmetic and I think using 0 for false and 1 for true is good.  Anyways, that's the way J works.

   (1,.10 <: +/ .*&1 1 0) 1 1 10,:10 10 10
1 0
1 1

Here, I am adding a column that is all "1"s onto my 0 and 1.  In essence ',.' is all about adding columns onto a table.  In this case neither argument was a table, so they both get re-interpreted to be columns of a table. Here's another example of using ,.

   (9,.10 <: +/ .*&1 1 0) 1 1 10,:10 10 10
9 0
9 1

The plain number gets repeated as many times as is necessary to match the rows(s) defined in the other argument.

And I can do this twice:

   (1,.1,.10 <: +/ .*&1 1 0) 1 1 10,:10 10 10
1 1 0
1 1 1

Now I have a three column table which exactly matches the size of my original table.  And, I can multiply them together:

   (] * 1,.1,.10 <: +/ .*&1 1 0) 1 1 10,:10 10 10
 1  1  0
10 10 10

Tada... that's what I wanted.

So, what about those multiplications?

Everyone knows what multiplication is, right?  1 * 1 is 1, 0 * 1 is 0.  You can't get much simpler than that.  But when more than one number gets involved, things start getting messy.  You have inner products, outer products, cross products, scalar products, tensor products and... lots of stuff that we are not going to get into here.

Of those products, the two we use here are an unnamed one ("plain multiplication") and "inner product".  "Inner product" is a very popular one, and traditionally in math involving matrices (tables) or vectors (lists), if nothing else is said inner product is what people use.  But, for computers, it's simple to explain inner product in terms of pairwise multiplication.  But going the other way?  It's nightmarish.

Anyways, multiplication in J works just like addition in J -- numbers are paired up and instead of being added they are multiplied.

   1 2 3 * 4 5 6
4 10 18

So, what about inner product?  Well, in J, the decimal point taken by itself (with a space to its left) is an operator used to create inner product verbs.  For the standard mathematical inner product use +/ on its left and * on its right.

   1 2 3 +/ .*4 5 6

For this program it's sufficient to know that the last dimension of the left argument must match the first dimension of the right argument.  These inner dimensions are paired up and the corresponding numbers are multiplied and then these results are added up.

Anyways, (+/ .*&1 1 0) 1 1 10,:10 10 10 gives the same result as (1 1 10,:10 10 10) +/ .* 1 1 0

As it happens the right argument for the derived verb +/ .*&1 1 0 winds up being the left argument for the verb +/ .*

Basically, what it's doing is: adding up the first two numbers of our argument table.

(Warning: you might want to stop reading here.)

But why is that so complicated?   It's so much simpler to go:

   int[][] table;
   for (int j= 0; j<table.length; j++) {
      int[] row= table[j];
      int sum= 0;
      for (int k= 0; k< row.l; k++) {
no wait  maybe

   int[][] table;
   for (int j= 0; j<table.length; j++) {
      int sum= table[j][0]+table[j][1];

no wait, this table stuff is messy, let's just compute the total score that's a lot simpler.   Then we can use one loop to traverse the raw bowling score numbers and we just need some state variables to keep track of what we are doing...


"Simple" really depends on what you understand.  The underlying concept behind +/ .*&1 1 0 is not that hard to hold onto... but on the one hand (in traditional mathematics) inner product has been made so fundamental that talking about how to get there becomes difficult and on the other hand (when programming in a traditional programming language) the iteration process has been made into such an important part of the grammar that it's difficult to abstract the algorithm.

That said, note that I have glossed over the grammar of the dot operator here (and the rules for the slash operator).  Those rules are relatively simple, but they are different from the rules for + and *.