r-nd-m: 2011

Friday, June 24

Money: apples vs. oranges

According to http://en.wikipedia.org/wiki/Money_supply, almost 2 trillion dollars of currency existed on 4 Nov 2009. compared to less than 1 trillion existing in 2007. And the corresponded to 8.36 trillion in savings (Nov 2009) compared to 7.41 trillion in savings in 2007. http://www.investopedia.com/articles/08/fight-recession.asp has some perspective on this issue.

According to http://en.wikipedia.org/wiki/Reserve_requirement, "As of 2006 the required reserve ratio in the United States was 10% on transaction deposits and zero on time deposits and all other deposits."

There is apparently a more comprehensive measure of how much money exists (M3), but we no longer use it. That 0 required reserve ratio on various classes of deposits might be the federal reserve considers M3 to be uninformative. (But I am not certain that having that kind of money being treated as "meaningless" is a good thing for our economy.)

According to http://en.wikipedia.org/wiki/United_States_public_debt, the U.S. federal debt was 14.32 trillion on May 6, 2011. Projecting backwards, this would have been about 12.62 in 2010 (because the debt increased by 1.7 trillion in 2010) and 10.72 in 2009 (because the debt increased by 1.9 trillion in 2009).

But the real fun begins at http://en.wikipedia.org/wiki/Quantitative_easing#Credit_easing

I am still searching to find out how we quantify issues such as death and bankruptcy (where savings and loans lose traction).

Tuesday, June 21

Economics?

The mark of a worthwhile treatment of economics is that someone can get rich by applying the ideas. I wonder if there is anything worthwhile about this blog entry.

I am going to try a scratchpad post, with lots of "first approximations" -- ideas that currently seem obvious to me, but which do not reach deep, and which might have something fundamentally wrong with them. I think I have some good ideas here (but my thinking that is not enough, of course).

So, ok, first off, money. Dollars.

Popular wisdom has it that Dollars are pieces of paper backed by nothing. There's this guy in the federal reserve and when he thinks we need more dollars, he writes down a note that says create x dollars, and all of a sudden we have x additional dollars in our economy.

And, yes, there's a bit of truth to that.

But that does not mean that dollars are backed by nothing. For one thing, we have to pay our taxes in dollars. This means that dollars are backed by taxpayers. Taxpayers have to collect dollars to pay their taxes, so the efforts of taxpayers are available to anyone with dollars.

Another issue is that people like things to mostly stay the same. That is pretty much the whole reason we have government. And governments tend to topple when things change too much or too fast. Which is not exactly a backing for money (though there might be something there that I am not bright enough to grasp), but it is a reason to believe that the current system of dollars and taxes will be with us for a long time.

End of thought.

Except note that the people who are closer to the issuance of dollars will tend to have some advantages in collecting them.

And, note that this sort of suggests that to really understand economics we would have to understand not just the dollars but the people behind them. Dollars are easier to count, but I think we can make a few observations based on this hypothesis:

First: some people are "worth more than others". This is going to be situational -- the people that are worth more to you are going to be the people you wants stuff from, and if you are trading in dollars they will be the people you are giving your dollars to.

Second: economic worth has a rough correlation to population size. But this is complicated by the whole thing where some people do things you want and others do not.

But this also suggests that high unemployment corresponds in some way to untapped wealth. If you can find unemployed people that you like dealing with and if you can help them and if they can help you, you will both be better off. And you might make friends besides?

But I want to tie this back to The War On Drugs, and The Housing Bubble and The Educational System and maybe a few other issues, and 9/11 and a few other issues.

So... 9/11

Specifically Cantor Fitzgerald. According to wikipedia:

The firm is one of twenty primary dealers who trade U.S. government securities directly with the Federal Reserve Bank of New York,^[1] and is also involved in investment banking, asset management, market data, and brokerage services.

I was looking at a lot of data on 9/11 victims and I noticed that a lot of them were Cantor Fitzgerald employees. And, almost universally, they were really nice folk who would go out of their way to help other people. Which sort of makes sense, given that they were in the office when most people would be elsewhere.

But, given that I am supposing that our real economic wealth is our people and that the dollars are just a way of gaining access to that wealth, I am also imagining that having this kind of person between the issuance of dollars and their use was important for for making our wealth "real". This kind of person would be naturally inclined, I imagine, to direct money in ways that it would do the most good -- which also happens to meant that they would be inclined to direct money in ways that "generate the most wealth".

Their loss, I imagine probably made our markets a fraction of a percent less efficient than they had been. And, as any market economist can tell you, a tiny fraction of a percent loss in efficiency can snowball.

Of course I might be wrong here, or I might even be irrelevant -- I suspect I am right, but it's a very impressionistic sort of "right" if I am right. I have no clue how to measure the kind of efficiency I am hypothesizing here. But, if I am right, it's a sort of right that would probably be mostly useful in the HR department of big financial outfits.

P.S. I am not envisioning here, that these guys acted like loan officers dealing with small loans. I am, however, envisioning that they had something of a model for the consequences of their decisions -- that sometimes they would have insights beyond the bare facts of risk and profit.

Taxes

Remember that my idea was that dollars have value because people have to pay taxes with them. Unless they also have value for other reasons, and unless those other reasons have some overriding importance, this means that raising taxes devalues dollars (causes inflation). This is because raising taxes increases the rate at which dollars have to flow without increasing the wealth behind the dollars.

If I am right about the reason that dollars have value, to increase actual wealth you have to increase the size of our population, increase their general education level and/or get them involved in worthwhile projects.

One way of increasing the size of our population involves having babies, letting them grow up and so on -- this has something like a 20-30 year lag time. And we also have a lot of people worrying about population growth. And they have some good reasons behind their worries. I could go on about population growth, crowding, cities, farms, population distribution and so on... and maybe I should, but not right now.

Another way of increasing the size of our population involves immigrants. Historically speaking, much of the U.S. population is an immigrant population. (There were also hundreds of nations here before the U.S. and many of them have been wiped out, and a few of them are doing well and most of the remainder are doing poorly, but that's another series of blog entries that is just going to have to wait.) Anyways, this approach has a fair bit of angst associated with it. And, again, some of the reasoning behind that angst might be good reasoning.

But we have the resources to support a much larger population than what we currently have. And, in fact, one of the big problems we are currently facing, economically, is a lack of growth in the housing market. We want to support a increase in family homes but we do not have the people interested in buying them (in part because people are being economical and doubling up on housing).

Which brings me to the Housing Bubble.

So, ok, the housing bubble is bad. Mkay? The housing bubble means that dollars got way disconnected from actual wealth.

Too obvious, right?

So, how about the War on Drugs?

If you look into the history of the War on Drugs, another country went into a severe loss of face and power because of addiction problems. They did it to themselves but they apparently had some outside help also. So, to keep that from happening here, we have the War on Drugs -- we do not want that happening here, right?

But, as near as I can tell, the underlying issue (which is that people can get diverted from productive activities to non-productive activities) is not being addressed in the War on Drugs. In other words, I think that it's an attack on a symptom which does not halt the disease. Ferinstance:

http://usat.ly/jFu7yL The economic winners of the last decade are states that focus on raw materials, government and senior citizens. The big losers are places that make things

In other words: from one point of view we have shifted our economy from doing something useful (making things) to keeping things from changing (government and senior citizens) and raw materials.

The raw materials thing is interesting, and maybe a hopeful thing? But other than that we apparently are not being very productive or worthwhile. Which, in turn suggests a loss of wealth behind our dollars.

Which, I think leads us to things like:

http://www.thenewfederalist.eu/Europe-vs-USA-Whose-Economy-Wins US growth, unlike that in the EU, is funded by a dangerously high mountain of foreign debt

I really should find more quotes for this blog. I found this quote because I was talking to a friend about my thinking here and he pointed out to me that our economy was being dragged down by failures in the European economies. So I went out to find out about that issue... maybe I need to do more searching?

Anyways, these are half formed thoughts right now.

Currently... ok, currently, if you are trying to survive, perhaps the safest way to do it is to try to get a government job. We have a lot of rules and regulations about what you are allowed to do and how you are allowed to do it, but jumping into them head first seems (temporarily at least) like the best way to get access to the food supply and to housing. (But that's certainly not the only possibility. Here's a representation of Moody's Analytics projections: http://www.usatoday.com/money/economy/story/Jobs-Forecast-2011/34083932/1 -- requires flash.)

I think I read that something like 3% of the U.S. population is farmers. I do not know how much of the population is housing. But it's not most of us.

In the long run, though? In the long run, we need people doing things which other people value. And, for better or worse, no one seems to want to pay for government unless they are forced to do so. This suggests that things need to change, if the government is going to live up to its purpose (which is to keep things from changing too much or too fast).

How's that for a contradiction?

facebook graph

I have started playing with facebook's graph api.

One question which seems to get asked a lot (since you have to know the answer to this in various contexts) is: what is my facebook id?

And I have seen a lot of bad answers to that question.

Here is a good answer:

Step 1: go to your facebook profile page.
Step 2: change the 'www' in the url to 'graph'.
Step 3: your user id will be in the result on the line that says: id:

Tuesday, May 24

What is J?

http://jsoftware.com

Superficial introduction:

J was, originally, a notation introduced to describe computer architecture (Kenneth E. Iverson, 1957, Harvard School of Business). Iverson then went to work for IBM and with Adin Falkoff, Fred Brooks and others published a few reports and parts of books. A part of this was some documentation on the System/360, and the notation used to document the architecture evolved into an executable notation (a programming language). This programming language gained some early users but eventually more interactive systems (spreadsheets) began eating into its niche. J is a rewrite of this language, designed to eliminate some of the obstacles which could prevent its use in schools (for example: the original language could not be programmed in ASCII).

So, what's it like to work in J?

Imagine, if you will, that you are working in a computer language which has some new-fangled operators, like "multiplication" and you are talking with people used to working in traditional languages which do not have any such operation. How would you describe your language?

"Multiplication is like repeated addition."

"Oh, I can do that! I just use a loop!"

"Ok, but.. polynomials..."

"Hey, how do you make multiplication conditional?"

"Huh?"

"I have this loop here where when I=X+Y we subtract H instead of adding K. How do you do that with this multiplication thing?"

"Um... what is it that you are trying to do?"

(long conversation follows)

"This multiplication stuff is too esoteric. I'm a practical person and I do not need to do this obscure multiplication stuff. Besides, my language enforces double-entry bookkeeping, and I really do not want to learn another language if it does not enforce double-entry bookkeeping. Have I shown you my differential equation simulator that translates mass and acceleration into energy?"

...

Anyways, that's kind of how it feels like, working in J.

Tuesday, May 3

TDD Bowling and J, revised, part .... 5 of 3?

This is a continuation of my notes on my recent tdd bowling scorer effort.

The problem here, is that code that seems simple, to me, is not simple to someone that does not understand it. But bringing some that does not understand something up to where they do understand it can be tricky. They need to have the interest to do so, and the ability to find things that they do understand. And I can only try to help them. If I offer things that do not answer their questions, I am not doing anything worthwhile.

But, of course, this is one of those blogs on the edge of nowhere and I am piloting blind: I do not actually have any questions to answer. But what I can do is take a sentence of code, and break it down into parts, explain each of the parts, and try to show how the parts work together.

For today, the sentence I am going to work on will be: limitSpares=: ] * 1,.1,.10 <: +/ .*&1 1 0

This is from:

nextFrameIndex=: ] + 1 + 10 > {.@{~
limitSpares=: ] * 1,.1,.10 <: +/ .*&1 1 0
bowlingScores=: limitSpares@({~ nextFrameIndex^:(i.10)&0)@(3 ]\ ])@{.~&21

isScore=: -:&(+/@,) , (10 3 >: $@]) , 2 -: #@$@]

assert 300 isScore bowlingScores 12#10
assert 190 isScore bowlingScores 21$9 1
assert 0 isScore bowlingScores 20$0
assert 90 isScore bowlingScores 20$8 1

And, here are some examples of how this definition might get used:

limitSpares 10 3$10
limitSpares 10 3$1

In other words, the argument will be a table with 10 rows and three columns. The rows correspond to bowling frames. The columns are the balls that count towards that frame. The problem is that while we get three balls in a frame when we have a strike or a spare, we only get to count the first two when we did not score 10 points with those two balls.

In other words, sometimes we need to multiply that third ball score by 0, so we can ignore it when we are totalling our score. (Yes, we could also do something fancy so that we do not have a third ball score for those frames. But simplicity is a virtue.)

Speaking of simplicity, let's cut down the size of our data. Instead of 10 rows, let's just have two rows:

1 1 10,:10 10 10
1 1 10
10 10 10
limitSpares 1 1 10,:10 10 10
1 1 0
10 10 10

limitSpares will only work on three column tables, but it does not care how many rows we have.

So, first off, let's take this step by step, operating on the data the same way the parser would:

(+/ .*&1 1 0) 1 1 10,:10 10 10
2 20

The first two balls in the first frame scored 2. The first two balls in the second frame scored 20. So the first frame was "open" while the second frame was a "strike". (If the score for those two balls was exactly 10 it might have been a spare or a strike, but that distinction does not matter here. If their score was greater than 10, it had to be a strike.)

And, yes, I am "sort of cheating" here. That's a rather complicated expression in those parenthesis. But bear with me and I will get back to this. First though, I want to show how this intermediate result gets used:

(10 <: +/ .*&1 1 0) 1 1 10,:10 10 10
0 1

Here, we ask: was 10 less than or equal to the score for the first two balls of a frame? And the answer for the first frame is no (0), while the answer for the second frame is yes (1). <:

By the way, some people have tried to tell me that using 0 for false and 1 for true is bad. But 0 and 1 are very special numbers in arithmetic and I think using 0 for false and 1 for true is good. Anyways, that's the way J works.

(1,.10 <: +/ .*&1 1 0) 1 1 10,:10 10 10
1 0
1 1

Here, I am adding a column that is all "1"s onto my 0 and 1. In essence ',.' is all about adding columns onto a table. In this case neither argument was a table, so they both get re-interpreted to be columns of a table. Here's another example of using ,.

(9,.10 <: +/ .*&1 1 0) 1 1 10,:10 10 10

9 0

9 1

The plain number gets repeated as many times as is necessary to match the rows(s) defined in the other argument.

And I can do this twice:

(1,.1,.10 <: +/ .*&1 1 0) 1 1 10,:10 10 10
1 1 0
1 1 1

Now I have a three column table which exactly matches the size of my original table. And, I can multiply them together:

(] * 1,.1,.10 <: +/ .*&1 1 0) 1 1 10,:10 10 10
1 1 0
10 10 10

Tada... that's what I wanted.

So, what about those multiplications?

Everyone knows what multiplication is, right? 1 * 1 is 1, 0 * 1 is 0. You can't get much simpler than that. But when more than one number gets involved, things start getting messy. You have inner products, outer products, cross products, scalar products, tensor products and... lots of stuff that we are not going to get into here.

Of those products, the two we use here are an unnamed one ("plain multiplication") and "inner product". "Inner product" is a very popular one, and traditionally in math involving matrices (tables) or vectors (lists), if nothing else is said inner product is what people use. But, for computers, it's simple to explain inner product in terms of pairwise multiplication. But going the other way? It's nightmarish.

Anyways, multiplication in J works just like addition in J -- numbers are paired up and instead of being added they are multiplied.

1 2 3 * 4 5 6
4 10 18

So, what about inner product? Well, in J, the decimal point taken by itself (with a space to its left) is an operator used to create inner product verbs. For the standard mathematical inner product use +/ on its left and * on its right.

1 2 3 +/ .*4 5 6
32

For this program it's sufficient to know that the last dimension of the left argument must match the first dimension of the right argument. These inner dimensions are paired up and the corresponding numbers are multiplied and then these results are added up.

Anyways, (+/ .*&1 1 0) 1 1 10,:10 10 10 gives the same result as (1 1 10,:10 10 10) +/ .* 1 1 0

As it happens the right argument for the derived verb +/ .*&1 1 0 winds up being the left argument for the verb +/ .*

Basically, what it's doing is: adding up the first two numbers of our argument table.

(Warning: you might want to stop reading here.)

But why is that so complicated? It's so much simpler to go:

int[][] table;
for (int j= 0; j<table.length; j++) {
int[] row= table[j];
int sum= 0;
for (int k= 0; k< row.l; k++) {
...
no wait maybe

int[][] table;

for (int j= 0; j<table.length; j++) {

int sum= table[j][0]+table[j][1];

...

no wait, this table stuff is messy, let's just compute the total score that's a lot simpler. Then we can use one loop to traverse the raw bowling score numbers and we just need some state variables to keep track of what we are doing...

Ahem...

"Simple" really depends on what you understand. The underlying concept behind +/ .*&1 1 0 is not that hard to hold onto... but on the one hand (in traditional mathematics) inner product has been made so fundamental that talking about how to get there becomes difficult and on the other hand (when programming in a traditional programming language) the iteration process has been made into such an important part of the grammar that it's difficult to abstract the algorithm.

That said, note that I have glossed over the grammar of the dot operator here (and the rules for the slash operator). Those rules are relatively simple, but they are different from the rules for + and *.

Monday, April 18

(I started this entry with little time, because of a short lunch meeting, and am coming back to it after a weekend, I hope it's not too disjointed. Not that I necessarily need any help, in being disjointed...)

In TDD bowling, and J, revised, part3..., I had said that I wanted to come back and write up something about how J interprets some of that code. But I have had some other thoughts since then, and let's see how this comes out.

First off, here is the code:
nextFrameIndex=: ] + 1 + 10 > {.@{~
limitSpares=: ] * 1,.1,.10 <: +/ .*&1 1 0
bowlingScores=: limitSpares@({~ nextFrameIndex^:(i.10)&0)@(3 ]\ ])@{.~&21

isScore=: -:&(+/@,) , (10 3 >: $@]) , 2 -: #@$@]

assert 300 isScore bowlingScores 12#10
assert 190 isScore bowlingScores 21$9 1
assert 0 isScore bowlingScores 20$0
assert 90 isScore bowlingScores 20$8 1

So, here is the vocabulary:

# $ & ( ) * + , ,. -: . / 0 1 1 1 0 10 10 3 12 190 2 20 21 3 300 8 1 9 1 90 <: =: > >: @ \ ] ^: assert bowlingScores i. isScore limitSpares nextFrameIndex { {. ~

That's a bit mixed up though, because a sequence of numbers in J counts as one word. And, yes, this means that you can have a word in J which contains a space. But that in and of itself is not very novel -- most languages will treat a quoted string as a single "token". But in J, you need other words to separate numeric lists. 1 2 + 3 4 would be three words. And, my presentation, above, does not really show word boundaries for numbers (but I have played with bold, to help mark off the different numbers that I used).

Anyways, numbers just represent themselves.

Next, here is an informal rundown on the words as used here. Feel free to skip over it -- it's for referring back to, when I get into examples, in case you feel lost.

This initial summary will be a bit ambiguous, to keep from getting bogged down in details. I will give some examples (later) and references to hopefully help deal with the ambiguities. I will also include links into J's dictionary that might also help.

Monadic verbs (traditional "functions" with their arguments on the right):

   # Gives the number of items in a list
   +/ derived verb meaning "sum the items in a list"
   , Removes any extra structure from an array, turning it into a flat list
   ] Identity function: its result is its argument
   i. List of indices: 0 1 2 3 ... with size of result specified by argument
   {. The first item of a list

   bowlingScores given a list of ball scores from a bowling game, arranges them in rows and columns, with one row for each frame, such that the sum of the entire result will be the score of the game.
   limitSpares given an array representing the frames from a bowling game, makes sure that only the first two balls count when the frame is neither a spare nor a strike (when you have a spare or a strike, you need a sequence of three balls to get the score)

Dyadic verbs (traditional "operations", like +, with arguments on left and right). I will use x as a placeholder for the left argument and y for the right argument.:

   # Make x copies of y (if they both are sequences pair corresponding elements)
   $ make an x dimensional array from y (keep repeating y as many times as you need)
   * x*y is x times y (x1,x2,x3)*(y1,y2,y3) is ((x1*y1),(x2*y2),(x3*y3))
   , make a list with the elements from x followed by the elements from y
   ,. like , but x and y represent columns
   -: 1 when x and y as a whole are identical, 0 otherwise
   <: 1 when an element of x is less than or equal to an element of y, 0 otherwise
   > 1 when an element of x is greater than an element of y, 0 otherwise
   >: 1 when an element of x is greater than or equal to an element of y, 0 otherwise
   ] right identity: its result is its right argument
   { elements from y indexed by x
   {. the prefix of y selected by x (if x is greater than #y, pad the result)

   nextFrameIndex if x is a list of potential frames (or just a list of ball scores), and y is the index of one that we know is a valid frame, the result is the index of the frame after y.

Adverbs (these modify a verb). I will use u as a placeholder for the verb (which always appears to the left of the adverb):

Adverbs used monadically:
   / Insert verb between the items of y: +/1 2 3 gives the result of evaluating 1+2+3

Adverbs used dyadically:
   \ apply u to each of the sublists of y which contain x elements
   ~ swap the arguments: x u~ y is the same as y u x

Conjunctions (these are combining words that work on verbs, they bind very tightly). I will use u for the left verb and v for the right verb, here. But conjunctions can also consume a noun instead of a verb, so I will use m for a left noun and n for a right noun. Some conjunctions behave very differently with nouns.

   & When given two verbs, the verb on the right (v) becomes the preprocessor for the verb on the left (u). In the dyadic case, v is used on each argument and those results are combined using u.
   & When given a verb and a noun, the verb (u or v) is curried with that noun (n or m). The resulting monadic verb looks for its remaining argument on its right.

   . This is a special conjunction for defining inner products. The usual matrix multiplication inner product is +/ .* (and you have to have a space to the left of the decimal).

   @ When given two verbs, the verb on the left (u) becomes the postprocessor for the verb on the right (v). In the dyadic case, v is used to combine the left and right arguments.

   ^: The verb on the left (u) gets used repeatedly, the number of times it gets used is specified by the noun on the right (n). If n is 0, y is the result. If n is 1, the verb is used once, just like normal. If n is 2, the verb is used twice with the result of the first evaluation being the right argument for the second evaluation. (In a dyadic context, x will be used, unchanged, for each evaluation). f n is a list of numbers the result has an item corresponding to each element of n.

(Note, by the way, that adverbs and conjunctions in English also tend to be rather contextual in their meanings.)

But enough of this reference stuff, let's get to some examples (if you were skipping over the vocabulary material, here is a good place to stop skipping).

   nextFrameIndex=: ] + 1 + 10 > {.@{~

What does this look like, in action?

   (12$10) (] + 1 + 10 > {.@{~) 7
8

Um... ok, what did that tell us?

Here, the left argument represents ball scores and the right argument represents an index into them that marks the begining of a frame. The result will be an index either 1 or 2 greater than the right argument. It's 1 greater when the index selects a ball that knocked down 10 pins, otherwise it's 2 greater.

Let's break this down into simpler expressions:

   (12$10) ({~) 7
10

Here, we are selecting a ball which had a score of 10.

(12$10) ({.@{~) 7

Here, we are selecting a ball which had a score of 10.

Hey, wait a minute! What was the point of that extra stuff?

A problem here is that I gave the left argument in a form that the code did not actually use. In "real life" it would have looked more like this:

(10 3$10) (] + 1 + 10 > {.@{~) 7

In other words, I would have given it 10 rows and each row would represent one frame. So, let's start over:

(10 3$10) ({~) 7

10 10 10

The eighth frame had three balls and all of them were perfect scores. (As is usual for indices in most modern programming languages the first value is selected by the index 0. Here, "first" would be a cardinal number and "0" an ordinal number.)

(10 3$10) ({.@{~) 7

The first ball of the eighth frame was a perfect 10.

(10 3$10) (10 > {.@{~) 7

10 is not greater than the score of the first ball in the eighth frame

(10 3$10) (1 + 10 > {.@{~) 7

The offset to the beginning of the next frame is 1.

(10 3$10) (7 + 1 + 10 > {.@{~) 7

The index of the next frame is 1 (that's the ninth frame). Note that here I plugged in the value 7 for the current index. In the real code I instead used the function ] which always has as its value the value of its right argument. In other words:

(10 3$10) (] + 1 + 10 > {.@{~) 7

But wait, how did ] get 7 for its right argument?

The answer is: when we have an odd number of verbs arranged in a isolated train (and the parenthesis here are sufficient to isolate them), the odd ones get the same right argument as the train as a whole. And if the train has a whole has a left argument, all of the odd verbs get that for their left argument also where if the train as a whole has no left argument none of the odd verbs get a left argument.

Meanwhile, the even verbs are combining verbs. They combine the results from the odd verbs (working from right to left, just like J normally does -- and note also that arabic numbers are also designed to work from right to left with the least part on the right).

But wait, so how does that work when you have numbers like 1 and 10 in it? Those are not verbs! Those are nouns!!

The answer is: if we use a noun in the odd positions in what would otherwise be a verb train, we treat them as verbs which always produce themselves as their result. The rightmost word has to be a verb, but the other odd positions can be nouns.

And yes, I can imagine some grumbling about how this odd and even business seems contrived. But it corresponds rather closely to some notation that winds up being used rather often in mathematics. Anyways, that's the way J works.

And, I am out of time again...

Thursday, April 14

r-nd-m: r-nd-m: TDD bowling, and J, revised, part3

This is a continuation of TDD bowling, and J, revised, part 1 and TDD bowling, and J, revised, part 2

I left off with:

isScore=: -:&(+/@,) , (10 3 >: $@]) , 2 -: #@$@]

F=: ] + 1 + 10 > {.@{~

bowlingScores=: ({~ F^:(i.10)&0)@(3 ]\ ])

assert 300 isScore bowlingScores 12#10

assert 190 isScore bowlingScores 21$9 1

...and the first thing we need to do is find a better name for F, one that will be useful without having to go back and re-read yesterday's blog. I'm going to go with nextFrameIndex.

nextFrameIndex=: ] + 1 + 10 > {.@{~

bowlingScores=: ({~ nextFrameIndex^:(i.10)&0)@(3 ]\ ])

and with that, I can get on to my next test:

assert 0 isScore bowlingScores 20#0

This fails:

   assert 0 isScore bowlingScores 20#0
|index error: bowlingScores
| assert 0 isScore bowlingScores 20#0

My problem here is that I am indexing off the end of my array. 3 ]\ 20$0 has 18 elements but the ninth (and last) invocation of nextFrameIndex will give me 18 as its result, and the largest valid index is 17.

The simple solution here, I think, is to pad out my result so that 18 will always be a valid index. Since I only have 10 frames (corresponding to 0 through 9 invocations of nextFrameIndex) that should be enough. And, since I am ultimately adding up the results, padding with zeros will not change the ultimate score. (And, in fact, I have been planning on using zeros, all along, to deal with an issue that I do not yet have a test for.)

bowlingScores=: ({~ nextFrameIndex^:(i.10)&0)@(3 ]\ ])@{.~&21

That's good enough for now, though my definition is getting a bit long-winded. I also have multiple "magic numbers" here. I am not sure that that is bad, but here they are:

10: The number of frames in a complete game
21: The number of balls in the longest possible game
3: The number of balls in the longest possible frame

I think that that information is interesting enough that I would leave it in a comment in the completed program.

But now we need another test:

   assert 90 isScore bowlingScores 20$8 1

And the current implementation of bowlingScores gives me a score of 162 -- it needs to deal with frame that are neither spares nor strikes. In other words, if the sum of the first two elements of a frame is less than 10 the third element of my representation of that frame must be a zero.

One approach could be:
   (0&{, 1&{, 2&{ * 10 > 0&{ + 1&{)

For example:

   3 ]\ 7$8 1
8 1 8
1 8 1
8 1 8
1 8 1
8 1 8
   (0&{, 1&{, 2&{ * 10 <: 0&{ + 1&{)"1 (3 ]\ 7$8 1)
8 1 0
1 8 0
8 1 0
1 8 0
8 1 0

But that is long-winded for what seems like it ought to be a simple concept. Can we do better?

Well.. one possbility involves multiplication. From this point of view, we want to multiply each row by either 1 1 1 or 1 1 0, with that last bit depending on the sum of the previous two values. That previous sum can be expressed as the inner product of 1 1 0 and the three numbers from my potential frame.

In other words, something like this:

   (3 ]\ 7$8 1) +/ .* 1 1 0
9 9 9 9 9
   10 <: (3 ]\ 7$8 1) +/ .* 1 1 0
0 0 0 0 0
   1,.1,.10 <: (3 ]\ 7$8 1) +/ .* 1 1 0
1 1 0
1 1 0
1 1 0
1 1 0
1 1 0

   (3 ]\ 7$8 1) * 1,.1,.10 <: (3 ]\ 7$8 1) +/ .* 1 1 0
8 1 0
1 8 0
8 1 0
1 8 0
8 1 0

   (] * 1,.1,.10 <: ] +/ .* 1 1 0"_) 3 ]\ 7$8 1
8 1 0
1 8 0
8 1 0
1 8 0
8 1 0

Except, I have some bad experiences with how people react to expressions of the form: noun"_ Some people do not look at that and see the resemblance to diaeresis and realize that this is a variation on the old "each" operator from APL. Instead, they see the double quotes and wonder why I have an unterminated string constant.

I could avoid that kind of reaction be rephrasing:

   (] * 1,.1,.10 <: +/ .*&1 1 0) 3 ]\ 7$8 1
8 1 0
1 8 0
8 1 0
1 8 0
8 1 0

Here, I have curried the inner product (+/ .*) with the vector 1 1 0 (on its right). That derives a verb which only needs one argument, which will appear on the right. And, that is a slight be more concise than my previous expression so I think I will use this variation.

(People also react badly to my "unbalanced brackets". They do not see them as left and right identity functions but as some unknown construct that would satisfy rules for some other programming language. But I find myself often using them, with no better alternatives.)

   limitSpares=: ] * 1,.1,.10 <: +/ .*&1 1 0
   bowlingScores=: limitSpares@({~ nextFrameIndex^:(i.10)&0)@(3 ]\ ])@{.~&21

And that satisfies my current set of tests

Is that getting too long?

   #'bowlingScores=: limitSpares@({~ nextFrameIndex^:(i.10)&0)@(3 ]\ ])@{.~&21'
73

I think I can live with a 73 character long line. But if it were much longer, I would be feel I should refactor that definition. As it is, I would not reject a refactoring. But I do not feel like doing that right now.

So, here is my current set of definitions:

NB. 10: The number of frames in a complete game
NB. 10: The highest possible score with one ball
NB. 21: The number of balls in the longest possible game
NB. 3: The number of balls in the longest possible frame
NB. 2: The number of dimensions in the desired result (frames and balls)

nextFrameIndex=: ] + 1 + 10 > {.@{~
limitSpares=: ] * 1,.1,.10 <: +/ .*&1 1 0
bowlingScores=: limitSpares@({~ nextFrameIndex^:(i.10)&0)@(3 ]\ ])@{.~&21

isScore=: -:&(+/@,) , (10 3 >: $@]) , 2 -: #@$@]

assert 300 isScore bowlingScores 12#10
assert 190 isScore bowlingScores 21$9 1
assert 0 isScore bowlingScores 20$0
assert 90 isScore bowlingScores 20$8 1

And, I think that that I have an adequate set of definitions. I have a slight temptation to try and distinguish between the two meanings for the number 10, but that is inherent in the context of bowling -- you do not know if a bowler saying "10" means "10 pins" or "10 frames" until you get a bit more information.

Next up: I should perhaps go back and take some of these sentences and show how J evaluates them. I think I did an ok job of illustrating "limitSpares", but some of the other lines I just kind of thew out there. And, if in so doing I feel inspired to distinguish between the two meanings for "10", or if I think up better names for some of my words, ... well... it could happen. And I hope my next post will not be too tedious.

Update: I could have defined:

limitSpares=: #"1~ 1,.1,.10 <: +/ .*&1 1 0

This would give me a two column result for games that had no spares nor strikes. But I do not see a use for that complication.

Wednesday, April 13

r-nd-m: TDD bowling, and J, revised, part 2

This is a continuation of r-nd-m: TDD bowling, and J, revised, part 1

   assert 300 isScore bowlingScores 12#10

This needs an implementation:

   bowlingScores=: 3 ]\ ]

Easy enough, and that works.

   assert 190 isScore bowlingScores 21$9 1

This fails with my current definition for bowlingScores. The problem is that my result has too many rows in it. If I had a way of eliminating them, I would have the right result:

   +/@,(#~ 9={."1) bowlingScores 21$9 1
190

But I don't want to use this technique because it is too closely tied to the example score. What I really need is a way of ignoring potential frames immediately after a frame that did not begin with a 10.

validFrames=:3 :0
   valid=. 0 NB. first potential frame is always a valid frame
   while. (#y) > {: valid do.
   valid=. valid, ({:valid) + 1 + 10 ~: {. ({:valid) { y
   end.
   (i.#y) e. valid
)

That implementation seems overly complicated, but it works:

   +/, (#~ validFrames) 3 ]\ 21$9 1
190

Some issues here:

I am repeatedly referring to the last element of valid.
Each element of valid depends on all of the previous elements (and its first element is always 0).
The last element of valid is not really valid (because it's an index off the end of the array), but that does not actually matter for well-formed games, since no valid index will ever match it.
I have written this code so that it can work on a flat list of ball scores or an array of them.
If a game goes too long, this code will not ignore irrelevant trailing ball scores.

Anyways, I would like to get rid of that while loop. One possibility might involve the use of prefix (\) but that turns out to not be useful. Yes, it gives me shorter instances of the game to work on, but that does not help me with issue 2. I need an inductive approach here.

For an inductive approach, the initial value is always 0. So I need a function that, given one value will give me the next value.

In other words, I need a function that adds 1 to an index if the value at that index is 10, and which otherwise adds 2 to that index.

But, I also need my function to not have problems from indexing off the end of my scores.

For example, if my potential frames are my left argument:
   F=: ] + 1 + 10 > {.@{~
   (19 3$9 1) F^:(i.10) 0
0 2 4 6 8 10 12 14 16 18

This leaves me with my new definition:

   bowlingScores=: ({~ F^:(i.10)&0)@(3 ]\ ])

That satisfies both of my assertions so far, but lunch time is over so I need to put the rest of this off until later.

For now, here are my collected definitions and assertions:

isScore=: -:&(+/@,) , (10 3 >: $@]) , 2 -: #@$@]
F=: ] + 1 + 10 > {.@{~
bowlingScores=: ({~ F^:(i.10)&0)@(3 ]\ ])

assert 300 isScore bowlingScores 12#10
assert 190 isScore bowlingScores 21$9 1

Monday, April 4

Bowling, revisited

I was going through and tagging blog entries and it looks like this one was a latent, unpublished draft. It's probably best to ignore this post for now.... I am intending to leave this thread of thought alone, for now at least. See also: http://r-nd-m.blogspot.com/2011/04/tdd-bowling-and-j-revised-part-1.html which I hope will be a better approach.

I am currently writing for an audience of maybe four people, and quite possibly I have disappointed some of them. Where is the elegance in the code I have been writing? (Also there was a bug in the blog I posted last, I had 2 >. instead of 2 <. which was horrid.)

Anyways, let's re-invent the bowling scoring code, let's create an array which when all of its numbers have been summed gives us the total score. In other words, let's have a ten row, three column list of numbers:

   assert 10 3 -: $ framesScore 20#0

   framesScore=: 10 {. 3 ]\ ]

"Now, wait", I can imagine someone saying how do we tell whether a frame is a complete frame or an incomplete frame?

And, the answer is: this function will not tell you that. This function just gives you numbers you can total up. And overloading its meaning to attempt to represent incomplete frames seems like bad modularity. We can use a different function to determine which frames are complete frames -- that does not have to be represented in the scores themselves.

All we are doing here is finding some numbers to add up, let's not make that operation overly complicated.

   assert 0 -: +/, framesScore 20#0

Fine, so far.

   assert 30 -: +/, framesScore 19{.10 0 10

Now we need to identify where each frame starts. Borrowing from <a href="http://xprogramming.com/articles/j-working-with-henry-richs-example/">Henry</a>:

   framesStart=: 10 {. 0 {~^:a:~ _1 _1 ,~ #\ + 10&~:
   framesScore=: framesStart { 3 ]\ ,&0 0

Whoa... what happened here?

So, first off, I do not really know how test driven development is supposed to work for "enterprise" situations or really any situation where you have pre-existing code. Are you allowed to just use the code, or are their preconditions? But, also, I was pretending, in my earlier posts, that I did not understand how Henry's code worked (or even that it existed). I was wondering if it would just jump out of the process (and it didn't).

But pretending ignorance was getting difficult for me...

So let's go over this code for a bit, since I expect at least one person that might read this blog might not have a computer handy.

   19 {. 10 0 10
10 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

That was just a concise way of expressing a game -- the first ball was a strike, the second was a gutter ball, the third ball turned that frame into a spare and then we gave up so the rest of the balls implicitly had a score of zero.

   10 ~: 10 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

This is marking balls where not all of the pins were knocked down all at once.

   (#\ + 10&~:) 19 {. 10 0 10
1 3 3 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

#\ is 1 2 3 4 5... and if we think in terms of indices it's the index of the next element in the array. 0 is the index of the first element, 1 is the value there and is the index of the second element. This result 1 3 3 5 6 7.. is "the index of where the next frame starts for each ball if that ball were the start of a frame". It has some junk values in it, but we do not have to use them (we can select just the useful values).

   (_1 _1,~#\ + 10&~:) 19 {. 10 0 10
1 3 3 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 _1 _1

We will need those negative values in just a moment. But if we are thinking of indices, _1 is the index of the last value in an array.

Interlude: tennis scoring

Tracy Harms posted http://www.jsoftware.com/jwiki/TracyHarms/TennisScore, and Ron Jeffries asked about how the J community perceived it. I am not the J community, but I am a part of it, and maybe my perspectives can help express how I read J code.

So, first, here is the code as it is at the time of this writing:

firstPlayerName=: 'Alice'"_
secondPlayerName=: 'Bob'"_
ScoreMarks=: 'AB'

tellGameScoreSeries=: 3 :'tellSituation/&> ({.~ validLength) pair ScoreMarks, y'
pair   =: [: tally&.> situate
tally  =: 1 1 -~ [:+/"1 [:= ]
situate=: [: }. <\

tellSituation =: tellIncomplete ` tellFinal @. isFinal
tellIncomplete=:        tellLow ` tellHigh  @. isHigh
tellLow       =:       tellBoth ` tellTie   @. isTied
tellHigh      =:  tellAdvantage ` tellDeuce @. isTied

isFinal=: isHigh  *.  1 <  [: | -
isHigh=:  3 < >.
isTied=:  =

tellFinal=: 'game ', leadingPlayerName
tellAdvantage=: 'advantage ', leadingPlayerName
tellDeuce=: 'deuce'"_
tellBoth=:  firstPlayerName, ' ', tellNumber@[ , ', ',  secondPlayerName, ' ', tellNumber@] 
tellTie=: tellNumber,' all'"_ 
tellNumber=: [: > ] { (;:' love fifteen thirty forty ')"_
leadingPlayerName=: firstPlayerName ` secondPlayerName @. <

validLength=: ( {.@: ((1+I.),#) )@: (isFinal/&>)

And, here is an example use:

tellGameScoreSeries 'AABBBAABAABBBB'
love all                 
Alice fifteen, Bob love  
Alice thirty, Bob love   
Alice thirty, Bob fifteen
thirty all               
Alice thirty, Bob forty  
forty all                
advantage Alice          
deuce                    
advantage Alice          
game Alice

So... impressions...

First off, this is a fairly big and ornate chunk of code -- displaying it takes a lot of screen real-estate, and it uses long names, and it seems involved. The complications are probably inherent in the underlying scoring algorithm, the long names are probably a good way of telling me what the code is about, but I do not have an immediate sense of the structure of the sentences.

So, looking closer:

   tellGameScoreSeries 'AABBBAABAABBBB'

We are scoring based on a list of letters using tellGameScoreSeries, which has the definition:

   3 :'tellSituation/&> ({.~ validLength) pair ScoreMarks, y'

Looking at that through the lens of J's parser:

   13 :'tellSituation/&> ({.~ validLength) pair ScoreMarks, y'
[: tellSituation/&> [: ({.~ validLength) [: pair [: ScoreMarks ,

We prepend ScoreMarks ('AB') to the list, then use pair (whatever that does) and then do some other stuff.

(Edit: note that I am working with a different example here, than above. I do not remember where this example came from.)

   pair 'ABABAABBBAABA'
┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┐
│0 0│1 0│1 1│2 1│3 1│3 2│3 3│3 4│4 4│5 4│5 5│6 5│
└───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┘

Ok... those are pairs, but pairs of what?

   pair
[: tally&.> situate

   situate 'ABABAABBBAABA'
┌──┬───┬────┬─────┬──────┬───────┬────────┬─────────┬──────────┬───────────┬────────────┬─────────────┐
│AB│ABA│ABAB│ABABA│ABABAA│ABABAAB│ABABAABB│ABABAABBB│ABABAABBBA│ABABAABBBAA│ABABAABBBAAB│ABABAABBBAABA│
└──┴───┴────┴─────┴──────┴───────┴────────┴─────────┴──────────┴───────────┴────────────┴─────────────┘

Ok, I think I know what is going on here: the example showed game progress, so we build up a list of intermediate game states. This means that pair is probably showing us the scores at each step of the game -- tally probably gives us the score for one of those steps. But if we are just summing things, why not express that directly?

+/'AB'=/'ABAABBBAABA'

1 1 1 1 1 1 1 1 1 1 1

NB. huh?

'AB'=/'ABAABBBAABA'

1 0 1 1 0 0 0 1 1 0 1

0 1 0 0 1 1 1 0 0 1 0

NB. Oh! oops

+/'AB'=/~'ABAABBBAABA'

6 5

NB. but I wanted the running sum

+/\'AB'=/~'ABAABBBAABA'

1 0

1 1

2 1

3 1

3 2

3 3

3 4

4 4

5 4

5 5

6 5

There, that looks like the same information that pairs gives, though I leave off that initial 0 0. I am not sure if I need to announce that before every game the players are tied. But, if I did, it would be easy enough to include.

It's also not presented in quite the same fashion, but I prefer to avoid putting things in boxes when I do not need the boxes. I think I would have named a routine that does this "runningTally' or 'runningScore' rather than 'pairs'.

runningTally=: [: +/\ =/~

Or, maybe:

runningScores=: 0 0, [: +/\ =/~

I had two alternative names and I am going to abuse that so that I can trivially distinguish between these two possible ways of presenting the progress of the game.

So.. I am already rewriting the code and I have barely gotten started. I do not know if that is a comment on my personality or a comment on the code.

Next up, we have validLength:

validLength

{.@:((1 + I.) , #)@:(isFinal/&>)

isFinal

isHigh *. 1 < [: | -

isHigh

3 < >.

So... isFinal goes between a score pair, and then we pick either the length of the argument or the length which includes the first final score... And a score is final if it's greater than 3 and also exceeds the other score by more than 1. Again, I am tempted to rewrite it... if nothing else, the results from my runningTally are not in quite the same format that validLength expects. Also, this code is fundamentally scalar code -- it's array of structures rather than structure of arrays, and J works much better for me when I think in terms of structures of arrays.

The high score test could probably be any of (untested code):

3 < >./"1 tallies

3 < >./|: tallies

>./"1 3:< tallies NB. ugly and confusing, let's ignore the "1 3: thing

>./|:3 < tallies

*./|:3 < tallies

The score separation test could be any of (again, untested code):

1 < | -/"1 tallies

1 < | -/|: tallies

or... are there other good variations I should consider?

Anyways, I am inclined to use transpose (rather than modify the rank of two reductions). That leaves me with:

(something *. otherthing) |: tallies

Perhaps:

((3 < >./) *. 1 <&| -/)@|: tallies

Let's try that:

((3 < >./) *. 1 <&| -/)@|: 'AB' runningTally 'ABAABBBAABA'

0 0 0 0 0 0 0 0 0 0 0

Nope.. see? that's what I get for trying to do all of this in my head, without using any experiments.

No, wait... maybe this was an acceptable result? After all, the original use had a specific allowance for the thing being all zeros. Well... I do not know the rules of tennis so I am not in any position to judge. But I think this thing is defective, because this logic would not complain if I added another 'B' to the end of the game. But maybe my result is not what the original code would do? (Edit: I was getting confused here. For some reason, I was thinking that the example I was working with, here, was the original example of a complete game -- but it was not, and my cascade of confusion led to me to drift off (and stop this blog entry).)

isFinal/"1 'AB' runningScores 'ABAABBBAABA'

0 0 0 0 0 0 0 0 0 0 0 0

isFinal/"1 'AB' runningScores 'ABAABBBAABAB'

0 0 0 0 0 0 0 0 0 0 0 0 0

Hmm... so ok, let's come back to this later. I will note, however, that the desired result, here would be i.1 of that bit vector. There is no need to find all of the final scores, and i.1 will automatically give us the length if there are no 1s in the list.

At this point, I really need to understand the requirements, I think, and come at the problem afresh. But a quick look at the code:

firstPlayerName=: 'Alice'"_
secondPlayerName=: 'Bob'"_
ScoreMarks=: 'AB'

why are we using constant verbs and not nouns here? Looking at the code, I see one place where the syntax requires a verb:

leadingPlayerName=: firstPlayerName ` secondPlayerName @. <

but that could be rephrased to use { instead of @.

So... impressions...

My impression is that this code was developed using blinders -- the code was designed with a structure in mind (build up verbs that do some part of the phrasing) and that mold was given fairly high priority. That is not how I work, though, and I am not particularly comfortable with the result.

Which, I suppose, explains my strong urges to rewrite the thing.

Also, the validation part needs a bit of thought, but I do not have enough time, right now, to be researching how tennis games work.

Or does it? When I look at http://en.wikipedia.org/wiki/Tennis_score, the impression I get was that the example game ended early (with a score of 6-5, the game did not have to be over), and there is nothing wrong with the validator.

Friday, June 24

Tuesday, June 21

Tuesday, May 24

Tuesday, May 3

Monday, April 18

Thursday, April 14

Wednesday, April 13

Monday, April 4

Followers