Monday, April 4

Interlude: tennis scoring

Tracy Harms posted, and Ron Jeffries asked about how the J community perceived it.  I am not the J community, but I am a part of it, and maybe my perspectives can help express how I read J code.

So, first, here is the code as it is at the time of this writing:

firstPlayerName=: 'Alice'"_
secondPlayerName=: 'Bob'"_
ScoreMarks=: 'AB'

tellGameScoreSeries=: 3 :'tellSituation/&> ({.~ validLength) pair ScoreMarks, y'
pair   =: [: tally&.> situate
tally  =: 1 1 -~ [:+/"1 [:= ]
situate=: [: }. <\

tellSituation =: tellIncomplete ` tellFinal @. isFinal
tellIncomplete=:        tellLow ` tellHigh  @. isHigh
tellLow       =:       tellBoth ` tellTie   @. isTied
tellHigh      =:  tellAdvantage ` tellDeuce @. isTied

isFinal=: isHigh  *.  1 <  [: | -
isHigh=:  3 < >.
isTied=:  =

tellFinal=: 'game ', leadingPlayerName
tellAdvantage=: 'advantage ', leadingPlayerName
tellDeuce=: 'deuce'"_
tellBoth=:  firstPlayerName, ' ', tellNumber@[ , ', ',  secondPlayerName, ' ', tellNumber@] 
tellTie=: tellNumber,' all'"_ 
tellNumber=: [: > ] { (;:' love fifteen thirty forty ')"_
leadingPlayerName=: firstPlayerName ` secondPlayerName @. <

validLength=: ( {.@: ((1+I.),#) )@: (isFinal/&>)

And, here is an example use:

tellGameScoreSeries 'AABBBAABAABBBB'
love all                 
Alice fifteen, Bob love  
Alice thirty, Bob love   
Alice thirty, Bob fifteen
thirty all               
Alice thirty, Bob forty  
forty all                
advantage Alice          
advantage Alice          
game Alice 

So... impressions...

First off, this is a fairly big and ornate chunk of code -- displaying it takes a lot of screen real-estate, and it uses long names, and it seems involved.  The complications are probably inherent in the underlying scoring algorithm, the long names are probably a good way of telling me what the code is about, but I do not have an immediate sense of the structure of the sentences.

So, looking closer:

   tellGameScoreSeries 'AABBBAABAABBBB'

We are scoring based on a list of letters using tellGameScoreSeries, which has the definition:

   3 :'tellSituation/&> ({.~ validLength) pair ScoreMarks, y'

Looking at that through the lens of J's parser:

   13 :'tellSituation/&> ({.~ validLength) pair ScoreMarks, y'
[: tellSituation/&> [: ({.~ validLength) [: pair [: ScoreMarks ,

We prepend ScoreMarks ('AB') to the list, then use pair (whatever that does) and then do some other stuff.

(Edit: note that I am working with a different example here, than above.  I do not remember where this example came from.)

│0 0│1 0│1 1│2 1│3 1│3 2│3 3│3 4│4 4│5 4│5 5│6 5│

Ok... those are pairs, but pairs of what?

[: tally&.> situate

   situate 'ABABAABBBAABA'

Ok, I think I know what is going on here: the example showed game progress, so we build up a list of intermediate game states.  This means that pair is probably showing us the scores at each step of the game -- tally probably gives us the score for one of those steps.  But if we are just summing things, why not express that directly?

1 1 1 1 1 1 1 1 1 1 1
   NB. huh?
1 0 1 1 0 0 0 1 1 0 1
0 1 0 0 1 1 1 0 0 1 0
   NB. Oh! oops
6 5
  NB. but I wanted the running sum
1 0
1 1
2 1
3 1
3 2
3 3
3 4
4 4
5 4
5 5
6 5

There, that looks like the same information that pairs gives, though I leave off that initial 0 0.  I am not sure if I need to announce that before every game the players are tied.  But, if I did, it would be easy enough to include.

It's also not presented in quite the same fashion, but I prefer to avoid putting things in boxes when I do not need the boxes.  I think I would have named a routine that does this "runningTally' or 'runningScore' rather than 'pairs'.

   runningTally=: [: +/\ =/~

Or, maybe:

   runningScores=: 0 0, [: +/\ =/~

I had two alternative names and I am going to abuse that so that I can trivially distinguish between these two possible ways of presenting the progress of the game.

So.. I am already rewriting the code and I have barely gotten started.  I do not know if that is a comment on my personality or a comment on the code.  

Next up, we have validLength:

{.@:((1 + I.) , #)@:(isFinal/&>)
isHigh *. 1 < [: | -
3 < >.

So... isFinal goes between a score pair, and then we pick either the length of the argument or the length which includes the first final score...  And a score is final if it's greater than 3 and also exceeds the other score by more than 1.  Again, I am tempted to rewrite it... if nothing else, the results from my runningTally are not in quite the same format that validLength expects.  Also, this code is fundamentally scalar code -- it's array of structures rather than structure of arrays, and J works much better for me when I think in terms of structures of arrays.

The high score test could probably be any of (untested code):
   3 < >./"1 tallies
   3 < >./|: tallies
   >./"1 3:< tallies  NB. ugly and confusing, let's ignore the "1 3: thing
   >./|:3 < tallies
   *./|:3 < tallies

The score separation test could be any of (again, untested code):
   1 < | -/"1 tallies
   1 < | -/|: tallies
or... are there other good variations I should consider?  

Anyways, I am inclined to use transpose (rather than modify the rank of two reductions).  That leaves me with:

   (something *. otherthing) |: tallies

   ((3 < >./)  *. 1 <&| -/)@|: tallies

Let's try that:
   ((3 < >./)  *. 1 <&| -/)@|: 'AB' runningTally 'ABAABBBAABA'
0 0 0 0 0 0 0 0 0 0 0

Nope.. see? that's what I get for trying to do all of this in my head, without using any experiments.

No, wait... maybe this was an acceptable result?  After all, the original use had a specific allowance for the thing being all zeros.  Well... I do not know the rules of tennis so I am not in any position to judge.  But I think this thing is defective, because this logic would not complain if I added another 'B' to the end of the game.  But maybe my result is not what the original code would do?  (Edit: I was getting confused here.  For some reason, I was thinking that the example I was working with, here, was the original example of a complete game -- but it was not, and my cascade of confusion led to me to drift off (and stop this blog entry).)

   isFinal/"1 'AB' runningScores 'ABAABBBAABA'
0 0 0 0 0 0 0 0 0 0 0 0

   isFinal/"1 'AB' runningScores 'ABAABBBAABAB'
0 0 0 0 0 0 0 0 0 0 0 0 0

Hmm... so ok, let's come back to this later.  I will note, however, that the desired result, here would be i.1 of that bit vector.  There is no need to find all of the final scores, and i.1 will automatically give us the length if there are no 1s in the list.

At this point, I really need to understand the requirements, I think, and come at the problem afresh.  But a quick look at the code:

firstPlayerName=: 'Alice'"_
secondPlayerName=: 'Bob'"_
ScoreMarks=: 'AB'

why are we using constant verbs and not nouns here?  Looking at the code, I see one place where the syntax requires a verb:

leadingPlayerName=: firstPlayerName ` secondPlayerName @. <

but that could be rephrased to use { instead of @.

So... impressions...

My impression is that this code was developed using blinders -- the code was designed with a structure in mind (build up verbs that do some part of the phrasing) and that mold was given fairly high priority.  That is not how I work, though, and I am not particularly comfortable with the result.

Which, I suppose, explains my strong urges to rewrite the thing.

Also, the validation part needs a bit of thought, but I do not have enough time, right now, to be researching how tennis games work.

Or does it?  When I look at, the impression I get was that the example game ended early (with a score of 6-5, the game did not have to be over), and there is nothing wrong with the validator.


  1. Thanks for the reactions, Raul.

    I found it amusing to see you make the only explicit verb tacit, because the only reason I made it explicit was to make its long chain of composition easier on the reader. Clearly you confounded my expectation of reader preference!

    Shortly after starting a solution that relied on boxes, I rewrote the paired part of it as a length-by-two integer matrix. But when I came back to finish, I carried on with the boxed version instead, and I didn't bother to reconsider it as I came to a completed calculation. I attribute this to laziness on my part.

    I do like [: +/\ =/~ much better than my (boxing) code that creates pairs.

    You seem to have read the name of my verb as "pairs" but it isn't. True, it creates pairs. But the name I chose is the verb, "pair".

    The verb validLength seems to have distracted you from the main gist of the program. That seems like another strike against preening input within a program. I wish I'd ignored that expectation in the "requirements."

    The claim that my version is "fundamentally scalar code" is not pleasant to my years, but I think I can see the difference you're talking about, and I agree that there are advantages to leaning toward structures of arrays.

    I set the first three values as simple definitions to free myself of the task of writing something that would prepare such data as parameters to a verb. That aspect didn't happen to interest me at this time. I made the names into constant verbs as an easy way to allow them to differ in length.

    I'm glad to say it is getting harder for me to tell when the differences between our programs reflects differences of skill, and when they're differences in taste or style. It seems to me that you envision a wider range of expressions more readily than do I, and can select among them more casually. The way I relied on Self Classify to produce running totals is a great indicator of this. As always, I benefit from seeing how you accomplish things.

  2. Thank you for dropping by... (woot! my first comment! ahem...)

    I think I would modify one thing in your comments -- I stopped after dealing with the validator because I was confused, and because I was lazy, but also because it seems to be validating a different set of rules than what you have implemented.

    As near as I can tell, there are two sets of rules for tennis. In both sets, the game cannot end until one person scores four points. In one set of rules, one person has to have a two person lead over the other, but in the other form they do not have to have that lead.

    And, in the example, I can see that you are using the short form -- the game ended with a "score" of 4-3. (Except that's not how the rules say we label the game.)

    But the validator, as near as I can tell, does not impose a game end until someone has a two person lead.

    So, when I saw that conflict (and this was before I had even looked up tennis rules), I did not know what was going on, and I decided I needed to go off and look up the rules before I proceeded.

    Anyways, I think that is not so much a comment on the validator itself, but on how I approached this problem.

    Though I am wondering if I should go back and "finish up" the critic (probably on a new blog entry) or if I should just leave it like this?

  3. I don't follow why you think my example game ends with a score of 4 3. It ends with a score of 6 4. The raw input of the example is 'AABBBAABAABBBB' (The last four characters are spurious or invalid. I used validLength to filter them out.)

  4. That's a good question... the short form is that I was confused (and wrong).

    Somehow I had started using the sequence 'ABAABBBAABA' and for some reason I was thinking that that was your example sequence. Needless to say, ... I was wrong.

  5. Thank you for pointing out that i.&1 can replace ((1+I.),#)

  6. I think that concept needs work also -- it's really # <. 1+i.&1 that you need, and I am not sure that that's simpler.