r-nd-m: 2012

Monday, August 6

E.T. Jaynes and Boolean algebra

So... https://twitter.com/psnively/status/232555921115602945 ... @psnively had recommended I read E.T. Jaynes Probability Theory: The Logic of Science.

And, ok, so far I have not had much time to spend on that particular book. So I'm still on the first chapter.

Anyways I particularly liked this quote that he had used:

"Good mathematicians see analogies between theorems; great mathematicians see analogies between analogies" (attributed to S. Banach quoted by S. Ulam 1957 ('Marian Smoluchowski and the theory of probabilities in physics', Am. J. Phys. 25, 475-481)

This matches my intuition about what makes great mathematicians great -- they see potential relationships long before they have explored them and expressed them rigorously.

On the other hand, I was almost disturbed by one of his sentences in this statement:

1.5 Boolean algebra

To state these ideas more formally, we introduce some notation of the usual symbolic logic, or Boolean algebra, so called because George Bool (1854) introduced a notation similar to the following. Of course, the principles of deductive logic itself were well understood centuries before Boole, and, as we shall see, all the results that follow from Boolean algebra were contained already as special cases in the rules of plausible inference given by (1812). The symbol

AB,

called the logical product or the conjunction, denotes the proposition 'both A and B are true'. Obviously the order in which we state them does not matter, AB and BA say the same thing. The expression

A + B

called the logical sum or disjunction, stands for 'at least one of the propositions, A, B is true' and has the same meaning as B + A. These symbols are only a shorthand way of writing propositions, and do not stand for numerical values.

The sentence which bothers me is that last sentence that I quoted, and it bothers me because of the first sentence that I quoted.

Here's the thing: in George Boole's work, these statements represent numerical results. George Boole started out by expressing his concepts in terms of axioms, but those axioms have a well understood correspondence to numerical operations. Boolean multiplication is the operation we know as "Least Common Multiple" and Boolean addition is the operation we know as "Greatest Common Denominator"

So here's a times table for boolean multiplication:

┌─┬───────────────────────────┐
│ │0 1 2 3 4 5 6 7 8 9│
├─┼───────────────────────────┤
│0│0 0 0 0 0 0 0 0 0 0│
│1│0 1 2 3 4 5 6 7 8 9│
│2│0 2 2 6 4 10 6 14 8 18│
│3│0 3 6 3 12 15 6 21 24 9│
│4│0 4 4 12 4 20 12 28 8 36│
│5│0 5 10 15 20 5 30 35 40 45│
│6│0 6 6 6 12 30 6 42 24 18│
│7│0 7 14 21 28 35 42 7 56 63│
│8│0 8 8 24 8 40 24 56 8 72│
│9│0 9 18 9 36 45 18 63 72 9│
└─┴───────────────────────────┘

Notice how the upper left hand corner, which has 0 and 1, would be a truth table for the "and" operation, if 1 is true and 0 is false. Of course, that's not very exciting, because you get the same upper left hand corner from a regular multiplication table. The problem though is that when you add 1 and 1 you get 2. But boolean addition (LCM) looks like this:

┌─┬───────────────────┐

│ │0 1 2 3 4 5 6 7 8 9│

├─┼───────────────────┤

│0│0 1 2 3 4 5 6 7 8 9│

│1│1 1 1 1 1 1 1 1 1 1│

│2│2 1 2 1 2 1 2 1 2 1│

│3│3 1 1 3 1 1 3 1 1 3│

│4│4 1 2 1 4 1 2 1 4 1│

│5│5 1 1 1 1 5 1 1 1 1│

│6│6 1 2 3 2 1 6 1 2 3│

│7│7 1 1 1 1 1 1 7 1 1│

│8│8 1 2 1 4 1 2 1 8 1│

│9│9 1 1 3 1 1 3 1 1 9│

└─┴───────────────────┘

Here, the upper lefthand corner looks like an OR truth table.

So, yes, George Boole's axioms give us a system where we can use the algebraic notation to represent logical operations. But it's not that the values involved are not numeric -- it's that the operations involved are not the usual multiply and divide.

Of course... I can imagine where Jaynes is going with this. Instead of using boolean multiplication we could use regular multiplication to get our truth table. Probabilities, after all, are expressed as numbers in the range 0..1, and we can multiply probabilities to express the probability of two independent events in terms of the probability of each of them.

And, we can express "OR" similarly -- first we find the chance that each of our events do not happen (1-probability) then we multiply those to find the chance that neither event happens, then we subtract that result from one to find the probability that at least one event happened. So, possibly he was saying that boolean algebra does not represent numbers to avoid confusion between the notation he is going to introduce and the notation he is using here.

Personally, though, I would rather avoid such non-factual statements, even for emphasis. (Also, it's likely he is saying this because someone else told it too him.)

Anyways, just for the fun of it, here's what boolean addition (using the euclidean algorithm to find least common multiple) gives me for some fractions in the range of 0..1:

┌───┬───────────────────────────────────────────┐

│ │0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 │

├───┼───────────────────────────────────────────┤

│0 │0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 │

│0.1│0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1│

│0.2│0.2 0.1 0.2 0.1 0.2 0.1 0.2 0.1 0.2 0.1 0.2│

│0.3│0.3 0.1 0.1 0.3 0.1 0.1 0.3 0.1 0.1 0.3 0.1│

│0.4│0.4 0.1 0.2 0.1 0.4 0.1 0.2 0.1 0.4 0.1 0.2│

│0.5│0.5 0.1 0.1 0.1 0.1 0.5 0.1 0.1 0.1 0.1 0.5│

│0.6│0.6 0.1 0.2 0.3 0.2 0.1 0.6 0.1 0.2 0.3 0.2│

│0.7│0.7 0.1 0.1 0.1 0.1 0.1 0.1 0.7 0.1 0.1 0.1│

│0.8│0.8 0.1 0.2 0.1 0.4 0.1 0.2 0.1 0.8 0.1 0.2│

│0.9│0.9 0.1 0.1 0.3 0.1 0.1 0.3 0.1 0.1 0.9 0.1│

│1 │1 0.1 0.2 0.1 0.2 0.5 0.2 0.1 0.2 0.1 1 │

└───┴───────────────────────────────────────────┘

And, here's the corresponding probabilities:

┌───┬──────────────────────────────────────────────────┐

│ │0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1│

├───┼──────────────────────────────────────────────────┤

│0 │0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1│

│0.1│0.1 0.19 0.28 0.37 0.46 0.55 0.64 0.73 0.82 0.91 1│

│0.2│0.2 0.28 0.36 0.44 0.52 0.6 0.68 0.76 0.84 0.92 1│

│0.3│0.3 0.37 0.44 0.51 0.58 0.65 0.72 0.79 0.86 0.93 1│

│0.4│0.4 0.46 0.52 0.58 0.64 0.7 0.76 0.82 0.88 0.94 1│

│0.5│0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1│

│0.6│0.6 0.64 0.68 0.72 0.76 0.8 0.84 0.88 0.92 0.96 1│

│0.7│0.7 0.73 0.76 0.79 0.82 0.85 0.88 0.91 0.94 0.97 1│

│0.8│0.8 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1│

│0.9│0.9 0.91 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1│

│1 │1 1 1 1 1 1 1 1 1 1 1│

└───┴──────────────────────────────────────────────────┘

Both of those tables are tables of numbers, I'm just using different operations here.

Of course...

Another related issue is that "Boolean Algebra" has been redefined to mean the boolean operations limited to the values 0 and 1 (because it includes a "logical negation" operation which is defined to only operate on the values 0 and 1). This redefinition (which apparently happened a few decades ago) pretty much ignores most of Boole's work and instead focuses on I think some derivations which should be attributed to Claude Shannon. (Edit: no probably Henry Sheffer - see http://r-nd-m.blogspot.com/2014/04/the-boolean-disease.html where I went over these same issues again.)

(No references here, tonight -- I'm working from memory -- but this matches everything I know about the subject. However, the only statements I have seen that boolean values are not numeric -- have never been accompanied by any reference to Boole's work. And, given the obvious numeric interpetation (the GCD/LCM thing), I am pretty sure that that's just an "urban myth". I just hate to see it repeated so widely, in the works of mathematicians who really should have looked it up for themselves.)

Update: Paul Snively wrote a followup: http://psnively.github.com/blog/2012/08/08/Jaynes_and_Boolean_Algebra/

And, while mostly I agree with the thought he's expressed there I do have a quibble with his followup:

In addition, though, it makes no sense to ascribe numerical values to propositions! Consider:

A = Christopher Columbus discovered America in 1492.

B = It will rain today.

AB

My problem here is that the proposition itself makes no sense.

I have also been wondering if I could come up with a better phrasing for Jayne's sentence "These symbols are only a shorthand way of writing propositions, and do not stand for numerical values." -- personally, I think I would just drop the last seven words from that sentence.

Wednesday, May 16

So... yesterday, I was dwelling in depth on types that "cannot be implemented".

Except, that's not a completely fair statement -- you can implement a finite part of an infinite type. Some operations might fail, but maybe you do not need those operations.

There are several approaches here:

You can fall into an infinite loop when you encounter these situations.

You can introduce runtime checks -- runtime errors. These are no more logically valid than an infinite loop, but have practical advantages.

You can limit yourself to a formally defined finite subset. This can be satisfying, mathematically, but there will be operations which behave differently in this finite system than they would behave if they were members of the infinite system.

This all matters because traditionally numbers are infinite -- if I pick a number, you can add one to it.

So the above three approaches look like this:

1. Sometimes when you add two positive numbers, the system just hangs.

2. Sometimes when you add two positive numbers, you get some kind of error.

3. Sometimes when you add two positive numbers, the result is less than the peano sum of those two numbers.

Which of these is "best" can vary, depending on your own purpose and ideals -- on your own concept of "best".

* * * * *

Note also that the above options are valid in both "statically typed languages" and "dynamically typed languages". So what is the difference between these two?

The distinction between these two kinds of languages has to do with how they treat syntax, and with how people think about problems.

The syntactic distinction is that statically typed languages place an emphasis on "user defined syntax" where when users define "types" (when they describe how they will represent data) , they are also introducing new syntax that is valid for dealing with that data.

In a dynamically typed language, the user defined syntax issue still exists in some vestigial form (undefined names might be syntactically invalid), but greater emphasis is placed on performing checks when the program runs.

Wait, what?

Let's try again... in a "statically typed language" we break up evaluation of a program into multiple stages. The last stage is the familiar "run the program" stage. An earlier stage is typically a "compilation" stage (or often a sequence of compilation stages). Runtime errors in the earlier stages are the static type checks.

Nowadays, with relatively cheap computing, compared to the days when people came up with terms like "compiler", this can get blurry. You can have multiple execution stages (compilation and running) happen nearly instantaneously. You can have editing environments (emacs, and other IDEs) that perform some execution for you.

And, you can use a "proof assistant" in the early stages of execution.

And, do not get me wrong, proof assistant's can be fabulously helpful.

But they can never be a substitute for understanding what you are talking about.

* * * * *

So... let's consider one setting where people work on understanding what they are talking about: the classroom.

The classroom is not a perfect learning situation. Just as it's easy to be glib in the context of computer programming, it's easy to be glib in the classroom. And, the professor can be glib sometimes. Hopefully not, but no one is perfect.

All professors are going to have limits on what they understand. And, how they deal with that issue is a reflection on their quality as a professor.

One *good* practice on the part of a professor, is to refer the student to some appropriate resource when the student is venturing into territory where the professor is not prepared to present the information right then. "You might want to read ____."

But this is a declaration, also, that the professor is not prepared to talk about that subject. Maybe the issue is lack of time. Maybe it's a lack of study. Maybe the professor is rusty on the subject.

... anyways, referring someone to a book in place of explaining one's point does not mean that the referrer has an adequate understanding of the subject.

* * * *

(Adendum: too small for a new post, but added after the above text)

Anyways... in some ways the kind of concept I have been labeling as a "Glib Type" is related to an "Abstract Class". The difference, though, is that implementations of an Abstract Class are full members of the abstract class.

With a glib type, you are going to have problem cases. For the peano numbers, for example, there is going to be a number which does not have a successor. (And, if you build out operations like addition, multiplication, and exponentiation, it's going to be easy to achieve problem cases, because those operations involve generating long sequences of successors.)

Tuesday, May 15

Glib Types

I am currently mildly upset because of how a task is being treated at rosettacode.org.

For people not familiar with the site, rosettacode was developed on the idea of "let's take a look at implementing the same task in various programming languages". This has a necessary corollary, which was that the task is not allowed to forbid programming languages.

A recent task though, labeled "Proof" has gathered a collection of people that seem bound and determined to circumvent this concept.

The task starts out innocently enough -- it asks that a type be defined representing natural numbers, and that a derivative type be defined which represents the even subset of those natural numbers, and then that we prove that when we add two even numbers we get an even number. Or its original form was something like that. Since this is a fairly trivial problem, I submitted a fairly trivial proof which illustrated some of the same points which were illustrated by the other implementations.

And then, after a long time, someone objected to my proof. And this resulted in a long discussion, where we danced around the real issues. Eventually, the task was changed, explicitly requiring a type based on the peano axioms and asking that associativity be proved.

So here's a problem: the peano axioms require an infinity of distinct values. And, computers can only represent a finite number of distinct values. And, they explicitly required that we implement this as a concrete type. And, finiteness means that there will be some value which the computer can implement where the next larger value cannot be represented.

So I added to the task a statement that the existence of this limit was expected.

And my change was almost immediately reverted, without comment.

So I reverted that reversion and asked that the change be justified.

And it was reverted, again, without justification. And, it's not like anyone feels that computers or computer implementations being finite is in any way controversial.

Instead, the person doing this seems to be confused about the distinction between "a symbol which represents infinity" and "a concrete type which must represent an infinite set of distinct values". But that's just a guess, since the behavior was not justified.

So... I have decided that I am not going to deal with the site until this unjustified behavior ceases.

Still, it's an interesting problem. So I am going to post my work here, where I do not have to worry about people reverting my changes.

I plan on doing this in a few stages: First, a casual discussion of the peano axioms, and a proof about the associativity of addition where addition is built on these axioms. Then I want to go back and think about re-implementing the proofs in an automated fashion, touching on some of the issues that this raises.

* * * * *

First, are the constraints which our peano axioms need to fit, for this task:

Define a countably infinite set of natural numbers {0, 1, 2, 3, ...}.
Define an addition on that numbers.
Define a countably infinite set of even natural numbers {0, 2, 4, 6, ...} within the previously defined set of natural numbers.
Prove that the addition of any two even numbers is even.
Prove that the addition is always associative.

(The peano axioms could also be built starting from the number 1 -- this changes some details in the proofs.)

So here's a set of peano axioms that would work here, borrowed from the current instance at the math wiki:

There exists a number zero (0).
For each $n \in \mathbb{N}$ , there exists a natural number that is the successor of $n$ , denoted by $n'$ .
Zero is not the successor of any natural number. (For each $n \in \mathbb{N}$ , $n' \ne 0$ )
For all natural numbers $n,m \in \mathbb{N}$ , if $n' = m'$ , then $n = m$ .
Given any predicate P on the natural numbers, if $P (0)$ is true and $P (k)$ implies $P (k')$ for any $k \in \mathbb{N}$ , then $P (n)$ is true for all $n \in \mathbb{N}$ . (This is also known as the principle of mathematical induction.)

Note that there's a slight difference here -- the math wiki defines natural numbers as not including 0 where the rosettacode task defines natural numbers as including 0. So, for the purpose of the rosettacode task we define natural numbers as the set which includes the math wiki natural numbers and 0.

So... proof.

First off, this set of axioms is a classic mathematical problem. I think most mathematicians get exposed to it sooner or later. So showing that a system is equivalent to the peano axioms is enough to show that there is an operation like addition which is associative. And, multiplication is associative. And... mathematics is generally built on a concept rather like puns, where a set of principles developed for one line of thought can be applied to another line of thought if the underlying axioms can be shown to apply.

But there are limits to mathematical puns. For example, octonions contain the set of natural numbers but in the general case multiplication of octonions is not associative. Multiplication involving only octonions which are natural numbers is still associative, but this kind of issue means we need to proceed with caution.

So, back to the peano axioms -- how can we prove that addition is associative if all we have is the peano axioms?

First, we are going to have to define addition. Here's one implementation, using J (and originally I had used the name 'successor' where I am using the name 'next' but that resulted in some "line too long" problems, so I changed it):

zero=: 0

next=:3 :0
1 + y
)

equal=: -:

predecessor=:3 :0
zero predecessor y
:
if. y equal next x do.
x
else.
(next x) predecessor y
end.
)

addition=:3 :0
:
if. zero equal x do.
y
else.
(predecessor x) addition next y
end.
)

Example use:

predecessor next next next zero
2

(next next next zero) addition (next next next next zero)

This is a rather tedious approach -- this implementation of addition is going to be extremely slow -- but note also that we can replace the initial three definitions with something entirely different and addition will still work.

For example:

zero=: ''

next=: , ? bind 9

equal=: 1:@,. :: 0:

predecessor next next next zero
2 1

(next next next zero) addition (next next next next zero)

2 4 3 2 7 3 3

Here, we are constructing sequences of arbitrary digits and our "peano numbers" are the lengths of these sequences. Equality is thus an equality of length rather than a literal equality. (This is generally the case with numbers and equality -- if we are counting sheep we do not mean that the sheep are identical, instead we mean that we do not care about the differences between the things we are counting, except that they somehow represent distinct individuals.)

So... proof... how do we prove that addition is associative? In other words, if A, B and C are valid peano numbers, how do we prove that these two sentences are peano equal?

A addition (B addition C)
(A addition B) addition C

First, we can consider the case where A is zero. If A is zero, then these are equal:

A addition (B addition C)
zero addition (B addition C) NB. substituting for A
(B addition C) NB. if. part of the definition of addition
B addition C NB. definition of parenthesis

Similarly, if A is zero, then these are equal:

(A addition B) addition C
(zero addition B) addition C NB. substituting for A
(B) addition C NB. if. part of the definition of addition
B addition C NB. definition of parenthesis

And we already know that B addition C is peano equal to B addition C

For our next step, we rely on the last clause of the peano axioms. We start by saying that A is a value where our two sentences are equal (A addition (B addition C) and (A addition B) addition C). If this allows us to prove that they are also equal when we replace A by next A then our proof is complete.

So we start this stage with they hypothesis that these are equal:
A addition (B addition C)
(A addition B) addition C

So:
((next A) addition B) addition C
(A addition (next B)) addition C NB. else. part of the definition of addition
(next (A addition B)) addition C NB. else. part of the definition of addition
(A addition B) addition (next C) NB. else. part of the definition of addition
(next ((A addition B) addition C)) NB. by induction **
(next (A addition (B addition C)) NB. hypothesis
(A addition (next (B addition C)) NB. else. part of the definition of addition
((next A) addition (B addition C)) NB. else. part of the definition of addition

These are all equal so the first statement in this chain must be equal to the last statement in this chain.

The transition between these two steps needs induction:

(A addition B) addition (next C) NB. else. part of the definition of addition
(next ((A addition B) addition C)) NB. by induction **

Here's a quick exposition of that induction:

If (A addition B) is zero then it breaks down like this:

(A addition B) addition (next C)
0 addition (next C) NB. by definition
(next C) NB. if part of addition

And, because (A addition B) is zero, we can replace C by (A addition B) addition C and it means the same thing, which leaves us with:

(next ((A addition B) addition C))

If we then assume that we have a case where (A addition B) addition (next C) equals (next ((A addition B) addition C)) we just need to show that it also holds for the next of (A addition B):

(next (A addition B)) addition (next C)
(A addition B) addition (next (next C)) NB. else part of definition of addition

by our induction, we get:

(next ( (A addition B) addition (next C))) NB. else part of definition of addition

Easy, right?

Though some people might prefer I deal with the parenthesis differently, or that I replace phrases with named values. That's a quibble, though, and you can do that kind of manipulation for yourself if you are interested in it.

One note here is that this is a rather tedious sequence of manipulations. But that's not the most important point.

I think the most important point here is the rule of induction. When we can find a case which is equivalent to the inductive rule in the peano axioms, we have found a case which we can characterize by counting. In other words: we can manipulate numbers to deal with those kinds of problems.

Many people I think take the wrong lesson here -- instead of seeing how useful numbers can be, they instead focus on the concept of recursion. But in most practical cases, numbers are good enough. So why make life complicated and tedious? (There are cases -- like ackerman's function -- where you need something more, but I do not think anyone has ever found a practical use for ackerman's function.)

Anyways, from a practical programmer's point of view: numbers are handy things. (And sequences of numbers are handy things -- but that's getting too far afield for this discussion.)

So... how do we convert this proof to a mechanical proof about types?

First, we need a definition for "type" in J. A type is the set of results which can be yielded by a parenthesized J expression.

We also need a way of expressing the above definitions (especially the definition of addition) in a fashion that lets us distinguish the if part and the else part in a mechanical fashion. It may also be convenient to represent a part of an addition. In other words, instead of 1 + 2, we might want to express this in parts like (1 + ]) 2

And, we need a way of tying these things together.

So... back to J... here's some thoughts on how I might construct a type checked program in J:

First, typing itself: If I can make the complete type declaration a part of the name of the implementation of a routine then I can limit access to the routine by type in the same way that I limit access by name.

We also need a mechanical way of expressing things like (A (equal (predecessor (next A)))). I do not need a fully general implementation of these kinds of mechanisms -- I only need enough for this task. (Or, I would not if people would stop changing the task.)

Also, how do we distinguish between different peano systems?

One issue, here, is that for us to be satisfied by a proof, we need to understand it. That's pretty much the point of a proof -- that we are comfortable with our understanding at each step of the way.

And a related issue is that the theories which we can derive from a set of axioms are always going to be restatements of those axioms. If there's something new being introduced, it's going to be a new axiom. An axiom is something that we assume to be true, and the point of all this theory and proof stuff is to minimize our set of axioms -- of assumptions.

And, the way proofs work is: we are introducing a new sequence of things which we might treat as axioms -- constructive statements -- but but which we can later ignore. We see that they are equal to axioms we already have accepted as true.

But that's not really a good description for how computer programs run. Computer programs are a mechanical set of logical processes -- very like a proof -- but instead of understanding their steps, the execution of a program happens outside of our minds. Thus, programs which deal with proofs are typically called "proof assistants". They can perform proof-like steps but it's still up to us to determine the relevance of these steps.

So... we can express the peano proof in a mechanical fashion. But whether that is convincing to the reader is going to depend on the reader and in their understanding of the underlying system.

A related issue is that we can trust the system "too much". The issue I began with here -- the issue having to do with treatment of infinity -- I think illustrates that hazard rather nicely.

But back to representing this problem in J...

Personally, I do not find mechanical manipulations to be a very adequate proof. So I would not be writing an implementation that satisfies myself. Instead, I suppose I would start by emulating an implementation in some other language. Let's take the Agda implementation, for example:

module Arith where
 
-- 1. Natural numbers.
-- 
--   ℕ-formation:     ℕ is set.
-- 
--   ℕ-introduction:  o ∈ ℕ,
--                    a ∈ ℕ | (1 + a) ∈ ℕ.
-- 
data ℕ : Set where
  o : ℕ
  1+ : ℕ → ℕ
 
-- 2. Addition.
-- 
--   via ℕ-elimination.
-- 
infixl 6 _+_
_+_ : ℕ → ℕ → ℕ
o + n = n
1+ m + n = 1+ (m + n)
 
-- 3. Even natural numbers.
-- 
data 2×ℕ : ℕ → Set where
  o : 2×ℕ o
  2+ : {n : ℕ} → 2×ℕ n → 2×ℕ (1+ (1+ n))
 
-- 4. Sum of any two even numbers is even.
-- 
--   This function takes any two even numbers and returns their sum as an even number,
--   this is a type, i.e. logical proposition, algorithm itself is a proof by unification
--   which builds a required term of a given (inhabited) type, and the typechecker
--   performs that proof (so that this is compile-time verification).
-- 
even+even≡even : {m n : ℕ} → 2×ℕ m → 2×ℕ n → 2×ℕ (m + n)
even+even≡even o n = n
even+even≡even (2+ m) n = 2+ (even+even≡even m n)
 
-- The identity type for natural numbers.
-- 
infix 4 _≡_
data _≡_ (n : ℕ) : ℕ → Set where
  refl : n ≡ n
 
cong : {m n : ℕ} → m ≡ n → 1+ m ≡ 1+ n
cong refl = refl
 
-- 5.1. Direct proof of the associativity of addition.
-- 
+-associative : (m n p : ℕ) → (m + n) + p ≡ m + (n + p)
+-associative o _ _ = refl
+-associative (1+ m) n p = cong (+-associative m n p)
 
-- Proof _of_ mathematical induction on the natural numbers.
-- 
--   P 0, ∀ x. P x → P (1 + x) | ∀ x. P x.
-- 
ind : (P : ℕ → Set) → P o → (∀ n → P n → P (1+ n)) → ∀ n → P n
ind _ P₀ _ o = P₀
ind P P₀ next (1+ n) = next n (ind P P₀ next n)
 
-- 5.2. Associativity of addition by induction.
-- 
+-associative′ : (m n p : ℕ) → (m + n) + p ≡ m + (n + p)
+-associative′ m n p = ind P P₀ is m
  where
    P : ℕ → Set
    P i = i + n + p ≡ i + (n + p)
    P₀ : P o
    P₀ = refl
    is : ∀ i → P i → P (1+ i)
    is i Pi = cong Pi

Here's the web site for Agda: http://wiki.portal.chalmers.se/agda/pmwiki.php

So... reading through this... o is similar to my zero and 1+ is similar to my next . Except

1. Agda begins by declaring their "type".
2. It's not clear to me if this Agda code has any concrete implementation for the type, nor whether it's intended to be able to have any concrete implementation. We can have different sets which are isomorphic to the natural numbers...

Also, this comment bothers me:

"Proof _of_ mathematical induction on the natural numbers"

This was originally an axiom -- something that we take as a given, not something that we should be proving. So either that comment is wrong or the code is doing something different from what I think that I would want to do.

Also... if I understand how Agda works (and I might not), it looks to me as if the treatment of associativity is implemented as a constraint on the original definition. In other words, it looks to me as if it's saying that the earlier definition contains an associative subset, which is something slightly different from saying that the peano axioms guarantee associativity.

But, wait, what about the mathematical "for all" statements in that induction section? Don't those show that the guarantees in part 5 must be valid for all natural numbers? ..... unfortunately, that is the problem. By asserting validity for an infinite series, we are asserting that a finite implementation will eventually fail.

Anyways, it *should* be possible to discuss these kinds of issues rationally, using the english language. And I am upset that I cannot achieve any basic discussion on the subject of "what does it mean to implement a type in a computer program"?

My sense is that there are multiple concepts of type here -- that an Agda type is not anything like a J type. (A J type being the set of values which can result from an expression, an Agda type may never correspond to any results?) If so, this kind of description should be a crucial part of the Rosetta code task.

If it's really the case that the finite character of computer systems has no relevance to the definition of type being used here, this means that the type is not limited to the things which can result from the computer program. Instead, it's a purely symbolic thing -- a glib type -- a type which does not have examples, which cannot be implemented.

That's... disappointing.

Wednesday, March 28

2 + 3 NB. five

So...

I got a few comments in tweets, on my last post. One expressed interest in sparse array handling, another expressed interest in jtva2, and another expressed general interest in the J interpreter.

I am not sure how much time and interest I am going to have, but I decided to look closer at how my +/1 2 3 expression gets handled.

But, before I do that (next post, probably), I thought I would do a pass through jtva2() in va2.c:

First: jtva2() does not get used, at all, in the expression +/1 2 3. This is because +/ gets handled by special code. However, if I execute:

1+2+3

then jtva2 gets used (twice, in fact). The first time it runs, a is a noun for the number 2, and w is a noun for the number 3. The second time it runs, a is a noun for the number 1 and w is a noun for the number 5. Here's a breakdown of the names and their meanings, in the first time through:

an: 1 (the number of items in a)
ar: 0 (the rank of a)
as: pointer to shape of a
at: 4 (the jtype of a -- int)

wn, wr, ws and wt are the same (since the only difference between a and w here is the value)

id is '+'
zt: 4 (the jtype of z -- int)

var gets called, and in var:
t: 4 (intermediate result specifying both argument types)
j: 4 (index into list of dyadic handlers)
q: pointer to struct VA:
typedef struct {VA2 p2[13];VA2 pins[7];VA2 ppfx[7];VA2 psfx[7];} VA;

VA2 is
typedef struct {VF f;I cv;} VA2;

Here, f is a handler -- it's going to do the actual addition, multiplication, or whatever (once we have decided what type of operation we are going to perform), and cv represents the kinds of results we might be producing.

Here are the values for cv:
/* cv - control vector */

#define VBB (I)1 /* convert arguments to B */
#define VII (I)2 /* convert arguments to I */
#define VDD (I)4 /* convert arguments to D */
#define VZZ (I)8 /* convert arguments to Z */
#define VXX (I)16 /* convert arguments to XNUM */
#define VQQ (I)32 /* convert arguments to RAT */
#define VB (I)256 /* result type B */
#define VI (I)512 /* result type I */
#define VD (I)1024 /* result type D */
#define VZ (I)2048 /* result type Z */
#define VX (I)4096 /* result type XNUM */
#define VQ (I)8192 /* result type RAT */
#define VSB (I)16384 /* result type SBT */
#define VRD (I)65536 /* convert result to D if possible */
#define VRI (I)131072 /* convert result to I if possible */
#define VXEQ (I)262144 /* convert to XNUM for = ~: */
#define VXCF (I)524288 /* convert to XNUM ceiling/floor */
#define VXFC (I)1048576 /* convert to XNUM floor/ceiling */

Anyways, in var we are concerned with (q->p2)[j]

j is the result of a decision tree something like this:

If either argument is complex, j is 9

if either argument is rational or extended precision (or extended floating or extended complex -- types which I think are unimplemented -- I shall try to ignore unimplemented possibilities), then j is 8 if the other argument was floating point (C's double) otherwise j is 11 if either argument was rational and 10 (extended precision) for the remaining cases.

Otherwise, j depends on the type of both the left and right argument, it's the sum of:

left arg boolean: 0
left arg int: 3
left arg something else (probably double): 6

right arg boolean: 0
right arg int: 1
right arg something else (probably double): 2

In our case, both args are int, so j is 4 and (q->p2)[j] gives a VA2 struct with f: _plusII and cv: 512

that f gets passed back to jtva2 in its ado pointer, and that cv gets passed back to jtva2 in its cv pointer (pointers to both of these were the last two arguments to var()).

Note also that cv gets 4 added to it if we are combining floating point with extended precision (I think in all such cases for primitives, the result is floating point -- this is for both practical and performance reasons: if you are using floating point, you have already lost the extreme accuracy which is supported by extended precision and rational numbers, and of course floating point calculations are machine efficient.)

But that doesn't matter here, and that's all that var does. So, back to jtva2:

The first thing we do when we get back into jtva2 is inspect cv and figure out what kind of result we are going to try to produce. This corresponds to the description at http://jsoftware.com/help/dictionary/dictg.htm

Basically: if we can produce a bool, we do that. If we can produce an int (a C long), we do that. If we can produce a float (a C double), we do that, and so on...

This is the value for zt= rtype(cv);

Next, we look at whether either of our arguments need to be converted to the relevant type in t=atype(cv), and in this case t is 0 -- we do not have to convert either of our arguments for our desired result type.

So, anyways, we skip over the type coercion code (and if we were dealing with sparse arrays we would also skip over that, perhaps because the sparse array code already dealt with that?).

Then we skip over some code that only runs if our jt context has a rank specification, and in the else clause:

We check that our arguments have the same shape prefix -- length error if not.

We find the shape of the result (the longer of the left shape or the right shape -- they're the same in this case), and we define m and n. m is how long our shorter argument is, and n is how many types we need to repeat the shorter argument to make it the same length as the longer argument. In this case, both are 1 -- both arguments are the same length and each only contains one value.

Finally, we allocate space for our result. We need space for one integer with a rank of zero. Here's the macro for "Get A":

#define GA(v,t,n,r,s) RZ(v=ga(t,(I)(n),(I)(r),(I*)(s)))

RZ means return null if the result is false -- that's how errors are implemented in the J interpreter.

ga is a cover for jtga:

A jtga(J jt,I t,I n,I r,I*s){A z;I m,w;
if(t&BIT){const I c=8*SZI; /* bit type: pad last axis to fullword */
ASSERTSYS(1>=r||s,"ga bit array shape");
if(1>=r)w=(n+c-1)/c; else RE(w=mult(prod(r-1,s),(s[r-1]+c-1)/c));
w+=WP(INT,0L,r); m=SZI*w;
ASSERT( n>=0&&m>w&&w>0,EVLIMIT); /* beware integer overflow */
}else{
w=WP(t,n,r); m=SZI*w;
ASSERT(m>n&&n>=0&&m>w&&w>0,EVLIMIT); /* beware integer overflow */
}
RZ(z=ma(m));
if(!(t&DIRECT))memset(z,C0,m);
if(t&LAST0){I*v=(I*)z+w-2; *v++=0; *v=0;}
AC(z)=1; AN(z)=n; AR(z)=r; AFLAG(z)=0; AK(z)=AKX(z); AM(z)=msize[((MS*)z-1)->j]-(AK(z)+sizeof(MS));
AT(z)=0; tpush(z); AT(z)=t;
if(1==r&&!(t&SPARSE))*AS(z)=n; else if(r&&s)ICPY(AS(z),s,r); /* 1==n always if t&SPARSE */
R z;
}

basically, we have arguments specifying:
jt: interpreter context
t: data type
n: number of [data] elements to allocate
r: rank (number of [shape] elements to allocate)
*s: the shape

And then we find how big that is, call malloc (jtma() being our cover that will pull from our freelist of unused memory or try C's malloc()).

So, anyways, we get a blank slate to write our result into, bailing out if that fails, and then we execute this line:

if(1==nf) DO(mf, ado(jt,b,m,n,zv,av,wv); zv+=zk; av+=ak; wv+=wk;)

nf and mf are always 1 unless our execution context has a rank specification. And DO is a macro that builds a for loop for us:

#define DO(n,stm) {I i=0,_n=(n); for(;i<_n;i++){stm}}

In other words: two arguments, the first is how many times the second is run (the second argument is a C statement). So basically, in this case, we call our ado routine (which was defined as _plusII) one time.

When we are done, we check our execution context for an error. If the error smells like an overflow that we can handle, we recurse, calling jtva2 again with updated arguments (that we got from type coercion).

So.. that's jtva2 -- what does plusII look like?

Here's line 269 of ve.h, where it's declared:

extern ADECL2( plusII,I,I,I);

And here's line 8 of ve.h which defined ADECL2:

#define ADECL2 AHDR2

And here's line 124 of va.h, which defines AHDR2:

#define AHDR2(f,Tz,Tx,Ty) void f(J jt,B b,I m, I n,Tz*z,Tx*x,Ty*y)

And note that Tz, Tx and Ty are C type names, for the result, left argument and right argument.

So... an ADECL2 function gets these arguments:
jt: interpreter context
b: (only matters if n is 0): 1: x is a list, 0: y is a list
m: how many sequential values to generate
n: 1 if both arguments are lists of the same length
z: result
x: left argument
y: right argument

In other words:
1+2 3 4
b: 0
m: 3
n: 0
4 5 6*8
b: 1
m: 3
n: 0
9 10 11-12 13 14
m: 3
n: 1
1+1
m: 1
n: 1

And the actual definition of plusII is on line 18 of ve.c and looks like this:

AOVF( plusII, I,I,I, PLUSVV, PLUS1V, PLUSV1)

AOVF is defined in lines 147-152 of va.h, and looks like:

#define AOVF(f,Tz,Tx,Ty,fvv,f1v,fv1) \
AHDR2(f,I,I,I){C er=0;I u,v,*x1,*y1,*z1; \
if(1==n) {fvv(m,z,x,y); RER;} \
else if(b){z1=z; y1=y; DO(m, u=*x++; f1v(n,z,u,y); RER; z=z1+=n; y=y1+=n;);} \
else {z1=z; x1=x; DO(m, v=*y++; fv1(n,z,x,v); RER; z=z1+=n; x=x1+=n;);} \
}

And, I gets defined on either line 15 (64 bit) or line 22 (32 bit) of jtype.h. The 32 bit version looks like this:

typedef long I;

The 64 bit version looks like this:

typedef long long I;

So that's the C type that J uses to represent simple integer values.

Meanwhile, PLUSVV is defined on line 489 of vasm.h:

#define PLUSVV(m,z,x,y) {B p; DO(m, p=0>*x; *z=*x+*y; BOV(p==0>*y&&p!=0>*z); z++; x++; y++;);}

and PLUS1V is on line 493:

#define PLUS1V(n,z,u,y) {B p=0>u; DO(n, z[i]=u+y[i]; BOV(p==0>y[i]&&p!=0>z[i]););}

and PLUSV1 is on line 497:

#define PLUSV1(n,z,x,v) PLUS1V(n,z,v,x)

(obviously, these need to be macros, because they are being defined for a variety of data types.)

B is just a declaration for a bit value (from line 27 of jtype.h:

typedef char B;

BOV checks an expression for overflow:

#define BOV(exp) if(exp){er=EWOV; break;}

(and this error state is not displayed to the user -- it means try again, or report a domain error if that's not valid.)

And... I think that's about all I can say about 2+3 -- but this is getting long, so I'll save +/ 1 2 3 for another post... (And I should maybe do a couple more passes through jtva2() some time. Interesting issues, that probably deserve their own passes, include: non-trivial rank, type coercion, overflow, and sparse arrays.)

Saturday, March 24

+/ 1 2 3 NB. six

Scott Vokes (@silentbicycle) wrote:

@raudelmil I haven't looked at it lately, but I remember running into a brick wall trying to figure out how adverbs work in it. @ravi_mohan

So I was thinking that sounds like a worthwhile project -- documenting how an adverb works in the J interpreter.

So, first off, my understanding is that the J interpreter, while written in C, is structured much like a J program. It's doing "J things". So, to understand what it's doing, we should start with J.

In this case, let's consider the expression:

+/ 1 2 3

Here, we have three objects:

+ a verb
/ an adverb
1 2 3 a noun

And the first thing that the J interpreter would handle, when executing this expression is the noun 1 2 3. Since it's a noun, internally in the interpeter it's represented by two lists. The first list is the shape of the array (1 dimenion: 3). The second list is the values of the array (which must have a length equal to the product of the values of first list). There's also something to represent the C type used to represent the values, and the C length of the stored data (for memory management).

Next, J would pick up the adverb. But it can't do anything with it yet.

Next, J would pick up the verb. A J verb has a dyadic definition, and a monadic definition and a rank. In this case the dyadic definition is the interesting one, but the interpeter does not know that yet (unless special code is involved, which might actually be the case here). So, somewhere in the interpreter we are going to be dealing with both of these definitions (and the rank: 0 0 0).

Next, the interpeter would give that verb to the adverb. And, the adverb would be building a new verb from it. Again, the new verb would have a dyadic definition, a monadic definition and a rank. In this case we are going to be using the monadic definition, but unless special code is involved (and it probably is, in this case), the interpeter will not know that yet.

Finally, this new verb is going to process that noun. But the interpreter isn't quite done yet -- it's going to be dealing with rank issues before letting the underlying verb mechanisms get at the underlying data. But, eventually, something is going to be adding up those three numbers and coming back with the value 6.

Except, of course, 6 is going to be a noun. So it's going to have a list of dimensions (an empty list), and a list of values (one value here). Obvious, maybe, but important.

--------------

So.. next, what are the C definitions that are relevant here?

First, line 227 of j.h:

#define F2(f) A f(J jt,A a,A w)

This (F2) is how a C function is turned into a J dyadic definition.

A is an array type (it's a pointer to the structure which represents a noun)

J is the type of an J interpreter instance (again, it's a pointer to a structure). I think this corresponds to a context where a symbol can be resolved (a stack frame within a locale), but I am not sure, maybe I'll come back to this.

Also: in the interpreter, the name 'a' typically refers to a verb's left argument, 'w' typically refers to its right argument and 'z' typically refers to its result. (There are historical reasons for this.)

So this is saying that when we execute a function representing a dyadic verb, that verb will get three arguments: the context it is executing in, the verb's left argument, and the verb's right argument.

So... next, line 705 of ja.h:

#define plus(x,y) jtplus(jt,(x),(y))

Here, we see something that looks like a definition for plus. And notice how it's following the 'F2' pattern -- jtplus could be a valid argument for F2 -- so this looks promising for a definition for addition.

And, indeed, on line 489 of je.h:

extern F2(jtplus);

So this is the C extern, that various C modules are going to need to have access to, if they are going to be able to refer to jtplus. Also, we have a cpp macro named 'plus' that does the J dyadic addition thing. This is not necessarily the angle we will be coming in at, when we execute +/ but hopefully some of the patterns are becoming clear. For example, we know know that jt is expected to be the name that locally refers to the context where a verb will be executed.

So, how does the definition of jtplus look? It seems to be line 400 of va2.c:

F2(jtplus ){R va2(a,w,CPLUS );}

R is the C return statement. We already know that F2(jtplus) is the C type signature for jtplus. So, we just need va2 and CPLUS to understand this statement.

CPLUS is simple, it's on line 52 of jc.h

#define CPLUS '+' /* 43 053 2b */

The comment gives the numeric value of that character, in decimal, octal and hexidecimal.

va2 is defined on line 1074 of ja.h and looks like this:

#define va2(x,y,z) jtva2(jt,(x),(y),(z))

We know that z will be '+' and x and y will be left and right verb arguments

jtva2 has a big definition:

static A jtva2(J jt,A a,A w,C id){A z;B b,c,sp=0;C*av,*wv,*zv;I acr,af,ak,an,ar,*as,at,cv,f,m,
mf,n,nf,*oq=jt->rank,r,*s,*sf,t,wcr,wf,wk,wn,wr,*ws,wt,zk,zn,zt;VF ado;
RZ(a&&w);
an=AN(a); ar=AR(a); as=AS(a); at=an?AT(a):B01;
wn=AN(w); wr=AR(w); ws=AS(w); wt=wn?AT(w):B01;
if(id==CEXP&&1==wn&&FL&wt&&0.5==*DAV(w))R sqroot(a);
if(SPARSE&at){sp=1; at=DTYPE(at);}
if(SPARSE&wt){sp=1; wt=DTYPE(wt);}
RZ(var(id,a,w,at,wt,&ado,&cv)); zt=rtype(cv); t=atype(cv);
if(t&&!sp){
b=1&&t&XNUM;
if(t!=at)RZ(a=b?xcvt(cv&VXEQ?XMEXMT:cv&VXFC?XMFLR:cv&VXCF?XMCEIL:XMEXACT,a):cvt(t,a));
if(t!=wt)RZ(w=b?xcvt(cv&VXEQ?XMEXMT:cv&VXCF?XMFLR:cv&VXFC?XMCEIL:XMEXACT,w):cvt(t,w));
}
if(jt->rank){I acn,q,wcn,zcn;
r=jt->rank[0]; acr=MIN(ar,r); af=ar-acr; acn=prod(acr,as+af);
r=jt->rank[1]; wcr=MIN(wr,r); wf=wr-wcr; wcn=prod(wcr,ws+wf); jt->rank=0;
ASSERT(!ICMP(as,ws,MIN(af,wf))&&!ICMP(as+af,ws+wf,MIN(acr,wcr)),EVLENGTH);
c=af<=wf; f=c?wf:af; q=c?af:wf; sf=c?ws:as;
b=acr<=wcr; zcn=b?wcn:acn; m=b?acn:wcn; n=m?zcn/m:0; r=b?wcr:acr; s=b?ws+wf:as+af;
if(zcn){RE(mf=prod(q,sf)); RE(nf=prod(f-q,q+sf));}else mf=nf=0;
if(!sp){RE(zn=mult(mf,mult(nf,zcn))); zk=zcn*bp(zt); ak=acn*bp(AT(a)); wk=wcn*bp(AT(w));}
}else{
ASSERT(!ICMP(as,ws,MIN(ar,wr)),EVLENGTH);
ak=wk=zk=af=wf=f=c=0; acr=ar; wcr=wr; sf=0; mf=nf=1;
b=ar<=wr; zn=b?wn:an; m=b?an:wn; n=m?zn/m:0; r=b?wr:ar; s=b?ws:as;
}
if(sp){z=vasp(a,w,id,ado,cv,t,zt,af,acr,wf,wcr,f,r); if(!jt->jerr)R z;}
else{
GA(z,zt,zn,f+r,sf); ICPY(f+AS(z),s,r);
if(!zn)R z;
av=CAV(a); wv=CAV(w); zv=CAV(z);
if(1==nf) DO(mf, ado(jt,b,m,n,zv,av,wv); zv+=zk; av+=ak; wv+=wk;)
else if(c)DO(mf, DO(nf, ado(jt,b,m,n,zv,av,wv); zv+=zk; wv+=wk;); av+=ak;)
else DO(mf, DO(nf, ado(jt,b,m,n,zv,av,wv); zv+=zk; av+=ak; ); wv+=wk;);
if(!jt->jerr)R cv&VRI+VRD?cvz(cv,z):z;
}
R NEVM<jt->jerr?(jt->rank=oq,va2(a,w,id)):0;
} /* scalar fn primitive and f"r main control */

I have emphasized the places in this code which care about the fact that we are doing addition, instead of something else (bold, by itself, was almost invisible so I also used italics and I increased font size). This looks like a workhorse to handle all scalar primitives. And, it looks recursive, at the very bottom, which I think is how it handles overflow -- if a result overflows, the routine can derive new argument types and call itself again. Interesting (not completely structural) names here are the ones that care that we are dealing with CPLUS as opposed to something else:

var

vasp

va2 (a reference back to this routine).

var gets defined on line 1078 of ja.h:

#define var(x0,x1,x2,x3,x4,x5,x6) jtvar(jt,(x0),(x1),(x2),(x3),(x4),(x5),(x6))

so it needs a reference to the J interpreter context. Also.. three extra arguments (beyond jt and the three arguments provided to va2()) -- that's probably rank handling (and peeking ahead at the definition of jtvar: yes, it is).

vasp gets defined a couple lines down, on line 1080 of ja.h:

#define vasp(x0,x1,x2,x3,x4,x5,x6,x7,x8,x9,x10,x11,x12) jtvasp(jt,(x0),(x1),(x2),(x3),(x4),(x5),(x6),(x7),(x8),(x9),(x10),(x11),(x12))

This one turns out to be code for dealing with sparse data. Interesting, but too far afield for what we are doing right now.

So let's look at jtvar:

B jtvar(J jt,C id,A a,A w,I at,I wt,VF*ado,I*cv){B b;I j,t,x;VA*q;VA2 p;

if(jt->jerr){

switch(VARCASE(jt->jerr,id)){

default: R 0;

case VARCASE(EWIMAG,CCIRCLE ): *ado=(VF)cirZZ; *cv=VZ+VZZ+VRD; break;

case VARCASE(EWIMAG,CEXP ): *ado=(VF)powZZ; *cv=VZ+VZZ+VRD; break;

case VARCASE(EWIRR ,CBANG ): *ado=(VF)binDD; *cv=VD+VDD; break;

case VARCASE(EWIRR ,CEXP ): *ado=(VF)powDD; *cv=VD+VDD; break;

case VARCASE(EWRAT ,CDIV ): *ado=(VF)divQQ; *cv=VQ+VQQ; break;

case VARCASE(EWRAT ,CEXP ): *ado=(VF)powQQ; *cv=VQ+VQQ; break;

case VARCASE(EWDIV0,CDIV ): *ado=(VF)divDD; *cv=VD+VDD; break;

case VARCASE(EWOV ,CPLUS ): *ado=(VF)plusIO; *cv=VD+VII; break;

case VARCASE(EWOV ,CMINUS ): *ado=(VF)minusIO; *cv=VD+VII; break;

case VARCASE(EWOV ,CSTAR ): *ado=(VF)tymesIO; *cv=VD+VII; break;

case VARCASE(EWOV ,CPLUSDOT): *ado=(VF)gcdIO; *cv=VD+VII; break;

case VARCASE(EWOV ,CSTARDOT): *ado=(VF)lcmIO; *cv=VD+VII; break;

case VARCASE(EWOV ,CSTILE ): *ado=(VF)remDD; *cv=VD+VDD; break;

}

RESETERR;

}else if(at&NUMERIC&&wt&NUMERIC){

t=at|wt; b=1&&t&RAT+XNUM+XD+XZ;

j=t&CMPX ? 9 : b ? (t&XZ?13:t&XD?12:t&FL?8:t&RAT?11:10) :

(at&B01?0:at&INT?3:6)+(wt&B01?0:wt&INT?1:2);

q=va+vaptr[(UC)id];

p=(q->p2)[j];

*ado=p.f; *cv=x=p.cv; if(b&&t&FL&&!(x&VZZ))*cv+=VDD;

}else{

b=!HOMO(at,wt); *cv=VB;

jt->rela=ARELATIVE(a)*(I)a;

jt->relw=ARELATIVE(w)*(I)w;

switch(id){

case CEQ: *ado=b?(VF)zeroF:at&SBT?(VF)eqII:at&BOX?(VF)eqAA:

at&LIT?(wt&LIT?(VF)eqCC:(VF)eqCS):wt&LIT?(VF)eqSC:(VF)eqSS; break;

case CNE: *ado=b?(VF) oneF:at&SBT?(VF)neII:at&BOX?(VF)neAA:

at&LIT?(wt&LIT?(VF)neCC:(VF)neCS):wt&LIT?(VF)neSC:(VF)neSS; break;

default:

ASSERT(at&SBT&&wt&SBT,EVDOMAIN);

q=va+vaptr[(UC)id]; p=(q->p2)[12];

ASSERT(p.f,EVDOMAIN);

*ado=p.f; *cv=x=p.cv;

}}

R 1;

} /* function and control for rank */

First it has a bunch of error reporting, then it checks if both arguments are numeric and ...

q=va+vaptr[(UC)id];

Here, vaptr is a table with this comment: /* index in va[] for each ID */

And the va table itself contains this:

/* 2b + */ {

{{plusBB,VI }, {plusII,VI+VII}, {plusBD,VD},

{plusII,VI+VII}, {plusII,VI }, {plusID,VD},

{plusDB,VD }, {plusDI,VD }, {plusDD,VD},

{plusZZ,VZ+VZZ}, {plusXX,VX+VXX}, {plusQQ,VQ+VQQ}, {0,0}},

{{plusinsB,VI}, {plusinsI,VI}, {plusinsD,VD}, {plusinsZ,VZ}, {0,0}, {0,0}, {0,0}},

{{pluspfxB,VI}, {pluspfxI,VI}, {pluspfxD,VD}, {pluspfxZ,VZ}, {pluspfxX,VX}, {pluspfxQ,VQ}, {0,0}},

{{plussfxB,VI}, {plussfxI,VI}, {plussfxD,VD}, {plussfxZ,VZ}, {plussfxX,VX}, {plussfxQ,VQ}, {0,0}} },

... so it looks like we are doing a table lookup based on operation and type -- C gets a bit obscure for multi-dimensional arrays -- but ultimately, we are going to be running one of these "typed functions". J doesn't want to have to decide which one until it's inspected the data, so that's why we just refer to the operation by name until after we are down in the weeds. (And we can imagine that each of these names is a C macro and really a reference to jtplusBB or jtplusXX or whatever else.)

And, another thing is that this table lookup lets us share common code with all "scalar atomic functions" -- most of it is "boilerplate work", and at the last moment we dispatch to the appropriate handler. We could easily be using an analogous pattern when we get into adverb evaluation.

So.. that's our first verb.

-----------------

For our adverb, let's start on line 785 of ja.h:

reduce(x,y) jtreduce(jt,(x),(y))

This jt stuff should be a familiar pattern by now. Here, x is probably the verb doing the reducing and y is probably the noun being reduced. (x and y being newer names that adopted the same role as the previous a and w.)

Looking for references to jtreduce, I see three of them in ar.c:

line 12:

static DF1(jtreduce);

line 442 (the definition of jtreduce)

line 563 (the definition of jtslash)

Taking a step back: in J, we have two steps in using +/ 1 2 3. Our first step was to derive a verb (monadic definition, dyadic definition, rank). That's probably what jtslash is doing. Our second step was to derive a noun. That's probably what jtreduce is doing.

Taking a step back further, all references to reduce are in ar.c (aside from that definition in ja.h). So I suppose we can get away with a static rather than an extern in a header file, for jtreduce.

So... looking at jtslash:

F1(jtslash){A h;AF f1=jtreduce;C c;V*v;

RZ(w);

if(NOUN&AT(w))R evger(w,sc(GINSERT));

v=VAV(w);

switch(v->id){

case CCOMMA: f1=jtredcat; break;

case CCOMDOT: f1=jtredstitch; break;

case CSEMICO: f1=jtredsemi; break;

case CUNDER: if(COPE==ID(v->g)){c=ID(v->f); if(c==CCOMMA)f1=jtredcateach; else if(c==CCOMDOT)f1=jtredstiteach;}

}

RZ(h=qq(w,v2(lr(w),RMAX)));

R fdef(CSLASH,VERB, f1,jtoprod, w,0L,h, 0L, RMAX,RMAX,RMAX);

}

it has some special code for a few operations, otherwise it defines a verb (using fdef). The monadic definition is fl which will be jtreduce if special code was not used. Here, w is probably our plus verb.

But let's look at fdef for a moment, or rather jtfdef:

A jtfdef(J jt,C id,I t,AF f1,AF f2,A fs,A gs,A hs,I flag,I m,I l,I r){A z;V*v;

RE(0);

GA(z,t,1,0,0); v=VAV(z);

v->f1 =f1?f1:jtdomainerr1; /* monad C function */

v->f2 =f2?f2:jtdomainerr2; /* dyad C function */

v->f =fs; /* monad */

v->g =gs; /* dyad */

v->h =hs; /* fork right tine or other auxiliary stuff */

v->flag =flag;

v->mr =m; /* monadic rank */

v->lr =l; /* left rank */

v->rr =r; /* right rank */

v->fdep =0; /* function depth */

v->id =id; /* spelling */

R z;

}

Yep -- looks like struct stuffing.

So, anyways, when we evaluate this thing, we expect that it will be under control of jtreduce, which looks like this:

static DF1(jtreduce){A z;C id;I c,cv,f,m,n,r,rr[2],t,wn,wr,*ws,wt,zt;VF ado;

RZ(w);

wr=AR(w); r=jt->rank?jt->rank[1]:wr; f=wr-r;

wn=AN(w); ws=AS(w); n=r?ws[f]:1;

if(SPARSE&AT(w))R reducesp(w,self);

wt=AT(w); wt=wn?wt:B01;

id=vaid(VAV(self)->f);

switch(n){

case 0: R red0(w,self);

case 1: R tail(w);

case 2: RZ(reduce2(w,id,f,r,&z)); if(z)R z;

}

vains(id,wt,&ado,&cv);

if(!ado)R redg(w,self);

zt=rtype(cv); jt->rank=0;

GA(z,zt,wn/n,MAX(0,wr-1),ws); if(1<r)ICPY(f+AS(z),f+1+ws,r-1);

if((t=atype(cv))&&t!=wt)RZ(w=cvt(t,w));

m=prod(f,ws); c=m?wn/m:prod(r,f+ws);

ado(jt,m,c,n,AV(z),AV(w));

if(jt->jerr)R jt->jerr==EWOV?(rr[1]=r,jt->rank=rr,reduce(w,self)):0; else R cv&VRI+VRD?cvz(cv,z):z;

} /* f/"r w main control */

here we see the usual mismash of C type checking and sanity checking, That switch statement looks like it's based off of the length of the data. Length zero data gets one handling, length 1 data gets another handling, and length 2 gets the full "reduce" handling. So jtreduce2 is going to be doing the heavy lifting, and jtreduce is dealing with the special cases. Probably...

So, question: is this kind of analysis worth continuing? Should I delve deeper into any of the details here?