*E.T. Jaynes*Probability Theory: The Logic of Science.

And, ok, so far I have not had much time to spend on that particular book. So I'm still on the first chapter.

Anyways I particularly liked this quote that he had used:

"Good mathematicians see analogies between theorems; great mathematicians see analogies between analogies" (attributed to S. Banach quoted by S. Ulam 1957 ('Marian Smoluchowski and the theory of probabilities in physics', Am. J. Phys.

**25**, 475-481)

This matches my intuition about what makes great mathematicians great -- they see potential relationships long before they have explored them and expressed them rigorously.

On the other hand, I was almost disturbed by one of his sentences in this statement:

1.5 Boolean algebra

To state these ideas more formally, we introduce some notation of the usual symbolic logic, or Boolean algebra, so called because George Bool (1854) introduced anotationsimilar to the following. Of course, the principles of deductive logic itself were well understood centuries before Boole, and, as we shall see, all the results that follow from Boolean algebra were contained already as special cases in the rules of plausible inference given by (1812). The symbol

AB,

called thelogical productor theconjunction, denotes the proposition 'both A and B are true'. Obviously the order in which we state them does not matter, AB and BA say the same thing. The expression

A + B

called theThe sentence which bothers me is that last sentence that I quoted, and it bothers me because of the first sentence that I quoted.logical sumordisjunction, stands for 'at least one of the propositions, A, B is true' and has the same meaning as B + A. These symbols are only a shorthand way of writing propositions, and do not stand for numerical values.

Here's the thing: in George Boole's work, these statements represent numerical results. George Boole started out by expressing his concepts in terms of axioms, but those axioms have a well understood correspondence to numerical operations. Boolean multiplication is the operation we know as "Least Common Multiple" and Boolean addition is the operation we know as "Greatest Common Denominator"

So here's a times table for boolean multiplication:

┌─┬───────────────────────────┐

│ │0 1 2 3 4 5 6 7 8 9│

├─┼───────────────────────────┤

│0│0 0 0 0 0 0 0 0 0 0│

│1│0 1 2 3 4 5 6 7 8 9│

│2│0 2 2 6 4 10 6 14 8 18│

│3│0 3 6 3 12 15 6 21 24 9│

│4│0 4 4 12 4 20 12 28 8 36│

│5│0 5 10 15 20 5 30 35 40 45│

│6│0 6 6 6 12 30 6 42 24 18│

│7│0 7 14 21 28 35 42 7 56 63│

│8│0 8 8 24 8 40 24 56 8 72│

│9│0 9 18 9 36 45 18 63 72 9│

└─┴───────────────────────────┘

Notice how the upper left hand corner, which has 0 and 1, would be a truth table for the "and" operation, if 1 is true and 0 is false. Of course, that's not very exciting, because you get the same upper left hand corner from a regular multiplication table. The problem though is that when you add 1 and 1 you get 2. But boolean addition (LCM) looks like this:

┌─┬───────────────────┐

│ │0 1 2 3 4 5 6 7 8 9│

├─┼───────────────────┤

│0│0 1 2 3 4 5 6 7 8 9│

│1│1 1 1 1 1 1 1 1 1 1│

│2│2 1 2 1 2 1 2 1 2 1│

│3│3 1 1 3 1 1 3 1 1 3│

│4│4 1 2 1 4 1 2 1 4 1│

│5│5 1 1 1 1 5 1 1 1 1│

│6│6 1 2 3 2 1 6 1 2 3│

│7│7 1 1 1 1 1 1 7 1 1│

│8│8 1 2 1 4 1 2 1 8 1│

│9│9 1 1 3 1 1 3 1 1 9│

└─┴───────────────────┘

Here, the upper lefthand corner looks like an OR truth table.

So, yes, George Boole's axioms give us a system where we can use the algebraic notation to represent logical operations. But it's not that the values involved are not numeric -- it's that the operations involved are not the usual multiply and divide.

Of course... I can imagine where Jaynes is going with this. Instead of using boolean multiplication we could use regular multiplication to get our truth table. Probabilities, after all, are expressed as numbers in the range 0..1, and we can multiply probabilities to express the probability of two independent events in terms of the probability of each of them.

And, we can express "OR" similarly -- first we find the chance that each of our events do not happen (1-probability) then we multiply those to find the chance that neither event happens, then we subtract that result from one to find the probability that at least one event happened. So, possibly he was saying that boolean algebra does not represent numbers to avoid confusion between the notation he is going to introduce and the notation he is using here.

Personally, though, I would rather avoid such non-factual statements, even for emphasis. (Also, it's likely he is saying this because someone else told it too him.)

Anyways, just for the fun of it, here's what boolean addition (using the euclidean algorithm to find least common multiple) gives me for some fractions in the range of 0..1:

┌───┬───────────────────────────────────────────┐

│ │0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 │

├───┼───────────────────────────────────────────┤

│0 │0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 │

│0.1│0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1│

│0.2│0.2 0.1 0.2 0.1 0.2 0.1 0.2 0.1 0.2 0.1 0.2│

│0.3│0.3 0.1 0.1 0.3 0.1 0.1 0.3 0.1 0.1 0.3 0.1│

│0.4│0.4 0.1 0.2 0.1 0.4 0.1 0.2 0.1 0.4 0.1 0.2│

│0.5│0.5 0.1 0.1 0.1 0.1 0.5 0.1 0.1 0.1 0.1 0.5│

│0.6│0.6 0.1 0.2 0.3 0.2 0.1 0.6 0.1 0.2 0.3 0.2│

│0.7│0.7 0.1 0.1 0.1 0.1 0.1 0.1 0.7 0.1 0.1 0.1│

│0.8│0.8 0.1 0.2 0.1 0.4 0.1 0.2 0.1 0.8 0.1 0.2│

│0.9│0.9 0.1 0.1 0.3 0.1 0.1 0.3 0.1 0.1 0.9 0.1│

│1 │1 0.1 0.2 0.1 0.2 0.5 0.2 0.1 0.2 0.1 1 │

└───┴───────────────────────────────────────────┘

And, here's the corresponding probabilities:

┌───┬──────────────────────────────────────────────────┐

│ │0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1│

├───┼──────────────────────────────────────────────────┤

│0 │0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1│

│0.1│0.1 0.19 0.28 0.37 0.46 0.55 0.64 0.73 0.82 0.91 1│

│0.2│0.2 0.28 0.36 0.44 0.52 0.6 0.68 0.76 0.84 0.92 1│

│0.3│0.3 0.37 0.44 0.51 0.58 0.65 0.72 0.79 0.86 0.93 1│

│0.4│0.4 0.46 0.52 0.58 0.64 0.7 0.76 0.82 0.88 0.94 1│

│0.5│0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1│

│0.6│0.6 0.64 0.68 0.72 0.76 0.8 0.84 0.88 0.92 0.96 1│

│0.7│0.7 0.73 0.76 0.79 0.82 0.85 0.88 0.91 0.94 0.97 1│

│0.8│0.8 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1│

│0.9│0.9 0.91 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1│

│1 │1 1 1 1 1 1 1 1 1 1 1│

└───┴──────────────────────────────────────────────────┘

Both of those tables are tables of numbers, I'm just using different operations here.

Of course...

Another related issue is that "Boolean Algebra" has been redefined to mean the boolean operations limited to the values 0 and 1 (because it includes a "logical negation" operation which is defined to only operate on the values 0 and 1). This redefinition (which apparently happened a few decades ago) pretty much ignores most of Boole's work and instead focuses on I think some derivations which should be attributed to Claude Shannon. (Edit: no probably Henry Sheffer - see http://r-nd-m.blogspot.com/2014/04/the-boolean-disease.html where I went over these same issues again.)

(No references here, tonight -- I'm working from memory -- but this matches everything I know about the subject. However, the only statements I have seen that boolean values are not numeric -- have never been accompanied by any reference to Boole's work. And, given the obvious numeric interpetation (the GCD/LCM thing), I am pretty sure that that's just an "urban myth". I just hate to see it repeated so widely, in the works of mathematicians who really should have looked it up for themselves.)

Update: Paul Snively wrote a followup: http://psnively.github.com/blog/2012/08/08/Jaynes_and_Boolean_Algebra/

And, while mostly I agree with the thought he's expressed there I do have a quibble with his followup:

In addition, though, it makes no sense to ascribe numerical values to propositions! Consider:

A = Christopher Columbus discovered America in 1492.

B = It will rain today.

ABMy problem here is that the proposition itself makes no sense.

I have also been wondering if I could come up with a better phrasing for Jayne's sentence "These symbols are only a shorthand way of writing propositions, and do not stand for numerical values." -- personally, I think I would just drop the last seven words from that sentence.

Which proposition makes no sense? "Christopher Columbus discovered America in 1492," "It will rain today," or "Christopher Columbus discovered America in 1492 and it will rain today?" All three of them seem perfectly fine, albeit contrived: they're declarative statements that can be ascribed truth values, true or false. So I'm not quite sure what you mean.

ReplyDeleteI meant the conjunction.

ReplyDeleteSimilarly, if we allow ourselves to a truth value with 1 or 0, then using these numbers in any of these three cases also seems "perfectly fine, albeit contrived".

And, of course, the advantage of using numbers to represent truth values is that we can then use numerical notation to treat statements which deal with logical propositions.

Ah, thanks, I think I understand better.

ReplyDeleteAB is indeed odd, from a human point of view, but nevertheless perfectly sensible logically, which is all we're after.

Re: using numbers to represent truth values, sure, that's exactly where Jaynes is going, of course. The difference is that

propositionsdon't have numerical values, but theprobabilitiesof the propositions certainly do. So we can't ascribe a number 0.0..1.0 to A, but we certainly can, and do, ascribe such a value to P(A). Where Jaynes further parts company from Boole is in what equations he uses then to calculate using probabilities, but I'll get to that in a future post.Thanks for the clarification!

I'm not feeling inclined to agree with this distinction.

ReplyDeleteI would classify the distinction you are making as: it's meaningful to use real valued numbers in the range [0..1] to represent something about the truthfulness of a proposition (because they can represent probabilities), but it's not meaningful to use integer valued numbers in that range to represent truthfulness.

You might also be saying that proofs in an integer domain are invalid if they are structurally different from proofs about related concepts in a real domain?

Anyways, if you are working with contrived propositions -- propositions which are not part of a larger systematic framework -- then the usefulness of both the propositions themselves and of numbers representing the truth value of the propositions are both dubious.

But we can have propositions which are part of some larger system.

For example, if we have a proposition P(x) which is true if x is prime then we can express Euler's totient function as the sum of P(x) for x <= n.