Tuesday, April 8

The "boolean" disease

I was just reading the linux kernel coding style guide, and I am once again frustrated by how poorly people understand Boolean algebra.

Here's the offending quote:

Functions can return values of many different kinds, and one of the most common is a value indicating whether the function succeeded or failed.  Such a value can be represented as an error-code integer (-Exxx = failure, 0 = success) or a "succeeded" boolean (0 = failure, non-zero = success).

Mixing up these two sorts of representations is a fertile source of
difficult-to-find bugs.  If the C language included a strong distinction between integers and booleans then the compiler would find these mistakes for us... but it doesn't.
 So what's wrong with that?

It's essentially a failure of the entire educational system that a statement like that goes unchallenged.

Here's an example of Boolean multiplication on integers:
┌──┬─────────────────────────────┐
│*.│ _4  _3 _2 _1 0  1  2   3   4│
├──┼─────────────────────────────┤
│_4│  4  12  4  4 0 _4 _4 _12  _4│
│_3│ 12   3  6  3 0 _3 _6  _3 _12│
│_2│  4   6  2  2 0 _2 _2  _6  _4│
│_1│  4   3  2  1 0 _1 _2  _3  _4│
0│  0   0  0  0 0  0  0   0   0│
1│ _4  _3 _2 _1 0  1  2   3   4│
│ 2│ _4  _6 _2 _2 0  2  2   6   4│
│ 3│_12  _3 _6 _3 0  3  6   3  12│
│ 4│ _4 _12 _4 _4 0  4  4  12   4│
└──┴─────────────────────────────┘

And here's an example of Boolean addition on integers:
┌──┬─────────────────────┐
│+.│_4 _3 _2 _1 0 1 2 3 4│
├──┼─────────────────────┤
│_4│ 4  1  2  1 4 1 2 1 4│
│_3│ 1  3  1  1 3 1 1 3 1│
│_2│ 2  1  2  1 2 1 2 1 2│
│_1│ 1  1  1  1 1 1 1 1 1│
0│ 4  3  2  1 0 1 2 3 4│
1│ 1  1  1  1 1 1 1 1 1│
│ 2│ 2  1  2  1 2 1 2 1 2│
│ 3│ 1  3  1  1 3 1 1 3 1│
│ 4│ 4  1  2  1 4 1 2 1 4│
└──┴─────────────────────┘

Notice how, if we restrict ourselves to the 0 and 1 we get "binary and" and "binary or".

Anyways, somewhere along the line, someone (probably Henry Sheffer) decided that "logical not" should be a part of boolean algebra. Essentially, it's "proof by vigorous assertion". Algebras existed before Henry Sheffer, and Boole's work existed before Henry. Henry's work proved an equivalence between logic and Boole's system - for the case where we restrict ourselves to 0 and 1.

But what do we gain from this?

Answer: not much.

Meanwhile, there's another way of formulating logic using 0 and 1 which gracefully incorporates the idea of logical negation:

Instead of using Boole's multiplication (which is defined on integers) we can instead use regular multiplication. And, for logical not, we can use 1-x. And that's all we need, we can use Henry Sheffer's approach to define logical or: x OR y is 1-((1-x) * (1-y)).

And what do we gain from this?

Answer: probability and logic can be seen as two facets of the same system.

Something like this:

┌───┬──────────────────────────────────────────────────┐
│*  │0 0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1  │
├───┼──────────────────────────────────────────────────┤
│0  │0 0    0    0    0    0    0    0    0    0    0  │
│0.1│0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1│
│0.2│0 0.02 0.04 0.06 0.08 0.1  0.12 0.14 0.16 0.18 0.2│
│0.3│0 0.03 0.06 0.09 0.12 0.15 0.18 0.21 0.24 0.27 0.3│
│0.4│0 0.04 0.08 0.12 0.16 0.2  0.24 0.28 0.32 0.36 0.4│
│0.5│0 0.05 0.1  0.15 0.2  0.25 0.3  0.35 0.4  0.45 0.5│
│0.6│0 0.06 0.12 0.18 0.24 0.3  0.36 0.42 0.48 0.54 0.6│
│0.7│0 0.07 0.14 0.21 0.28 0.35 0.42 0.49 0.56 0.63 0.7│
│0.8│0 0.08 0.16 0.24 0.32 0.4  0.48 0.56 0.64 0.72 0.8│
│0.9│0 0.09 0.18 0.27 0.36 0.45 0.54 0.63 0.72 0.81 0.9│
│1  │0 0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1  │
└───┴──────────────────────────────────────────────────┘

┌───┬──────────────────────────────────────────────────┐
│or │0   0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1│
├───┼──────────────────────────────────────────────────┤
│0  │0   0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1│
│0.1│0.1 0.19 0.28 0.37 0.46 0.55 0.64 0.73 0.82 0.91 1│
│0.2│0.2 0.28 0.36 0.44 0.52 0.6  0.68 0.76 0.84 0.92 1│
│0.3│0.3 0.37 0.44 0.51 0.58 0.65 0.72 0.79 0.86 0.93 1│
│0.4│0.4 0.46 0.52 0.58 0.64 0.7  0.76 0.82 0.88 0.94 1│
│0.5│0.5 0.55 0.6  0.65 0.7  0.75 0.8  0.85 0.9  0.95 1│
│0.6│0.6 0.64 0.68 0.72 0.76 0.8  0.84 0.88 0.92 0.96 1│
│0.7│0.7 0.73 0.76 0.79 0.82 0.85 0.88 0.91 0.94 0.97 1│
│0.8│0.8 0.82 0.84 0.86 0.88 0.9  0.92 0.94 0.96 0.98 1│
│0.9│0.9 0.91 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1│
│1  │1   1    1    1    1    1    1    1    1    1    1│
└───┴──────────────────────────────────────────────────┘

So why the mixup?

In part: people are lazy. I know I'm pretty lazy about this issue, and sometimes careless of how I research it. But I'm certainly not alone. People mostly just don't care. And *that* says a lot about the value of education.

Education has a lot to do with satisfying the teacher, and not so much about historical accuracy. People like to refer to names that most people don't bother looking up (who has the time?). And of course, this suggests that educational literature is rife with mistakes - most of it is crud (but not all).

Still, there's the issue of naming. If "boolean" is the wrong name for the concept people are getting at when they talk about distinguishing between "booleans and integers" what are some good names?

One obvious name might be "logical". That's closer to accurate than "boolean". And that leads to "true" and "false" being satisfying names for variables of this type. In the above kernel style guide excerpt, I think it would be good to replace "boolean" with "logical value".

Another obvious name might be "bit". And a variable of type "bit" really ought to be able to represent the values 0 and 1.

But that won't stop fans of "proof by vigorous assertion", will it? For them, historical accuracy and utility take back seat to vehemence. It's actually kind of amusing to watch.




Resume again

I don't really need more work. But it's addictive. So I've been thinking about that alot.

Here's the position writeup for Gannett - they're looking for someone to replace me. It's nice to feel missed, in a sort of sad way. (I miss my friends there, and more than occasionally regret deciding that it was time for me to move on.)

Anyways, here's their writeup for my old position:



Principal Developer, Development & Integrations

The principal developer oversees the development and implementation of code for highly complex software applications.  They have high-impact responsibilities within the team and are typically assigned to the most critical and strategic / complex / high-risk undertakings. A principal is responsible for architecting the systems that make up the framework the rest of the organization works with and builds upon.

A principal must ensure the integrity and durability of the code base by both protecting it from substandard changes and by designing defensively to mitigate user error and abuse. As a system designer, a principal must also educate and mentor the team that makes use of these systems to ensure adoption and also to identify gaps in the system. The principal is well versed on the latest advancements in his field and regularly evaluates new trends and technologies to improve the framework.

Responsibilities:
Define and enforce overarching architecture strategies
Identify high-level architectural issues and document/develop solutions
Review and merge high-risk changes
Develop solutions to mitigate user error (eg. automation, validation, etc.)
Mentor and develop best practices for developers within the team
Regularly ideate and develop new high-impact innovation projects
Provide team training on architecture, strategy, and technology
Define best practices and guidelines for lower level developers
Work with other disciplines to help shape new products from a technical perspective

 Required Technical Skills
Bachelor’s degree in Computer Science, Computer Engineering or related technical discipline. Masters degree a plus.
9+ total years of software development and architecture experience with at least 5+ years designing and architecting enterprise web applications.
Experience working directly with senior Development and IT groups in an advisory role.
Strong object-oriented design and development skills.
Expertise in multiple technologies (JavaScript, jQuery, Backbone.js, Ajax, HTML5, Django, Python, etc.) across software engineering, security, data management, etc.
Experience in integrating with 3rd party REST API’s.
Experience with Agile/Lean methodologies, continuous integration, GitHub, Git, Test Driven Development.


Sunday, April 6

Upstart

I had an occasion to deal with "upstart" last week.

Upstart is a project which replaces "init". The design intent is pretty obvious: allow independent packages to have components which run at system boot. The expressed intent is somewhat different: add new "scanning" functionality to init (https://en.wikipedia.org/wiki/Upstart#Rationale).

Wait, what?

In other words, upstart has reduced functionality from sysv init. It no longer supports /etc/inittab and that's probably a good thing. Simplicity is a virtue. But documentation is hard to find and with all of its modules and added features (it also has added features) it can take some study before you can do anything useful with it.

A typical use for /etc/inittab was: you add a line specifying some program to run, and init runs it for you. If the program exits, init will restart it for you. Nice and simple.

For example you might add a line like this:

work:23:respawn:/bin/su username -c /home/username/bin/dowork

This would run /home/username/bin/dowork as the user 'username'. There's more details here, but that's the gist of it. And if the program exits, init would start it up again (if you read the man page for inittab you can see all sorts of details about this kind of thing).

Well, with upstart, there's more you have to do.

First, you have to create a configuration file such as /etc/init/work.conf

For this example that file might contain:
respawn
exec /bin/su username -c /home/username/bin/dowork

And then you also need to somehow get this started at machine boot. One possibility is to add a line to /etc/rc.local like:

service work start

Another possibility involves creating /etc/init.d/work (which might be the same) and then symbolically linking it from /etc/rc2.d/S99work and so on... You can get fancy here and create start/stop/restart commands. You probably do not want to get into this kind of complexity unless you are building packages for other people to use.

And, of course, there's lots more possibilities.

But upstart currently is not documented very well, in my opinion. Specifically, there is nothing in the manual page for upstart which even hints at the need to run the service program at boot time to achieve what init does automatically. Instead, the documentation reads like it would start the service automatically when a conf file was placed in /etc/init/. (And that might have been true for some versions of upstart - but it's not true for the version I'm using in ubuntu version 13.)

And this is something that both the free software communities and the professional software communities need to get better at: documenting things properly. And maybe this is something colleges need to do a better job of teaching.

It's really hard, as a developer, to think from the viewpoint of the user. And it's really easy for the user to not connect with the developer (note that I've not connected with any upstart developers about this issue - at least, not yet). So this is a difficult issue to think about, sometimes.