I got a few comments in tweets, on my last post. One expressed interest in sparse array handling, another expressed interest in jtva2, and another expressed general interest in the J interpreter.
I am not sure how much time and interest I am going to have, but I decided to look closer at how my +/1 2 3 expression gets handled.
But, before I do that (next post, probably), I thought I would do a pass through jtva2() in va2.c:
First: jtva2() does not get used, at all, in the expression +/1 2 3. This is because +/ gets handled by special code. However, if I execute:
1+2+3
then jtva2 gets used (twice, in fact). The first time it runs, a is a noun for the number 2, and w is a noun for the number 3. The second time it runs, a is a noun for the number 1 and w is a noun for the number 5. Here's a breakdown of the names and their meanings, in the first time through:
an: 1 (the number of items in a)
ar: 0 (the rank of a)
as: pointer to shape of a
at: 4 (the jtype of a -- int)
wn, wr, ws and wt are the same (since the only difference between a and w here is the value)
id is '+'
zt: 4 (the jtype of z -- int)
var gets called, and in var:
t: 4 (intermediate result specifying both argument types)
j: 4 (index into list of dyadic handlers)
q: pointer to struct VA:
typedef struct {VA2 p2[13];VA2 pins[7];VA2 ppfx[7];VA2 psfx[7];} VA;
VA2 is
typedef struct {VF f;I cv;} VA2;
Here, f is a handler -- it's going to do the actual addition, multiplication, or whatever (once we have decided what type of operation we are going to perform), and cv represents the kinds of results we might be producing.
Here are the values for cv:
/* cv - control vector */
#define VBB (I)1 /* convert arguments to B */
#define VII (I)2 /* convert arguments to I */
#define VDD (I)4 /* convert arguments to D */
#define VZZ (I)8 /* convert arguments to Z */
#define VXX (I)16 /* convert arguments to XNUM */
#define VQQ (I)32 /* convert arguments to RAT */
#define VB (I)256 /* result type B */
#define VI (I)512 /* result type I */
#define VD (I)1024 /* result type D */
#define VZ (I)2048 /* result type Z */
#define VX (I)4096 /* result type XNUM */
#define VQ (I)8192 /* result type RAT */
#define VSB (I)16384 /* result type SBT */
#define VRD (I)65536 /* convert result to D if possible */
#define VRI (I)131072 /* convert result to I if possible */
#define VXEQ (I)262144 /* convert to XNUM for = ~: */
#define VXCF (I)524288 /* convert to XNUM ceiling/floor */
#define VXFC (I)1048576 /* convert to XNUM floor/ceiling */
Anyways, in var we are concerned with (q->p2)[j]
j is the result of a decision tree something like this:
If either argument is complex, j is 9
if either argument is rational or extended precision (or extended floating or extended complex -- types which I think are unimplemented -- I shall try to ignore unimplemented possibilities), then j is 8 if the other argument was floating point (C's double) otherwise j is 11 if either argument was rational and 10 (extended precision) for the remaining cases.
Otherwise, j depends on the type of both the left and right argument, it's the sum of:
left arg boolean: 0
left arg int: 3
left arg something else (probably double): 6
right arg boolean: 0
right arg int: 1
right arg something else (probably double): 2
In our case, both args are int, so j is 4 and (q->p2)[j] gives a VA2 struct with f: _plusII and cv: 512
that f gets passed back to jtva2 in its ado pointer, and that cv gets passed back to jtva2 in its cv pointer (pointers to both of these were the last two arguments to var()).
Note also that cv gets 4 added to it if we are combining floating point with extended precision (I think in all such cases for primitives, the result is floating point -- this is for both practical and performance reasons: if you are using floating point, you have already lost the extreme accuracy which is supported by extended precision and rational numbers, and of course floating point calculations are machine efficient.)
But that doesn't matter here, and that's all that var does. So, back to jtva2:
The first thing we do when we get back into jtva2 is inspect cv and figure out what kind of result we are going to try to produce. This corresponds to the description at http://jsoftware.com/help/dictionary/dictg.htm
Basically: if we can produce a bool, we do that. If we can produce an int (a C long), we do that. If we can produce a float (a C double), we do that, and so on...
This is the value for zt= rtype(cv);
Next, we look at whether either of our arguments need to be converted to the relevant type in t=atype(cv), and in this case t is 0 -- we do not have to convert either of our arguments for our desired result type.
So, anyways, we skip over the type coercion code (and if we were dealing with sparse arrays we would also skip over that, perhaps because the sparse array code already dealt with that?).
Then we skip over some code that only runs if our jt context has a rank specification, and in the else clause:
We check that our arguments have the same shape prefix -- length error if not.
We find the shape of the result (the longer of the left shape or the right shape -- they're the same in this case), and we define m and n. m is how long our shorter argument is, and n is how many types we need to repeat the shorter argument to make it the same length as the longer argument. In this case, both are 1 -- both arguments are the same length and each only contains one value.
Finally, we allocate space for our result. We need space for one integer with a rank of zero. Here's the macro for "Get A":
#define GA(v,t,n,r,s) RZ(v=ga(t,(I)(n),(I)(r),(I*)(s)))
RZ means return null if the result is false -- that's how errors are implemented in the J interpreter.
ga is a cover for jtga:
A jtga(J jt,I t,I n,I r,I*s){A z;I m,w;
if(t&BIT){const I c=8*SZI; /* bit type: pad last axis to fullword */
ASSERTSYS(1>=r||s,"ga bit array shape");
if(1>=r)w=(n+c-1)/c; else RE(w=mult(prod(r-1,s),(s[r-1]+c-1)/c));
w+=WP(INT,0L,r); m=SZI*w;
ASSERT( n>=0&&m>w&&w>0,EVLIMIT); /* beware integer overflow */
}else{
w=WP(t,n,r); m=SZI*w;
ASSERT(m>n&&n>=0&&m>w&&w>0,EVLIMIT); /* beware integer overflow */
}
RZ(z=ma(m));
if(!(t&DIRECT))memset(z,C0,m);
if(t&LAST0){I*v=(I*)z+w-2; *v++=0; *v=0;}
AC(z)=1; AN(z)=n; AR(z)=r; AFLAG(z)=0; AK(z)=AKX(z); AM(z)=msize[((MS*)z-1)->j]-(AK(z)+sizeof(MS));
AT(z)=0; tpush(z); AT(z)=t;
if(1==r&&!(t&SPARSE))*AS(z)=n; else if(r&&s)ICPY(AS(z),s,r); /* 1==n always if t&SPARSE */
R z;
}
basically, we have arguments specifying:
jt: interpreter context
t: data type
n: number of [data] elements to allocate
r: rank (number of [shape] elements to allocate)
*s: the shape
And then we find how big that is, call malloc (jtma() being our cover that will pull from our freelist of unused memory or try C's malloc()).
So, anyways, we get a blank slate to write our result into, bailing out if that fails, and then we execute this line:
if(1==nf) DO(mf, ado(jt,b,m,n,zv,av,wv); zv+=zk; av+=ak; wv+=wk;)
nf and mf are always 1 unless our execution context has a rank specification. And DO is a macro that builds a for loop for us:
#define DO(n,stm) {I i=0,_n=(n); for(;i<_n;i++){stm}}
In other words: two arguments, the first is how many times the second is run (the second argument is a C statement). So basically, in this case, we call our ado routine (which was defined as _plusII) one time.
When we are done, we check our execution context for an error. If the error smells like an overflow that we can handle, we recurse, calling jtva2 again with updated arguments (that we got from type coercion).
So.. that's jtva2 -- what does plusII look like?
Here's line 269 of ve.h, where it's declared:
extern ADECL2( plusII,I,I,I);
And here's line 8 of ve.h which defined ADECL2:
#define ADECL2 AHDR2
And here's line 124 of va.h, which defines AHDR2:
#define AHDR2(f,Tz,Tx,Ty) void f(J jt,B b,I m, I n,Tz*z,Tx*x,Ty*y)
And note that Tz, Tx and Ty are C type names, for the result, left argument and right argument.
So... an ADECL2 function gets these arguments:
jt: interpreter context
b: (only matters if n is 0): 1: x is a list, 0: y is a list
m: how many sequential values to generate
n: 1 if both arguments are lists of the same length
z: result
x: left argument
y: right argument
In other words:
1+2 3 4
b: 0
m: 3
n: 0
4 5 6*8
b: 1
m: 3
n: 0
9 10 11-12 13 14
m: 3
n: 1
1+1
m: 1
n: 1
And the actual definition of plusII is on line 18 of ve.c and looks like this:
AOVF( plusII, I,I,I, PLUSVV, PLUS1V, PLUSV1)
AOVF is defined in lines 147-152 of va.h, and looks like:
#define AOVF(f,Tz,Tx,Ty,fvv,f1v,fv1) \
AHDR2(f,I,I,I){C er=0;I u,v,*x1,*y1,*z1; \
if(1==n) {fvv(m,z,x,y); RER;} \
else if(b){z1=z; y1=y; DO(m, u=*x++; f1v(n,z,u,y); RER; z=z1+=n; y=y1+=n;);} \
else {z1=z; x1=x; DO(m, v=*y++; fv1(n,z,x,v); RER; z=z1+=n; x=x1+=n;);} \
}
And, I gets defined on either line 15 (64 bit) or line 22 (32 bit) of jtype.h. The 32 bit version looks like this:
typedef long I;
The 64 bit version looks like this:
typedef long long I;
So that's the C type that J uses to represent simple integer values.
Meanwhile, PLUSVV is defined on line 489 of vasm.h:
#define PLUSVV(m,z,x,y) {B p; DO(m, p=0>*x; *z=*x+*y; BOV(p==0>*y&&p!=0>*z); z++; x++; y++;);}
and PLUS1V is on line 493:
#define PLUS1V(n,z,u,y) {B p=0>u; DO(n, z[i]=u+y[i]; BOV(p==0>y[i]&&p!=0>z[i]););}
and PLUSV1 is on line 497:
#define PLUSV1(n,z,x,v) PLUS1V(n,z,v,x)
(obviously, these need to be macros, because they are being defined for a variety of data types.)
B is just a declaration for a bit value (from line 27 of jtype.h:
typedef char B;
BOV checks an expression for overflow:
#define BOV(exp) if(exp){er=EWOV; break;}
(and this error state is not displayed to the user -- it means try again, or report a domain error if that's not valid.)
And... I think that's about all I can say about 2+3 -- but this is getting long, so I'll save +/ 1 2 3 for another post... (And I should maybe do a couple more passes through jtva2() some time. Interesting issues, that probably deserve their own passes, include: non-trivial rank, type coercion, overflow, and sparse arrays.)
No comments:
Post a Comment