LateNightHacking Louis Projects 2004  Auth 
20040311 : A coworker of mine brought in a personality test and had everyone take it. The test identified four personalities labelled with the colors green, gold, orange, and blue. The test gave you a score for each color, and your dominant color identified your personality. Everyone shared scores and we enjoyed comparing people against the category they had fallen into. Dan Tull and I, both being analytical greens, were interested in trying to see the personality relationships across our whole group. How different are people? What does it mean when one person gets a dominant green score and someome else gets a higher green score and an even higher score in orange? Which people are the most different? (BTW, I'm not claiming that these mathematical comparisons are necessarily accurate in a psychological sense!)
What made things complicated is that there are four scores. You can't just plot them, even in a 3D graph. Dan suggested calculating the (Cartesian) distance between points, and I came up with a table showing all the pairwise distances, plus the group average and distance from the group average. This was interesting, especially when I sorted the list of pairs by distance. You could see which people were the most similar, and which people we the big outliers. Still, we were curious why certain people were so far apart.
Again, what we really wanted to do was plot them on a graph, but projections from 4D on to 3D are really hard to visualize. Understanding something as complex as a rotating hypercube is hard enough, let alone a set of random points. Even if we did it, we probably wouldn't glean anything from it.
I thought about it for a while, and it occurred to me that the points (the test scores) had to be on some sort of surface. You couldn't have just any arbitrary score. For each question, you had to give a distinct rank to each of the four answers. In fact, your final test score always added up to 130.
My first guess was that this meant that the points had to be coplanar. I figured I could the dot product to determine the angle between the scores, and then plot the scores on a polar graph. This produced an interesting chart but it wasn't right. Some people who should have been far apart we close together. I figured I wasn't getting the signs right. Plus, I was doing the dot product against unit vector (1,0,0,0) and it seemed very suspicious to me that only one coordinate was being used. I tried to use a cross product to at least fix the sign, and I spent quite a while looking that up on the net. I finally came to the conclusion that, while I'm not sure what all you can cross, you can't take the cross product of two 4D vectors (it's meaningless). So much for that.
Thursday (20040311), I spent my time at the airport thinking about the problem. I started
with the lower dimensional problems and worked my way up to the 4D solution. I realized I was
looking at problems in this form: x+y=a
(2D), x+y+z=a
(3D),
x+y+z+w=a
(4D), etc. In each case, the set of points which are valid solutions
is constrained. In the 2D case, the valid solution points are colinear. In the 3D case,
coplanar. In the 4D case, "cospacial". I have no idea if that's the right word, but I'll use it.
I got far less hyperdimensional geometry in my schooling than I would have liked (5 dimensional
wave physics diagrams aside). :) This is great news. Visualizing points in 3D is relatively easy.
It should be possible to produce an undistored 3D plot that shows the relationships between
people's test scores. Now to figure out how.
Taking the 3D case for a moment, let's say you had a set of points that were solutions to
x+y+z=1
, like A:(1,0,0), B:(0,1,0), and C:(0,0,1).
(Visually, I'm talking about the points that are one unit away from
the origin along each of the three axes. These three points define in a plane which is kind of
lying sideways, propped up against the three axes.) You should be able to create some
sort of 2D graph that shows the relationships between the points. The points are each sqrt(2)
from each other. If you project the points onto the XY plane (by dropping the Z coordinate), you
get an unambiguous projection, but the relationships (distances) are distorted. The points A and B
are still sqrt(2) from each other, but point C is too close to the others. It looks like A is
closer to C than to B, but they are actually equidistant. This projection is not really what we
want. What we really want is a projection that is equivalent to that plane those three points are
lying in. Now, it's not clear where the origin is (translation) or which direction is up
(rotation), but now matter how you look at it, in that plane the three points form an equilateral
triangle, and that's what we're trying to capture and visualize. We need to extract that plane.
Similarly, in 2D you can get a good view by extracting the line. In 4D, we should be able to
extract that 3D subspace.
How do we extract the plane? Well, we could find a basis (a set of axes) such that the X' and Y' axes lie in the target plane (and the Z' axis is normal to it). If we view our points in that coordinate system, they would all lie in the X'Y' plane. We could discard the Z' information because it would all be zero. How do we figure out that basis? Well, if we had some vectors that lie in the X'Y' plane, that would be a good start. How about the vector AB? That lies in the plane. We also have AC and BC. AC and AB are not perpendicular, so they can't be our basis as is. Since AB and AC are not parallel, they do define a plane, so there must be a way to extract a basis for that plane. Here's how. Take vector AB. Make it unit length, and define it to be X'. Now look at AC. If we remove from it all that is in the X' direction, what remains must be perpendicular to X'. Let that be the Y' direction. Ta da  our basis!
I'll say that again, in slightly more detail:
X'=AB/AB

Take AB. Unitize it (divide it by it's length). That's X'. 
Take AC. Find the component of it in the X' direction.  
ac_{X}=AC dot X'

To do that, project it on X'  take the dot product. That gives you a length ac_{X'}, which is a scalar. 
AC_{X}=ac_{X}*X'

Multiply by X' to convert it into a vector AC_{X} which is the component of AC in the X' direction. 
AC=AC_{X}+AC_{Y}
=> AC_{Y}=ACAC_{X}

We can assume AC can be broken into two components AC_{X} and AC_{Y}. We can find AC_{Y} by subtracting AC_{X} from AC. 
Y'=AC_{Y}/AC_{Y}

AC_{Y} is by construction perpendicular to AC_{X} and hence X'. Let's make it our Y'. We just need to unitize it. 
˜ ™
To Be Continued...
Screenshot of the final spreadsheet.
Louis K. Thomas <loui sth@hotm ail.co m>  Auth  20040313 (4792 days ago) 