Rational

2011-11-16 : Some collected thoughts on creating a Rational numeric type.

In finance, we are always dealing with quantities of money, which usually have some number of digits after the decimal point, usually two or four. It also used to be common to see prices given in eighths. Unfortunately, rather than using a representation that can represent the numbers with complete accuracy, standard floating point numbers are usually used. This often leads to rounding issues.

Now, I'm not one of those people who is paranoid about floating point numbers and believes that every comparison also needs to include an accuracy epsilon. However, I do understand that standard floating point are numbers stored as fractions with a denomitor that is 2n. Therefore, even a simple number like 0.1 (1/10) must be stored as an approximation (838861/8388608, which has been rounded up by 0.2/8388608).

It would be nice if we could represent simple useful numbers lime 1/10, 1/8, and even 1/3 without rounding. These are all rational numbers.

Standard IEEE 754 encoding of floating point numbers basically uses bit fields within a standard 32-bit int / 64-bit long / etc. They encode the numerator (mantissa or more properly significand) and denominator (derived from the exponent) for a base 2 fraction in a single integer. It would be interesting to come up with an encoding that allowed arbitrary fractions. Obviously, there would be some trade-offs compared to floating point, but how good could you make it?

(Hand written notes: 2008-04-22 2008-04-24)

Here are some features that would be nice to have:

• It would be nice if casting from int to long worked correctly.
• Put control bits in the LSB.
• Put numerator in the MSB, signed / two's complement.
• (int)0 == (rational)0
• control == 0 → denom == 1 → x2 / << 1 converts integer to rational
• The denominator need never be negative, and in fact, need never be 0 (unless that is used to represent NaN).

Some bit patterns are redundant, e.g., 2/4 == 1/2, 3/6 == 1/3.

• We could represent more numbers if we didn't allow redundant representations.
• Mathematical operations would be faster if we allowed redundant representations for intermediate values.
• It's also useful for representing significant figures.
• If we just disallow 2/4, then we're wasting bit patterns. However, assigning a unique rational to every bit pattern is a very difficult mapping (discussed later).
Encoding the Denominator

I've looked how to encode the denominator using a variable number of bits. (See 2008-04-22/pg2.) One problem with this approach is that supporting a large number of "number of bits" itself consumes valuable bits. In the worst case, you are reduced to base 1 encoding. To have enough bits to actually encode a large denominator, you have to pick a small number of denominator bit-widths. That leads to the question of which widths are most useful. Bit-widths that can encode "even" powers of ten are definitely useful, since many calculations are performed on decimals.

 n 2n floor(log10(2n)) 4 16 1 7 128 2 10 1024 3 14 16,384 4 17 131,072 5 20 1,048,576 6 24 16,777,216 7 27 134,217,728 8 30 1,073,741,784 9

Some interesting combinations for 32-bit ints are as follows. Here, the first group is the number of bits in the denominators. The second group shows in more detail the number of bits in the numerator, the denominator, and the control bits.

• 0/14 - (31n-0d-1c)/(17n-14d-1c)
• 0/7/14 - (31n-0d-1c)/(23n-7d-2c)/(16n-14d-2c)
• 0/10/20 - (31n-0d-1c)/(20n-10d-2c)/(10n-20d-2c)
• 7/14 - (24n-7d-1c)/(17n-14d-1c)
• 10/20 - (21n-10d-1c)/(11n-20d-1c)
• 16 - (16n-16d)
Having a 0-deniminator-bit mode basically allows you to reserve half your bit patterns for representing large integers. Initially, I thought this was important, but I currently think this is not necessary: you are unlikely to use a rational unless there is a high probability that the denominator will be greater than one.

For 64-bit longs, consider these patterns:

• 0/10/20/30 - (63n-0d-1c)/(52n-10d-2c)/(41n-20d-3c)/(31n-30d-3c)
• 0/10/20/30 - (62n-0d-2c)/(52n-10d-2c)/(42n-20d-2c)/(32n-30d-2c)
• 7/14/30 - (56n-7d-1c)/(48n-14d-2c)/(32n-30d-2c)
• 10/20/30 - (53n-10d-1c)/(42n-20d-2c)/(32n-30d-2c)
• 10/20/30/40 - (52n-10d-2c)/(42n-20d-2c)/(32n-30d-2c)/(22n-40d-2c)
• 32 - (32n-32d)
The 7/14/30 is interesting because the numerators are all multiples of 8 bits.

The 10/20 for ints and the 10/20/30/40 for longs are interesting because, for the largest denominator, the numerator is smaller than the denominator. I'm not sure if this is useful, since while it allows you to represent smaller magnitude fractions (large denominator), the number of bits of precision goes down (small numerator).

It would be useful to have a denominator encoding that's the same for ints and longs, since you can then just cast from int to long and get the same value with greater precision (just like with regular integers).

The two most favorable candidates are

• 7/14/30 - int: (24n-7d-1c)/(16n-14d-2c)/(0n-30d-2c) - long: (56n-7d-1c)/(48n-14d-2c)/(32n-30d-2c)
• 10/20/30 - int: (21n-10d-1c)/(10n-20d-2c)/(0n-30d-2c) - long: (53n-10d-1c)/(42n-20d-2c)/(32n-30d-2c)
Since it seems to makes sense to favor numerator bits over denominator bits, then 7/14/30 is the winner.
The Result

Source: `https://latenighthacking.com:8443/svn/code_root/2011/Rational`

C o m m e n t s :
(nothing yet)
Edit