Skip to content

Computing Correct Results #115

@waldemarhorwat

Description

@waldemarhorwat

In this issue I'll explore the diverse sources of the rounding problems in unit conversions as well as how to fix them permanently. It's a bit of a long post, so I'll focus on the conversions that don't have an offset. There exist only two units that have offsets in the CLDR table (Celsius and Fahrenheit), and handling them won't be too difficult once we settle on the correct approach for units without an offset.

I hope that at some future time the language will include some kind of a Decimal type. When it does, Amount should be compatible with it — it would be bad if adding Decimal in the future would cause breaking changes to Amount. To ensure that, I'll sketch an Amount design that includes Decimal inputs (in addition to Number and BigInt) and we can decide which parts we can omit because we don't want to implement them today.

For clarity:

  • I will suffix Numbers by 𝔽 and Decimals by 𝔻. Real numbers will have no suffix.
  • I'm skipping the conversion to exponential notation for Strings.

State Today

Let's look at a few conversions in today's implementation using Number inputs and outputs. Let's start with integral inputs that are representable exactly as Numbers:

Input Unit In Unit Out Conv Factor Output
3𝔽 m mm 1000 3000𝔽
5𝔽 in cm 127/50 12.7𝔽
84𝔽 in ft 1/12 7𝔽
5𝔽 g tonnes 1/1000000 0.0000049999999999999996𝔽
825𝔽 g kg 1/1000 0.8250000000000001𝔽

The incorrect outputs above are due to multiple rounding. As I'll show later, doing the conversion more precisely will fix these incorrect results.

Next let's take a look at some non-integral Number inputs:

Input MV(Input) Unit In Unit Out Conv Factor Output
0.003𝔽 0.0030000000 0000000006 2450045135 1650553988 2928133010 8642578125 m mm 1000 3𝔽
0.3𝔽 0.2999999999 9999998889 7769753748 4345957636 8331909179 6875 yard ft 3 0.8999999999999999𝔽
0.352𝔽 0.3519999999 9999997957 1896346897 1196562051 7730712890 625 m cm 100 35.199999999999996𝔽

Here the outputs are actually correct for Number arithmetic:

  • 𝔽(0.299999999999999988897769753748434595763683319091796875 × 3) = 0.8999999999999999𝔽
  • 𝔽(0.35199999999999997957189634689711965620517730712890625 × 100) = 35.199999999999996𝔽

Since in these examples the results are mathematical values correctly rounded to the nearest Number, that's the best we can do.

Desired State

Now let's imagine that we also have a Decimal type that's supported by Amount. What should the results to look like for the above examples?

Input MV(Input) Unit In Unit Out Conv Factor Output
3𝔻 3 m mm 1000 3000𝔻
5𝔻 5 in cm 127/50 12.7𝔻
84𝔻 84 in ft 1/12 7𝔻
5𝔻 5 g tonnes 1/1000000 0.000005𝔻
825𝔻 825 g kg 1/1000 0.825𝔻
0.003𝔻 0.003 m mm 1000 3𝔻
0.3𝔻 0.3 yard ft 3 0.9𝔻
0.352𝔻 0.352 m cm 100 35.2𝔻

What about Strings? The MV of a String is well-defined, so the results should be similar to the Decimal results:

Input MV(Input) Unit In Unit Out Conv Factor Output
"3" 3 m mm 1000 "3000"
"5" 5 in cm 127/50 "12.7"
"84" 84 in ft 1/12 "7"
"5" 5 g tonnes 1/1000000 "0.000005"
"825" 825 g kg 1/1000 "0.825"
"0.003" 0.003 m mm 1000 "3"
"0.3" 0.3 yard ft 3 "0.9"
"0.352" 0.352 m cm 100 "35.2"

Amount also allows rounding before conversion. Let's say that one does:

let a = new Amount(0.3515𝔻, { unit: "meter", fractionDigits: 3, roundingMode: "halfEven" });
let b = a.convertTo({ unit: "cm" });

Here the correct answer would be for b's value to be "35.2" because 0.3515𝔻 rounded to 3 fraction digits in halfEven mode is 0.352. It would be strange indeed (and incorrect) to do Number arithmetic here and produce "0.8250000000000001".

Implementation

Implementing this so that it works well on everyday usage cases is quite easy. Here's how I'd do it:

  • Every unit in the CLDR table has a base unit conversion factor $f$ that is some rational number (the CLDR even expresses π as a rational number). Represent $f$ as a lowest-terms ratio of two positive integers $f = p/q$ where $p$ and $q$ have no common factors. These are CLDR constants so these would be precomputed and stored in a table, not calculated at run time.
  • To convert a value $a$ whose current unit has conversion factor $f = p/q$ to a new value $b$ whose unit has conversion factor $g = r/s$, we need to compute $b = a \frac{f}{g} = a \frac{ps}{qr}$.

Number

If the input $a$ is a Number:

  • Let $num = 𝔽(ps)$ and $den = 𝔽(qr)$. These are integers, so no rounding will take place unless the conversion factors are very large, well beyond typical everyday usage. For example, a zettameter (1021 m, which is over 100,000 lightyears) can be represented exactly as a Number, even though it's much larger than 253; a yottameter (1024 m) would get rounded.
  • Let $b = 𝔽(a × num / den)$. To calculate this, do, for example (using C/C++ syntax):
double hi = a * num;
double err = fma(a, num, -hi);
double res = hi / den;
double rem = fma(res, -den, hi) + err;
double b = res + (rem / den);

If an intermediate overflow to ±∞ or NaN happens, revert to the simpler:

double ratio = num/den;
double b = a * ratio;

String

If the input $a$ is a String or BigInt, getting correct results is also quite easy, using integers or existing BigInt operations. All variables except $a$ in this section are integers or BigInts.

  • Express $a$ = $m × 10^e$, where $m$ is an integer or BigInt.
  • Let $n = m × p × s$
  • Let $den = q × r$
  • Let $res$ be the integer arithmetic quotient $n / den$ truncated to an integer and let $rem$ be the remainder.
  • If $rem = 0$, then $res$ is exact; otherwise:
    • Generate as many decimal fraction places of the result as we want by repeatedly multiplying $rem$ by a power of 10 and doing the division again. For example, to generate the next 20 decimal places at once:
      • Let $n_2 = rem × 10^{20}$.
      • Let $res_2$ be the integer arithmetic quotient $n_2 / den$ truncated to an integer and let $rem_2$ be the remainder.
      • The result now is $res + res_2 × 10^{-20}$. If more fraction digits are desired and $rem_2 ≠ 0$, keep doing this.
  • Multiply the result by $10^e$; this is just an exponent shift.

At this point we have a choice to make. The above will produce the correct results for the String cases using nothing more than existing BigInt functionality already built into the language. However, there is also a desire to not implement such arithmetic. So we have a tradeoff:

  • Do the simple calculations above. There is already precedent for such computations things in the Intl number formatting code.
  • Do the calculations on Numbers only. If the user tries to do them on a String, that's an error; if the user gives us a String but really wants to use the faulty Number conversion arithmetic, they should first convert the String amount to a Number amount.
  • Do the calculations incorrectly on Strings. Producing incorrect results is hostile to users and to future compatibility with Decimal, requiring either flags or breaking changes, so I would not be in favor of this alternative.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions