Bug in ClojureScript 'rem' with JavaScript numbers

(The ClojureScript version used in this post was "1.11.60")

TL;DR:

> (rem num 1000) ;; should be between -999 and 999 for all integers
6.1897001964269014E26
TL;DR

I wanted to create a simple Reagent toy example where a user can input a number and the number is displayed in words. A straightforward implementation can use the already existing cl-format from cljs.pprint:

> (cljs.pprint/cl-format false "~R" 123456)
"one hundred twenty-three thousand, four hundred fifty-six"

Because of the way numbers are represented in JavaScript it was clear from the beginning that this will not work properly for large numbers. For example:

> (cljs.pprint/cl-format false "~R" 44444444444444444444444444444444444444)
"forty-four undecillion, four hundred forty-four decillion, four hundred forty-four nonillion, four hundred forty-four octillion, four hundred forty-four septillion, four hundred thirty-two sextillion"

We can see it ends with "thirty-two sextillion"; a clear loss of precision.

What surprised me was when playing around with the Reagent example, the whole thing suddenly broke. The textual representation was no longer getting updated, and there was an error message in the console:

Uncaught (in promise) Error: No item 6.189700196426902e+24 in vector of length 20
...

So what's going on? Trying to replicate the issue, I found an example input value:

;; we'll use this definition in some of the later REPL examples as well:
(def num 5555555555555555555555555555555555555555555)

(cljs.pprint/cl-format false "~R" num)
Execution error (Error) at (<cljs repl>:1).
No item 6.189700196426902e+24 in vector of length 20
:repl/exception!

The stack trace contains a line

at cljs$pprint$format_simple_cardinal (pprint.cljs:1187:33)

which seemed like a good place to start digging around the source code (https://github.com/clojure/clojurescript/blob/39709c9614d37b9a3dd398be8fed83cd3dda534b/src/main/cljs/cljs/pprint.cljs#L1187):

      (if (pos? hundreds) (str (nth english-cardinal-units hundreds) " hundred"))

Since the error was "No item 6.189700196426902e+24 in vector of length 20", the (nth english-cardinal-units hundreds) is the likely culprit - hundreds is defined as

(defn- format-simple-cardinal
  "Convert a number less than 1000 to a cardinal english string"
  [num]
  (let [hundreds (quot num 100)
        ,,,]

Since the function formats a number less than 1000, 'hundreds' should always be between 0 and 9, but here it seems to be larger than that. Let's take a look at where this is getting called. In format-cardinal-english we find

      (let [,,,
            parts (remainders 1000 abs-arg)]
        (if (<= (count parts) (count english-scale-numbers))
          (let [parts-strs (map format-simple-cardinal parts)

Let's see if there's anything weird going on with 'remainders' (note: the #'ns/fn syntax can be used to access private functions from other namespaces):

> (#'cljs.pprint/remainders 1000 num)
(5 555 555 555 555 554 0 0 0 0 0 0 0 0 6.1897001964269014E26)

The final value in the list looks familiar. Since 'remainders' is using 'rem' let's just quickly try:

> (rem num 1000)
6.1897001964269014E26

There we go. rem i.e. remainder "Returns the remainder of dividing numerator n by denominator d." Ignoring the precision issues, the remainder should be 555. At least it should never be greater than (or equal to) 1000, the denominator. Here we're getting a massive number instead.

The source code of rem is

(defn rem
  [n d]
  (let [q (quot n d)]
    (- n (* d q))))

We can see that numerator is divided by denominator (taking only the quotient), then the quotient is multiplied again by the denominator. This is then subtracted from the numerator, which mathematically should give the remainder - something about the number representation causes us to lose precision here. To further pinpoint the issue:

> (let [n 5555555555555555555555555555555555555555555
        d 1000
        q (quot n d)
        multiplied (* d q)]
    [n multiplied (- n multiplied)])
[5.5555555555555556E42
 5.555555555555555E42
 6.1897001964269014E26]

An alternative example:

> 5555555555555555555555555555555555555555555
5.5555555555555556E42
> 5555555555555555555555555555555555555555
5.555555555555555E39
> (* *1 1000)
5.555555555555555E42
> (- 5555555555555555555555555555555555555555555 *1)
6.1897001964269014E26

Where we can see the original number vs the multiplied one differ slightly in the representation (5.[snip]56E42 vs 5.[snip]5E42), which causes an error in the subtraction to the tune of 6e26.

I did some searching and I found a report quot and rem are inefficient from 2015. The proposed solution (rem defined as js-mod) seems to fix this problem:

(with-redefs [rem js-mod]
  (cljs.pprint/cl-format false "~R" 5555555555555555555555555555555555555555555))
"five tredecillion, five hundred fifty-five duodecillion, five hundred fifty-five undecillion, five hundred fifty-five decillion, five hundred fifty-five nonillion, five hundred fifty-four octillion, three hundred four septillion, four hundred eighty-eight sextillion, nine hundred seventy-six quintillion, seven hundred eighty-four quadrillion, four hundred sixteen trillion, eight hundred seventy-two billion, nine hundred seventy-six million, three hundred twenty thousand, four hundred eighty"

It still has the aforementioned precision issues, but at least it doesn't outright fail.

Another solution would be to just replace the rem with js-mod permanently:

(set! rem js-mod)