#BecauseJavaScript Part 2 – String Comparisons

Welcome back!

I delved into Twitter again and found this little tweet by Nico Castro. This is a quick one, but it does need a fair bit of explanation to clarify things.

So what’s going on here? Why is the small ‘a’ greater than capital ‘Z’, but at the same time has capital ‘A’ not greater than capital ‘Z’?

Nico is on the right track, ASCII has a lot to do with it, but it’s a teeny tiny bit more complex than that… Let’s take another look at those two statements:

"a" > "Z" // true

"A" > "Z" // false

So, as I said, there is some focus on the ASCII component of the strings that comes into play.

Let’s replace the strings with their ASCII numeric equivalents. Once we do, we see something that instantly clicks:

97 > 90 // "a" > "Z" == true!

65 > 90 // "A" > "Z" == false!

Ta-dah! JavaScript has this unique way with comparing strings when it comes to the greater-than and lesser-than operators. In comparing strings, JavaScript treats them as arrays comprised of single character elements, so in the first example, JavaScript is really comparing [“a”] against [“Z”], JavaScript then compares both arrays, index by index.

Here are three more examples…

"ab" > "ac" // false
"ab" > "a" // true
"ate" > "are" // true

The first example gets converted to their respective arrays: [“a”,”b”] and [“a”,”c”], and then runs through the conversions to the numeric equivalents: [97,98] and [97, 99]. From there, JavaScript will compare these two arrays index by index until it finds a pair that satisfies the comparison operator in question. If it cannot find a pair of elements satisfying the comparison false is returned, otherwise it will return true. If, however, we find a pair of elements that does satisfy the comparison, it returns true and breaks out of the check. So in this first example, the zeroth element for both arrays is 97. 97 isn’t greater than 97, so it then tests out the first elements of, 98 against 99, 98 is not greater than 99, and thus because we’ve reached the end of both arrays without anything satisfying the comparison, it returns false.

Second example is slightly trickier. We have a two character-long string compared against a single character. What happens there?

If you were to type “a”[0] into a JavaScript console, you’d get returned “a”, the zeroth element in the array. So far, so good. However, were you then to type “a”[1] into the console, you’d get returned undefined, which makes sense because in a single element array there will only the element at index 0, and no other, so any index above zero will return undefined.

So “a” which we know is the same as [“a”] is also technically equivalent to [“a”, undefined]!

In short, when comparing strings with greater-than or lesser-than operators, the shorter string will be “appended” with undefined instances when converted to an array until both arrays are of equal length before doing the final series of comparisons. And because undefined is a falsy value, when compared, the undefined gets coerced to 0.

With all that said and done, we finally end up with: [97,98] (“ab”) and [97,0] (“a”). Comparing the two arrays, We get the zeroth elements not satisfying the > operator, but the first elements do, so true is returned.

Finally, we compare “ate” with “are”. When both strings are converted to numeric arrays for comparison we have [97, 116, 101] and [97, 114, 101] . The rest is a foregone conclusion:

Index 0; is 97 > 97? No, move onto index 1.

Index 0; is 116 > 114? Yes! Ding Ding Ding! Stop processing and return true.

So there you have it, string comparisons with lesser-than and greater-than operators. And yes, it’s a pain in the ass that JavaScript computes comparisons like this, but once you stop and look at the process of how it does it, there’s a greater appreciation of how the language works.

#BecauseJavaScript

I hate people who whine about the eccentricities of JavaScript. “Oh, JavaScript doesn’t do what I expect it to when I do XYZ, it does ABC instead… #BecauseJavaScript.” – treating the language like it’s something they have to lump with because they have no choice…

A word to the wise: if you don’t like JavaScript, choose another language to code in and move on. I’m sick to death of these people that bemoan the fact that JavaScript doesn’t behave the way they expect it to.

If you take a step back to consider what is actually happening, then it ends up making sense and you can actually embrace the language for what it really is – truly insane. I’ve gone past that point and become somewhat insane myself, mainly because of my fondness for the language.

So where do we go from here? Let’s take a look at what some developers have tweeted with #BecauseJavaScript:

Jose Luis Cortes tweets:

Okay, so let’s re-write this so that we can see the code statement by statement and we’ll break down the conundrum right here, right now. I’ll place comments here and there within the code to explain what’s happening…

var a = b = [];

Okay, so we’ve defined two arrays, a and b… Note, that this notation doesn’t mean that a and b will remain equal to one another regardless of what happens to a or b, far from it. Using this kind of notation simply allows you to define a number of variables simultaneously to the same initial value, in this case the empty array. So, both arrays A and B are both set to []. So far, so good.

a['a'] = 0; a['b'] = 1;
b[0] = 0; b[1] = 1;

So what’s happening here? Two similar, but different types of array population. Array A is being set as an associative array (also known in some circles as a hash array.) Associative arrays allow you to “associate” various text-based keys with values. Associative arrays are what JSON objects are based on. As a result, you have a word association of sorts with each key-value pair… It’s actually pretty cool.

So, the first line of the two lines above sets two key-value pairs. One assigning the value of 0 to the key ‘a’, and the other the value of 1 to the key ‘b’. Now, the thing about associative arrays, is that unlike standard dimensional arrays, you don’t have to remember in which order the index lies, whether it was the very first index, or the twenty-seventh. Each item in an associative array is just “indexed” by its text key, so that it’s easier to remember. Associative arrays can also have other associative arrays or indexed arrays as values… So you can have layer upon layer upon layer… it’s very “Inception”-esque.

The second line is with respect to the more traditional indexed array, zero-based and everything. Here, Array B has two values set, the values 0 and 1 are defined for indices 0 and 1 respectively.

a.length //0
Object.keys(a)//["a","b"]
b.length //2
Object.keys(b)//["0","1"]

a.length… What is the length of an associative array? Technically, it’s undefined, so Jose isn’t quite right by expressing zero… The closest thing you can get to obtaining a “length” is by obtaining a number of the top-level keys being used. In a “flat” associative array, this would be equal to the number of keys, for example:

var myArray = new Object();
myArray["firstname"] = "Gareth";
myArray["lastname"] = "Simpson";
myArray["age"] = 21;

In this case, we know just by looking at the code that there’s 3 key-value pairs. But obtaining that amount works better using:

 Object.keys(myArray).length // returns 3

Anything deeper than a one dimensional associative array and it becomes a bit of a rabbit-hole. This differs in Array B however, where we have a clearly defined number of elements, in this case b.length will return 2 as its length.

The code used to generate the keys in both instances also gives testament to what each of these arrays is. The code is virtually identical except for the array being referenced. Both responses are correct, Array A having keys of ‘a’ and ‘b’, whereas Array B has “keys” of 0 and 1… Now there’s no trickery, this is correct, it’s all correct.

Indices do serve as primitive keys to the users who are working these, key 0… key 1… It makes sense. So it should!

So there’s nothing occult, or magical, or mysterious about the code… it plays out how it should and does not scare you if you know how it works. I’ll address a few more #BecauseJavaScript tweets later on in this blog in the next upcoming posts.