Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
75 views
in Technique[技术] by (71.8m points)

Sorting in JavaScript: Should every compare function have a "return 0" statement?

I recently read many answers about sorting in JavaScript and I often stumble upon a compare function that looks like this:

array.sort(function(a,b){ a > b ? 1 : -1; });

So it is a compare function that returns 1 if a is greater than b and -1 if a is less than OR EQUAL TO b. As described on MDN (link), a compare function can also return zero, to ensure that the relative position of two items remains unchanged:

If compareFunction(a, b) returns 0, leave a and b unchanged with respect to each other, but sorted with respect to all different elements.

So the official examples look more like this:

function compare(a, b) {
  if (a < b) return -1;
  if (a > b) return 1;
  return 0;
}

And indeed, by adding a return 0 statement, the sorting algorithm often needs less iterations and runs faster in total (JSPerf).

So I was wondering if there is any advantage on omitting a return 0 statement.

I realized that on MDN, it also says:

Note: the ECMAscript standard does not guarantee this behaviour, and thus not all browsers (e.g. Mozilla versions dating back to at least 2003) respect this.

referring to the behavior, that a and b should remain unchanged if 0 is returned. So maybe, by returning 0, we get a slightly different sorted array in different browsers? Could that be a reason? And are there any other good reasons for not returning zero at all?

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

So I was wondering if there is any advantage on omitting a return 0 statement.

Less letters to type. And it might be a tiny bit faster due to the one omitted comparison. All other effects are disadvantages.

I realized that on MDN, it also says:

Note: the ECMAscript standard does not guarantee this behaviour, and thus not all browsers (e.g. Mozilla versions dating back to at least 2003) respect this.

referring to the behavior, that a and b should remain unchanged if 0 is returned.

That the position of a and b may remain unchanged is only the requirement for a stable sort. This is not a specified behaviour, and some browsers have implemented a non-stable sort algorithm.

However, the actual purpose of returning zero is that neither a is sorted before b (as if less than 0) nor that b is sorted before a (as if greater than 0) - basically when a equals b. This is a must-have for a comparison, and all sorting algorithms obey it.

To produce a valid, satisfiable ordering (mathematically: divide the items into totally ordered equivalence classes), a comparison must have certain properties. They are listed in the spec for sort as requirements for a "consistent comparison function".

The most prominent one is reflexivity, demanding that an item a is equal to a (itself). Another way to say this is:

compare(a, a) must always return 0

What happens when a comparison function does not satisfy this (like the one you stumbled upon obviously does)?

The spec says

If comparefn […] is not a consistent comparison function for the elements of this array, the behaviour of sort is implementation-defined.

which basically means: If you provide an invalid comparison function, the array will probably not be sorted correctly. It might get randomly permuted, or the sort call might not even terminate.

So maybe, by returning 0, we get a slightly different sorted array in different browsers? Could that be a reason?

No, by returning 0 you get a correctly sorted array across browsers (which might be different due to the unstable sort). The reason is that by not returning 0 you get slightly different permuted arrays (if at all), maybe even producing the expected result but usually in a more complicated manner.

So what could happen if you don't return 0 for equivalent items? Some implementations have no problems with this, as they never compare an item to itself (even if apparent at multiple positions in the array) - one can optimise this and omit the costly call to the compare function when it is already known that the result must be 0.

The other extreme would be a never-terminating loop. Assuming you had two equivalent items after each other, you would compare the latter with the former and realise that you had to swap them. Testing again, the latter would still be smaller than the former and you'd have to swap them again. And so on…

However, an efficient algorithm mostly does not test already compared items again, and so usually the implementation will terminate. Still, it might do more or less swaps that actually had been unnecessary, and will therefore take longer than with a consistent comparison function.

And are there any other good reasons for not returning zero at all?

Being lazy and hoping that the array does not contain duplicates.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...