I am sorting 10+ million uint64_t
s with RGB data from .RAW files and 79% of my C program time is spent in qsort
. I am looking for a faster sort for this specific data type.
Being RAW graphical data, the numbers are very random and ~80% unique. No partial sorting or runs of sorted data can be expected. The 4 uint16_t
s inside the uint64_t
are R, G, B and zero (possibly a small count <= ~20).
I have the simplest comparison function I can think of using unsigned long long
s (you CANNOT just subtract them):
qsort(hpidx, num_pix, sizeof(uint64_t), comp_uint64);
...
int comp_uint64(const void *a, const void *b) {
if(*((uint64_t *)a) > *((uint64_t *)b)) return(+1);
if(*((uint64_t *)a) < *((uint64_t *)b)) return(-1);
return(0);
} // End Comp_uint64().
There was a very interesting "Programming Puzzles & Code Golf" on StackExchange, but they used float
s. Then there are QSort, RecQuick, heap, stooge, tree, radix...
The swenson/sort looked interesting but had no (obvious) support for my datatype, uint64_t
. And the "quick sort" time was the best. Some sources say the system qsort
can be anything, not necessarily "Quick Sort".
A C++ sort bypasses the generic casting of void pointers and realizes great improvements in performance over C. There has to be an optimized method to slam U8s through a 64bit processor at warp speed.
System/compiler info:
I am currently using the GCC with Strawberry Perl
gcc version 4.9.2 (x86_64-posix-sjlj, built by strawberryperl.com
Intel 2700K Sandy Bridge CPU, 32GB DDR3
windows 7/64 pro
gcc -D__USE_MINGW_ANSI_STDIO -O4 -ffast-math -m64 -Ofast -march=corei7-avx -mtune=corei7 -Ic:/bin/xxHash-master -Lc:/bin/xxHash-master c:/bin/stddev.c -o c:/bin/stddev.g6.exe
First attempt at a better qsort
, QSORT()
!
Tried to use Michael Tokarev's inline qsort
.
"READY-TO-USE"? From qsort.h
documentation
-----------------------------
* Several ready-to-use examples:
*
* Sorting array of integers:
* void int_qsort(int *arr, unsigned n) {
* #define int_lt(a,b) ((*a)<(*b))
* QSORT(int, arr, n, int_lt);
--------------------------------
Change from type "int" to "uint64_t"
compile error on TYPE???
c:/bin/bpbfct.c:586:8: error: expected expression before 'uint64_t'
QSORT(uint64_t, hpidx, num_pix, islt);
I can't find a real, compiling, working example program, just comments with the "general concept"
#define QSORT_TYPE uint64_t
#define islt(a,b) ((*a)<(*b))
uint64_t *QSORT_BASE;
int QSORT_NELT;
hpidx=(uint64_t *) calloc(num_pix+2, sizeof(uint64_t)); // Hash . PIDX
QSORT_BASE = hpidx;
QSORT_NELT = num_pix; // QSORT_LT is function QSORT_LT()
QSORT(uint64_t, hpidx, num_pix, islt);
//QSORT(uint64_t *, hpidx, num_pix, QSORT_LT); // QSORT_LT mal-defined?
//qsort(hpidx, num_pix, sizeof(uint64_t), comp_uint64); // << WORKS
The "ready-to-use" examples use types of int
, char *
and struct elt
. Isn't uint64_t
a type?? Try long long
QSORT(long long, hpidx, num_pix, islt);
c:/bin/bpbfct.c:586:8: error: expected expression before 'long'
QSORT(long long, hpidx, num_pix, islt);
Next attempt: RADIXSORT
:
Results: RADIX_SORT is RADICAL!
I:r3pf.249465>grep "Event" bb12.log | grep -i Sort
<< 1.40 sec average
4) Time=1.411 sec = 49.61%, Event RADIX_SORT , hits=1
4) Time=1.396 sec = 49.13%, Event RADIX_SORT , hits=1
4) Time=1.392 sec = 49.15%, Event RADIX_SORT , hits=1
16) Time=1.414 sec = 49.12%, Event RADIX_SORT , hits=1
I:r3pf.249465>grep "Event" bb11.log | grep -i Sort
<< 5.525 sec average = 3.95 time slower
4) Time=5.538 sec = 86.34%, Event QSort , hits=1
4) Time=5.519 sec = 79.41%, Event QSort , hits=1
4) Time=5.519 sec = 79.02%, Event QSort , hits=1
4) Time=5.563 sec = 79.49%, Event QSort , hits=1
4) Time=5.684 sec = 79.83%, Event QSort , hits=1
4) Time=5.509 sec = 79.30%, Event QSort , hits=1
3.94 times faster than whatever sort qsort
out of the box uses!
And, even more importantly, there was actual, working code, not just 80% of what you need given by some Guru who assumes you know everything they know and can fill in the other 20%.
Fantastic solution! Thanks Louis Ricci!
See Question&Answers more detail:
os