Your 1000 elements for sort is too low
the measured times are too low to represent a valid measurement (as majority of the time might not be used by the sort itself but initialization of window, openning files etc ...).
you need times at least 100ms or more (1 sec is ideal).
if you have access to data that is being sorted
You can introduce data set that will be challenging for each type of sort (and from times infer used algo)... so for example bubble sort is slowest for sorted array in reverse order ... so pass sorted data ascending and descending and random and compare times. let call the times tasc,tdes,trnd
and assuming ascending sort then if bubble sort is involved it should be:
tasc O(n) < trnd < tdes O(n^2)
so:
tasc*n == tdes + margin_of error
so just test tdes/tasc
is close to n
... with some margin for error ...
so you just need to create a sample data that will be hard for specific type of sort and not for the others ... and from the times detect if it is the case or not until you find algo used.
Here some data (all times are in [ms]
) I tested on mine bubble sort and asc ordered data:
n tasc tdesc tasc*n
1000 0.00321 2.96147 3.205750
2000 0.00609 11.76799 12.181855
4000 0.01186 45.58834 47.445111
to be more clear if we have runtime for complexity O(n)
t(O(n)) = c*n
to convert to runtime with complexity O(n^2)
(assuming the same constant time c
):
t(O(n^2)) = c*n*n = t(O(n)) * n
This way you can compare times with different complexities you just need to convert all measured time into single common complexity...
if you can chose sorted data size
then as it was mentioned in comments you can infer the growth rate of the times with increasing n
(doubling) from that you can estimate complexity and from that you can tell which algo was used.
So lets assume measured times from #2 then for O(n)
the constant time c
should be the same so for tasc (O(n)
):
n tasc c=tasc/n
1000 0.00321 0.000003210
2000 0.00609 0.000003045
4000 0.01186 0.000002965
and for tdesc (O(n^2)
):
n tdesc tdesc/n^2
1000 2.96147 0.00000296147000
2000 11.76799 0.00000294199750
4000 45.58834 0.00000284927125
as you can see the c
is more or less the same for both times tasc,tdesc
which means they comply their estimated complexities O(n),O(n^2)
However without knowing what the tested App does is hard to be sure as sorting might be preceded by processing ... for example data might be scanned to detect the form (sorted, random, almost sorted ...) which is doable in O(n)
and with the result along with data size it might chose which sorting algo to use ... So your measurements might measure different routines invalidating results ...
[edit1] I had an insane idea of detecting the complexity automaticaly
Simply by testing if constant time constant is more or less the same between all meassured times versus their corresponding n
... Here simple C++/VCL code:
//$$---- Form CPP ----
//---------------------------------------------------------------------------
#include <vcl.h>
#include <math.h>
#pragma hdrstop
#include "Unit1.h"
//---------------------------------------------------------------------------
#pragma package(smart_init)
#pragma resource "*.dfm"
TForm1 *Form1;
//---------------------------------------------------------------------------
double factorial[]= // n[-],t[ms]
{
11,0.008,
12,0.012,
13,0.013,
14,0.014,
15,0.016,
16,0.014,
17,0.015,
18,0.017,
19,0.019,
20,0.016,
21,0.017,
22,0.019,
23,0.021,
24,0.023,
25,0.025,
26,0.027,
27,0.029,
28,0.032,
29,0.034,
30,0.037,
31,0.039,
32,0.034,
33,0.037,
34,0.039,
35,0.041,
36,0.039,
37,0.041,
38,0.044,
39,0.046,
40,0.041,
41,0.044,
42,0.046,
43,0.049,
44,0.048,
45,0.050,
46,0.054,
47,0.056,
48,0.056,
49,0.060,
50,0.063,
51,0.066,
52,0.065,
53,0.069,
54,0.072,
55,0.076,
56,0.077,
57,0.162,
58,0.095,
59,0.093,
60,0.089,
61,0.093,
62,0.098,
63,0.096,
64,0.090,
65,0.100,
66,0.104,
67,0.111,
68,0.100,
69,0.121,
70,0.109,
71,0.119,
72,0.104,
73,0.124,
74,0.113,
75,0.118,
76,0.118,
77,0.123,
78,0.129,
79,0.133,
80,0.121,
81,0.119,
82,0.131,
83,0.150,
84,0.141,
85,0.148,
86,0.154,
87,0.163,
88,0.211,
89,0.151,
90,0.157,
91,0.166,
92,0.161,
93,0.169,
94,0.173,
95,0.188,
96,0.181,
97,0.187,
98,0.194,
99,0.201,
100,0.185,
101,0.191,
102,0.202,
103,0.207,
104,0.242,
105,0.210,
106,0.215,
107,0.221,
108,0.217,
109,0.226,
110,0.232,
111,0.240,
112,0.213,
113,0.231,
114,0.240,
115,0.252,
116,0.248,
117,0.598,
118,0.259,
119,0.261,
120,0.254,
121,0.263,
122,0.270,
123,0.281,
124,0.290,
125,0.322,
126,0.303,
127,0.313,
128,0.307,
0,0.000
};
//---------------------------------------------------------------------------
double sort_asc[]=
{
1000,0.00321,
2000,0.00609,
4000,0.01186,
0,0.000
};
//---------------------------------------------------------------------------
double sort_desc[]=
{
1000, 2.96147,
2000,11.76799,
4000,45.58834,
0,0.000
};
//---------------------------------------------------------------------------
double sort_rand[]=
{
1000, 3.205750,
2000,12.181855,
4000,47.445111,
0,0.000
};
//---------------------------------------------------------------------------
double div(double a,double b){ return (fabs(b)>1e-10)?a/b:0.0; }
//---------------------------------------------------------------------------
AnsiString get_complexity(double *dat) // expect dat[] = { n0,t(n0), n1,t(n1), ... , 0,0 }
{
AnsiString O="O(?)";
int i,e;
double t,n,c,c0,c1,a,dc=1e+10;
#define testbeg for (e=1,i=0;dat[i]>0.5;){ n=dat[i]; i++; t=dat[i]; i++;
#define testend(s) if ((c<=0.0)||(n<2.0)) continue; if (e){ e=0; c0=c; c1=c; } if (c0>c) c0=c; if (c1<c) c1=c; } a=fabs(1.0-div(c0,c1)); if (dc>=a){ dc=a; O=s; }
testbeg; c=div(t,n); testend("O(n)");
testbeg; c=div(t,n*n); testend("O(n^2)");
testbeg; c=div(t,n*n*n); testend("O(n^3)");
testbeg; c=div(t,n*n*n*n); testend("O(n^4)");
testbeg; a=log(n); c=div(t,a); testend("O(log(n))");
testbeg; a=log(n); c=div(t,a*a); testend("O(log^2(n))");
testbeg; a=log(n); c=div(t,a*a*a); testend("O(log^3(n))");
testbeg; a=log(n); c=div(t,a*a*a*a); testend("O(log^4(n))");
testbeg; a=log(n); c=div(t,n*a); testend("O(n.log(n))");
testbeg; a=log(n); c=div(t,n*n*a); testend("O(n^2.log(n))");
testbeg; a=log(n); c=div(t,n*n*n*a); testend("O(n^3.log(n))");
testbeg; a=log(n); c=div(t,n*n*n*n*a); testend("O(n^4.log(n))");
testbeg; a=log(n); c=div(t,n*a*a); testend("O(n.log^2(n))");
testbeg; a=log(n); c=div(t,n*n*a*a); testend("O(n^2.log^2(n))");
testbeg; a=log(n); c=div(t,n*n*n*a*a); testend("O(n^3.log^2(n))");
testbeg; a=log(n); c=div(t,n*n*n*n*a*a); testend("O(n^4.log^2(n))");
testbeg; a=log(n); c=div(t,n*a*a*a); testend("O(n.log^3(n))");
testbeg; a=log(n); c=div(t,n*n*a*a*a); testend("O(n^2.log^3(n))");
testbeg; a=log(n); c=div(t,n*n*n*a*a*a); testend("O(n^3.log^3(n))");
testbeg; a=log(n); c=div(t,n*n*n*n*a*a*a); testend("O(n^4.log^3(n))");
testbeg; a=log(n); c=div(t,n*a*a*a*a); testend("O(n.log^4(n))");
testbeg; a=log(n); c=div(t,n*n*a*a*a*a); testend("O(n^2.log^4(n))");
testbeg; a=log(n); c=div(t,n*n*n*a*a*a*a); testend("O(n^3.log^4(n))");
testbeg; a=log(n); c=div(t,n*n*n*n*a*a*a*a); testend("O(n^4.log^4(n))");
#undef testend
#undef testbeg
return O+AnsiString().sprintf(" error = %.6lf",dc);
}
//---------------------------------------------------------------------------
__fastcall TForm1::TForm1(TComponent* Owner):TForm(Owner)
{
mm_log->Lines->Clear();
mm_log->Lines->Add("factorial "+get_complexity(factorial));
mm_log->Lines->Add("sort asc "+get_complexity(sort_asc));
mm_log->Lines->Add("sort desc "+get_complexity(sort_desc));
mm_log->Lines->Add("sort rand "+get_complexity(sort_rand));
}
//-------------------------------------------------------------------------
with relevant times measurements of mine fast exact bigint factorial where I simply used only the bigger times above 8ms, and also the sorting measurement from above which outputs this:
factorial O(n.log^2(n)) error = 0.665782
sort asc O(n) error = 0.076324
sort desc O(n^2) error = 0.037886
sort rand O(n^2) error = 0.075000
The code just test few supported complexities and outputs the one that has lowest error (variation of c
constant time between different n
)...
Just ignore the VCL stuff and convert the AnsiString into any string or output you want ...