NOTE: This experimental result is valid for MSVC. In some other implementation of library, the result will vary.
printf
could be (much) faster than cout
. Although printf
parses the format string in runtime, it requires much less function calls and actually needs small number of instruction to do a same job, comparing to cout
. Here is a summary of my experimentation:
The number of static instruction
In general, cout
generates a lot of code than printf
. Say that we have the following cout
code to print out with some formats.
os << setw(width) << dec << "0x" << hex << addr << ": " << rtnname <<
": " << srccode << "(" << dec << lineno << ")" << endl;
On a VC++ compiler with optimizations, it generates around 188 bytes code. But, when you replace it printf
-based code, only 42 bytes are required.
The number of dynamically executed instruction
The number of static instruction just tells the difference of static binary code. What is more important is the actual number of instruction that are dynamically executed in runtime. I also did a simple experimentation:
Test code:
int a = 1999;
char b = 'a';
unsigned int c = 4200000000;
long long int d = 987654321098765;
long long unsigned int e = 1234567890123456789;
float f = 3123.4578f;
double g = 3.141592654;
void Test1()
{
cout
<< "a:" << a << “
”
<< "a:" << setfill('0') << setw(8) << a << “
”
<< "b:" << b << “
”
<< "c:" << c << “
”
<< "d:" << d << “
”
<< "e:" << e << “
”
<< "f:" << setprecision(6) << f << “
”
<< "g:" << setprecision(10) << g << endl;
}
void Test2()
{
fprintf(stdout,
"a:%d
"
"a:%08d
"
"b:%c
"
"c:%u
"
"d:%I64d
"
"e:%I64u
"
"f:%.2f
"
"g:%.9lf
",
a, a, b, c, d, e, f, g);
fflush(stdout);
}
int main()
{
DWORD A, B;
DWORD start = GetTickCount();
for (int i = 0; i < 10000; ++i)
Test1();
A = GetTickCount() - start;
start = GetTickCount();
for (int i = 0; i < 10000; ++i)
Test2();
B = GetTickCount() - start;
cerr << A << endl;
cerr << B << endl;
return 0;
}
Here is the result of Test1 (cout):
- # of executed instruction: 423,234,439
- # of memory loads/stores: approx. 320,000 and 980,000
- Elapsed time: 52 seconds
Then, what about printf
? This is the result of Test2:
- # of executed instruction: 164,800,800
- # of memory loads/stores: approx. 70,000 and 180,000
- Elapsed time: 13 seconds
In this machine and compiler, printf
was much faster cout
. In both number of executed instructions, and # of load/store (indicates # of cache misses) have 3~4 times differences.
I know this is an extreme case. Also, I should note that cout
is much easier when you're handling 32/64-bit data and require 32/64-platform independence. There is always trade-off. I'm using cout
when checking type is very tricky.
Okay, cout
in MSVS just sucks :)