I have moved my blog to Wordpress at theunixgeek.wordpress.com. I will still be checking back periodically on this one as well, though. 19 April 2009

featured

Merging Mkdir and Cd | 280 Slides Interview | I Switched to KDE 4

clickable portals

Sunday, September 7, 2008

C vs Python: Speed

Introduction
Python is a very popular interpreted scripting language. C is a very popular compiled language. Due to its compiled nature, C is generally faster than Python, but is lower-level, making Python programming quicker and easier than C programming.

The questions here are whether or not the extra time taken to run a Python program (without input) will be less cost-effective than its C equivalent and whether runtime time is more important than programming time.

Note: due to technical difficulties, I have placed parentheses around some symbols or removed some tabs from the Python examples

The Systems Program
I decided to make a simple program that resolves the following system of equations:

{ x + y = 14
{ x^2 + y^2 = 100

I quickly wrote the program in Python and found the answers. Then I translated the same program into C. I knew the same program in C would be relatively longer than the same written in Python, but that's not what I was looking for. But before we get there, here are my results:

Python:
x = 1
while x <= 14:
y = 14 - x
print str(x) + "|" + str(y)
if x**2 + y**2 == 100:
print "match"
x = x + 1



C:
#include (<)stdio.h(>)

int main()
{
int x, y, t;

for (x = 1; x <= 14; x++) {
y = 14 - x;
printf("%d|%d\n", x, y);
if ((x*x) + (y*y) == 100)
printf("match\n");
}
return 0;
}

Now, I've always heard that C was always one of fastest languages out there. Running both programs from the terminal, I didn't recognize any difference between the Python program and the C program, so I fired up the terminal in Ubuntu and typed:

time ./a.out
(The time command, followed by the normal command that could be typed without the "time" prefix, runs the command and times it - here, it is obviously the C program that's being tested) I got 0.001 seconds real time, 0 for the user time, and 0 for the system time. Now, time to test the Python version!

time python system.py
The figures got a bit scary here: 0.017 seconds for the real time, 0.012 seconds for the user time, and 0.004 seconds for the system time.

Sure, the difference for the real time is only sixteen thousandths of a second, but it can be a significant difference for larger systems that need to perform multiple calculations for long periods of time.

The Million Program
I decided to take this idea into hand and wrote yet another program that prints all integers between 0 to 1,000,000 including 0, which, of course, is not exactly of the same scale as the aforementioned possibility, but gives the computer a bit more to print out.

Python:
i = 0
while i (<) 1000000 print i
i = i + 1


C:
#include (<)stdio.h(>)

int main ()
{
int i;
for (i = 0; i <>
printf ("%d\n", i);
return 0;
}

now, time to test out the programs!

C:
real 0m24.625s
user 0m0.652s
sys 0m2.240s

Python:
real 0m29.805s
user 0m1.984s
sys 0m1.812s

Conclusion
I have to admit each language has its strengths and weaknesesses, but from these results, I only want to use Python for quick things like the systems program shown or for prototyping C programs, and C for programs where the time taken to process information matters more.

Either way, goals may be different for different people or different projects - what's your opinion?

Afterword
After testing and retesting various times, I have found that at times the programs can be faster or slower depending on what else the computer was doing (on my machine, the more processes were being handled, the faster the programs ran, oddly enough).

16 comments:

Prime said...

Very cool article. I've always really preferred C over any language, but would be interested in seeing how Perl would stand up to these same tests? Any chance on me seeing that soon?

Anonymous said...

Um thats about a 17x slow down in the first test...

For the second im afraid what you have there is a benchmark of how fast your terminal can out put lines...

try something like

time ./a.out >/dev/null

real 0m1.221s
user 0m1.220s
sys 0m0.000s

time python loop.py >/dev/null

real 0m6.596s
user 0m6.540s
sys 0m0.060s

time perl loop.pl >/dev/null

real 0m2.115s
user 0m2.117s
sys 0m0.003s


so Python is SLOW, just use perl :)

Anonymous said...

BTW i should mention I increased the number from 1,000,000 to 10,000,000 for those tests...

Max said...

Don't you think that's some kind of shit, not comparison? You didn't even mentioned optimization flags for C program.

Aekold said...

Both programs are using printing functions. Printing is very slow operation - you can compare it by writing same program with file output. So the most time of this program is not mathematics operation but simple printing.

olex said...

hardware:

#cpu model and capabilities

$ cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 15
model : 67
model name : AMD Athlon(tm) 64 X2 Dual Core Processor 4600+
stepping : 3
cpu MHz : 2400.224
cache size : 512 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy
bogomips : 4806.29
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc

processor : 1
vendor_id : AuthenticAMD
cpu family : 15
model : 67
model name : AMD Athlon(tm) 64 X2 Dual Core Processor 4600+
stepping : 3
cpu MHz : 2400.224
cache size : 512 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy
bogomips : 4802.42
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc

#memory stats

$ free
total used free shared buffers cached
Mem: 1933220 1899184 34036 0 72888 756840
-/+ buffers/cache: 1069456 863764
Swap: 2000084 100 1999984


now lets see versions of software used for testing:

#base system version

$cat /etc/gentoo-release
Gentoo Base System release 1.12.11.1

#kernel version

$ uname -a
Linux localhost 2.6.23-hardened-r7-20080818 #3 SMP Tue Aug 19 10:20:41 EEST 2008 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 4600+ AuthenticAMD GNU/Linux

#gcc version

$ gcc -v
Reading specs from /usr/lib/gcc/x86_64-pc-linux-gnu/3.4.6/specs
Configured with: /var/tmp/portage/sys-devel/gcc-3.4.6-r2/work/gcc-3.4.6/configure --prefix=/usr --bindir=/usr/x86_64-pc-linux-gnu/gcc-bin/3.4.6 --includedir=/usr/lib/gcc/x86_64-pc-linux-gnu/3.4.6/include --datadir=/usr/share/gcc-data/x86_64-pc-linux-gnu/3.4.6 --mandir=/usr/share/gcc-data/x86_64-pc-linux-gnu/3.4.6/man --infodir=/usr/share/gcc-data/x86_64-pc-linux-gnu/3.4.6/info --with-gxx-include-dir=/usr/lib/gcc/x86_64-pc-linux-gnu/3.4.6/include/g++-v3 --host=x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu --disable-altivec --disable-nls --with-system-zlib --disable-checking --disable-werror --enable-secureplt --disable-libunwind-exceptions --disable-multilib --enable-languages=c,c++,java,objc,treelang --enable-shared --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu
Thread model: posix
gcc version 3.4.6 (Gentoo Hardened 3.4.6-r2 p1.5, ssp-3.4.6-1.0, pie-8.7.10)

#glibc version
$ ls /lib/ld-*
/lib/ld-2.6.1.so /lib/ld-linux-x86-64.so.2

#binutils version

$ ld -v
GNU ld (GNU Binutils) 2.18

#python version

$ python -V
Python 2.5.2

#perl version
$ perl -v
perl -v

This is perl, v5.8.8 built for x86_64-linux

Copyright 1987-2006, Larry Wall

Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.

Complete documentation for Perl, including FAQ lists, should be found on
this system using "man perl" or "perldoc perl". If you have access to the
Internet, point your browser at http://www.perl.org/, the Perl Home Page.

Ok, thats enough - lets go testing:

1. first program

1.1 python

t.py file:

x = 1

while x <= 14:
y = 14 - x
print str(x) + "|" + str(y)
if x**2 + y**2 == 100:
print "match"
x = x + 1

$ time python ./t.py > /dev/null

real 0m0.013s
user 0m0.007s
sys 0m0.007s

1.2 perl

t.pl file:

for ($x = 1; $x <= 14; $x++){
$y = 14 - $x;
print $x, "|", $y, "\n";
if ((($x*$x) + ($y*$y)) == 100){
print "match\n";
}
}

$ time perl ./t.pl > /dev/null

real 0m0.003s
user 0m0.003s
sys 0m0.000s

1.3 C

t.c file:

int main()
{
int x, y;

for (x = 1; x <= 14; x++) {
y = 14 - x;
printf("%d|%d\n", x, y);
if ((x*x) + (y*y) == 100)
printf("match\n");
}
return 0;
}

$ gcc -o t t.c

$ time ./t > /dev/null

real 0m0.001s
user 0m0.000s
sys 0m0.000s

2. second test
2.1 python

l.py file:

i = 0
while i < 1000000:
print i
i = i + 1

$ time python ./l.py > /dev/null

real 0m1.420s
user 0m1.410s
sys 0m0.010s

2.2 perl

l.pl file:

for ($i =0; $i <= 1000000; $i++){
print $i, "\n";
}


$ time perl ./l.pl > /dev/null

real 0m0.907s
user 0m0.900s
sys 0m0.007s

2.3 C

l.c file:

int main ()
{
int i;
for (i = 0; i <= 1000000 ; i++)
printf ("%d\n", i);
return 0;
}


$ gcc -o l l.c

$ time ./l > /dev/null

real 0m0.156s
user 0m0.157s
sys 0m0.000s

3. conclusion

1. Python is slower then Perl and C
2. Perl is slower then C
3. Perl faster then Python
4. C is faster then Perl and Python

Thats it...

Anonymous said...

Yeah ... new generation of CS always compare the soft with the light ;-(

If like such ... mnnnn ... "Language performance benchmark" - go to most known one: http://shootout.alioth.debian.org/gp4/benchmark.php?test=all&lang=all

Prime said...

Thanks, Olex, for that bit on Perl vs Python/C. I figured it was faster than Python, but was unsure about C. Thanks for clearing that up though.

Pablo said...

The python version is slower because you are also measuring the VM startup time. Thats what the timeit module is for. As another reader pointed out, the language shootout is the best place for this metrics

olex said...

I don't try to make absolutely right or wrong testing about any of three languages. I'm only saw strange (IMHO) results presented in main post and I was trying to prove "rightness" or "wrongness" of info. It was interesting for me to measure "real" (I mean what user can expect) performance of those languages and not to abstract from VM startup time or any other startup things.

Anyway this is not right way to compare interpreted languages (Perl, Python) with C.

Much correct is comparing Python and Perl.

PS AFAIK Perl before starting any perl script do byte-compiling and then runs byte-code. Byte-compiling is like VM startup IMHO.

Isaac Rodriguez said...

I'm sorry, but all your testing is useless. Not only you do not mention any compiler switches for optimizations in C, but the code is not written in the most optimal manner for C or Python.

This is the problem you find in most examples that compare the execution time between one language and another (doesn't matter the laguages of choice). You cannot write the program in one language and translate it line by line to another. You have to use the available language features in each of the languages and most people don't do that.

Your examples are not optimal in either languages, which makes your entire test invalid.

Mystilleef said...

Use psyco, it's a JIT for Python. It might give you as fast code as C, maybe even faster

rodrigo said...

Programmer time is much more expensive than processor time.

Kristian said...

You can not make a testsuite where the input data is known at compile time. The C compiler will most likely optimize away the whole program, especially for such a small amount of input data.

Funkyjunkie said...

The real problem here is that when you do something like this:

for(int i=0;i<1000;i++)
a = 10

in C/C++, the compiler most probably ignores the loop and sets a to 10 ONLY ONCE, there is no point doing the same over and over again. Besides, that kind of "benchmark" only tests very, very tiny part of the actual compiler/interepter(the results vary between compilers and interepters) and hence it has absolutely NO value, for anybody seeking for facts about the subject.

Pavel said...

The Java Enterprise platform leverages the robustness of the Java programming language that allows developers to write the code only once and execute the application on any platform. Presently more than two-thirds of web development managers use the Java Enterprise platform to develop and deploy their applications. The Java Enterprise platform provides a framework for developing and deploying web services on the Java platform. The Java API for XML enables Java developers to develop interoperable and portable web services. java software company | software development company | java web development | blackberry application development | iphone application development | android application development | java outsourcing | it outsourcing services | http://www.tenaxtechnologies.com