Nov 24

Life Is Short, Use Python

Posted by abiao at 12:00
I started to play with Python two weeks ago due to the limitation of R in terms of handling large data, then a friend of mine suggested me to try Python since I had to do data massage frequently, "Python is the best choice, trust me", he said. Although I was unwilling to learn another new software, I couldn't bear with the low efficiency of R (or of my work) for large data. You may realize my learning curve as: Excellent free CSV splitter --> MySQL+RMySQL package --> Several R packages including bigmemory and ff. But to be honest, none of them satisfies me either because of the limitation of the method (slow + malfunction) or of my own computer (short of memory).

I am shocked by python's extreme power and easy-to-use design after nearly two weeks, dealing with a 10GB CSV had never become so easy. More importantly, you can access R from Python almost seamlessly with the package RPY. To get started, I would like to recommend the following readings to all Python newbies like me:
1, commands dictionary Matlab vs R vs Python;
2, free ebook Dive Into Python;
3, a text book Machine Learning: An Algorithmic Perspective by Prof. Stephen Marsland.

The third book is especially useful for data analysis, as there are lots of Python code examples in the book, the code and dataset are available to download @ the author's website http://www-ist.massey.ac.nz/smarsland/MLBook.html, take a look before deciding to add it to your shelf.

I agree that Python looks great. But I am just starting to feel comfortable with Perl; is it worth it to switch to Python, and re-learn all the little tricks that go with feeling comfortable using a language?
I never used Perl, however, i was told Perl is also a highly efficient language. We learn another language simple because it is better, if Perl is able to do the job equally good as Python, I don't think it is necessary to switch to Python.
Go to Revolution R, and register with your school email. They provide R-64bit to students for free. Assuming you have >4GB of RAM, the 64bit edition will be more than enough to handle 10GB of data.
sounds great, will check it soon, 10x.
Perl? Hm. I don't recommend Perl. It is a great language, but for scientific work I recommend Python:

* Easier to learn than most other languages (especially Perl)
* The syntax is pleasant to the eyes (unlike Perl)
* Matlab rivalling functionality in numpy/scipy
* Computer Algebra in Sage
* Easy integration with R, C++, GPUs ....
* High productivity
And the switch from any language to Python is much less painful than to any other language.
abiao replied on 2011/08/06 09:05
can't agree more, thax.
