Migrating from R to Python

Many years ago, I shifted from Microsoft Excel and LibreOffice Calc to R data frames as my primary spreadsheet tool. This was one of the earliest steps in my ongoing move from bloat to minimalism (see my three blog posts on this process). Shifting to R yielded many benefits:

  • Greater readability and maintainability
  • Version control
  • Reusable code
  • Dynamic generation of reports and presentations from computed data using LaTex and knitr
  • Production quality graphics and charts using plain R graphics and more importantly ggplot
  • Access to a comprehensive library of statistical and quantitative finance tools written in R

Over the last few months, I have been shifting from R to Python for most of my work. The primary reason for making this change is that Python is a full fledged programming language unlike R which is primarily a statistical language which has been extended to do a lot of other things. A few years ago (when I first shifted to R), Python was totally unsuitable for use as a spreadsheet because the language was primarily designed to work with scalars rather than vectors and matrices. But in recent years, the Python tool sets (NumPy, SciPy, pandas, matplotlib, statsmodels, scikit-learn) have developed rapidly and now goes beyond the capabilities of R in many respects. Jake VanderPlas’s keynote talk at the Scipy 2015 Conference is an excellent introduction to this entire set of tools. Overall, I am very happy with the pandas implementation of data frames based on NumPy arrays; the best features of R have been preserved.
Continue Reading

Advertisements