This is a part of Python Knowledge and Resources List

====

Please read the following reddit comment before reading the article

This article is very subjective. All that has been proven by it is that the used functions perform better onnumpy.Array than on python list when using integers.

However, it completely avoids talking on the fact that:

  • Python list are made for heterogeneous types while numpy.Array works on homogeneous types.
  • Python list adding and removing elements where numpy.Array does not.
  • The used functions are not described anywhere therefore we cannot assume that the algorithm is the same
  • The sum of python list takes more time than the mean which is suppose to take more time. And numpymean takes double the time of numpy sum
  • The protocol only runs the function once, where it should run the function at least 100+ times and get an average
====



By using the following function to measure the execution time, here is the list which compares various numpy operations and python lists

from numpy import arange
from datetime import datetime

def calculate_time(expression):

   nitems = 100000

   narray = arange(nitems)

   larray = range(nitems)

   start = datetime.now()

   val = eval(expression)

   end = datetime.now()

   return "%d micro seconds  %s" %((end-start).microseconds,expression)

  1. numpy array sum vs list sum

    numpy_op1 = "narray.sum()"

    list_op1 = "sum(larray)"

    print calculate_time(numpy_op1)

    print calculate_time(list_op1)

    output:

    222 micro seconds  narray.sum()

    931 micro seconds  sum(larray)

  2. numpy array min vs list min

    numpy_op2 = "narray.min()"

    list_op2 = "sorted(larray)[0]"

    print calculate_time(numpy_op2)

    print calculate_time(list_op2)

    306 micro seconds  narray.min()

    3691 micro seconds  sorted(larray)[0]
  3. numpy array mean vs list average

    numpy_op3 = "narray.mean()"

    list_op3= "sum(larray)/len(larray)"

    print calculate_time(numpy_op3)

    print calculate_time(list_op3)

    446 micro seconds  narray.mean()

    916 micro seconds  sum(larray)/len(larray)
  4. numpy array max vs list max

    numpy_op4 = "narray.max()"

    list_op4 = "sorted(larray,reverse=True)[0]"

    print calculate_time(numpy_op4)

    print calculate_time(list_op4)

    280 micro seconds  narray.max()

    2777 micro seconds  sorted(larray,reverse=True)[0]
  5. Please don't ever sort a list to find the maximum.  That just makes my head hurt.  It's O(n lg n) rather than O(n).  Use the min() and max() builtins - it's what they're there for!
  6. Some much needed critique.
    http://www.reddit.com/r/Python/comments/32flst/why_use_numpy_array_instead_of_python_lists/
Add a Resource to this List
Not more than 250 characters.