NumPy for fast array computations

NumPy

  • is the main foundation of the scientific Python ecosystem
  • offers a data structure for high-performance numerical computing
  • implements the multidimensional array structure in C and provides a convenient Python interface: together they bring high performance and ease of use.
  • is used by many Python libraries, for example is built on top of NumPy.

We will illustrate now with an example how fast can be the computations using Nympy

We import first the built-in random Python module and NumPy:

In [1]:
import random
In [2]:
import numpy as np

We generate two Python lists, x and y, each one containing 1 million random numbers between 0 and 1

In [3]:
n=1000000
In [4]:
x = [random.random() for _ in range(n)]
y = [random.random() for _ in range(n)]
In [5]:
x[:3], y[:3]
Out[5]:
([0.7345096443143712, 0.3600792643126497, 0.6075298300342545],
 [0.5509167343998912, 0.2703683929229056, 0.08152056247823536])

We compute now the element-wise average of all of these numbers: the average between the first element of x and the first element of y, and so on.

In [6]:
 z = [(x[i] + y[i])/2  for i in range(n)]

How long does this computation take? It took for my computer ~ 165ms to make this computation

Let's now try to do the same using NumPy Library and compare the execution time.

We will first transform x and y into np.array using NumPy

In [7]:
xa = np.array(x)
ya = np.array(y)
In [8]:
xa[:3], ya[:3]
Out[8]:
(array([0.73450964, 0.36007926, 0.60752983]),
 array([0.55091673, 0.27036839, 0.08152056]))
In [9]:
za = (xa + ya)/2

Now it took for us only 24ms to make the same computation