Introduction to Numpy#

Objective: Gain practical skills in manipulating and analyzing data using Numpy, one of Python’s most powerful libraries for data science.

This tutorial includes step-by-step examples to familiarize you with the essential functionalities of Numpy. Each example is designed to demonstrate key concepts and techniques for data analysis. To execute the code, select the cell and press SHIFT-ENTER.

What is Numpy?#

Numpy, short for ‘Numerical Python’, is a fundamental package for scientific computing in Python. It provides powerful support for large, multi-dimensional array and matrix data structures and offers a wide range of mathematical functions to operate on these arrays.

Creating ndarray#

The primary data structure in Numpy is the ndarray, a multi-dimensional array offering fast and efficient array operations. This section will demonstrate how to create ndarrays from lists and tuples, highlighting the ease of transitioning from Python’s standard data structures to Numpy’s array-oriented computing approach. You’ll learn how to create both one-dimensional and multi-dimensional arrays, serving as the foundation for more complex numerical computations.

import numpy as np

# Creating a 1-dimensional array
oneDArray = np.array([5.5, 6, 7, 8, 9])
print(oneDArray)
print(f"Dimensions = {oneDArray.ndim}")
print(f"Shape = {oneDArray.shape}")
print(f"Size = {oneDArray.size}")
print(f"Array type = {oneDArray.dtype}\n")

# Creating a 2-dimensional array
twoDArray = np.array([[10, 11], [12, 13], [14, 15], [16, 17]])
print(twoDArray)
print(f"Dimensions = {twoDArray.ndim}")
print(f"Shape = {twoDArray.shape}")
print(f"Size = {twoDArray.size}")
print(f"Array type = {twoDArray.dtype}\n")

# Creating ndarray from a tuple
tupleArray = np.array([(10, 'x', 4.0), (20, 'y', 5.5)])
print(tupleArray)
print(f"Dimensions = {tupleArray.ndim}")
print(f"Shape = {tupleArray.shape}")
print(f"Size = {tupleArray.size}")
[5.5 6.  7.  8.  9. ]
Dimensions = 1
Shape = (5,)
Size = 5
Array type = float64

[[10 11]
 [12 13]
 [14 15]
 [16 17]]
Dimensions = 2
Shape = (4, 2)
Size = 8
Array type = int64

[['10' 'x' '4.0']
 ['20' 'y' '5.5']]
Dimensions = 2
Shape = (2, 3)
Size = 6

The following examples utilize numpy functions to create arrays with random values, integers in a range, reshaped matrices, and equally spaced values in specified intervals.#

print("Uniformly distributed random numbers:")
print(np.random.rand(4))  # Generates 4 random numbers between 0 and 1

print("\nNormally distributed random numbers:")
print(np.random.randn(4))  # Generates 4 random numbers following a normal distribution

print("\nSequential integers with a custom step:")
print(np.arange(-5, 5, 1))  # Creates an array of integers from -5 to 4

print("\nMatrix formed by reshaping a sequence:")
print(np.arange(16).reshape(4, 4))  # Reshapes an array of 0-15 into a 4x4 matrix

print("\nEqually spaced values in a given range:")
print(np.linspace(0, 2, 5))  # Splits the interval [0,2] into 5 parts

print("\nLogarithmically spaced values:")
print(np.logspace(-2, 2, 5))  # Creates an array with values between 10^-2 and 10^2
Uniformly distributed random numbers:
[0.39591115 0.36709992 0.55659577 0.56889963]

Normally distributed random numbers:
[-1.12448484 -1.75597869 -0.70779256 -0.22205479]

Sequential integers with a custom step:
[-5 -4 -3 -2 -1  0  1  2  3  4]

Matrix formed by reshaping a sequence:
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]]

Equally spaced values in a given range:
[0.  0.5 1.  1.5 2. ]

Logarithmically spaced values:
[1.e-02 1.e-01 1.e+00 1.e+01 1.e+02]

The following examples show how to create matrices filled with zeros or ones and an identity matrix using numpy functions.#

print("Matrix filled with zeros:")
print(np.zeros((3, 4)))  # Creates a 3x4 matrix of zeros

print("\nMatrix filled with ones:")
print(np.ones((4, 3)))  # Creates a 4x3 matrix of ones

print("\nIdentity matrix:")
print(np.eye(4))  # Creates a 4x4 identity matrix
Matrix filled with zeros:
[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]

Matrix filled with ones:
[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]

Identity matrix:
[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]

Element-wise Operations in Numpy#

Numpy enables efficient element-wise operations on ndarrays using standard arithmetic operators. These operations allow for concise and fast manipulation of array data, as each element is processed in parallel. This section demonstrates how to perform common arithmetic operations like addition, subtraction, multiplication, and division on an element-by-element basis in numpy arrays.

a = np.array([10, 20, 30, 40, 50])

print(f"a = {a}")
print(f"a + 5 = {a + 5}")  # Adds 5 to each element
print(f"a - 5 = {a - 5}")  # Subtracts 5 from each element
print(f"a * 2 = {a * 2}")  # Multiplies each element by 2
print(f"a // 3 = {a // 3}")  # Integer division of each element by 3
print(f"a ** 2 = {a**2}")  # Squares each element
print(f"a % 4 = {a % 4}")  # Modulo operation on each element
print(f"1 / a = {1 / a}")  # Divides 1 by each element

b = np.array([5, 15, 25, 35, 45])
c = np.array([2, 4, 6, 8, 10])

print(f"b = {b}")
print(f"c = {c}")
print(f"b + c = {b + c}")  # Element-wise addition
print(f"b - c = {b - c}")  # Element-wise subtraction
print(f"b * c = {b * c}")  # Element-wise multiplication
print(f"b / c = {b / c}")  # Element-wise division
print(f"b // c = {b // c}")  # Element-wise integer division
print(f"b ** c = {b**c}")  # Element-wise exponentiation
a = [10 20 30 40 50]
a + 5 = [15 25 35 45 55]
a - 5 = [ 5 15 25 35 45]
a * 2 = [ 20  40  60  80 100]
a // 3 = [ 3  6 10 13 16]
a ** 2 = [ 100  400  900 1600 2500]
a % 4 = [2 0 2 0 2]
1 / a = [0.1        0.05       0.03333333 0.025      0.02      ]
b = [ 5 15 25 35 45]
c = [ 2  4  6  8 10]
b + c = [ 7 19 31 43 55]
b - c = [ 3 11 19 27 35]
b * c = [ 10  60 150 280 450]
b / c = [2.5        3.75       4.16666667 4.375      4.5       ]
b // c = [2 3 4 4 4]
b ** c = [               25             50625         244140625     2251875390625
 34050628916015625]

Indexing and Slicing in Numpy Arrays#

Within a Numpy array, a variety of methods are available for selecting specific elements or subsets. It’s important to note that when you assign an array or its part to a new variable, it creates a reference to the original array, not a separate copy. To create an actual copy of an array, you must explicitly use the .copy() function to avoid unintended changes to the original data.

a = np.arange(10, 20)
print(f"Original a = {a}")

# b is a slice, pointing to a subarray in a
b = a[2:5]
print(f"Initial b (slice of a) = {b}")
b[:] = 999  # Modifying b will change a
print(f"Modified b = {b}")
print(f"a (after modifying b) = {a}\n")

# c is a separate copy of the subarray
c = a[2:5].copy()
print(f"Original a = {a}")
print(f"Initial c (copy of a slice of a) = {c}")
c[:] = 333  # Modifying c does not affect a
print(f"Modified c = {c}")
print(f"a remains unchanged = {a}")
Original a = [10 11 12 13 14 15 16 17 18 19]
Initial b (slice of a) = [12 13 14]
Modified b = [999 999 999]
a (after modifying b) = [ 10  11 999 999 999  15  16  17  18  19]

Original a = [ 10  11 999 999 999  15  16  17  18  19]
Initial c (copy of a slice of a) = [999 999 999]
Modified c = [333 333 333]
a remains unchanged = [ 10  11 999 999 999  15  16  17  18  19]

The following example contrasts how element retrieval varies between a standard Python list and a numpy ndarray, highlighting the unique capabilities and syntax used in numpy.#

# Creating a new 2D list
myNew2dList = [[10, 20, 30], [40, 50, 60], [70, 80, 90]]
print(f"New 2D List: {myNew2dList}")
print(f"Second row of 2D List: {myNew2dList[1]}")
# Note: Accessing elements in a list of lists can be limited

# Converting the 2D list to a numpy array
myNew2dArray = np.array(myNew2dList)
print(f"\nConverted to numpy 2D array:")
print(f"{myNew2dArray}\n")

# Demonstrating numpy's advanced indexing capabilities
print(f"Second row in numpy array: myNew2dArray[1][:]= {myNew2dArray[1][:]}")
print(f"Second row in numpy array: myNew2dArray[1, :]= {myNew2dArray[1, :]}")
print(f"Third column in numpy array: myNew2dArray[:, 2]= {myNew2dArray[:, 2]}")
print(f"Top right 2x2 subarray:")
print(f"myNew2dArray[:2, 1:] =\n {myNew2dArray[:2, 1:]}")
New 2D List: [[10, 20, 30], [40, 50, 60], [70, 80, 90]]
Second row of 2D List: [40, 50, 60]

Converted to numpy 2D array:
[[10 20 30]
 [40 50 60]
 [70 80 90]]

Second row in numpy array: myNew2dArray[1][:]= [40 50 60]
Second row in numpy array: myNew2dArray[1, :]= [40 50 60]
Third column in numpy array: myNew2dArray[:, 2]= [30 60 90]
Top right 2x2 subarray:
myNew2dArray[:2, 1:] =
 [[20 30]
 [50 60]]

The following examples illustrate how to use boolean indexing to select elements from a numpy array#

# Creating a different numpy 2D array
newArray = np.arange(10, 22).reshape(3, 4)
print(f"New 2D Array:\n{newArray}")

# Boolean indexing to filter elements greater than 15
greaterThan15 = newArray[newArray > 15]
print(f"Elements greater than 15:\n{greaterThan15}")

# Boolean indexing on the first row for elements less than 13
lessThan13FirstRow = newArray[0, newArray[0, :] < 13]
print(f"Elements in the first row less than 13:\n{lessThan13FirstRow}")
New 2D Array:
[[10 11 12 13]
 [14 15 16 17]
 [18 19 20 21]]
Elements greater than 15:
[16 17 18 19 20 21]
Elements in the first row less than 13:
[10 11 12]

More indexing examples.#

# Creating a numpy 2D array with a different shape
my2darr = np.arange(10, 22).reshape(4, 3)
print(f"2D Array:\n{my2darr}")

# Shuffling rows using specified indices
indices = [1, 3, 2, 0]
print(f"Row indices for shuffling: {indices}\n")
shuffledRows = my2darr[indices, :]
print(f"Shuffled Rows:\n{shuffledRows}")

# Selecting specific elements using row and column indices
rowIndex = [3, 2, 1, 0, 0]
columnIndex = [1, 2, 0, 1, 2]
print(f"\nRow Indices: {rowIndex}")
print(f"Column Indices: {columnIndex}\n")
selectedElements = my2darr[rowIndex, columnIndex]
print(f"Selected Elements: {selectedElements}")
2D Array:
[[10 11 12]
 [13 14 15]
 [16 17 18]
 [19 20 21]]
Row indices for shuffling: [1, 3, 2, 0]

Shuffled Rows:
[[13 14 15]
 [19 20 21]
 [16 17 18]
 [10 11 12]]

Row Indices: [3, 2, 1, 0, 0]
Column Indices: [1, 2, 0, 1, 2]

Selected Elements: [20 18 13 11 12]

Numpy Arithmetic and Statistical Functions#

The following examples illustrate the use of numpy functions for basic arithmetic operations on arrays and performing common statistical calculations.

z = np.array([2.5, -1.7, 3.3, -4.4, 5.1])
print(f"Array z: {z}\n")

print(f"Absolute values: np.abs(z) = {np.abs(z)}")
print(f"Square roots: np.sqrt(np.abs(z)) = {np.sqrt(np.abs(z))}")
print(f"Signs of z: np.sign(z) = {np.sign(z)}")
print(f"Exponential function: np.exp(z) = {np.exp(z)}")
print(f"Sorted array: np.sort(z) = {np.sort(z)}\n")

a = np.linspace(-2, 2, 5)
b = np.random.rand(5)
print(f"Array a: {a}")
print(f"Array b: {b}\n")

print(f"Element-wise addition: np.add(a, b) = {np.add(a, b)}")
print(f"Element-wise subtraction: np.subtract(a, b) = {np.subtract(a, b)}")
print(f"Element-wise multiplication: np.multiply(a, b) = {np.multiply(a, b)}")
print(f"Element-wise division: np.divide(a, b) = {np.divide(a, b)}")
print(f"Element-wise maximum: np.maximum(a, b) = {np.maximum(a, b)}\n")

z = np.array([4.5, -2.3, 1.1, 3.8, -4.6])
print(f"Array z: {z}\n")

print(f"Minimum value: np.min(z) = {np.min(z)}")
print(f"Maximum value: np.max(z) = {np.max(z)}")
print(f"Average value: np.mean(z) = {np.mean(z)}")
print(f"Standard deviation: np.std(z) = {np.std(z)}")
print(f"Sum of elements: np.sum(z) = {np.sum(z)}")
Array z: [ 2.5 -1.7  3.3 -4.4  5.1]

Absolute values: np.abs(z) = [2.5 1.7 3.3 4.4 5.1]
Square roots: np.sqrt(np.abs(z)) = [1.58113883 1.30384048 1.81659021 2.0976177  2.25831796]
Signs of z: np.sign(z) = [ 1. -1.  1. -1.  1.]
Exponential function: np.exp(z) = [1.21824940e+01 1.82683524e-01 2.71126389e+01 1.22773399e-02
 1.64021907e+02]
Sorted array: np.sort(z) = [-4.4 -1.7  2.5  3.3  5.1]

Array a: [-2. -1.  0.  1.  2.]
Array b: [0.51224953 0.46271901 0.67527709 0.6415085  0.0118122 ]

Element-wise addition: np.add(a, b) = [-1.48775047 -0.53728099  0.67527709  1.6415085   2.0118122 ]
Element-wise subtraction: np.subtract(a, b) = [-2.51224953 -1.46271901 -0.67527709  0.3584915   1.9881878 ]
Element-wise multiplication: np.multiply(a, b) = [-1.02449906 -0.46271901  0.          0.6415085   0.02362439]
Element-wise division: np.divide(a, b) = [ -3.90434718  -2.16113878   0.           1.55882581 169.31652642]
Element-wise maximum: np.maximum(a, b) = [0.51224953 0.46271901 0.67527709 1.         2.        ]

Array z: [ 4.5 -2.3  1.1  3.8 -4.6]

Minimum value: np.min(z) = -4.6
Maximum value: np.max(z) = 4.5
Average value: np.mean(z) = 0.5
Standard deviation: np.std(z) = 3.495711658589707
Sum of elements: np.sum(z) = 2.5

Numpy linear algebra#

The following examples illustrate various linear algebra operations in numpy

# Creating a different random matrix
A = np.random.randn(3, 2)
print(f"Random matrix A:\n{A}\n")
print(f"Transpose of A, A.T:\n{A.T}\n")  # Transpose of A

# Creating another random vector
z = np.random.randn(2)
print(f"Random vector z: {z}\n")

# Matrix-vector multiplication
print("Matrix-vector multiplication A.dot(z):")
print(f"{A.dot(z)}\n")

# Matrix-matrix multiplication
print("Matrix-matrix product A.dot(A.T):")
print(f"{A.dot(A.T)}")
print(f"\nA.T.dot(A):\n{A.T.dot(A)}")

# Creating a square matrix
B = np.random.randn(4, 4)
print(f"\nSquare matrix B:\n{B}")

# Inverse of the square matrix
invB = np.linalg.inv(B)
print(f"Inverse of B:\n{invB}\n")

# Determinant of the square matrix
detB = np.linalg.det(B)
print(f"Determinant of B: {detB}")

# Eigenvalues and eigenvectors of the square matrix
eigenvalues, eigenvectors = np.linalg.eig(B)
print(f"Eigenvalues of B:\n{eigenvalues}")
print(f"Eigenvectors of B:\n{eigenvectors}")
Random matrix A:
[[ 0.28981895  0.07727174]
 [ 0.16631152  1.44620734]
 [-0.34689874  0.35076048]]

Transpose of A, A.T:
[[ 0.28981895  0.16631152 -0.34689874]
 [ 0.07727174  1.44620734  0.35076048]]

Random vector z: [-0.47046493  0.04874463]

Matrix-vector multiplication A.dot(z):
[-0.13258307 -0.00774889  0.18030138]

Matrix-matrix product A.dot(A.T):
[[ 0.08996595  0.15995119 -0.07343396]
 [ 0.15995119  2.11917519  0.44957913]
 [-0.07343396  0.44957913  0.24337165]]

A.T.dot(A):
[[0.23199328 0.14123739]
 [0.14123739 2.22051951]]

Square matrix B:
[[-1.72292888  0.98959157  0.70769872 -0.84603348]
 [ 0.52683155 -0.56854281  0.02950835  0.28093564]
 [-0.22432955  1.76899462  0.12286362 -0.30401368]
 [ 0.55374889 -0.00507599  0.82779746 -1.05828807]]
Inverse of B:
[[-0.44706619  0.2326002   0.32578315  0.32555986]
 [-0.02155429  0.3170635   0.67898383 -0.09365152]
 [ 1.15929747  4.10803497  0.67168913 -0.02921107]
 [ 0.67298359  3.33350975  0.69260735 -0.79697305]]

Determinant of B: -0.889580527673561
Eigenvalues of B:
[ 0.44772238+0.j         -1.49391189+0.81316319j -1.49391189-0.81316319j
 -0.68679474+0.j        ]
Eigenvectors of B:
[[ 0.18142329+0.j         -0.23646964+0.59953388j -0.23646964-0.59953388j
  -0.37052008+0.j        ]
 [ 0.25747335+0.j          0.14071072-0.22524417j  0.14071072+0.22524417j
  -0.18381953+0.j        ]
 [ 0.80240484+0.j         -0.18365148+0.23726873j -0.18365148-0.23726873j
   0.56656601+0.j        ]
 [ 0.50689204+0.j          0.65121616+0.j          0.65121616-0.j
   0.71269082+0.j        ]]