# Numpy Tutorial For Beginners

## What is ⚡ NumPy

**⚡** **NumPy****, which stands for Numerical Python, is an** **opensource** **library that allows users to store large amounts of data using less memory and perform extensive operations (mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation, etc) easily using homogenous, one-dimensional, and multidimensional arrays**.

The basic data structure of NumPy is a ndarray, similar to a list.

💡 An array in NumPy is a data structure organized like a grid of rows and columns, containing values of the same data type that can be indexed and manipulated efficiently as per the requirement of the problem.

## Difference between NumPy and Python standard List

The three most important differences between NumPy arrays and standard Python sequences are:

NumPy Array | Python Sequences (list, tuple, range) | |

Creation Size | Fixed size | Python list can grow dynamically |

Datatype | Elements are of same datatype | Elements can be of multiple datatypes |

Speed | Fast as its partially written in C | Slower compared to NumPy |

## Why use Numpy: Computation time

A python list can very well perform all the operations that NumPy arrays perform; it is simply a fact that NumPy arrays are faster ⚡ and convenient when it comes to large complex computations.

Let's add two matrix of 9 million elements each to see the computation time.

```
import time
import numpy as np
# python standard list
list_A = [i for i in range(1,9000000)]
list_B = [j**2 for j in range(1,9000000)]
t0 = time.time()
sum_list = list(map(lambda x, y: x+y, list_A, list_B))
t1 = time.time()
list_time = t1 - t0
print ("Time taken by Python standard list is ",list_time)
# numpy array
array_A = np.arange(1,9000000)
array_B = np.arange(1,9000000)
t0 = time.time()
sum_numpy = array_A + array_B
t1 = time.time()
numpy_time = t1 - t0
print ("Time taken by NumPy array is ",numpy_time)
print("The ratio of time taken is {}".format(list_time//numpy_time))
```

```
Time taken by Python standard list is 0.6801159381866455
Time taken by NumPy array is 0.04106783866882324
The ratio of time taken is 16.0
```

You can notice that NumPy is a lot faster than the list. Below is a table to show the difference between the python standard list and NumPy computation speed on different operations.

Size of each matrix | Type of operation | Time taken by list | Time taken by numpy | Ratio (List Time / Numpy Time) |

9 million | Addition (+) | 0.56s | 0.017s | 32.0 |

9 million | Subtraction (-) | 0.61s | 0.016s | 36.0 |

9 million | Multiplication (*) | 0.69s | 0.016s | 42.0 |

9 million | Division (/) | 0.51s | 0.022s | 23.0 |

From the above table, we can conclude that NumPy is a lot faster than the python standard list. In the real world when the data is in billions and the operation are more complex, this ratio will be even bigger.

## Installing NumPy

To start working with NumPy, you need to install it and you can't go wrong if you follow instructions from numpy official website.

[Optional]: Follow this guide to install python, if you don't have it already installed. It's not required but it's ideal to install python packages inside a virtual environment to avoid version-related conflicts in the future.

## Basics of Numpy

As a prerequisite, you will need to know beginner-level python. See this Python tutorial for refreshing your concepts.

In the above image array is an object of ndarray class of the NumPy library.

Whenever you work with a dataset, the first step is to get an idea about the dataset array. Four important attributes of NumPy array to get information about the dataset are:

.ndim: returns number(int) of dimensions (axis) of the array.

.shape: returns a tuple of

**n**rows and**m**column (n,m)..size: returns a number(int) of total elements in the array.

.dtype: returns an object of

**numpy.dtype**that describes the type of elements in the array.

Below is a code snippet of the attributes described above.

```
array = np.array([[1,2,3],[4,5,6]]) # Creating NumPy array from list
print("Dimension: ",array.ndim, type(array.ndim))
print("Shape: ",array.shape, type(array.shape))
print("Size: ",array.size, type(array.size))
print("Datatype: ",array.dtype, type(array.dtype))
print("Itemsize: ",array.itemsize, type(array.itemsize))
print("Data: ",array.data, type(array.data))
```

```
Dimension: 2 <class 'int'>
Shape: (2, 3) <class 'tuple'>
Size: 6 <class 'int'>
Datatype: int64 <class 'numpy.dtype[int64]'>
Itemsize: 8 <class 'int'>
Data: <memory at 0x7f2d807312b0> <class 'memoryview'>
```

### Array Creation

A NumPy array is created by passing an array-like data structure such as python's list or a tuple.

Let's create a **0-D**, **1-D**, **2-D**, and a **3-D** array from a list.

0-D array:

`np.array(11)`

1-D array:

`np.array([1, 2, 3, 4, 5])`

2-D array:

`np.array([[1, 2, 3], [4, 5, 6]])`

3-D array:

`np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])`

```
array_0D = np.array(11)
array_1D = np.array([1, 2, 3, 4, 5])
array_2D = np.array([[1, 2, 3], [4, 5, 6]])
array_3D = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])
print(array_0D)
print(array_1D)
print(array_2D)
print(array_3D)
```

```
11
[1 2 3 4 5]
[[1 2 3]
[4 5 6]]
[[[1 2 3]
[4 5 6]]
[[1 2 3]
```

Like the python standard list, here are 7 ways to create a NumPy array.

.array([1,2,3]): Returns array from list.

.array((1.1,2.2,3.3)): NumPy array from tuple.

.zeros((2,3)): Returns array filled with zeros (2 rows, 3 columns).

.ones((2,3)): NumPy array filled with ones (2 rows, 3 columns).

.empty((2,4)): Returns array of arbitary data of given shape and type.

.arange((2,10,2)): Returns evenly spaced values within a given range. Similar to python range().

.linspace((2,4,9)): Return evenly spaced 9 numbers between 2 and 4.

```
array_list = np.array([1,2,3], dtype=int) # From List
array_tuple = np.array((1.1,2.2,3.3)) # From Tuple
array_zeroes = np.zeros((2,3)) # Array of zeroes: 2 rows and 3 columns
array_ones = np.ones((2,3)) # Array of ones: 2 rows and 3 columns
array_empty = np.empty((2,4)) # Array of zeroes: 2 rows and 3 columns
array_arange = np.arange(2,10,2) # Similar to python range()
array_linspace = np.linspace(2,4,9) # Array of 9 numbers between 2 and 4
```

Just like **dtype=int** parameter, you can make use of others parameters like **copy**, **order**, **subok**, **ndim**, **like**. You can explore other NumPy arrays parameters.

Let's practice some methods to create arrays

💡 Tip: Use

helpto see syntax when required

```
help(np.zeros)
```

```
array([[ 0.],
[ 0.]])
>>> s = (2,2)
>>> np.zeros(s)
array([[ 0., 0.],
[ 0., 0.]])
>>> np.zeros((2,), dtype=[('x', 'i4'), ('y', 'i4')]) # custom dtype
array([(0, 0), (0, 0)],
dtype=[('x', '<i4'), ('y', '<i4')])
```

Create a **1D** array of ones.

```
arr = np.ones(9)
print(arr)
print(arr.dtype)
```

```
[1. 1. 1. 1. 1. 1. 1. 1. 1.]
float64
```

Notice that, by default, NumPy creates a data type **float64**. Let's provide dtype explicitly.

```
arr = np.ones(9, dtype=int)
print(arr)
print(arr.dtype)
```

```
[1 1 1 1 1 1 1 1 1]
int64
```

Create a **4x3** array of **zeroes**.

```
arr = np.ones((4,3), dtype=int)
print(arr)
```

```
[[1 1 1]
[1 1 1]
[1 1 1]
[1 1 1]]
```

Create an array of **integers between 3 to 7**.

```
arr = np.arange(4,7)
print(arr)
```

```
[4 5 6]
```

Create an array of integers from **5 to 20 with a step of 2**

```
arr = np.arange(5,21,2)
print(arr)
```

```
[ 5 7 9 11 13 15 17 19]
```

Create an array of **random integers of size 10**.

```
arr = np.random.randint(5,size=10)
print(arr)
```

```
[3 2 2 0 4 0 1 3 2 0]
```

Create an array of **random integers between 6 and 9 of size 10**.

```
arr = np.random.randint(7,9,size=10)
print(arr)
```

```
[8 8 7 7 8 8 8 7 7 7]
```

Create a **2x3** 2D array of random numbers.

```
arr = np.random.random([2,3])
print(arr)
```

```
[[0.9664729 0.33623868 0.52633769]
[0.80454667 0.68146984 0.08063325]]
```

Create an array of **size 10 between 1.5 and 2**.

```
arr = np.linspace(1.5,2,10)
print(arr)
```

```
[1.5 1.55555556 1.61111111 1.66666667 1.72222222 1.77777778
1.83333333 1.88888889 1.94444444 2. ]
```

That's all for the basic ways of creating arrays. You can also explore these other 4 ways to create arrays as well:

.full(): Create a constant array of any number ‘n’

.tile(): Create a new array by repeating an existing array for a particular number of times

.eye(): Create an identity matrix of any dimension

.random.randint(): Create a random array of integers within a particular range

### Basic Operations

NumPy can perform a variety of operations, the very basics include, addition, subtraction, and multiplication. Below are a few basic operations that can be done in NumPy without using loops.

**Create** a NumPy array to store the marks of 5 students.

```
marks = [1, 2, 3, 4, 5]
marks_np = np.array(marks)
print(marks_np)
```

```
[1 2 3 4 5]
```

**Add** marks of 5 subjects of two different students.

```
marks_A = [10,20,10,20,14]
marks_B = [23,12,43,12,43]
marks_np_A = np.array(marks_A)
marks_np_B = np.array(marks_B)
total = marks_np_A + marks_np_B # Add using + operator
print(total)
```

```
[33 32 53 32 57]
```

**Convert** weight of 5 students from kg to gram

```
weight = [45, 55, 53, 63, 60] # In KG
weight_np = np.array(weight)
weight_in_gram = weight_np * 1000 # 1kg = 1000gm
print(weight_in_gram)
```

```
[45000 55000 53000 63000 60000]
```

**Calculate** the BMI of 5 students. To calculate BMI we need

Two arrays of height and weight

Apply the formulae

**weight_in_kg / (height_in_m ** 2)**

```
heights_in_inch = [71,72,73,74,75]
weights_in_lbs = [195, 180, 250, 230, 200]
```

First, let's convert height from inch to meter and weight lbs to kg

```
height_in_m = np.array(heights_in_inch) * 0.0254
weight_in_kg = np.array(weights_in_lbs) * 0.453592
```

Now, we have converted the array into the right units, let's calculate BMI

```
BMI = weight_in_kg / (height_in_m ** 2)
print("BMI",BMI)
```

```
BMI [27.19667848 24.41211827 32.98315848 29.52992539 24.99800911]
```

Here is a list of 5 common basic functions in NumPy ndarray:

.sum: returns sum of elements over a given axis

.min: return minimum number along a given axis.

.max: return maximum number along a given axis.

.cumsum: return cumulative sum of elements along a given axis.

.mean: return average of elements along a given axis.

NumPy also provides universal functions like **sin**, **cos**, and **exp**, these are also called **ufunc**.

### Indexing, Slicing, and Iterating

```
bmi_first_element = BMI[0] #First Element
bmi_last_element = BMI[1] # second element
bmi_first_five_elements = BMI[0:5] # elements 1-5
bmi_last_five_elements = BMI[-1:] # elements 1-5 from the last
```

Filter BMI array where BMI > 23

```
# Conditional Filter
BMI_filtered = BMI[BMI > 23]
print(BMI_filtered)
```

```
[27.19667848 24.41211827 32.98315848 29.52992539 24.99800911]
```

Now you know the basics to work with a NumPy array and you should be able to create arrays and perform operations on them.