CS602 – Data-Driven Development with – Fall’19 Handout 9
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
CS602 – Data-Driven Development with – Fall’19
Handout 9
NumPy package
NumPy package (https://docs.scipy.org/doc/numpy/) provides efficient operations on arrays of numerical data. Many other packages, including scientific computation, use NumPy.
import numpy import numpy as np
Package defines following data types:
integer int8, int16, int32, int64, uint8, . . .
float float16, float32, float64, . . . complex complex64, complex128, . . . boolean bool8
NDARRAY
multidimensional, homogeneous array of items (a.k.a. numpy array).
import numpy as np
>>> a = np.array([10, 20, 30, 40, 50])
>>> type(a) numpy.ndarray
>>> a
array([10, 20, 30, 40, 50])
>>> a*2
array([ 20, 40, 60, 80, 100])
>>> a
array([10, 20, 30, 40, 50])
>>> a = a*3
>>> a
array([ 30, 60, 90, 120, 150])
>>> b = np.array(range(1, 6))
>>> b
array([1, 2, 3, 4, 5])
>>> c = a + b
>>> c
array([ 31, 62, 93, 124, 155])
>>> e = np.array([(1.5,2,3), (4,5,6)])
>>> e
array([[1.5, 2. , 3. ],
[4. , 5. , 6. ]])
>>> e[1,1] 5.0
>>> e[1][1] 5.0
>>> type(e) numpy.ndarray
NDARRAY ATTRIBUTES
The more important attributes of an ndarray object are:
ndarray.ndim - the number of axes (dimensions) of the array.
ndarray.shape - the dimensions of the array. This is a tuple of integers indicating the size of the array in each dimension. For a matrix with n rows and mcolumns, shape will be (n,m). The length of the shape tuple is therefore the number of axes, ndim.
ndarray.size - the total number of elements of the array. This is equal to the product of the elements of shape.
ndarray.dtype - an object describing the type of the elements in the array. One can create or
specify dtype’s using standard Python types. Additionally NumPy provides types of its own.
>>> e
array([[1.5, 2. , 3. ],
[4. , 5. , 6. ]])
>>> e.ndim
2 # two dimensions:
>>> e.shape
(2, 3) # 2 rows, 3 columns
>>> e.dtype
dtype('float64') # type of each element
>>> e.size
6 # number of elements = 2*3
CREATE NDARRAY
There are a number of ways to create and initialize new numpy arrays, for example from
– a file
– a Python list or tuples
– using functions that are dedicated to generating numpy arrays, such as arange, linspace, random,randint etc.
– reading data from files
** Note - these descriptions are not complete
All functions below return an ndarray
numpy.loadtxt (filepath) |
Returns an ndarray of elements from file. Each line designates a row. |
numpy.linspace(start, stop, num=50) |
Returns an ndarray of num evenly spaced samples, calculated over the interval [start, stop]. |
numpy.arange([start,]stop,[step]) |
Returns an ndarray of evenly spaced values. For floating point arguments, the length of the result is ceil((stop - start)/step). Because of floating point overflow, this rule |
may result in the last element being greater than stop.
numpy.full(shape,value) |
Returns array of given shape, filled with value |
numpy.zeros(shape,dtype=float) |
Returns ndarray of zeros with the given shape, dtype. |
numpy.ones(shape,dtype=float) |
Returns ndarray of ones with the given shape, dtype. |
numpy.zeros_like(a, dtype=None) numpy. ones_like(a, dtype=None) |
Return an array of zeros/ones with the same shape and type as a given array a. |
numpy.random.randint(low, high=No ne, size=None) |
Returns ndarray of ints; size-shaped array of random integers from the appropriate distribution, or a single such random int if size not provided. size is a int or tuple of ints, optional defining the shape of the ndarray |
numpy.random.choice(a, size=None, p = None) |
Returns a size-shaped array of randomly picked elements from a. size is a int or tuple of ints, optional defining the shape of the ndarray. p provides probabilities associated with every element; must add to 1. |
numpy.reshape(a, shape): ndarray |
Returns ndarray with the same data as a, but dimensions according to the specified shape. |
>>> np.random.seed() # to start off the randomizer
>>>h = np.random.randint(100, 120, 5)
>>>h
array([106, 113, 116, 113, 106])
>>>k = np.random.randint(100, 120, size = (3, 6))
>>>k
array([[101, |
117, |
112, |
112, |
112, |
101], |
[119, |
103, |
107, |
101, |
100, |
109], |
[106, |
107, |
112, |
114, |
115, |
102]]) |
>>> m = np.random.choice(['red', 'blue', 'green' ], size = [2,3])
>>> m
array([['red', 'green', 'red'],
['red', 'blue', 'blue']], dtype='
>>> n = np.random.choice(['red', 'blue', 'green'], size = [4,7], p = (.1, .6, .3))
>>> n
array([['green', 'green', 'green', 'blue', 'blue', 'green', 'blue'],
['red', 'blue', 'blue', 'green', 'blue', 'blue', 'blue'],
['blue', 'blue', 'blue', 'blue', 'red', 'blue', 'green'],
['blue', 'blue', 'blue', 'green', 'blue', 'blue', 'green']], dtype='
>>> m = np.loadtxt('midtermData1.txt')
Practice problems:
Generate an ndarray to store:
· a 30 x 5 matrix of 0s
· a vector of 17 random integers in the 18-95 range
· a 10x10 matrix filled with about 50% of zeros and 50% of ones
· a 20x3 matrix of even values starting from 66 and up
SLICING
Indexing and slicing use usual slicing syntax Difference from lists:
- slices for the various dimensions separated by comma, e.g. x[0:2, 0:3]
- the result is not a copy of the portion of the original data, but an ndarray view on the portion of the data
- changing the content of the view changes the original ndarray
>>> lst = [[row*10+col for col in range (4) ] for row in range (5)]
>>> x = np.array(lst)
>>> x
array([[ 0, |
1, |
2, |
3], |
[10, |
11, |
12, |
13], |
[20, |
21, |
22, |
23], |
[30, |
31, |
32, |
33], |
[40, |
41, |
42, |
43]]) |
>>> x[0:2, 0:3] # rows 0, 1 and cols 0,1,2
array([[ 0, 1, 2],
[10, 11, 12]])
>>> x[:, 3] # col 3
array([ 3,
13,
23,
33,
43])
>>> x[::2 , 0]
array([ 0,
20,
40])
Practice problems:
1. what will be printed by the following code?
lst = [[row*10+col for col in range (4) ] \ for row in range (5)]
x = np.array(lst)
y = np.arange(21).reshape(3,7) print('x = \n', x, '\n y = \n',y)
z = np.full((2,2), 8)
x [:2,:2] = z
y [1:3, -2:] = z
print('z= \n', z, '\n x = \n', x, '\n y = \n',y)
2. Given array x defined in the example above, what is the result of evaluating each of the following expressions:
1. x[3,3]
2. x[1,:]
3. x[:, 3]
4. x[1:3, :]
5. x[::3, 2:]
3. Generate the matrix shown below and write down the expression to generate the highlighted parts:
VECTORIZATION AND UFUNC
Vectorized operations – operations that are applied to each element of the array, no loops required.
>>> a = np.array([10, 20, 30, 40, 50])
>>> b = np.array(range(1, 6))
>>>a
array([10, 20, 30, 40, 50])
>>>b
array([1, 2, 3, 4, 5])
>>>2*a +3
array([ 23, 43, 63, 83, 103])
>>>c = a-2*b
>>>c
array([ 8, 16, 24, 32, 40])
>>> c % 3
array([2, 1, 0, 2, 1], dtype=int32)
>>> matrix = np.random.rand(3,5) *100+50
array([[102.67458931, |
130.42761583, |
140.99279117, |
82.19500047, |
137.34975161], |
[145.03804322, |
53.90727239, |
146.04476238, |
114.44195163, |
70.12476056], |
[122.61003385, |
112.72862743, |
87.3816173 , |
106.72014036, |
70.37626365]]) |
>>> matrix = np.round(matrix, 2) |
array([[102.67, |
130.43, |
140.99, |
82.2 , |
137.35], |
[145.04, |
53.91, |
146.04, |
114.44, |
70.12], |
[122.61, |
112.73, |
87.38, |
106.72, |
70.38]]) |
2022-11-24