Python’s NumPy library is fun in that it’s easy to work with multi-dimensional data. For simplicity, consider a 2D array (aka matrix).
I wrote some code to demonstrate the creation, simple visualization, slicing, and aggregation of data within a matrix, including totals and slice-subtotals.
Source Code:
It is available in Git Hub: NumPy 2D Array Slice Aggregation.
Here we go…
Step-by-Step Demonstration of Python Code and Results
# Create matrix
matrix = np.arange(6,31).reshape(5,5)
matrix
# Results: Note that two level [ [ … ] ] brackets means two dimensional array (metrix)
#array([[ 6, 7, 8, 9, 10],
# [11, 12, 13, 14, 15],
# [16, 17, 18, 19, 20],
# [21, 22, 23, 24, 25],
# [26, 27, 28, 29, 30]])
#
#
# Find mean (average) value for each column.
matrix.mean(axis=0)
# Results…
#array([16., 17., 18., 19., 20.])
#
#
# Now, mean values per row, and reshape for vertical output
matrix.mean(axis=1).reshape(5,1)
# Results:
#array([[ 8.],
# [13.],
# [18.],
# [23.],
# [28.]])
#
#
# Slice matrix just to rows 1 – 3, columns 2 & 3
matrix[:3,1:3]
# Results…
#array([[ 7, 8],
# [12, 13],
# [17, 18]])
#
#
#Sum the slice on each column
matrix[:3,1:3].sum(axis=0)
# Results…
#array([36, 39])
#
#
#Sum the entire slice
matrix[:3,1:3].sum()
# Results…
#75
#
#
#Sum the slice on each row, and reshape
matrix[:3,1:3].sum(axis=1).reshape(3,1)
# Results…
#array([[15],
# [25],
# [35]])
#
#
#As above, sum the whole slice as visual check
# that row and column aggregates add up.
matrix[:3,1:3].sum()
# Results: Same as above
#75
#
#
# # # All done! # # #
As we’ve seen here, a grasp of NumPy fundamentals makes creating, slicing, and aggregating multi-dimensional data straightforward.
With that, my journey continues into Python Pandas.