Data Visualization With Matplotlib

Data Visualization with Matplotlib in Python

Data visualization is a key aspect of any Data Science or machine learning project. It helps in exploring data, understanding patterns, deriving insights, and making informed decisions. Drawing upon the power of visuals, Python provides a sweeping landscape of libraries such as Matplotlib, seaborn, and bokeh for data visualization. Among these, Matplotlib is one of the most commonly used plotting libraries in the Python programming language.


Data Visualization With Matplotlib
Data Visualization With Matplotlib

This article aims to provide a comprehensive overview of visualizing data with Matplotlib, covering both basic and advanced aspects with clear, practical examples. Whether you’re a beginner dabbling in data or an experienced Python enthusiast, this article will equip you with the necessary knowledge and skills to produce insightful, engaging visualizations.

Table of Contents

  • Introduction to Matplotlib
  • Matplotlib Architecture
  • Basic Plotting with Matplotlib
  • Creating Multiplots
  • Histograms, Bar Plots, and Pie Charts
  • Advanced Plotting: 3D Charts
  • Customizing Plots
  • Saving and Displaying Plots

Introduction to Matplotlib

Matplotlib, a brainchild of John D. Hunter, was created as an attempt to bring MATLAB’s plotting capabilities to Python. It is an open-source library that provides a flexible platform for creating all sorts of visualization products. It can generate plots, histograms, power spectra, bar charts, error charts, scatterplots, and much more, all with a few lines of code.

import matplotlib.pyplot as plt
plt.plot([1, 2, 3, 4])
plt.ylabel('Numbers')
plt.show()

After executing the above code, you will get a simple plot of numbers from one to four. The plot() function is used to draw points (markers) in a diagram where the argument contains coordinates of the points starting from the bottom left.

Matplotlib Architecture

Matplotlib is architecturally composed of three main layers:

  • Scripting Layer: The scripting layer contains pyplot, which we use for most of our plotting. This layer is regarded as the front-end and the entry point to the structured world of Matplotlib.
  • Artist Layer: This layer is used to add more functionality and is the place where much of the heavy lifting happens. It is the core which takes care of how to draw figures.
  • Backend Layer: This contains all the things at the top of the architecture and is designed to draw and render, it encapsulates all the backends.

Matplotlib Architecture

Basic Plotting with Matplotlib

Before visualizing, it’s essential to understand the basic structure of a plot in Matplotlib. – Figure is the whole image or page on which everything is drawn – Each figure can contain several Axes. These are what we refer to as subplots – Each Axes usually has an x-axis and a y-axis, labels, title, and finally, it is where we plot our data

Coming back to the basic plotting, let’s discuss various types of plots.

Line Plot

A Line plot can be defined as a graph that displays data along a number line. Line plots also known as line graphs are used to display trends over time. Here is a simple example:

import matplotlib.pyplot as plt
plt.plot([2,3,4,5])
plt.xlabel('Actual birth weight')
plt.ylabel('Estimated birth weight')
plt.show()

This will create a line plot of estimations of birth weight given the actual weights.

Creating Multiplots

Multiplots are a way to plot multiple plots on a single figure.

import numpy as np
import matplotlib.pyplot as plt

x = np.linspace(0, 2 * np.pi, 400)
y = np.sin(x ** 2)

# A figure with just one sub-plot
fig, ax = plt.subplots()
ax.plot(x, y)
ax.set_title('A single plot')

# A figure with two sub-plots 
fig, (ax1, ax2) = plt.subplots(2)
fig.suptitle('Vertically oriented sub-plots')
ax1.plot(x, y)
ax2.plot(x, -y)

# A figure with two subplots with shared y-axis 
fig, (ax1, ax2) = plt.subplots(2, sharey=True)
fig.suptitle('Sub-plots with shared y-axis')
ax1.plot(x, y)
ax2.plot(x + 1, -y)

Histograms, Bar Plots, and Pie Charts

Histograms

Histograms are a great tool to quickly assess a great amount of data points that have been split into logical ranges.

import numpy as np
import matplotlib.pyplot as plt
np.random.seed(19680801)
mu, sigma = 100, 15
x = mu + sigma * np.random.randn(10000)
plt.hist(x, bins=50, density=1, alpha=0.75)
plt.xlabel('Smarts')
plt.ylabel('Probability')
plt.title('Histogram of IQ')
plt.text(60, .025, r'$\mu=100,\ \sigma=15$')
plt.axis([40, 160, 0, 0.03])
plt.grid(True)
plt.show()

Bar Plots

Bar charts are useful for comparing quantities corresponding to different groups or categories.

import matplotlib.pyplot as plt

# Look at index 4 and 6, which demonstrate overlapping cases.
bars = plt.bar(['group1', 'group2', 'group3', 'group4', 'group5'], [3,10,7,5,2])

plt.show()

Pie Charts

Pie charts can be easily plotted using the pie() function and the size of the slice will be proportional to the data.

import matplotlib.pyplot as plt

# Pie chart, where the slices will be ordered and plotted counter-clockwise:
labels = 'Frogs', 'Hogs', 'Dogs', 'Logs'
sizes = [15, 30, 45, 10]
explode = (0, 0.1, 0, 0)  

fig1, ax1 = plt.subplots()
ax1.pie(sizes, explode=explode, labels=labels, autopct='%1.1f%%',
        shadow=True, startangle=90)
ax1.axis('equal')  # Equal aspect ensures pie chart to be drawn as a circle.

plt.show()

3D Plotting

In matplotlib, we can create 3D plots by importing the mplot3d toolkit, included with the main Matplotlib installation.

import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits import mplot3d

x = np.outer(np.linspace(-2, 2, 30), np.ones(30))
y = x.copy().T 
z = np.cos(x ** 2 + y ** 2)

fig = plt.figure()
ax = plt.axes(projection='3d')

ax.plot_surface(x, y, z, cmap='viridis')
ax.set_title('3D line plot using ax.plot3D and ax.scatter3D')
plt.show()

Customizing Plots

Matplotlib is highly customizable and provides functions to change and manipulate the properties of plots. Here is an example:

plt.title("My Title") # Title of the plot
plt.xlim(-1, 1)       # Range of x-axis
plt.ylim(-1, 1)       # Range of y-axis
plt.xlabel("X-Axis")  # Label of x-axis
plt.ylabel("Y-Axis")  # Label of y-axis

Saving and Displaying Plots

Matplotlib provides the savefig() function to save the figures as various image file types.

plt.savefig('my_figure.png') 

After this, you can use the show() function to display the figure.

plt.show()

In conclusion, Matplotlib is an excellent library for developing insights from data. It’s versatile, meaning it can plot anything, providing valuable qualitative context to complement your quantitative findings. I hope this article gave you an idea of how to start visualizing your data using matplotlib, and I’d definitely recommend diving in and giving it a go yourself!

Share this article:

Leave a Comment