Exploring Data Visualization With Bokeh

Exploring Data Visualization with Bokeh

Bokeh Logo


Exploring Data Visualization With Bokeh
Exploring Data Visualization With Bokeh

Introduction

Data visualization is an essential aspect of data analysis, allowing us to explore and communicate insights effectively. Python provides several powerful libraries for creating visualizations, and one such library is Bokeh. Bokeh is an interactive data visualization library that enables the creation of web-ready, interactive plots, dashboards, and applications. In this article, we will explore the various features and capabilities of Bokeh and learn how to create stunning visualizations using Python.

Table of Contents

  1. Understanding Data Visualization 1.1 Importance of Data Visualization 1.2 Types of Data Visualization

  2. Introducing Bokeh 2.1 Installing Bokeh 2.2 Importing Bokeh Modules

  3. Creating Basic Plots with Bokeh 3.1 Line Plots 3.2 Scatter Plots 3.3 Bar Plots

  4. Customizing Visualizations 4.1 Color Palettes 4.2 Font Styling 4.3 Adding Annotations 4.4 Customizing Tick Labels

  5. Interactive Visualizations with Bokeh 5.1 Adding Interactivity 5.2 Handling Events 5.3 Creating Dashboards

  6. Advanced Features 6.1 Bokeh Server 6.2 Streaming Data 6.3 3D Visualization

  7. Real-World Examples 7.1 Visualizing Stock Market Data 7.2 Analyzing Climate Data 7.3 Creating Geospatial Visualizations

  8. Tips and Tricks 8.1 Improving Performance 8.2 Publishing and Sharing Visualizations 8.3 Leveraging Bokeh Extensions

  9. Conclusion

1. Understanding Data Visualization

Data visualization is the graphical representation of data and information. It provides a visual context for understanding trends, patterns, and relationships within large datasets. By representing data visually, we can quickly grasp complex information, identify outliers, and communicate insights effectively.

1.1 Importance of Data Visualization

Data visualization plays a vital role in data analysis and communication. Here are some key reasons why data visualization is essential:

  • Simplify Complex Data: Visualizations simplify complex data by representing it in a visual format, making patterns and relationships more apparent.

  • Identify Trends and Patterns: Visualizations help identify trends and patterns that may not be apparent in raw data. By spotting these patterns, we can gain valuable insights.

  • Communicate Insights: Visualizations create a shared understanding of data among stakeholders. By presenting information visually, it becomes easier to communicate findings and make data-driven decisions.

1.2 Types of Data Visualization

Data visualization encompasses a wide range of techniques and plots. Some common types of data visualizations include:

  • Line Charts: Line charts display data points connected by lines and are useful for showcasing trends over time.

  • Bar Charts: Bar charts use bars of varying lengths to represent data and compare different categories.

  • Scatter Plots: Scatter plots show the relationship between two variables, with data points plotted on a graph.

  • Histograms: Histograms display the distribution, frequencies, or densities of a dataset.

  • Pie Charts: Pie charts divide a circle into sections to represent proportions of data.

  • Heatmaps: Heatmaps use colors to represent data values and show patterns or correlations.

2. Introducing Bokeh

Bokeh is an open-source Python library that provides elegant and powerful tools for creating interactive visualizations. Developed by the Bokeh Development Team, Bokeh’s primary goal is to help users build web-based dashboards, presentation slides, and applications with ease.

2.1 Installing Bokeh

Before we dive into Bokeh, let’s first ensure that it is installed in our Python environment. Open the terminal or command prompt and run the following command:

pip install bokeh

2.2 Importing Bokeh Modules

Once Bokeh is installed, we can import the necessary modules to get started:

from bokeh.plotting import figure, show
from bokeh.io import output_notebook

The figure module provides a convenient way to create and customize plots. The show function allows us to display the plot in the web browser. Lastly, we use the output_notebook function to enable the display of plots directly in the Jupyter Notebook.

3. Creating Basic Plots with Bokeh

Now that we have Bokeh set up, let’s explore some basic plot types and learn how to create them using Bokeh.

3.1 Line Plots

Line plots are commonly used to show trends and patterns over time or continuous data. To create a line plot with Bokeh, we need to define the x and y coordinates of the data points.

Here is an example of creating a basic line plot:

# Importing the necessary modules
from bokeh.plotting import figure, show

# Data
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

# Creating a new plot with figure() function
p = figure(title="Line Plot", x_axis_label="X-axis", y_axis_label="Y-axis")

# Adding the data to the plot
p.line(x, y)

# Showing the plot
show(p)

In this example, we define two lists, x and y, representing the x and y coordinates of the data points. We then create a new plot using the figure() function and pass optional arguments such as the title and axis labels. Next, we add the data to the plot using the line() function and finally display the plot using the show() function.

3.2 Scatter Plots

Scatter plots are useful for visualizing the relationship between two variables. Each data point is represented by a dot on a two-dimensional graph.

Here is an example of creating a basic scatter plot:

# Importing the necessary modules
from bokeh.plotting import figure, show

# Data
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

# Creating a new plot with figure() function
p = figure(title="Scatter Plot", x_axis_label="X-axis", y_axis_label="Y-axis")

# Adding the data to the plot
p.circle(x, y)

# Showing the plot
show(p)

In this example, we use the same figure() function to create a new plot. We then add the data using the circle() function, which plots a circle at each data point. Finally, we display the plot using the show() function.

3.3 Bar Plots

Bar plots are used to compare values across different categories. Each category is represented by a bar whose length corresponds to the value it represents.

Here is an example of creating a basic bar plot:

# Importing the necessary modules
from bokeh.plotting import figure, show

# Data
categories = ['A', 'B', 'C', 'D']
values = [10, 15, 8, 12]

# Creating a new plot with figure() function
p = figure(title="Bar Plot", x_range=categories, y_axis_label="Values")

# Adding the data to the plot
p.vbar(x=categories, top=values, width=0.9)

# Showing the plot
show(p)

In this example, we define two lists, categories and values, representing the categories and their corresponding values. We create a new plot using the figure() function and pass the x_range argument to specify the categories on the x-axis. We then add the data using the vbar() function, specifying the categories, values, and width of the bars. Finally, we display the plot using the show() function.

4. Customizing Visualizations

Bokeh provides a wide range of options for customizing visualizations. Let’s explore some of the customization options available.

4.1 Color Palettes

Color palettes play a crucial role in creating visually appealing visualizations. Bokeh provides several built-in color palettes that we can use to customize our plots.

# Importing the necessary modules
from bokeh.plotting import figure, show
from bokeh.palettes import Category10

# Data
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

# Creating a new plot with figure() function
p = figure(title="Line Plot", x_axis_label="X-axis", y_axis_label="Y-axis")

# Adding the data to the plot with a custom color palette
p.line(x, y, color=Category10[3][0])

# Showing the plot
show(p)

In this example, we import the Category10 color palette from the bokeh.palettes module. We then use the line() function to add the data to the plot, specifying the first color from the Category10 palette as the line color.

4.2 Font Styling

Bokeh allows us to customize various font styles such as the font family, size, and style of elements like titles, axis labels, and tick labels.

# Importing the necessary modules
from bokeh.plotting import figure, show
from bokeh.models import Title, Label

# Creating a new plot with figure() function
p = figure(title="Custom Fonts", x_axis_label="X-axis", y_axis_label="Y-axis")

# Customizing title font
p.title.text_font = "Arial"
p.title.text_font_size = "18pt"
p.title.text_font_style = "bold"

# Customizing axis label font
p.xaxis.axis_label_text_font = "Times New Roman"
p.xaxis.axis_label_text_font_size = "14pt"
p.yaxis.axis_label_text_font = "Times New Roman"
p.yaxis.axis_label_text_font_size = "14pt"

# Customizing tick label font
p.xaxis.major_label_text_font = "Courier"
p.xaxis.major_label_text_font_size = "12pt"
p.yaxis.major_label_text_font = "Courier"
p.yaxis.major_label_text_font_size = "12pt"

# Adding a custom label
label = Label(x=2, y=6, text="Custom Font", text_font="Verdana", text_font_size="14pt")
p.add_layout(label)

# Showing the plot
show(p)

In this example, we create a new plot and customize various font styles using the dot notation. We can set the text_font, text_font_size, and text_font_style attributes for the title, axis labels, and tick labels. Additionally, we can add a custom label using the Label model and specify the font properties.

4.3 Adding Annotations

Annotations provide additional information on specific data points or regions within a plot. Bokeh offers multiple annotation types, including labels, arrows, and bands.

# Importing the necessary modules
from bokeh.plotting import figure, show
from bokeh.models import Label, Arrow, NormalHead, Band

# Data
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

# Creating a new plot with figure() function
p = figure(title="Annotations", x_axis_label="X-axis", y_axis_label="Y-axis")

# Adding the data to the plot
p.line(x, y)

# Adding a label
label = Label(x=2, y=6, text="Important", text_font_size="12pt", border_line_color='black',
              background_fill_color='white')
p.add_layout(label)

# Adding an arrow
arrow = Arrow(end=NormalHead(fill_color='red'), line_color='black', x_start=3, y_start=8,
              x_end=4, y_end=8)
p.add_layout(arrow)

# Adding a band
band = Band(base=4, lower=7, upper=9, fill_alpha=0.3, fill_color='blue')
p.add_layout(band)

# Showing the plot
show(p)

In this example, we use the Label, Arrow, and Band models to add annotations to the plot. We specify the position and other properties for each annotation, such as the text, fill color, and line color.

4.4 Customizing Tick Labels

Bokeh allows us to control the appearance of tick labels on the axes, including their orientation and formatting.

# Importing the necessary modules
from bokeh.plotting import figure, show
from bokeh.models import AdaptiveTicker

# Data
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

# Creating a new plot with figure() function
p = figure(title="Custom Tick Labels", x_axis_label="X-axis", y_axis_label="Y-axis")

# Adding the data to the plot
p.line(x, y)

# Rotating x-axis tick labels
p.xaxis.major_label_orientation = "vertical"

# Formatting y-axis tick labels
p.yaxis.ticker = AdaptiveTicker()

# Showing the plot
show(p)

In this example, we create a new plot and customize the tick labels. We set the major_label_orientation attribute of the x-axis to “vertical” to rotate the x-axis tick labels. We also use the AdaptiveTicker model to automatically adjust the y-axis tick labels based on the data range.

5. Interactive Visualizations with Bokeh

One of the key strengths of Bokeh is its ability to create interactive visualizations. Let’s explore how we can add interactivity to our plots using Bokeh.

5.1 Adding Interactivity

We can add interactivity to our plots by enabling features such as zooming, panning, and tooltips.

# Importing the necessary modules
from bokeh.plotting import figure, show
from bokeh.models import HoverTool

# Data
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

# Creating a new plot with figure() function
p = figure(title="Interactive Plot", x_axis_label="X-axis", y_axis_label="Y-axis",
           tools="hover,pan,wheel_zoom,reset")

# Adding the data to the plot
p.line(x, y)

# Adding tooltips
hover = HoverTool(tooltips=[("Value", "@y")])
p.add_tools(hover)

# Showing the plot
show(p)

In this example, we create a new plot and enable interactivity by specifying the tools we want to use. We include features such as hover, pan, wheel zoom, and reset. We can also add tooltips to our plot by creating a HoverTool and passing a list of tooltips to display when hovering over the data points.

5.2 Handling Events

Bokeh provides event handling capabilities, allowing us to respond to different types of events, such as mouse clicks or mouse movement.

# Importing the necessary modules
from bokeh.plotting import figure, show
from bokeh.events import Tap

# Data
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

# Creating a new plot with figure() function
p = figure(title="Click Event", x_axis_label="X-axis", y_axis_label="Y-axis")

# Adding the data to the plot
p.line(x, y)

# Handling click events
def handle_event(event):
    print("Clicked at x={}, y={}".format(event.x, event.y))

p.on_event(Tap, handle_event)

# Showing the plot
show(p)

In this example, we create a new plot and handle a click event using the on_event() method. We specify the Tap event and define a callback function handle_event() that is triggered when the plot is clicked. In this case, we simply print the coordinates of the clicked point.

5.3 Creating Dashboards

Bokeh allows us to create interactive dashboards by combining multiple plots and widgets into a single layout.

# Importing the necessary modules
from bokeh.plotting import figure, show
from bokeh.layouts import layout
from bokeh.models import TextInput

# Data
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

# Creating plots
plot1 = figure(title="Plot 1", x_axis_label="X-axis", y_axis_label="Y-axis")
plot1.line(x, y)

plot2 = figure(title="Plot 2", x_axis_label="X-axis", y_axis_label="Y-axis")
plot2.scatter(x, y)

# Creating widgets
text_input = TextInput(value="Type here")

# Creating a layout
dashboard_layout = layout([[text_input], [plot1, plot2]])

# Showing the dashboard
show(dashboard_layout)

In this example, we create two plots, plot1 and plot2, and a TextInput widget. We then create a layout using the layout() function and specify the arrangement of plots and widgets. Finally, we display the dashboard layout using the show() function.

6. Advanced Features

Bokeh provides several advanced features that extend its capabilities even further. Let’s explore some of these advanced features.

6.1 Bokeh Server

The Bokeh server allows us to create highly interactive web-based applications and dashboards. With the Bokeh server, we can update plots and widgets based on user input or changing data in real-time.

# main.py
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource
from bokeh.io import curdoc
from bokeh.layouts import layout
from bokeh.models.widgets import Slider

# Creating a plot with a slider widget
plot = figure()
source = ColumnDataSource(data=dict(x=[0, 1, 2, 3], y=[0, 1, 4, 9]))
plot.line('x', 'y', source=source)

slider = Slider(start=0, end=10, value=0, step=1, title="Value")

def update_plot(attr, old, new):
    source.data = dict(x=[0, 1, 2, 3], y=[0, 1, new**2, 9])    

slider.on_change('value', update_plot)

# Creating a layout
dashboard_layout = layout([[slider], [plot]])

curdoc().add_root(dashboard_layout)

In this example, we create a plot with a slider widget using the Bokeh server framework. We define a callback function update_plot() that updates the plot’s data source whenever the slider’s value changes. Finally, we create a layout and add it to the curdoc() (current document) object.

To run the Bokeh server and display the application, open the terminal and execute the following command:

bokeh serve main.py

6.2 Streaming Data

Bokeh supports streaming data, allowing us to visualize and update real-time data dynamically.

# Importing the necessary modules
from bokeh.plotting import figure, show
from bokeh.models import ColumnDataSource
from random import randint
from bokeh.driving import count

# Creating a plot
plot = figure(title="Streaming Data", x_axis_label="X-axis", y_axis_label="Y-axis")
source = ColumnDataSource(data=dict(x=[], y=[]))
plot.line('x', 'y', source=source)

# Generating random data
@count()
def update_data(i):
    new_data = dict(x=[i], y=[randint(0, 10)])
    source.stream(new_data)

# Showing the plot
show(plot, notebook_handle=True)

# Updating the data
for i in range(10):
    update_data(i)

In this example, we create a plot and a ColumnDataSource object to store the data. We use the stream() method to update the data source with new data. The @count() decorator generates an incrementing value i for the update_data() function, and we update the data source with random values. Finally, we display the plot using the show() function, passing notebook_handle=True to enable live updates in Jupyter Notebook.

6.3 3D Visualization

Bokeh also supports 3D visualization, allowing us to create interactive 3D plots.

# Importing the necessary modules
from bokeh.plotting import figure, show
from bokeh.models import ColumnDataSource
from bokeh.layouts import layout

# Creating a 3D plot
plot = figure(title="3D Plot", x_axis_label="X-axis", y_axis_label="Y-axis", toolbar_location=None)
source = ColumnDataSource(data=dict(x=[1, 2, 3, 4], y=[5, 6, 7, 8], z=[9, 10, 11, 12]))
plot.scatter3d('x', 'y', 'z', source=source, size=10, color='red')

# Showing the plot
show(plot)

In this example, we create a 3D scatter plot using the scatter3d() function. We specify the x, y, and z coordinates of the data points as well as other optional arguments such as the size and color of the markers.

7. Real-World Examples

Let’s explore a few real-world examples to see how Bokeh can be used in practical scenarios.

7.1 Visualizing Stock Market Data

Bokeh can be used to create interactive visualizations of stock market data, allowing us to analyze trends and patterns effectively.

# Importing the necessary modules
from bokeh.plotting import figure, show
from bokeh.models import ColumnDataSource
import pandas as pd
import yfinance as yf

# Fetching stock data
symbol = 'AAPL'
data = yf.download(symbol, start='2022-01-01', end='2022-12-31')

# Preparing data for visualization
data['Date'] = pd.to_datetime(data.index)
source = ColumnDataSource(data)

# Creating a plot
p = figure(title="Stock Price", x_axis_label="Date", y_axis_label="Price")

# Adding lines for Open, High, Low, Close
p.line('Date', 'Open', source=source, legend_label="Open", color="blue")
p.line('Date', 'High', source=source, legend_label="High", color="green")
p.line('Date', 'Low', source=source, legend_label="Low", color="red")
p.line('Date', 'Close', source=source, legend_label="Close", color="orange")

# Adding a legend
p.legend.location = "top_left"
p.legend.title = "Stock Price"

# Showing the plot
show(p)

In this example, we use the yfinance library to fetch stock market data for the given symbol and time range. We convert the index to a Date column to make it compatible with Bokeh’s DatetimeAxis. We then create a plot and add lines for the opening, high, low, and closing prices. Finally, we display the plot using the show() function.

7.2 Analyzing Climate Data

Bokeh can be used to visualize climate data, allowing us to analyze temperature, humidity, and other environmental factors easily.

# Importing the necessary modules
from bokeh.plotting import figure, show
from bokeh.models import ColumnDataSource
import pandas as pd

# Loading climate data
data = pd.read_csv("climate_data.csv")

# Preparing data for visualization
data['Year'] = pd.to_datetime(data['Year'], format="%Y")
source = ColumnDataSource(data)

# Creating a plot
p = figure(title="Temperature Change", x_axis_label="Year", y_axis_label="Temperature (°C)")

# Adding a line for average temperature
p.line('Year', 'AverageTemperature', source=source, legend_label="Average Temperature", color="blue")

# Adding a legend
p.legend.location = "bottom_right"
p.legend.title = "Temperature"

# Showing the plot
show(p)

In this example, we load climate data from a CSV file and convert the “Year” column to a Datetime type for compatibility with Bokeh’s DatetimeAxis. We create a plot and add a line representing the average temperature over the years. Finally, we display the plot using the show() function.

7.3 Creating Geospatial Visualizations

Bokeh has geospatial capabilities, enabling us to create interactive visualizations on maps.

# Importing the necessary modules
from bokeh.plotting import figure, show
from bokeh.models import GeoJSONDataSource
from bokeh.tile_providers import get_provider

# Load geospatial data
with open('countries.geojson') as f:
    geojson_data = json.load(f)

# Prepare data for visualization
source = GeoJSONDataSource(geojson=geojson_data)

# Create a plot
p = figure(title='World Map', tools='wheel_zoom,pan,reset', active_drag='pan',
           x_axis_location=None, y_axis_location=None, width=800, height=500)

# Add map tile
p.add_tile(get_provider('CARTODBPOSITRON'))

# Add geospatial data
p.patches('xs', 'ys', fill_alpha=0.7, fill_color='blue', line_color='white', line_width=0.5, source=source)

# Show the plot
show(p)

In this example, we load geospatial data from a GeoJSON file. We use the GeoJSONDataSource to convert the data into a format that Bokeh can understand. We create a plot and add a map tile using the get_provider() function. Finally, we add the geospatial data to the plot by specifying the coordinates and other attributes.

8. Tips and Tricks

Here are some tips and tricks to help you make the most out of Bokeh:

8.1 Improving Performance

Bokeh provides various options for improving the performance of your visualizations:

  • Data Aggregation: When dealing with large datasets, aggregating the data can significantly improve performance. Consider summarizing the data before visualizing it, especially when creating interactive plots.

  • Downsampling: Downsampling reduces the number of data points to improve performance while preserving the overall pattern. Bokeh provides a downsample() function that you can use to downsample your data.

  • Throttling Updates: If you have live data streaming, you can throttle the updates to a specified rate by using the add_periodic_callback() function with a desired interval.

8.2 Publishing and Sharing Visualizations

Bokeh provides several options for publishing and sharing your visualizations:

  • Embedding: Bokeh visualizations can be embedded in HTML pages, Jupyter Notebooks, or standalone web applications.

  • Saving: You can save your Bokeh plots to various formats, including static images (PNG, SVG) or interactive HTML files.

  • Exporting: Bokeh plots can be exported directly to external services like GitHub Pages or other web hosting platforms.

  • Server Deployment: The Bokeh server allows you to deploy your applications and dashboards as web applications in production environments.

8.3 Leveraging Bokeh Extensions

Bokeh extensions enable the customization of Bokeh plots and widgets using other JavaScript libraries or CSS. You can create powerful and unique visualizations by leveraging these extensions.

9. Conclusion

Bokeh is a versatile library for creating interactive and visually appealing data visualizations with Python. In this article, we covered the basics of Bokeh and explored its various features, including basic plots, customization options, interactivity, and advanced capabilities. We also looked at real-world examples to understand how Bokeh can be used in practical scenarios. By harnessing the power of Bokeh, you can create stunning visualizations that make data analysis more engaging and insightful. So why wait? Start exploring Bokeh and unlock the potential of interactive data visualization in Python.

Share this article:

Leave a Comment