- Rostyslav Sipakov
Visualizing data is an art in which people are either talented or not. The good news for you is that Python has a library called Seaborn, which provides high-level tools such as heatmaps to visualize your data and make correlations with it more leisurely. This blog post will show how to use seaborn.heatmap function to do just that!
Also, check the post's footer for an easy way to run your Jupyter Notebook in the Google Colaboratory. "Google Colab" is available for free to anyone with a Google account.
The first step is to read the data set. To do this, we'll use the Pandas library.
# Importing libraries import seaborn as sns import numpy as np import pandas as pd import matplotlib.pyplot as plt # Additional libraries that needed to loading a .csv file from GitHub in Python import requests import io
Loading .csv file in Python from GitHub
The second step is downloading data from .csv file hosted in the public repository on GitHub.
# Downloading the csv file from GitHub (make sure the url is the raw version of the file on GitHub)** url = "https://raw.githubusercontent.com/rsipakov/PythonProjectsShared/master/seaborn/SO2_HCHO_correlation/Dataset_SO2_HCHO_post_20_Kyiv_2017.csv" download = requests.get(url).content # Reading the downloaded content and turning it into a pandas dataframe SO2_df = pd.read_csv(io.StringIO(download.decode('utf-8')))
If you need to download data from a private repository, you need to use a personal access token.
# Username of your GitHub account github_username = 'YOUR GITHUB USERNAME' # Personal Access Token (PAO) of your GitHub account personal_token = 'YOUR GITHUB PAO' # Creates a reusable session object that includes your GitHub credentials. github_session = requests.Session() github_session.auth = (github_username, personal_token) # Downloading the csv file from your private repository on GitHub (make sure the url is the raw version of the file on GitHub)** url = "https://raw.githubusercontent.com/rsipakov/PythonProjectsShared/master/seaborn/SO2_HCHO_correlation/Dataset_SO2_HCHO_post_20_Kyiv_2017.csv" download = requests.get(url).content # Reading the downloaded content and turning it into a pandas dataframe SO2_df = pd.read_csv(io.StringIO(download.decode('utf-8')))
This is how the DataFrame looks
# View the first five rows of the data SO2_df.head()
# Generate the correlation matrix SO2_df.corr()
# Output data correlation into .xlsx file cr1 = SO2_df.corr() cr1.to_excel("output_SO2_HCHO.xlsx")
# Generate a heatmap using .corr() function sns.heatmap(SO2_df.corr())
# Save heatmap in the .png format sns.heatmap(SO2_df.corr()) plt.savefig('heatmap_SO2_HCHO.png', transparent=True)
One more, basic seaborn.scatterplot()
# Generate a scatterplot sns.scatterplot(x='SO2', y='HCHO', data=SO2_df)
# Save scatterplot in the .pdf format sns.scatterplot(x='SO2', y='HCHO', data=SO2_df) plt.savefig('scatterplot_SO2_HCHO.pdf')
- If you would like to download data set from a local file (for example, .xls), use the following:
SO2_df = pd.read_excel('/PATH TO/Dataset_SO2_HCHO_post_20_Kyiv_2017.xls', engine='xlrd')`
In this blog post, I'm gone over the basics of how to create and use heatmaps in Python. Now, you get quickly started with your Jupyter Notebook project right here in Google Colaboratory.
You may get started immediately by importing a Jupyter Notebook for this tutorial from my public GitHub repository.
I hope you found this blog post helpful. If so, please share with your friends! Thank you for reading.