How to create a Gantt chart by accessing the Google spreadsheet API

Danilo Galindo
Jun 17, 2020

Contents Outline

How to create a Gantt chart by accessing the Google spreadsheet API

Jun 17, 2020 4 minutes read

As we know there are several ways to store our data. Normally, we can read and extract information easily by means of txt,csv files among many others. However, we can also extract information from the Google cloud. In this post we will focus on extracting data from the Google spreadsheet API and plotting a Gantt chart with this data.

To do this, we'll follow the general steps below.

  • Create a Google spreadsheet with task data and their respective times.
  • Go to https://console.developers.google.com/ and create a new project.
  • Enable the Google drive API.
  • Create our credentials.
  • Enable the Google sheets API.
  • Share access to our Google sheet through the API with the previously created credentials.
  • Create our jupyter notebook and let the action begin!

The first step as mentioned above is to create our spreadsheet called times with the following fields. 

  • Task_id
  • start_time
  • end_time
  • date


Then, we access https://console.developers.google.com/ and create our project.



We give our project a name and leave the field location like no other. This will not affect our project, then we click on continue.



In this step, we will search the Google library by clicking on APIs and services.



We search by Google drive API and give you enable.



We click on create credentials.



Then, we set up our credentials as follows at the bottom.



We click on the option "What credentials do I need" and configure as follows.



Select continue and a .json file will be generated. This must be secured, as we will use it later.

Again, we search the Google library for the Google sheets API and give it a check.



We opened our .json file that was downloaded in previous steps. From this, we copy our "client_email" to be shared with our spreadsheet.



To do this we access our spreadsheet and click on share to paste our email.



It's time to create our Jupyter Notebook to start playing with our spreadsheet.

As a requirement we must install gspread and oauth2client with the pip command as follows
 
!pip install gspread oauth2client
 
Then, we import the previously downloaded libraries. 



We created our scope to be used in the credentials.



We created our credentials with ServiceAccountCredentials and passed our .json file with the scope as an argument.



We created our client with gspread.



We open our sheet where the information of our spreadsheet called times is contained.



We can get our data with the command get_all_records().



We print out our data.



If we want a friendly way to print our data, just import pprint.

from pprint import pprint



We can even extract the rows as follows.




Let's make our Gantt chart.

We import pandas as usual.



We created our DataFrame with the data from the spreadsheet.





If we check the type of data contained in the DataFrame, we will notice that it is of the string type. So we need to pass the fields date,end_time and start_time to type time.

The date field will be of type datetime and the fields start_time and end_time will be of type Timedelta as follows. 





We printed out our DataFrame.



On the other hand, it is not enough to have start_time and end_time on time. Somehow we must get the duration and express it as a whole number either in hours, minutes or seconds. To do this, we'll take the duration of the difference between start_time and end_time and express it in minutes just as start_time will be in minutes as well.





To plot our gantt chart we will only need the fields Task_id,start_time[m] and duration[m].



The gantt chart will be originated using the matplotlib broken_barh function that will take as arguments 3 parts that will be described below.

  • A tuple made up of the start_time and duration variables.
  • A tuple formed by the minimum value of the axis and/or height or increment.  
  • The color of each task.

This function should look like this at the bottom.

broken_barh((start_time,duration),(y_minimum,height),color)

We already know in a general way how the gantt function is composed. Now it's our turn to graph it by following the steps below.

We import matplotlib.



We created our dictionary of colors.



We created a figure to plot our gantt chart. Additionally, we iterate through a for to get the values of the start_time[m] and duration[m] fields for each task.





Finally, we get our gantt diagram for each assigned task.

This is a clear example that we have the opportunity to automate certain tasks by making use of the google sheet API. Just feed the spreadsheet, connect to the google sheet API and make use of our imagination. 
Join our private community in Discord

Keep up to date by participating in our global community of data scientists and AI enthusiasts. We discuss the latest developments in data science competitions, new techniques for solving complex challenges, AI and machine learning models, and much more!