Build A Web Data Dashboard In Just Minutes With Python

Admond Lee
May 15, 2020

Build A Web Data Dashboard In Just Minutes With Python

May 15, 2020 11 minutes read

Exponentially increase power & accessibility by converting your data visualizations into a web-based dashboard with Plotly Dash.


Build a web data dashboard — in just a few lines of Python code

I don’t know about you, but I occasionally find it a little bit intimidating to have to code something. This is doubly so when I’m building something akin to web development rather than doing some local data analysis and visualisation. I’m a competent Python coder, but I wouldn’t call myself a web developer at all, even after having more than dabbled with Django and Flask.

Still, converting your data outputs to a web app leads to a few non-trivial improvements for your project.

It is just much easier to build in true, powerful interactivity into a web app. It also means that you can control exactly how the data is presented, as the web app can become the de facto report as well as the access point to your data. Lastly, and most importantly, you can exponentially scale the accessibility to your outputs; making them available anywhere, any time. There is always a web browser at a user’s fingertips.

So, I bit the bullet and started to do just this with some of my data projects recently, with surprisingly fast speed and efficiency. I converted one of my outputs from this article to a web app here in just a couple of hours.


My NBA analytics web app (link)

I thought this was rather cool, and wanted to share how this came together in just a few lines of code.

As always, I include everything you need to replicate my steps (data & code), and the article is not really about basketball. So do not worry if you are unfamiliar with it, and let’s get going.

Before we get started


Data

I include the code and data in my GitLab repo here (dash_simple_nbadirectory). So please feel free to play with it / improve upon it.

Packages

I assume you’re familiar with python. Even if you’re relatively new, this tutorial shouldn’t be too tricky, though.

You’ll need pandas, plotly and dash. Install each (in your virtual environment) with a simple pip install [PACKAGE_NAME].

Previously, on Python…


For this tutorial, I am simply going to skip *most* of the steps taken to create the local version of our visualisation. If you’re interested in what is going on, take a look at this article:
Create effective data visualizations of proportionsBest ways to see individual contributions to a whole and changes over time, at various dataset sizes — (incl…towardsdatascience.com

We will have a recap session, though, so you can see what is happening between plotting the chart locally with Plotly, and how to port that to a web app with Plotly Dash.

Load data

I have pre-processed the data, and saved it as a CSV file. It is a collection of player data for the current NBA season (as of 26/Feb/2020), which shows:
  • What share of their team’s shots they are taking, and
  • How efficient / effective they are at doing it.

For this portion, follow along by opening
local_plot.py in my repo.

Load the data with:
all_teams_df = pd.read_csv(‘srcdata/shot_dist_compiled_data_2019_20.csv’)

Inspect the data with all_teams_df.head(), and you should see:

Each player’s data has been compiled for each minute of the game (excluding overtime), with stats pl_acc and pl_pps being the only exception, as they have been compiled per quarter of the game (for each 12 minute period).

The dataframe contains all NBA players, so let’s break it down to a manageable size, by filtering for a team. For instance, the New Orleans Pelicans’ players can be chosen with:
all_teams_df[all_teams_df.group == 'NOP']

Then, our data can be visualised in Plotly, as below:
import plotly.express as px
fig = px.scatter(all_teams_df[all_teams_df.group == 'NOP'], x='min_mid', y='player', size='shots_freq', color='pl_pps')
fig.show()


Visualised player data for New Orlean Pelicans

At the risk of doing this:


How to Draw a Horse — Van Oktop (Tweet)

I do add a few small details to my chart, to produce this version of the same graph.


Same chart, with a few ‘small details’ added (& different team).

This is the code that I used to do it.

Now, while it’s a lot of formatting code, I thought it useful to show you how I did it, because we are going to be re-using these functions in our Dash version of the code.

def clean_chart_format(fig):
    import plotly.graph_objects as go
    fig.update_layout(
        paper_bgcolor="white",
        plot_bgcolor="white",
        annotations=[
            go.layout.Annotation(
                x=0.9,
                y=1.02,
                showarrow=False,
                text="Twitter: @_jphwang",
                xref="paper",
                yref="paper",
                textangle=0
            ),
        ],
        font=dict(
            family="Arial, Tahoma, Helvetica",
            size=10,
            color="#404040"
        ),
        margin=dict(
            t=20
        )
    )
    fig.update_traces(marker=dict(line=dict(width=1, color='Navy')),
                      selector=dict(mode='markers'))
    fig.update_coloraxes(
        colorbar=dict(
            thicknessmode="pixels", thickness=15,
            outlinewidth=1,
            outlinecolor='#909090',
            lenmode="pixels", len=300,
            yanchor="top",
            y=1,
        ))
    fig.update_yaxes(showgrid=True, gridwidth=1, tickson='boundaries', gridcolor='LightGray', fixedrange=True)
    fig.update_xaxes(showgrid=True, gridwidth=1, gridcolor='LightGray', fixedrange=True)
    return True


def make_shot_dist_chart(input_df, color_continuous_scale=None, size_col='shots_count', col_col='pl_acc', range_color=None):

    max_bubble_size = 15
    if color_continuous_scale is None:
        color_continuous_scale = px.colors.diverging.RdYlBu_r
    if range_color is None:
        range_color = [min(input_df[col_col]), max(input_df[col_col])]

    fig = px.scatter(
        input_df, x='min_mid', y='player', size=size_col,
        color=col_col,
        color_continuous_scale=color_continuous_scale,
        range_color=range_color,
        range_x=[0, 49],
        range_y=[-1, len(input_df.player.unique())],
        hover_name='player', hover_data=['min_start', 'min_end', 'shots_count', 'shots_made', 'shots_freq', 'shots_acc', ],
        render_mode='svg'
    )
    fig.update_coloraxes(colorbar=dict(title='Points per<BR>100 shots'))
    fig.update_traces(marker=dict(sizeref=2. * 30 / (max_bubble_size ** 2)))
    fig.update_yaxes(title="Player")
    fig.update_xaxes(title='Minute', tickvals=list(range(0, 54, 6)))

    return fig


fig = make_shot_dist_chart(
    all_teams_df[all_teams_df.group == 'SAS'], col_col='pl_pps', range_color=[90, 120], size_col='shots_freq')
clean_chart_format(fig)
fig.update_layout(height=500, width=1250)
fig.show()


Now, let’s get to the main event — how to create a web app out of these plots.

Into the World Wide Web


You can read more about Plotly Dash here, but for now all you need to know that it is an open-source software package developed to abstract away the difficulties in putting your visualisations on the web.

It works with Flask under the hood, and you can happily reuse most of the code that you used to develop plots in plotly.py.

This is the simple version that I put together:


import pandas as pd
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output

all_teams_df = pd.read_csv('srcdata/shot_dist_compiled_data_2019_20.csv')

app = dash.Dash(__name__)
server = app.server
team_names = all_teams_df.group.unique()
team_names.sort()
app.layout = html.Div([
    html.Div([dcc.Dropdown(id='group-select', options=[{'label': i, 'value': i} for i in team_names],
                           value='TOR', style={'width': '140px'})]),
    dcc.Graph('shot-dist-graph', config={'displayModeBar': False})])

@app.callback(
    Output('shot-dist-graph', 'figure'),
    [Input('group-select', 'value')]
)
def update_graph(grpname):
    import plotly.express as px
    return px.scatter(all_teams_df[all_teams_df.group == grpname], x='min_mid', y='player', size='shots_freq', color='pl_pps')

if __name__ == '__main__':
    app.run_server(debug=False)


Try it out! It should open this plot on your browser.


Our first Dash app!

What’s the big deal? Well, for one — it is a live web app, in under 25 lines of code. And notice the drop-down menu on the top left? Try changing the values on it, and watch the graph change *magically*.

Go on, I’ll wait.

Okay? Done.

Let’s briefly go through the code.

At a high level, what I’m doing here is to:
  • Initialise a Dash app;
  • Get a list of available team names, and provide it to a dropdown menu (with DOM id group-select) with a default value or ‘TOR’;
  • Instantiate a Graph object as the shot-dist-graph identifier within Dash; and
  • Create a callback function where if any of the values are changed, it will call the update_graph function and pass the returned object to the Output.

If you take a look at the code, so many of what is probably trivial for web devs but annoying for me is abstracted away.

dcc.Graph wraps the figure object from plotly.py into my web app and HTML components like divs can be called and set up conveniently with html.Div objects.

Most gratifying for me personally is that Input objects and callbacks from those inputs are declaratively set up, and I can avoid having to deal with things like HTML forms or JavaScript.

And the resulting app still works beautifully. The graph is updated the moment that the pulldown menu is used to select another value.

And we did all that in fewer than 25 lines of code.

Why Dash?

At this point, you might be asking — why Dash? We can do all this with a JS framework front end, and Flask, or any one of myriad other combinations.

To someone like me who prefers the comfort of Python than natively dealing with HTML and CSS, using Dash abstracts away a lot of stuff that doesn’t add a lot of value to the end product.

Take, for instance, a version of this app that includes further formatting and notes for the audience:

(It is simple_dash_w_format.py in the git repo)


def clean_chart_format(fig):
    fig.update_layout(
        paper_bgcolor="white",
        plot_bgcolor="white",
        annotations=[
            go.layout.Annotation(
                x=0.9,
                y=1.02,
                showarrow=False,
                text="Twitter: @_jphwang",
                xref="paper",
                yref="paper",
                textangle=0
            ),
        ],
        font=dict(
            family="Arial, Tahoma, Helvetica",
            size=10,
            color="#404040"
        ),
        margin=dict(
            t=20
        )
    )
    fig.update_traces(marker=dict(line=dict(width=1, color='Navy')),
                      selector=dict(mode='markers'))
    fig.update_coloraxes(
        colorbar=dict(
            thicknessmode="pixels", thickness=15,
            outlinewidth=1,
            outlinecolor='#909090',
            lenmode="pixels", len=300,
            yanchor="top",
            y=1,
        ))
    fig.update_yaxes(showgrid=True, gridwidth=1, tickson='boundaries', gridcolor='LightGray', fixedrange=True)
    fig.update_xaxes(showgrid=True, gridwidth=1, gridcolor='LightGray', fixedrange=True)
    return True


def make_shot_dist_chart(input_df, color_continuous_scale=None, size_col='shots_count', col_col='pl_acc', range_color=None):
    max_bubble_size = 15
    if color_continuous_scale is None:
        color_continuous_scale = px.colors.diverging.RdYlBu_r
    if range_color is None:
        range_color = [min(input_df[col_col]), max(input_df[col_col])]

    fig = px.scatter(
        input_df, x='min_mid', y='player', size=size_col,
        color=col_col,
        color_continuous_scale=color_continuous_scale,
        range_color=range_color,
        range_x=[0, 49],
        range_y=[-1, len(input_df.player.unique())],
        hover_name='player', hover_data=['min_start', 'min_end', 'shots_count', 'shots_made', 'shots_freq', 'shots_acc', ],
        render_mode='svg'
    )
    fig.update_coloraxes(colorbar=dict(title='Points per<BR>100 shots'))
    fig.update_traces(marker=dict(sizeref=2. * 30 / (max_bubble_size ** 2)))
    fig.update_yaxes(title="Player")
    fig.update_xaxes(title='Minute', tickvals=list(range(0, 54, 6)))

    return fig


app.title = 'Dash Demo - NBA'
team_names = all_teams_df.group.unique()
team_names.sort()
app.layout = html.Div([
    html.Div([
        dcc.Markdown(
            """
            #### Shot Frequencies & Efficiencies (2019-20 NBA Season)

            This page compares players based on shot *frequency* and *efficiency*, 
            divided up into minutes of regulation time for each team.

            Use the pulldown to select a team, or select 'Leaders' to see leaders from each team.


            *Notes*:

            * **Frequency**: A team's shots a player is taking, indicated by **size**.

            * **Efficiency**: Points scored per 100 shots, indicated by **colour** (red == better, blue == worse).

            * Players with <1% of team shots are shown under 'Others'
            """
        ),
        html.P([html.Small("See more data / NBA analytics content, find me on "), html.A(html.Small("twitter"), href="https://twitter.com/_jphwang", title="twitter"), html.Small("!")]),
    ]),
    html.Div([
        dcc.Dropdown(
            id='group-select',
            options=[{'label': i, 'value': i} for i in team_names],
            value='TOR',
            style={'width': '140px'}
        )
    ]),
    dcc.Graph(
        'shot-dist-graph',
        config={'displayModeBar': False}
    )
])


@app.callback(
    Output('shot-dist-graph', 'figure'),
    [Input('group-select', 'value')]
)
def update_graph(grpname):
    fig = make_shot_dist_chart(
        all_teams_df[all_teams_df.group == grpname], col_col='pl_pps', range_color=[90, 120], size_col='shots_freq')
    clean_chart_format(fig)
    if len(grpname) > 3:
        fig.update_layout(height=850, width=1250)
    else:
        fig.update_layout(height=500, width=1250)

    return fig


Most of the changes are cosmetic, but I will note that here, I just write the body text in Markdown, and simply carry over my formatting functions from Plotly to be used in the formatting the graphs in Dash.

This saves me a tremendous amount of time between doing data analysis and visualisation to deployment to clients’ views.

All in all, from starting with my initial graph, I think it probably took less than an hour to deploy it to Heroku. Which is pretty amazing.

I will get into more advanced features of Dash, and actually doing some cool things with it functionality-wise, but I was very happy with this outcome in terms of ease and speed.

Try it out yourself — I think that you’d be very impressed. Next time, I plan to write about some really cool things you can do with Dash, and building truly interactive dashboards.

If you liked this, say 👋 / follow on twitter, or follow for updates. This is the article that the data viz is based on:
Join our private community in Discord

Keep up to date by participating in our global community of data scientists and AI enthusiasts. We discuss the latest developments in data science competitions, new techniques for solving complex challenges, AI and machine learning models, and much more!