Gapminder Animation with Plotly Express

by Dhafer Malouche

I have recently created a data about Research and Development from the WDI data (https://data.worldbank.org/). This data is now available in my website (https://malouche.github.io/data_in_class/RD_data.html) and it can be downloaded in different kind of formats csv, xls... and so on.

The aim of this tutorial is to show how you draw a gapminder animation using this data. I will show different examples of scatter plots.

Let's first import the needed python librairies

In [1]:
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
sns.set_style("white")
import pandas as pd
my_dpi=96

I will now get the data after omitting all the missing values in the data

In [49]:
data = pd.read_csv("data_RD.csv")

Preview of the data

In [50]:
data.head()
Out[50]:
iso2c country region income year gdp pop per_g nr nart TotExp TotRD ExpOneRD Nart100 ExpOneArt
0 AU Australia East Asia & Pacific High 2003 4.718385e+11 19895400 1.781659 3866.243655 24849.7 8.406554e+09 76920.464004 109288.907976 32.305707 338295.975889
1 AT Austria Europe & Central Asia High 2003 2.622286e+11 8121423 2.174560 3071.221983 7749.6 5.702319e+09 24942.692849 228616.815998 31.069620 735821.077426
2 BE Belgium Europe & Central Asia High 2003 3.190028e+11 10376133 1.831840 2980.250339 10811.8 5.843621e+09 30923.473888 188970.401891 34.963084 540485.514761
3 BG Bulgaria Europe & Central Asia Low 2003 2.098269e+10 7775327 0.473411 1220.331502 1700.3 9.933439e+07 9488.476477 10468.950866 17.919631 58421.686781
4 BR Brazil Latin America & Caribbean Low 2003 5.583199e+11 182482149 0.999390 493.297570 16752.4 5.579793e+09 90018.000670 61985.307553 18.610056 333074.273351

I will be using now the plotly_express that helps to make a Gapminder Animation in 2 lines.

I first import plotly_express library

In [25]:
import plotly_express as px

I will start with a scatter plot Expenditure in RD by Researcher x Cost of one article. Both indicators are expressed in current USD. Let's first check some statistics about our variables.

In [60]:
data['ExpOneRD'].min()
Out[60]:
2451.21305037836
In [61]:
data['ExpOneRD'].max()
Out[61]:
397620.42333109
In [62]:
data['ExpOneArt'].min()
Out[62]:
41386.9212389199
In [63]:
data['ExpOneArt'].max()
Out[63]:
5206796.60623054
In [92]:
fig=px.scatter(data, x="ExpOneRD", y="ExpOneArt", animation_frame="year", animation_group="country",height=600,width=1000,
           size="per_g", color="income", hover_name="country",size_max=50,log_x=True,log_y=True,text="iso2c",
               range_x=[2000,450000], range_y=[35000,5500000],
              labels=dict(ExpOneRD="Expenditure by one researcher (Current USD)",
                          ExpOneArt="Cost of one article (Current USD)",
                          per_g="Research and development expenditure (% of GDP)"))

We will now open the graph in a separate file that can be used to be inserted in your website.

In [83]:
import plotly.graph_objs as go
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
In [90]:
plot(fig)
Out[90]:
'temp-plot.html'