Hands-on Exercise 3b:Programming Animated Statistical Graphics with R

Published

January 24, 2024

Modified

January 24, 2024

1 Getting Started

In this exercise, we will use the following our R packages.

  • ggiraph: for making ‘ggplot’ graphics interactive.

  • gganimate, an ggplot extension for creating animated statistical graphs.

  • gifski converts video frames to GIF animations using pngquant’s fancy features for efficient cross-frame palettes and temporal dithering. It produces animated GIFs that use thousands of colors per frame.

  • tidyverse, a family of modern R packages specially designed to support data science, analysis and communication task including creating static statistical graphs.

  • gapminder An excerpt of the data available at Gapminder.org. We just want to use its country_colors scheme.

The code chunk below uses p_load() of pacman package to check if these packages are installed in the computer and load them onto your working R environment.

pacman::p_load(readxl, gifski, gapminder,
               plotly, gganimate, tidyverse)

The code chunk below imports Data worksheet from GlobalPopulation Excel workbook by using appropriate R package from tidyverse family.

col <- c("Country", "Continent")
globalPop <- read_xls("data/GlobalPopulation.xls",
                      sheet="Data") %>%
  mutate_each_(funs(factor(.)), col) %>%
  mutate(Year = as.integer(Year))
Important
  • read_xls() of readxl package is used to import the Excel worksheet.

  • mutate_each_() of dplyr package is used to convert all character data type into factor.

  • mutate of dplyr package is used to convert data values of Year field into integer.

Unfortunately, mutate_each_() was deprecated in dplyr 0.7.0. and funs() was deprecated in dplyr 0.8.0. In view of this, we will re-write the code by using mutate_at() as shown in the code chunk below.

col <- c("Country", "Continent")
globalPop <- read_xls("data/GlobalPopulation.xls",
                      sheet="Data") %>%
  mutate_at(col, as.factor) %>%
  mutate(Year = as.integer(Year))

Instead of using mutate_at(), across() can be used to get the same outputs.

col <- c("Country", "Continent")
globalPop <- read_xls("data/GlobalPopulation.xls",
                      sheet="Data") %>%
  mutate(across(col, as.factor)) %>%
  mutate(Year = as.integer(Year))

The code chunk below uses summary()to summarize the data.

summary(globalPop)
        Country          Year          Young             Old       
 Afghanistan:  28   Min.   :1996   Min.   : 15.50   Min.   : 1.00  
 Albania    :  28   1st Qu.:2010   1st Qu.: 25.70   1st Qu.: 6.90  
 Algeria    :  28   Median :2024   Median : 34.30   Median :12.80  
 Andorra    :  28   Mean   :2023   Mean   : 41.66   Mean   :17.93  
 Angola     :  28   3rd Qu.:2038   3rd Qu.: 53.60   3rd Qu.:25.90  
 Anguilla   :  28   Max.   :2050   Max.   :109.20   Max.   :77.10  
 (Other)    :6036                                                  
   Population                Continent   
 Min.   :      3.3   Africa       :1568  
 1st Qu.:    605.9   Asia         :1454  
 Median :   5771.6   Europe       :1344  
 Mean   :  34860.9   North America: 976  
 3rd Qu.:  22711.0   Oceania      : 526  
 Max.   :1807878.6   South America: 336  
                                         

2 Animated Data Visualisation: gganimate methods

gganimate extends the grammar of graphics as implemented by ggplot2 to include the description of animation. It does this by providing a range of new grammar classes that can be added to the plot object in order to customise how it should change with time.

  • transition_*() defines how the data should be spread out and how it relates to itself across time.

  • view_*() defines how the positional scales should change along the animation.

  • shadow_*() defines how data from other points in time should be presented in the given point in time.

  • enter_*()/exit_*() defines how new data should appear and how old data should disappear during the course of the animation.

  • ease_aes() defines how different aesthetics should be eased during transition

2.1 Building a static population bubble plot

The basic ggplot2 functions are used to create a static bubble plot.

ggplot(globalPop, aes(x = Old, y = Young, 
                      size = Population, 
                      colour = Country)) +
  geom_point(alpha = 0.7, 
             show.legend = FALSE) +
  scale_colour_manual(values = country_colors) +
  scale_size(range = c(2, 12)) +
  labs(title = 'Year: {frame_time}', 
       x = '% Aged', 
       y = '% Young')

2.2 Building the animated bubble plot

In the code chunk below,

  • transition_time() of gganimate is used to create transition through distinct states in time (i.e. Year).
  • ease_aes() is used to control easing of aesthetics. The default is linear. Other methods are: quadratic, cubic, quartic, quintic, sine, circular, exponential, elastic, back, and bounce.
Code
ggplot(globalPop, aes(x = Old, y = Young, 
                      size = Population, 
                      colour = Country)) +
  geom_point(alpha = 0.7, 
             show.legend = FALSE) +
  scale_colour_manual(values = country_colors) +
  scale_size(range = c(2, 12)) +
  labs(title = 'Year: {frame_time}', 
       x = '% Aged', 
       y = '% Young') +
  transition_time(Year) +       
  ease_aes('linear')          

3 Animated Data Visualisation: plotly

Both ggplotly() and plot_ly() in Plotly R packages support key frame animations through the frame argument/aesthetic. They also support an ids argument/aesthetic to ensure smooth transitions between objects with the same id.

3.1 Building an animated bubble plot: ggplotly() method

The animated bubble plot above includes a play/pause button and a slider component for controlling the animation

gg <- ggplot(globalPop, 
       aes(x = Old, 
           y = Young, 
           size = Population, 
           colour = Country)) +
  geom_point(aes(size = Population,
                 frame = Year),
             alpha = 0.7, 
             show.legend = FALSE) +
  scale_colour_manual(values = country_colors) +
  scale_size(range = c(2, 12)) +
  labs(x = '% Aged', 
       y = '% Young')

ggplotly(gg)
Important
  • Appropriate ggplot2 functions are used to create a static bubble plot. The output is then saved as an R object called gg

  • ggplotly() is then used to convert the R graphic object into an animated svg object.

Notice that although show.legend = FALSE argument was used, the legend still appears on the plot. To overcome this problem, theme(legend.position=‘none’) should be used as shown in the plot and code chunk below

gg <- ggplot(globalPop, 
       aes(x = Old, 
           y = Young, 
           size = Population, 
           colour = Country)) +
  geom_point(aes(size = Population,
                 frame = Year),
             alpha = 0.7) +
  scale_colour_manual(values = country_colors) +
  scale_size(range = c(2, 12)) +
  labs(x = '% Aged', 
       y = '% Young') + 
  theme(legend.position='none')

ggplotly(gg)

3.2 Building an animated bubble plot: plot_ly() method

bp <- globalPop %>%
  plot_ly(x = ~Old, 
          y = ~Young, 
          size = ~Population, 
          color = ~Continent,
          sizes = c(2, 100),
          frame = ~Year, 
          text = ~Country, 
          hoverinfo = "text",
          type = 'scatter',
          mode = 'markers'
          ) %>%
  layout(showlegend = FALSE)
bp