Hands-on Exercise 3a : Programming Interactive Data Visualisation with R

Published

January 24, 2024

Modified

February 4, 2024

1 Getting Started

In this exercise, we will use the following our R packages.

  • ggiraph: for making ‘ggplot’ graphics interactive.

  • plotly, R library for plotting interactive statistical graphs.

  • DT provides an R interface to the JavaScript library DataTables that create interactive table on html page.

  • tidyverse, a family of modern R packages specially designed to support data science, analysis and communication task including creating static statistical graphs.

  • patchwork for combining multiple ggplot2 graphs into one figure.

The code chunk below uses p_load() of pacman package to check if these packages are installed in the computer and load them onto your working R environment.

pacman::p_load(ggiraph, plotly, 
               patchwork, DT, tidyverse) 

The code chunk below imports exam_data.csv into R environment by using read_csv() function of readr package.

exam_data <- read.csv("data/Exam_data.csv")

The code chunk below uses summary()to summarize the data.

summary(exam_data)
      ID               CLASS              GENDER              RACE          
 Length:322         Length:322         Length:322         Length:322        
 Class :character   Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character   Mode  :character  
                                                                            
                                                                            
                                                                            
    ENGLISH          MATHS          SCIENCE     
 Min.   :21.00   Min.   : 9.00   Min.   :15.00  
 1st Qu.:59.00   1st Qu.:58.00   1st Qu.:49.25  
 Median :70.00   Median :74.00   Median :65.00  
 Mean   :67.18   Mean   :69.33   Mean   :61.16  
 3rd Qu.:78.00   3rd Qu.:85.00   3rd Qu.:74.75  
 Max.   :96.00   Max.   :99.00   Max.   :96.00  

2 ggiraph methods

ggiraph is an htmlwidget and a ggplot2 extension. It allows ggplot graphics to be interactive.

Interactive is made with ggplot geometries that can understand three arguments:

  • Tooltip: a column of data-sets that contain tooltips to be displayed when the mouse is over elements.

  • Onclick: a column of data-sets that contain a JavaScript function to be executed when elements are clicked.

  • Data_id: a column of data-sets that contain an id to be associated with elements.

When using within a shiny application, elements associated with an id (data_id) can be selected and manipulated on client and server sides.

2.1 Tooltip effect with tooltip aesthetic

The below code chunk uses ggiraph to plot an interactive statistical graph. It consists of two parts. First, an ggplot object will be created. Next, girafe() of ggiraph will be used to create an interactive svg object.

Code
p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(
    aes(tooltip = ID),
    stackgroups = TRUE, 
    binwidth = 1, 
    method = "histodot",
    fill = "#AB1858") +
  scale_y_continuous(NULL, 
                     breaks = NULL)
girafe(
  ggobj = p,
  width_svg = 6,
  height_svg = 6*0.618
)
Note

First, an interactive version of ggplot2 geom (i.e. geom_dotplot_interactive()) will be used to create the basic graph. Then, girafe() will be used to generate an svg object to be displayed on an html page.

3 Interactivity

3.1 Displaying multiple information on tooltip

The student’s ID will be displayed when you hover the mouse pointer on data point of interest. The content of the tooltip can be customized by including a list object as shown in the code chunk below.

Code
exam_data$tooltip <- c(paste0(     
  "Name = ", exam_data$ID,         
  "\n Class = ", exam_data$CLASS)) 

p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(
    aes(tooltip = exam_data$tooltip), 
    stackgroups = TRUE,
    binwidth = 1,
    method = "histodot",
    fill = "#AB1858") +
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(
  ggobj = p,
  width_svg = 8,
  height_svg = 8*0.618
)
Note

First 3 lines of codes create a new field called tootip. At the same time, it populates text in ID and CLASS fields into the newly created field. Next, this newly created field is used as tooltip field as shown in the code of line 7.

3.2 Customizing Tooltip style

The below chunk code uses opts_tooltip() of ggiraph to customize tooltip rendering by add css declarations.

Code
tooltip_css <- "background-color:pink; #<<
font-style:bold; color:black;" #<<

p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(tooltip = exam_data$tooltip),                   
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618,
  options = list(    #<<
    opts_tooltip(    #<<
      css = tooltip_css)) #<<
)                                        

3.3 Displaying statistics on tooltip

The below chunk code shows an advanced way to customise tooltip. In this example, a function is used to compute 90% confident interval of the mean. The derived statistics are then displayed in the tooltip.

Code
tooltip <- function(y, ymax, accuracy = .01) {
  mean <- scales::number(y, accuracy = accuracy)
  sem <- scales::number(ymax - y, accuracy = accuracy)
  paste("Mean maths scores:", mean, "+/-", sem)
}

gg_point <- ggplot(data=exam_data, 
                   aes(x = RACE),
) +
  stat_summary(aes(y = MATHS, 
                   tooltip = after_stat(  
                     tooltip(y, ymax))),  
    fun.data = "mean_se", 
    geom = GeomInteractiveCol,  
    fill = "pink"
  ) +
  stat_summary(aes(y = MATHS),
    fun.data = mean_se,
    geom = "errorbar", width = 0.2, size = 0.2
  )

girafe(ggobj = gg_point,
       width_svg = 8,
       height_svg = 8*0.618)

3.4 Hover effect with data_id aesthetic

Code chunk below shows the second interactive feature of ggiraph, namely data_id.

Code
p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(           
    aes(data_id = CLASS,
        tooltip = CLASS),             
    stackgroups = TRUE,               
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618                      
)                                        

Interactivity: Elements associated with a data_id (i.e CLASS) will be highlighted upon mouse over.

3.5 Styling hover effect

In the code chunk below, css codes are used to change the highlighting effect.

Code
p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(data_id = CLASS),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618,
  options = list(                        
    opts_hover(css = "fill: pink;"),  
    opts_hover_inv(css = "opacity:0.2;") 
  )                                        
)                                        

Interactivity: Elements associated with a data_id (i.e CLASS) will be highlighted upon mouse over.

Note

Note: Different from previous example, in this example the ccs customization request are encoded directly.

3.6 Combining tooltip and hover effect

We can combine tooltip and hover effect on the interactive statistical graph using the below code chunk.

Code
p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(tooltip = CLASS, 
        data_id = CLASS),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618,
  options = list(                        
    opts_hover(css = "fill: #pink;"),  
    opts_hover_inv(css = "opacity:0.2;") 
  )                                        
)                                        

3.7 Click effect with onclick

onclick argument of ggiraph provides hotlink interactivity on the web.

The code chunk below shown an example of onclick.

Code
exam_data$onclick <- sprintf("window.open(\"%s%s\")",
"https://www.moe.gov.sg/schoolfinder?journey=Primary%20school",
as.character(exam_data$ID))

p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(onclick = onclick),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618)                                        

Interactivity: Web document link with a data object will be displayed on the web browser upon mouse click.

Note

Note that click actions must be a string column in the dataset containing valid javascript instructions.

3.8 Coordinated Multiple Views with ggiraph

We can implement coordinated multiple views methods in the data visualization below.

When a data point of one of the dotplot is selected, the corresponding data point ID on the second data visualization will be highlighted too.

In order to do so, the following programming strategy will be used:

  1. Appropriate interactive functions of ggiraph will be used to create the multiple views.
  2. patchwork function of patchwork package will be used inside girafe function to create the interactive coordinated multiple views.
p1 <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(data_id = ID),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +  
  coord_cartesian(xlim=c(0,100)) + 
  scale_y_continuous(NULL,               
                     breaks = NULL)

p2 <- ggplot(data=exam_data, 
       aes(x = ENGLISH)) +
  geom_dotplot_interactive(              
    aes(data_id = ID),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") + 
  coord_cartesian(xlim=c(0,100)) + 
  scale_y_continuous(NULL,               
                     breaks = NULL)

girafe(code = print(p1 + p2), 
       width_svg = 6,
       height_svg = 3,
       options = list(
         opts_hover(css = "fill: #pink;"),
         opts_hover_inv(css = "opacity:0.2;")
         )
       ) 
Note

The data_id aesthetic is critical to link observations between plots and the tooltip aesthetic is optional but nice to have when mouse over a point.

4 plotly methods

There are two ways to create interactive graph by using plotly, they are:

  • by using plot_ly(), and

  • by using ggplotly()

4.1 Creating an interactive scatter plot: plot_ly() method

The tabset below shows an example a basic interactive plot created by using plot_ly().

plot_ly(data = exam_data, 
             x = ~MATHS, 
             y = ~ENGLISH)

4.2 Working with visual variable: plot_ly() method

The color argument is mapped to a qualitative visual variable (i.e. RACE).

plot_ly(data = exam_data, 
        x = ~ENGLISH, 
        y = ~MATHS, 
        color = ~RACE)

4.3 Creating an interactive scatter plot: ggplotly() method

p <- ggplot(data=exam_data, 
            aes(x = MATHS,
                y = ENGLISH)) +
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))
ggplotly(p)

There is only extra line that you need to include in the code chunk which is ggplotly().

4.4 Coordinated Multiple Views with plotly

There are three steps involved.

  • highlight_key() of plotly package is used as shared data.

  • two scatterplots will be created by using ggplot2 functions.

  • subplot() of plotly package is used to place them next to each other side-by-side.

d <- highlight_key(exam_data)
p1 <- ggplot(data=d, 
            aes(x = MATHS,
                y = ENGLISH)) +
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))

p2 <- ggplot(data=d, 
            aes(x = MATHS,
                y = SCIENCE)) +
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))
subplot(ggplotly(p1),
        ggplotly(p2))

5 crosstalk methods

Crosstalk is an add-on to the htmlwidgets package. It extends htmlwidgets with a set of classes, functions, and conventions for implementing cross-widget interactions.

5.1 Interactive Data Table: DT package

A wrapper of the JavaScript Library DataTables. Data objects in R can be rendered as HTML tables using the JavaScript library ‘DataTables’.

DT::datatable(exam_data, class= "compact")

5.2 Linked brushing: crosstalk method

d <- highlight_key(exam_data) 
p <- ggplot(d, 
            aes(ENGLISH, 
                MATHS)) + 
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))

gg <- highlight(ggplotly(p),        
                "plotly_selected")  

crosstalk::bscols(gg,               
                  DT::datatable(d), 
                  widths = 5)        

Things to learn from the code chunk:

  • highlight() is a function of plotly package. It sets a variety of options for brushing multiple plots. These options are primarily designed for linking multiple plotly graphs, and may not behave as expected when linking plotly to another htmlwidget package via crosstalk. In some cases, other htmlwidgets will respect these options, such as persistent selection in leaflet.

  • bscols() is a helper function of crosstalk package. It makes it easy to put HTML elements side by side. It can be called directly from the console but is especially designed to work in an R Markdown document.