Hi. I’m Sharon Machlis at IDG Communications, here with Do More With R: Add sparklines to your tables.
Often, all you need in a table is data as text and numbers. But sometimes, you’d like to visualize results in each row, too. That’s especially true when each row of data is a trend over time.
You can do that right inside a table, in a new column, with mini in-line graphs called sparklines. You might be familiar with them in Excel. We can do them in interactive R tables, too.
Let’s get started.
Last episode, I demo’d how easy it is to create basic HTML tables with the DT package. Here’s the code for that.
I loaded a few package and read in a spreadsheet of data about home prices in 5 cities. Then I created a table with DT’s datatable() function. Finally, I added a little formatting.
There are four steps if I want to add a sparkline column to this table:
1. Add a column in the data frame that has sparkline data and formatting.
3. This one is very easy: Add escape = FALSE as a datatable() argument so HTML displays as HTML and not as the actual code.
4. This is also very easy: Pipe the results to a function that adds necessary dependencies so the table will display sparklines.
Let’s take a look, step by step.
First I need to add a column to my dataframe with the sparkline information. Here’s the format for doing that:
You use the s.p.k underscore c.h.r function. That takes 2 required arguments: a vector of numeric values to visualize, and the type of graph you want. Dataviz choices include line for a line chart, bar for a bar chart, box for a box plot, and a few more. Unfortunately, this isn’t actually documented in the sparkline package help files. But, you can see available types if you look up documentation for the JQuery sparkline library. https://omnipotent.net/jquery.sparkline/#s-docs
I like to use two optional arguments in my sparklines: setting the Y axis minimums and maximums.
So how do we get the vector of values for each row to use in the sparklines? I could write a for loop, but this is actually easier to do if the data is “tidy”. That is, if I have only one observation per row, instead of the way it is now: multiple observations per row. I can create a tidy version of this data using the tidyr package and its gather() function.
I’ll create a tidy version of the prices data frame by first removing the percent Change column with select minus Change. I don’t want that percent change number to be in the trends I’m graphing! Next, I name the new category column Quarter, the new value column Price, and I “gather” every column between Q1 1996 and Q1 2018. Here’s what that looks like now:
One observation for each row: The MetroArea, the Quarter, and the Price.
The next block of code creates a data frame with sparkline info. It will have a column for MetroArea and a new column called TrendSparkline. If you look at the code, I first group by MetroArea, and then create the sparkline column.
It’s worth making sure you get what’s happening here to create the sparkline column. After grouping by MetroArea, I’m creating a new TrendSparkline column with the s.p.k underscore c.h.r function. The first argument here is my vector of values for each group – and that’s created automatically from the tidy data’s Price column, because I grouped by MetroArea already. I set the graph type to be a line chart. In this case, I want the Y axis minimum value to be 100, because that’s where the price index started in 1995. And I set the Y-axis maximum to be whatever the Price data’s maximum value is.
Here’s what this new dataframe looks like
You can see the TrendSparkline column has a bunch of HTML in it. Here’s what one of those values looks like
I can add data from this data frame back to my original prices data frame by using a dplyr left_join.
Now see what the data frame looks like.
The hard part is done. Now I just need to follow steps 2, 3, and 4.
That code starting from f.n.DrawCallback to the second single quote mark and closing parenthesis is what you need to add to the options list argument of the datatable code. I’ll have that code snippet available in the article accompanying this video at https://go.infoworld.com/Rsparklines.
If you take a look at the table now
You can see that the HTML code is appearing as the code itself, and not the code executing.
We can fix that with Step 3, adding escape = FALSE to the datatable() code. The default is escape = TRUE, which means the HTML code is escaped instead of executing.
If I run the table with escaped code and take a look in RStudio , there’s nothing there at all in the sparklines column. That’s because we need step 4, which adds the necessary dependencies to display the sparklines.
Do you see line 11 here, s.p.k. underscore add deps? That’s it. Now if we run the code
You’ll see our sparklines. You can even mouse over the graph to see the actual data points!
And we have an easy way to see trends over time for each city, just by scanning down this column.
That’s it for this episode, thanks for watching. If you want to learn more about tidyr and the gather() function, check out Episode 12, Reshape data with tidyr, at https://go.infoworld.com/morewithR. That’s https go dot infoworld dot com slash more with R, all lowercase except for the R. You can also find the Do More With R series on YouTube. Hope to see you next episode!