Hi. I’m Sharon Machlis at IDG Communications, here with Do More With R: Quick lookup tables using named vectors.
What’s the state abbreviation for Arkansas? Is it AR? AK? AS?
Maybe you’ve got a data frame with the information. Or any info where there’s one column with categories, and another column with values. Chances are, at some point you’d like to look up the value by category, sometimes known as the key. A lot of programming languages have ways to work with key-value pairs. This is easy to do in R, too, with named vectors. Here’s how.
I’ve got a spreadsheet with state names and abbreviations here.
Let’s import that into R.
And see what it looks like
Let me show you how easy it is to create a “lookup table” by making a named vector.
The lookup table slash named vector has values as the vector, and keys as the names. So let’s first make a vector of the values, which are in the PostalCode column.
And next we’ll add names from the State column.
To use this named vector as a lookup table, the format is My lookup vector, bracket, the key in quotation marks, close bracket.
So here’s how I’d get the postal code for Arkansas. getpostalcode, bracket, Arksansas in quotation marks (single or double, it doesn’t matter), close brackets.
If you want just the value, without the key, add the unname function to that value you get back:
That’s all there is to it. I know this is a somewhat trivial example, but this has some real-world use. For instance, I’ve got a named vector of FIPS codes that I need when working with U.S. Census data.
Here I’m reading in the data from my CSV file, creating a vector called getfips with the fips code column, and adding the states as names.
Now if I want the FIPS code for Massachusetts, I can use getfips brackets Massachusetts. And, add unname to get just the value without the name.
If having to keep using unname() gets too annoying, you can even make a little function from your lookup table
Here I’ve got 2 arguments to my function. One is my “key”, in this case the state name; the other is the lookupvector, which defaults to my getfips vector. In the first line of the function I run that unname lookupvector brackets state, and then I return the value.
And you can see how I use the function, just function name with one argument, the state name.
I can make that look a bit more generic
With a more generic name for the function, get_value; a more generic first argument name, mykey, and a second argument of mylookupvector that doesn’t default to anything.
It’s the same thing I’ve been doing: getting the value from the lookup vector, brackets, my key; then running the unname() function. But it’s all inside a function. So calling it is a bit more elegant.
I can use that function with any named vector I’ve created. Here, I’m using it with Arkansas and my getpostalcode vector.
Easy lookups in R! Just remember that names have to be unique. You can repeat values, but not keys.
I first saw this idea years ago in Hadley Wickham’s Advanced R book. I still use it a lot.
That’s it for this episode, thanks for watching! For more R tips, head to the Do More With R page at https go dot infoworld dot com slash more with R, all lowercase except for the R. You can also find the Do More With R playlist on YouTube. Hope to see you next episode!