Hello World!

Good morning! I’m starting this blog to keep a record of what I learn about the R language. While I’ve been using R for about fifteen years now, I’m far from being an expert. For one thing, the sheer number of user-written packages means there will always be something new to learn. But I do know a thing or two about the language, and I enjoy using it.

Posts here will focus more on R as a programming language than on its use as an interactive environment for doing statistics. I’ll use R to answer questions about topics that interest me. One of the most important of these is climate change, which I believe poses an existential threat not only to human society, but to all of life on planet Earth. We’ll see what historical weather data can tell us about how the climate has changed over the past hundred years.

As a lifelong fan of the Chicago Cubs — and the game of baseball in general — I’ll also use R to answer questions about the National Pastime.

Recreational mathematics is another interest of mine, and I’ll be using R to solve some interesting problems in that field.

To start things off, consider a question that appeared as the Riddler Express problem in fivethirtyeight.com’s popular Riddler column for December 26, 2016:

A geology museum in California has six different rocks sitting in a row on a shelf, with labels on the shelf telling what type of rock each is. An earthquake hits and the rocks all fall off the shelf. A janitor comes in and, wanting to clean the floor, puts the rocks back on the shelf in random order. The probability that the janitor put all six rocks behind their correct labels is 1/6!, or 1/720. But what are the chances that exactly five rocks are in the correct place, exactly four rocks are in the correct place, exactly three rocks are in the correct place, exactly two rocks are in the correct place, exactly one rock is in the correct place, and none of the rocks are in the correct place?

This is known to mathematicians as “le problème des rencontres”, and wikipedia gives the theoretical solution. Here is some R code that solves the problem:


require(combinat)
require(tidyverse)
get_count <- function(n) {
     correct <- function(z) {
          permn(1:n) %>%
          keep(function(x) sum(x == 1:n) == z) %>%
          length
     }
     myList <- map(0:n,correct) %>% unlist
     names(myList) <- 0:n
     myList
}

print(get_count(6))

Here’s what the output looks like:

  0   1   2   3   4   5   6 
265 264 135  40  15   0   1 

So, the probability that no rocks are in the right place is 265/720; the probability that 1 rock is in the right place is 264/720, etc., on down to the probabilty that all 6 are in the right place, which is 1/720.

Note the use of the keep function from Hadley Wickham’s purrr library.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s