# Sharing a bicycle (part 5)

Over the past four posts, I developed a simulation of a bicycle sharing problem that appeared in Parade Magazine’s Ask Marilyn column a few years ago. For what I hope will be the final post in this series, I’m going to derive some analytical results.

First of all, the graphs of completion time vs. distance ahead showed that in both the simple case (rA = rB and wA = wB) and the general case with all velocities different, there was a distinct minimum completion time. In order to determine that time, begin by defining the following quantities:

$s_{wA}$ = total distance A walks
$s_{rA}$ = total distance A rides the bicycle
$s_{wB}$ = total distance B walks
$s_{rB}$ = total distance B rides the bicycle

and as before,

$wA$ = A’s walking speed
$wB$ = B’s walking speed
$rA$ = A’s speed on the bicycle
$rB$ = B’s speed on the bicycle

# Sharing a bicycle (part 2)

The last post described a problem in which two friends share a bicycle to go from one friend’s house to the other’s. In this post, I’ll describe my analysis of the problem. Then in the next post I’ll turn this analysis into R code and present some graphics to describe the results.

Recall that there are four basic steps to this problem:

1. A rides the bike and B walks, until A is a distance h ahead of B, at which time A dismounts and leaves the bike on the sidewalk.
2. A and B walk until B comes to the bike, at which time B picks it up.
3. B rides the bike and A walks, until B is a distance h ahead of A, at which time B dismounts and leaves the bike on the sidewalk.
4. A and B walk until A gets to the bike, at which time A picks it up.

# Sharing a bicycle (part 1)

The following question appeared in Parade Magazine’s Ask Marilyn column for Sunday, May 2, 2010:

A friend and I once went from his house to mine with one bicycle. I started walking and he rode the bike. When he got a couple of blocks ahead, he left the bike on the sidewalk and started walking. When I got to the bike, I started riding, passing him, and then left the bike a couple of blocks ahead. When he got to the bike, he started riding. We did this the whole way.

At least one of us was always walking. At times, one was riding; at other times, we were both walking. I’m sure this was faster than if we had no bike. But some people insist that it was no faster because somebody was always walking. Who’s right?

Marilyn’s answer was: “The reader is right. It’s true that someone was always walking. But neither friend walked the whole distance. Both biked part of the way. This increased their average speed, so they saved time.”

As with the Monty Hall problem, several readers wrote in to disagree with her response, insisting that since someone was always walking, the journey could be no faster than if both had walked the entire distance. What do you think?

The problem statement is somewhat ambiguous, but it seems that what happens is the following (assuming A rides the bike first):

1. A rides the bike and B walks, until A is a distance h ahead of B, at which time A dismounts and leaves the bike on the sidewalk.
2. A and B walk until B comes to the bike, at which time B picks it up.
3. B rides the bike and A walks, until B is a distance h ahead of A, at which time B dismounts and leaves the bike on the sidewalk.
4. A and B walk until A gets to the bike, at which time A picks it up.

These four steps repeat in sequence until both friends get to the other house.

In his book Number-Crunching: Taming Unruly Computational Problems from Mathematical Physics to Science Fiction, Paul Nahin presents a discrete-time-step solution for the case where both friends share the same walking speed and the same — but higher — bike-riding speed. Over the next couple of posts I’m going to develop an event-driven simulation to determine the travel time in the more general case where all four speeds are different. By event-driven, I mean that instead of discretizing time, I’ll model each of the above four steps as discrete events; the length of the time step will vary.

Stay tuned for the program.

# Words With Friends Scores

Over a period of several months, I and a friend played 189 games of Words With Friends, a Scrabble-like game popular on Facebook. I kept track of our scores, and the resulting dataset — which I make available here — provides a couple of insights into the game.

The structure of the file itself is very simple: one line per game, with each line containing my score followed by my opponent’s score. There is no need to record the game number, since those are equivalent to the row numbers that are added automatically. Using the readr package:

require("tidyverse")


Here’s what the data looks like:

> scores
# A tibble: 189 × 2

Me  Opp1
1    360   313
2    365   388
3    458   349
4    378   419
5    440   348
6    388   353
7    358   376
8    332   379
9    362   325
10   353   326
# ... with 179 more rows


(Note that if you print scores in R, the print routine for the tibble will also provide the type of each column. But since the type [int for these columns] is enclosed in angle brackets, WordPress apparently thinks they’re HTML commands and so does not print them.)

# Gambling Problem (part 2)

The last post concerned a game based on a slot machine that generates random numbers between 0 and 999. The player keeps pulling the handle to generate a sequence of these numbers, and the machine keeps track of them. The game ends when any number shows up a second time. That is, as long as all the numbers generated are unique, the player keeps pulling the handle. Simulation of the game showed that the expected number of pulls until the game ends is about 40.3. In this post I’ll develop an analytical solution.

Of course it’s not possible to win on the first pull, but what is the probability of winning on the second pull? The machine has already generated a number between 0 and 999. The probability of generating that same number again is $\frac{1}{1000}$, so $p_2 = \frac{1}{1000}$.

To win on the third pull, the numbers generated on the first two pulls must be different. After the first number is generated, the probability of generating a different number on the second pull is $\frac{999}{1000}$. The probability that the third number will be one of the two already generated is $\frac{2}{1000}$, so $p_3 = \frac{2 \times 999}{1000^2}$.

To win on the fourth pull, the numbers generated on the first three pulls must all be different. The probability of that happening is $\frac{999 \times 998}{1000^2}$. The probability that the fourth number is the same as one of the first three is $\frac{3}{1000}$. So $p_4 = \frac{3 \times 999 \times 998}{1000^3}$.

Now we can write the general expression for the probability of winning on the kth pull:

# Gambling Problem (part 1)

I’m not a gambler myself, but I do enjoy probability and statistics problems associated with games of chance. Here’s one I came up with recently:

A manufacturer has invented a new game of chance based on a three-reel slot machine. Instead of fruit, each reel contains the digits from 0 through 9, so that pulling the handle generates a number between 000 and 999.

The game is played as follows: a player inserts a coin and pulls the handle repeatedly to generate a series of random numbers. The machine keeps track of the numbers generated on each pull. As long as all the numbers are different, the player continues to pull the handle. The game ends when any number comes up a second time. The machine then pays out a dollar for each time the handle was pulled.

For example, here are the numbers generated in a representative game:

{94, 845, 913, 994, 96, 269, 377, 913}

Since the number generated on the 8th pull (913) is a repeat of the number generated on the third pull, the game ends there and the machine pays out 8 dollars.

On average, how many times can a player expect to pull the handle before the game ends?

First I’ll write a program to simulate this game, and in the next post I’ll develop an analytical solution. The solution begins after the fold.
Continue reading “Gambling Problem (part 1)”

# Hello World!

Good morning! I’m starting this blog to keep a record of what I learn about the R language. While I’ve been using R for about fifteen years now, I’m far from being an expert. For one thing, the sheer number of user-written packages means there will always be something new to learn. But I do know a thing or two about the language, and I enjoy using it.

Posts here will focus more on R as a programming language than on its use as an interactive environment for doing statistics. I’ll use R to answer questions about topics that interest me. One of the most important of these is climate change, which I believe poses an existential threat not only to human society, but to all of life on planet Earth. We’ll see what historical weather data can tell us about how the climate has changed over the past hundred years.

As a lifelong fan of the Chicago Cubs — and the game of baseball in general — I’ll also use R to answer questions about the National Pastime.

Recreational mathematics is another interest of mine, and I’ll be using R to solve some interesting problems in that field.

To start things off, consider a question that appeared as the Riddler Express problem in fivethirtyeight.com’s popular Riddler column for December 26, 2016:

A geology museum in California has six different rocks sitting in a row on a shelf, with labels on the shelf telling what type of rock each is. An earthquake hits and the rocks all fall off the shelf. A janitor comes in and, wanting to clean the floor, puts the rocks back on the shelf in random order. The probability that the janitor put all six rocks behind their correct labels is 1/6!, or 1/720. But what are the chances that exactly five rocks are in the correct place, exactly four rocks are in the correct place, exactly three rocks are in the correct place, exactly two rocks are in the correct place, exactly one rock is in the correct place, and none of the rocks are in the correct place?