## Taxi driver problem

There’s a guy who travels by taxi every day from his work to his home (20km). One day he finds out that a colleague of his also takes taxi every day from work, the same direction, but he lives only half a way of the first guy (10km). So, they decide to share a taxi and also to share the costs. How should they share the costs?

First answer could go like this: First guy travels twice longer so he should pay twice as much as the second guy. So, the first guy should pay 2/3 of the total cost and the second guy only 1/3.

Second answer could go like this: Both of them travel the first half together, so they should split this part in half. The first guy travels alone the second half of the way, so he should pay the full amount of the second half. In total, first guy should pay 3/4 and the second one only 1/4.

Which one is fairer?

Let’s assume that first guy pays double for his trip as compared to the second one, when they travel separately – in two taxis. This is usually not correct, because of the initial charges, but let’s assume that the charge is proportional to the distance. Let’s assume that it’s fair if both of them get the same “discount” (percentage wise) when joining together. The first one would have to pay 2p(1-k) and the second one p(1-k), where k is the discount, p is what the second guy usually pays and 2p is what the first guy usually pays. So the first one should still pay twice as much as the second one, which leads to the first answer.

Let’s assume now that it’s fair if they all get the same discount (absolute cost reduction). So, the first one would pay 2p-q and the second one p-q, where q is the discount amount. Then (2p-q)+(p-q) = 2p => p = 2q => q = p/2, which leads to the second answer.

Let’s say that one day the first guy finds out that his neighbour also takes taxi every day from work home and accidentally, he works very close to where the second guy lives, just one block away, so he comes to a cunning idea. He will share the second part of his trip with his neighbour and he will reduce his costs even more. Obviously, he would prefer for the first formula to be applied, because in that case his colleague will pay 1/3, his neighbour also 1/3, which lives him with 1/3 as well. The problem now is that his colleague must not know about his neighbour and vice versa, because in that case they would ask for splitting the costs proportional to the distance, which would be 1/4 for the colleague, 1/4 for the neighbour and 1/2 for the first guy. Obviously, the first formula is not transparent and it allows for speculation. Second formula would give the totally transparent solution in which case the first guy wouldn’t care if his colleague and his neighbour know about each other.

Moral question: In case of applying the first formula in case of 3 guys sharing the taxi, when everybody pays 1/3 of the total price, do you think the first guy would be a cheater or a good businessman? How do you think the other two guys would react when they find out about it? Would they be angry? Would they be like: “Wow, you really tricked us professionally. Well done!”. Or would they just be glad they found it out, so from now they would pay less.

From where I come, he would definitely be a cheater, because the base of his “success” is in hiding the truth from the other two guys. On the other hand, we’re all aware that much of today’s businesses are based on hiding the truth from the others. Sometimes it’s some intellectual property information, but very often it’s just a cheap fact. If a famous chef hides the details of his recipe, that’s kind of fine. But if a job agent hides the details about how much of your contract he gets from your employer, that’s kind
of completely different story, isn’t it? This brings up more serious question: Is hiding the truth in today’s businesses unethical? Is hiding the truth in politics and diplomacy unethical?

I could write a lot about this and I already went quite far away from initial “logic” problem to an ethical one, that’s probably the reason why I post so rarely, so I’ll stop here

## Recursive expectation (part 2)

This post follows the previous recursive expectation post.

### Problem:

There is an equilateral triangle ABC and the bug is in the vertex A. The bug can move only along edges of the triangle. Once it starts going along an edge, it doesn’t change its moving direction until it reaches the next vertex. It starts walking from the vertex A. It takes bug 1 minute to walk between any two vertices. Every time it comes to a new vertex, it makes a random choice about which of other two vertices to go to next. What is the expected time for the bug to return to the starting vertex A?

### Solution 1:

First, completely intuitive solution, without (much) mathematics. It takes one minute for the bug to come to the edge BC. From that edge, each minute there is 1/2 chance that the bug will return to A. Basically we start the experiment that has 1/2 chance of success each time. So, according to the previous post, expected time to get back to A from BC will be 2 minutes, so all together with the first minute, we get the final result – 3 minutes.

### Solution 2:

And now, let’s try to do it pure mathematical way, without much thinking.

Possible solutions for number of minutes (length of trip) to return back to A are 2, 3, 4, … until infinity. If we take one of the possible solutions, let’s say n minutes, how many different paths do we have for this to happen? Basically, we can go from A to B and then go n-2 times between B and C and then the last time, we go back to A, either from B or from C. Another path would be from A to C and then n-2 times between B and C and then the last time, we go back to A. So, for each n, number of possible paths is always 2. Probability for each path of length n is:

$\left ( \frac{1}{2} \right )^n$

So, probability for n-minutes solution to happen is:

$P\left ( n \right ) = \left ( \frac{1}{2} \right )^n \cdot 2 = \left ( \frac{1}{2} \right )^{n-1}$

Now, the solution for the expectation becomes:

$\begin{array}{rcl} E\left ( X \right ) & = & \sum_{n=2}^{\infty }n\cdot\left ( \frac{1}{2}\right )^{n-1} \\ & = & \sum_{n=0}^{\infty }\left [n\cdot\left ( \frac{1}{2}\right )^{n-1} \right ] - 1 \\ & = & \frac{1}{\left ( 1 - \frac{1}{2}\right )^2} - 1 \\ & = & 4 - 1 \\ & = & 3 \end{array}$

which requires some mathematical knowledge, but doesn’t require much of the intuition.

### Solution 3:

This one will actually use recursive expectation (please see the previous post). Expected time from A back to A is equal to expected time from A to edge BC + expected time from edge BC back to A. Expected time from A to BC is 1. Expected time from BC to A (following the logic from the previous post) is:

$\begin{array}{rcl} E\left ( BC\rightarrow A \right ) & = & \frac{1}{2}\cdot1+\frac{1}{2}\left ( 1 + E\left ( BC\rightarrow A \right )\right ) \\ \frac{1}{2} \cdot E\left ( BC\rightarrow A \right ) & = & 1 \\ E\left ( BC\rightarrow A \right ) & = & 2 \end{array}$

So, the overall expectation is:

$\begin{array}{rcl} E\left ( A\rightarrow A \right ) & = & E\left ( A \rightarrow BC \right ) + E\left ( BC\rightarrow A \right ) \\ & = & 1 + 2 \\ & = & 3 \end{array}$

Recursive expectation once again, provides much easier solution than the mathematical one. Additionally, it’s much more intuitive.

## Rolling dice expectation

The day after my previous expectation related post, I kind of realised what lies behind intuitive answer to the question:

“What is the expected number of times we need to roll a dice to get ‘6’ for the first time?”

The question above is equivalent to the following. Let’s roll the dice large number of times. We will get long sequence, like:

3, 4, 6, 5, 6, 1, 2, 4, 3, 5, 1, 4, 5, 6, 6, …

Next, let’s cut the tail off after the last ‘6’. Let’s mark the length of the remaining sequence with N. Now we need to count “distances” between every two consecutive ‘6’s. In our example above, we will get 3, 2, 9, 1, … Distance is defined as the number of rolls between two consecutive ‘6’s. Since ‘6’s are at positions: 3, 5, 14, 15, … in the sequence above, distances are 3, 5-3=2, 14-5=9, 15-14=1, … The answer to the question, about the expected number of times, is the same as the average distance between every two consecutive ‘6’s. Let’s mark distances with the sequence: $d_1, d_2, d_3, \dots d_m$. $m$ is the total number of distances. Basically, $m$ will correspond to the number of experiments performed. The cut off tail is unfinished last experiment, which we decided to ignore. If X is the number of tries to get ‘6’ for the first time, we can write:

$E\left ( X \right ) = \frac{1}{m}\sum_{i=1}^{m}d_i$

Furthermore, sum of all distances must be equal to the total number of tries – N.

$\sum_{i=1}^{m}d_i=N$

Since $m$ is the number of distances, it is also the number of ‘6’s in the whole sequence. Since there is no reason for any number on the dice to appear (by expectation) different number of times than any other number (on the dice), we expect for $m$ to be:

$m = \frac{N}{6}$

If we replace last two equations in the formula for E(X), we get:

$\begin{array}{rcl} E\left ( X \right ) & = & \frac{1}{\frac{N}{6}} \cdot N \\ & = & \frac{1}{\frac{1}{6}} \\ & = & 6 \end{array}$

and no mathematics knowledge was needed for this proof. So, maybe this is what lies behind the instinctive answer.

If we write down the results of each experiment (experiment is said to be finished once we get ‘6’ for the first time):

$\begin{array}{rl} E_1: & 3, 4, 6 \\ E_2: & 5, 6 \\ E_3: & 1, 2, 4, 3, 5, 1, 4, 5, 6\\ E_4: & 6 \\ \cdots \\ E_m: & 1, 5, 3, 4, 6 \end{array}$

Since there are $m$ experiments, we have exactly $m$ 6’s, but we also expect to have each other dice number also $m$ times, so the sum of all lengths of all experiments is expected to be $6 \cdot m$, which makes average length to be $6$.

## Recursive expectation

I’ll start this with an example:

What is the expected number of tries to get ‘6’ when rolling dice?

Or even simpler, experiment is performed where the dice is thrown as many times as necessary to get ‘6’ for the first time. Save the number of throws for this experiment. Repeat the same experiment many times. What is the average value of throws in all experiments?

Some people would immediately say: “6”. Why 6? Because the probability to get ‘6’ on each throw is 1/6, so the expected number of throws will be:

$\frac{1}{\frac{1}{6}}=6$

Although intuitively everything looks fine, formal proof is far from trivial:

$\begin{array}{rcl} E\left ( X \right ) & = & \sum_{n=1}^{\infty}n\cdot P\left ( X=n \right ) \\ & = & \sum_{n=1}^{\infty}n\cdot\frac{1}{6}\cdot\left ( \frac{5}{6}\right )^{n-1} \\ & = & \frac{1}{5}\sum_{n=1}^{\infty}n\cdot\left ( \frac{5}{6}\right )^n \\ & = & \frac{1}{5}\frac{\frac{5}{6}}{\left ( 1-\frac{5}{6}\right )^2} \\ & = & 6 \end{array}$

but one must be more than intermediate mathematician to be able to go through this proof. I will not go through the details since it’s not the purpose of this post.

There is also a very intuitive solution as well, for which you don’t need to be some mathematician:

$\begin{array}{rcl} E\left ( X \right ) & = & \frac{1}{6}\cdot 1+ \frac{5}{6}\cdot \left ( 1+E\left ( X \right ) \right )\end{array}$

and it’s fairly easy to come to solution from this simple equation. What does this equation say? It says that there is 1/6 chance that our experiment will be finished after only one throw (we get 6 from the first try), or there is 5/6 chance that it will be longer. In this case it will be 1 throw longer than what we expected before the first throw. This is probably the hardest part to imagine. If we don’t get 6 from the first throw, we are basically at the same place where we were before we even started the experiment. The only difference is that counter is already on 1 and not on zero as it was before we started the experiment.

To put it even simpler. Let’s say that we do this experiment 600,000 times. 1/6 of them will be finished after first throw, because 1 in 6 times, we’ll get 6 in the first attempt. So, there will be 100,000 experiments out of 600,000 times for which the result will be 1 throw. The rest of 500,000 experiments will have, by average, one throw more than the average in all 600,000 experiments. What is then the average for all 600,000 experiments.

$E=\frac{100,000 \cdot 1+500,000 \cdot \left ( E+1 \right )}{600,000}$

which is basically the same as formula above. It’s very easy to get the result $E=\6$.

## Generalization

If we keep on trying succeeding the event that has probability p, how many times is expected to try it until the first success?

$\begin{array}{rcl} E\left ( X \right ) & = & p \cdot 1 + \left ( 1 - p \right ) \cdot \left ( E\left ( X \right ) + 1 \right ) \\ E\left ( X \right ) \cdot \left ( 1 - \left ( 1 - p \right ) \right ) & = & p + 1 - p \\ E\left ( X \right ) & = & \frac{1}{p} \end{array}$

## Programming solution

Following c# code will prove the dice problem using Monte Carlo simulation.

using System;

namespace RecursiveExpectation
{
class Program
{
static void Main()
{
var r = new Random((int)DateTime.Now.Ticks);
var total = 0;

// repeat experiment 10000 times
for (var game = 1; game <= 10000; game++)
{
// number of throws in the current game
var throws = 0;

// current number on the dice
var dice = 0;

// roll the dice until we get '6'
while (dice != 6)
{
// choose the random number between 1 and 6
dice = r.Next(1, 7);

// increase the number of throws
throws++;
}

// total sum of throws in all games
total += throws;

// calculate average number of throws in
// all experiments (games)
var avg = (double)total / game;

Console.WriteLine(avg);
}
Console.WriteLine("Finished");