Comparing Strong Priors to the Jeffreys Prior

This came out of some discussion with @ericmjl in the hallway track at SciPy 2018.

I had asked during the tutorial about choosing a highly informative prior that assumes a false belief. The widget below visualizes what happens in such a case. We model the highly informative prior with a Gaussian with a small standard deviation. It is modeled after the similar visualization in [notebook 2](https://github.com/ericmjl/bayesian-stats-modelling-tutorial/blob/master/notebooks/02-Instructor-Parameter_estimation_hypothesis_testing.ipynb).

```
def gaussian_pdf(x, mu, sd):
    """
    The Gaussian probability distribution function. We could import this, 
    but the visualation is smoother if we define the function here.
    """
    return np.exp(-np.power(x - mu, 2) / (2 * np.power(sd, 2)))

def plot_posteriors(p=0.6, N=0, mu=0.5, sd=1):
    np.random.seed(42)
    n_successes = np.random.binomial(N, p)
    x = np.linspace(0.01, 0.99, 100)
    prior1 = gaussian_pdf(x, mu, sd)
    likelihood = x**n_successes*(1-x)**(N-n_successes)
    posterior1 = likelihood * prior1
    posterior1 /= np.max(posterior1)  # so that peak always at 1
    plt.plot(x, posterior1, label='Gaussian prior')
    jp = np.sqrt(x*(1-x))**(-1)  # Jeffreys prior
    posterior2 = likelihood*jp  # w/ Jeffreys prior
    posterior2 /= np.max(posterior2)  # so that peak always at 1 (not quite correct to do; see below)
    plt.plot(x, posterior2, label='Jeffreys prior')
    plt.legend()
    plt.show()
```

The visualization can be created with 

```
interact(plot_posteriors, p=(0, 1, 0.01), N=(0, 100), mu=(0, 1, 0.1), sd=(0.01, 1, 0.01));
```
![image](https://user-images.githubusercontent.com/12173331/42636598-30d63a9c-85b7-11e8-8edd-d4141a665a3c.png)

I also wrote up a bit of explanation:

Here we are parameterizing the Gaussian pdf by (no surprise) the mean (`mu`) and standard deviation (`sd`). We can think of the standard deviation here as a measure of certainty in our prior belief that the probability of getting heads is the mean of the Gaussian.

**Interesting things to try**:  
1. Set the mean (`mu`) to `0.2` and the standard deviation (`sd`) to 0.01. This is specifying a prior that we have a strong belief that the probability of flipping a heads is `0.2` (which is wrong assuming that `p` is still set to `0.6`). Now change the number of trials (`N`) to 100. What happens?
2. Now slowly move the standard deviation (`sd`) to `1`. What happens now? Why does the posterior corresponding to the Gassian prior converge to the posterior from the Jeffreys prior?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Comparing Strong Priors to the Jeffreys Prior #29

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Comparing Strong Priors to the Jeffreys Prior #29

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions