---
title: "Computational Cognitive Science 2021-2022"
output:
  pdf_document: default
  word_document: default
  html_document: default
---
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
require(DirichletReg) # We'll be using Dirichlet distributions. gtools includes rdirichlet and ddirichlet but not with log densities.
```

# Tutorial 5: Active learning

This tutorial is designed to provide some hands-on exposure to concepts from the active learning lectures.

Imagine you come across a curiosity shop that has four devices for sale at £100 each. Each device can automatically create a delicious cake when one pulls its lever, but the devices vary in their reliability.

The shopkeeper says that 

* one device never works, 
* one device works exactly 20 percent of the time,
* one works 30 percent of the time, and 
* one works 75 percent of the time.

Critically, we don't know which device is which, but the shopkeeper will let us use the devices a few times to choose one.

Let's take an information-theoretic approach to this active learning problem.

Here are some helper functions.

```{r}
plogp <- function(p) {
  # bits, not nats
  if(p==0) 0 else p*log2(p) # Consider why we can't just use p*log2(p).
}

cat_entropy <- function(catv) {
  -sum(sapply(catv,plogp))
}
```

Let's define our hypothesis space as all of the different cake probabilities the machines before us might have.

**Question 1**: How does the code below define that hypothesis space?

```{r}
library(combinat) # From googling "all list permutations in R"
probs <- c(0,.2,.3,.75)
hyps <- permn(probs)
```

**Solution**: *Our hypotheses capture which machine is assigned which cake probability by exhaustively permuting a single probability list, yielding 4!=24 hypotheses.*

**Question 2**: What is our entropy?

**Solution**: 

```{r}
cat_entropy(rep(1/24,24))
```

**Question 3**: Suppose we turn the lever on the first machine and get nothing. What probability should we now assign our different hypotheses? Edit the code below.

```{r}
update_probs <- function(prior,choice,outcome) {
  # x, below, is a list of probabilities under a specific hypothesis
  likes <- sapply(hyps,function(x) {1.0}) # Replace the 1.0 with an actual likelihood
  unnp <- likes*prior
  z <- sum(unnp)
  unnp/z
}

post <- update_probs(runif(24),1,0) # Replace the "runif(24)" with the actual prior
print(post)
```


**Solution**:

```{r}
update_probs <- function(prior,choice,outcome) {
  # x, below, is a list of probabilities under a specific hypothesis
  likes <- sapply(hyps,function(x) {dbinom(outcome,1,x[choice])})
  unnp <- likes*prior
  z <- sum(unnp)
  unnp/z
}

post <- update_probs(rep(1/24,24),1,0)
print(post)
```

**Question 4**: What is our new entropy? Did we learning much? Why or why not?

**Solution**:

```{r}
cat_entropy(post)
```

*We didn't learn very much! It makes sense, because 3/4 of our machines are unlikely to produce a cake, so most hypotheses are quite consistent with what we've seen.*

**Question 5**: What is the expected probability getting a cake if we choose machine 2 next?

**Solution**:

```{r}
# First we want the probability of each outcome in each hyp
# Then we take the weighted sum over hypothesis probabilities
expected_outcome <- function(i,ph) {
  sum(sapply(hyps,function(x) {x[i]})*ph)
}
print(expected_outcome(2,post))
```

**Question 6**: What is the expected entropy if we choose machine 2 next?

**Solution**: *We take the expectation over possible outcomes*.

```{r}
expected_entropy <- function(choice,prior) {
  pwin <- expected_outcome(choice,prior)
  if_win <- update_probs(prior,choice,1)
  if_lose <- update_probs(prior,choice,0)
  cat_entropy(if_win)*pwin+cat_entropy(if_lose)*(1-pwin)
}
print(expected_entropy(2,post))
```

**Question 7**: Suppose we do choose machine 2 and it yields a cake. What is our current entropy and what are the expected entropies after each of our next possible choices?

**Solution**:

```{r}
post2 <- update_probs(post,2,1)
print(cat_entropy(post2))
sapply(c(1,2,3,4),function(x) {expected_entropy(x,post2)})
```

**Question 8**: What does this imply we should do next?

**Solution**: *It implies we should try a new arm -- either 3 or 4.*

**Question 9**: Is this at all counter-intuitive? What might we do differently if we are only interested in knowing which is the most reliable machine?

**Solution**: 

*It seems odd to learn about a machine that is probably bad, if we want to learn which machine is good -- our learning objective isn't matched to our goals. We can instead focus on the entropy of the random variable that matters: The identity of the machine with the highest probability.*

```{r}
is_75_ent <- function(probs) {
  oh_75 <- sapply(hyps,function(x) {sapply(x,function(y) {if(y==.75) 1.0 else 0})})
  pr <- matrix(rep(probs,4),ncol=4)
  p75<-colSums(t(oh_75) * pr)
  cat_entropy(p75)
}

ee_75 <- function(choice,prior) {
  pwin <- expected_outcome(choice,prior)
  if_win <- update_probs(prior,choice,1)
  if_lose <- update_probs(prior,choice,0)
  is_75_ent(if_win)*pwin+is_75_ent(if_lose)*(1-pwin)
}

ents_75 <- sapply(c(1,2,3,4),function(x) {ee_75(x,post2)})
ents_75


```