# Chapter 25 Compelling Decisions and Actions Under Uncertainty

Figure 25.1: The posterior distribution from this generative DAG gives us a way of modelling uncertainty in the parameters we care about.

We now continue last chapter’s investigation of Crossfit Gym’s use of yoga. Our goal is to bring the mathematical estimates of our posterior distribution’s parameters back into the real-world with compelling and visually-based recommendations. To do this, we 1) define our outcomes of interest, 2) compute a posterior distribution for those outcomes, and 3) communicate our beliefs about those outcomes by visualizing the outcomes of interest.

## 25.1 Outcomes of Interest

Our main outcome of interest is signup probability. We will investigate signup probability by investigating that particular node in relation to our decision. Representing this is the deceptively simple generative decision DAG shown in Figure 25.2.

dag_create() %>%
dag_node("Yoga Stretching?","yoga",
dec = TRUE) %>%
dag_node("Signup Probability","prob") %>%
dag_edge("yoga","prob") %>%
dag_plate("Stretch Type","",
nodeLabels = c("yoga","prob")) %>%
dag_render(shortLabel = TRUE, wrapWidth = 12)

Figure 25.2: Generative decision dag showing our interest in how yoga strecthing effects signup probability. The plate around the two nodes indicates we are investigating signup probability for each possible value of yoga stretching.

The top-down narrative of Figure 25.2 is as follows. Crossfit Gyms decides on whether to offer yoga stretching and the sign-up probability across their gyms changes as a result. Our job is to quantify this change and form an opinion on whether to offer yoga stretching or not across the gyms; we will form our opinion using the posterior distribution.

## 25.2 Compute A Posterior Distribution

The last model in the previous chapter gets us the generative DAG of Figure 25.1. Running it through the dag_greta() function yields a posterior distribution, but our job as analysts is not over. We now have to make sense of the posterior distribution and communicate its implications to stakeholders.

Rerun the analysis at the end of the previous chapter concluding with:

drawsDF = graph2 %>% dag_greta()

where drawsDF is a representative sample of the posterior distribution associated with Figure 25.1. This data frame is a sample of 4,000 draws of 28 variables. Let’s list the 28 variables using names(drawsDF).

 alpha_1 alpha_8 beta_3 beta_10 alpha_2 alpha_9 beta_4 beta_11 alpha_3 alpha_10 beta_5 beta_12 alpha_4 alpha_11 beta_6 mu_alpha alpha_5 alpha_12 beta_7 mu_beta alpha_6 beta_1 beta_8 sd_alpha alpha_7 beta_2 beta_9 sd_beta

Figure 25.3 is a subset of Figure 25.1 and our objective node, theta (a.k.a. Signup Probability), is the last descendant or bottom-child node of the graph.** ** We safely omit the children of this variable to save space since they do not affect our decision. Perusing our posterior’s 28 random variables, we might notice that theta is not one of them. Bummer! We are going to need to do some coding to get a representative sample for theta.

theta is omitted because it is a calculated node; its realization is a deterministic function of its parent y which is also a calculated node. Note that an oval’s double-perimeter is the visual clue for a calculated node (see Figure 25.3). Its parents include both random and observed nodes. So to actually determine theta, we need follow the generative recipe from grandparents (alpha,beta,j, and x) to grandchild (theta) via linear predictor y.

Figure 25.3: The posterior distribution from this generative DAG gives us a way of modelling uncertainty in the parameters we care about.

Notice, we do not care about mu_alpha and the other parents of alpha and beta. Once we have a representative sample of alpha and beta, plus the observed nodes j and x , then we can calculate theta.

For example, let’s estimate the additional sign-up probability when using “Yoga Stretch” at gym number 12?

1. Get a single draw of the required nodes from the representative sample.

draw = drawsDF %>%
slice_sample(n=1) %>% # get a random draw
select(alpha_12,beta_12)
2. Compute a value for the linear predictor with and without yoga (i.e. x=1 and x=0, respectively) using the formula shown for this node in Figure 25.3. Plugging in for x, we get the two different values of linear preditor y that interest us: x=1 $$\rightarrow y_{yoga} = \alpha_{12}+\beta_{12}*1$$) and without yoga (x=0 $$\rightarrow y_{trad} = \alpha_{12}+\beta_{12}*0 = \alpha_{12}$$):

draw = draw %>%
mutate(y_yoga = alpha_12 + beta_12 * 1) %>%
mutate(y_trad = alpha_12)
3. Compute the values for theta with and without yoga at gym12 using the inverse-logit function (i.e. the link function formula shown for this node in Figure 25.3):

draw = draw %>%
mutate(theta_yoga = 1 / (1+exp(-y_yoga))) %>%
mutate(theta_trad = 1 / (1+exp(-y_trad)))
4. Compute the increased probability of signup when using yoga at gym12:

draw = draw %>%
mutate(probIncDueToYoga =
theta_yoga - theta_trad)

And now, viewing these computed values:

draw
## # A tibble: 1 x 7
##      <dbl>   <dbl>  <dbl>  <dbl>      <dbl>      <dbl>            <dbl>
## 1    -1.78   0.340  -1.44  -1.78      0.191      0.144           0.0471

we see that for this draw, around 19% of yoga trial customers end up signing up for a membership versus 14% of customers doing traditional stretching. According to this draw then, approximately 5% is the signup probability increase due to yoga stretching.

Let’s get a little mathematical and declare this difference in probabilities to be a new random variable Z_{gymID} like:

$Z_{12} \equiv \textrm{ Probability increase due to yoga stretching at gym 12},$

The following code scales the above four steps for creating one draw to creating a column of representative samples for $$Z_{12}$$:

postDF = drawsDF %>%
select(alpha_12,beta_12) %>%
mutate(y_yoga = alpha_12 + beta_12 * 1) %>%
mutate(theta_yoga = 1 / (1+exp(-y_yoga))) %>%
mutate(z_12 = theta_yoga - theta_trad)

The column we just made, z_12, is our posterior distribution for the change in probability due to yoga stretching. We can visualize and summarize this posterior density:

postDF %>%
ggplot(aes(x = z_12)) +
geom_density(fill = "cadetblue", alpha = 0.5)

summary(postDF$z_12) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## -0.17407 0.01366 0.04864 0.06535 0.09382 0.65175 and form only a mild opinion that yoga is probably helpful at gym 12 (i.e. most of the plausible values are above zero). ## 25.3 Visualizing The Outcomes Of Interest To convert posterior probabilities into decisions, we want a visual that communicates a recommendation. In business, a random variable outcome of interest is usually made more compelling by converting it to some measure of money. Let’s assume that that the value of each new customer is estimated to be$500 in net present value terms. We can then create a mathematical formula for value created by yoga stretching per trial customer:

$ValueOfYogaStretchingForGym12 = 500 \times Z_{12}$

and also, represent it computationally

moneyDF = postDF %>%
mutate(ValueCreated = 500 * z_12)

We now have a random variable of the per customer profit estimate if gym12 adopts yoga stretching for the next year versus not adopting yoga stretching. We can summarize this random variable graphically,

moneyDF %>%
ggplot(aes(x = ValueCreated)) +
geom_density(fill = "cadetblue", alpha = 0.5) +
scale_x_continuous(labels = scales::dollar)

which shows both the plausibility of losing money as well as making up to say $100 per customer as a result of the decision. We can find some additional metrics: summary(moneyDF$ValueCreated)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
##  -87.04    6.83   24.32   32.68   46.91  325.87

telling us the median value for gym12 is about $24 of extra value per customer. This means that we assign a 50/50 chance to being above or below this number in terms of value created per customer. Hence, if it costs extra money (e.g. licensing fees, additional labor expense, additional equipment expense, etc.), say$25 per customer to offer the class, then this investment at gym12 might not be recovered.

Lastly, since continuous probability estimates are sometimes difficult for decision makers (and ourselves) to understand, we can create a discrete probability distribution by creating bins (see http://wilkelab.org/classes/SDS348/2016_spring/projects/project1/project1_hints.html).

breaks = c(-1000,-20,0,20,40,60,80,100,1000)
labels = c("<-$20","-$20 - $0","$0 - $20", "$20 - 40$","$40 - $60","$60 - $80", "$80-$100","$100+")
bins = cut(moneyDF$ValueCreated, breaks, include.lowest = T, right=FALSE, labels=labels) moneyDF$bins = bins  ## add new column

And then, we use the newly created bins to give us a very nice and interpretable plot as shown in Figure 25.4.

## add label for percentage in each bin
plotDF = moneyDF %>%
group_by(bins) %>%
summarize(countInBin = n()) %>%
mutate(pctInBin = countInBin / sum(countInBin)) %>%
mutate(label = paste0(round(100*pctInBin,0),"%")) %>%
mutate(makeMoney = ifelse(bins %in% levels(bins)[1:2],
"Not Profitable",
"Profitable"))

## Create more interpretable plot
plotDF %>%
ggplot(aes(x = bins, y = pctInBin,
fill = makeMoney)) +
geom_col(color = "black") +
geom_text(aes(label=label), nudge_y = 0.015) +
xlab("Value Added Per Trial Customer") +
ylab("Probability of Outcome") +
scale_fill_manual(values = c("lightcoral",
ggtitle("Making Yoga Stretching Mandatory for Gym 12")
From the above, we can easily talk to any decision maker about the possiblities for various outcomes. For example, summing the bottom two percentages tells us there is an approximately 16% chance of yoga stretching creating negative value. Additionally, if Crossfit anticipates a cost of $20 per customer, then there is a 44% chance of not breaking even by using this policy. Exercise 25.1 In the above walkthrough, we analyzed the monetary decision for gym12. Let’s now look at the implications of making yoga stretching mandatory for all new gyms. To do this, you would look at posterior estimates for $$\beta$$ and $$\alpha$$ - i.e. the parent nodes of $$\beta_j$$ and $$\alpha_j$$. Our generative DAG with parameters above the gym’s is a really a cool type of modelling as we not only have estimates at the gym level, but also estimates at this higher level of the entire Crossfit Gyms enterprise. Think of the higher level estimates as spawning gym’s with the $$\beta_j$$ and $$\alpha_j$$ drawn as random values from the $$\beta$$ and $$\alpha$$ distributions. Assuming it costs$10 extra for each trial customer to offer yoga stretching and that the value of a customer is \$500, what is the probability that this mandatory yoga stretching policy is value creating for the Crossfit Gyms corporation?