Share this on Twitter Link to YouTube playlist for videos that accompany each chapter

Chapter 15 Install TensorFlow, greta, and causact

The previous two chapters covered Bayes rule and generative DAGs. This chapter has you install necessary software:

  1. tensorflow to computationally automate Bayes rule,
  2. causact to visually depict pretty generative DAGs, and
  3. greta to apply Bayes rule to generative DAGs informed by data stored in data frames.

Figure 15.1: The greta logo. See https://greta-stats.org/ for more information on greta.

The greta logo.  See https://greta-stats.org/ for more information on greta.

The below instructions will help you set up your computer environment to get greta, along with the causact package, running smoothly. While Python is required, these instructions do not assume any previous installation of Python and only assume you have a recent version of R/RStudio (R-version 4.1.X or higher) and a machine capable of running TensorFlow. If your machine does not meet the system requirements in the margin, proceed to last section of this chapter to install greta using RStudio Cloud.

greta & TensorFlow System Requirements: * Ubuntu 16.04 or later (64-bit) * macOS 10.12.6 (Sierra) or later (64-bit) (no GPU support) * macOS - Intel chips only - M1 ARM-based chips not supported yet * Windows 7 or later (64-bit) (Python 3 only) * Raspbian 9.0 or later See RStudio cloud install instructions below if requirements not met.

15.1 Software Stack for Using causact

Before we get to the detailed installation instructions, it is instructive to understand why a software stack is needed in the first place. In our case, we use causact to visually translate business narratives to a statistical models. To perform inference on any model, causact relies on greta which makes scalable Bayesian inference available to R users without learning a separate coding language (e.g. TensorFlow, Stan, etc.). Behind the scenes, however, greta relies on the Python-based TensorFlow library for its numerical computation engine.

To get the TensorFlow software stack\(^{**}\) ** A software stack is a collection of programs, applications, components and tools that work together to get a result. working properly, we need to make sure all component parts are compatible.

Figure 15.2: Two software stack possibilities. The stack on the left is a stable one. The stack on the right is what you would get installing more recent versions of each component; i.e. a falling stack that does not work.

Two software stack possibilities.  The stack on the left is a stable one.  The stack on the right is what you would get installing more recent versions of each component; i.e. a falling stack that does not work.

Figure 15.2 shows two potential software stacks. Both include four key components:

  1. Python: TensorFlow is a Python library and hence, needs a Python implementation installed on its host system in order to run. We will use miniconda, as needed, to get this.

  2. TensorFlow: This is the heart of being able to do fast numerical computing. Once miniconda is installed, we will configure a conda environment to get the version of TensorFlow and other Python libraries required by greta.

  3. TensorFlow Probability: An additional Python library built on top of TensorFlow specializing in probabilistic inference at scale.

  4. greta: A very slick interface between R and TensorFlow. This interface gives us the power of TensorFlow without the complexity and cognitive overhead of learning another language.

** If you do not meet the minimum system requirements or decide to abandon the installation process detailed below, you can access greta capabilities using RStudio in the cloud. Simply navigate to https://rstudio.cloud/create and account, and use the cloud install instructions for greta and causact at the end of this chapter.

As I write this, only the stack on the left is servicable. The stack on the right represents more recent software versions - which you might get following the easiest default install instructions - but they do not play well with one another. Simply stated, these other versions will not work together. So before leveraging greta we will have to do some system configuration. The next section is an R script that does this configuration for you. If issues arise installing greta, you can also seek help on https://forum.greta-stats.org/. Please note that you need a 64-bit computer with about 10GB of free hard drive space.\(^{**}\)

15.2 An Easy Install Script

This script installs greta and causact on your local machine (see the “RStudio Cloud Install” section for cloud installations). The script allows you to complete the installation process without ever leaving RStudio. Try it by running each line one at a time and awaiting the system’s response before continuing.

## INSTALLATION SCRIPT TO GET GRETA, CAUSACT,
## and TENSORFLOW ALL WORKING TOGETHER HAPPILY

## NOTE:  Run each line one at a time using CTRL+ENTER.
##        Await completion of one line
##        before running the next.
##        If prompted to "Restart R", say YES.

#### STEP 0:  Restart R in a Clean Session
#### use RStudio menu:  SESSION -> RESTART R

#### STEP 1: INSTALL R PACKAGES
install.packages("greta")
install.packages("causact")

#### STEP 2: INSTALL PYTHON DEPENDENCIES IN FINDABLE SPOT
greta::install_greta_deps()
## if asked to install minconda, please type "Y" 
## and hit <ENTER> in the Console
## this can take up to 10 minutes

#### STEP 3:  TEST THE INSTALLATION - must restart r first
##  **** USE MENU:   SESSION -> RESTART R
library(greta)  ## should work without error if you restarted R.. 
library(causact)
graph = dag_create() %>%
  dag_node("Normal RV",
           rhs =normal(0,10))
graph %>% dag_render()  ## see oval
drawsDF = graph %>% dag_greta() ## see "running..."
drawsDF %>% dagp_plot(densityPlot = TRUE)  ## see plot
#### CONGRATS IF IT WORKS.  

If the above script produced a plot in the last line - CONGRATS!!

15.3 RStudio Cloud Install (skip if you had success installing locally)

RStudio Cloud (https://rstudio.cloud/) allows anyone with access to an internet browser to use R and RStudio. After setting up your account, you can get a working environment with greta,causact, and the tidyverse packages installed by following the below code. If asked to install Miniconda, respond yes by typing a y in the console.

RStudio cloud is useful for those with chromebooks or computers that seem underpowered for modern analytics. If you have a laptop that can handle it, then I recommend sticking to using your locally-installed RStudio.

### SETUP AN RSTUDIO CLOUD ACCOUNT
### AT https://rstudio.cloud/, THEN USE
### THIS INSTALL SCRIPT FOR INSTALLING
### CAUSACT,GRETA,TENSORFLOW ON RSTUDIO CLOUD

## Get R packages
install.packages("remotes")
install.packages("reticulate")
## Install older version as v2.7 had breaking changes
remotes::install_version("tensorflow", version = "2.6.0", 
                         repos = "http://cran.us.r-project.org")

## INSTALL PYTHON TENSORFLOW ENIVRONMENT
## If prompted to install Miniconda
## enter Y in console and then hit <ENTER>
tensorflow::install_tensorflow(
  version = "1.14.0",
  extra_packages = 
    c("tensorflow-probability==0.7.0",
      "numpy==1.16",
      "pyyaml", "requests",
      "Pillow", "pip"))   

## Get R packages
install.packages("greta")
install.packages("tidyverse")
install.packages("causact")
### IF NO ERRORS, THEN TRY BELOW TEST SCRIPT

## TEST SCRIPT
library(greta)  ## should work now
library(causact)
library(tidyverse)
graph = dag_create() %>%
  dag_node("Normal RV","x",
           rhs =normal(0,10))
graph %>% dag_render()  ## see oval - ignore warning on RStudio Cloud
drawsDF = graph %>% dag_greta()  ## observe "running X chains ..."
drawsDF %>% ggplot() + geom_density(aes(x=x), fill = "darkgreen")  ##see plot
## if NO ERRORS (warnings are okay), then installation is a success

The install instructions for this section should only be run once in your cloud account environment.

Go to top of page: link to the top
Share this page on Twitter: Share this on Twitter
YouTube playlist link for videos that accompany each chapter: https://youtube.com/playlist?list=PLassxuIVwGLPy-mtohX-NXrjD8fc9FBOc
Buy a beautifully printed full-color version of "A Business Analyst's Guide to Business Analytics" on Amazon: http://www.amazon.com/dp/B08DBYPRD2