Get you familiar with a basic workflow for a Bayesian analysis making use of brms
and other great packages...
Not: a technical introduction in Bayesian analysis
Side-effect: get you in touch with Bayesian Inferences
The fundament: Bayesian Theorem
P(θ|data)=P(data|θ)P(θ)P(data)
with
→ we will get back to these during the presentation
Advantages:
Disadvantage:
[*] Slide from Paul Bürkner's presentation availble on YouTube [https://www.youtube.com/watch?v=FRs1iribZME]
different dedicated software/packages are available: JAGS / BUGS / Stan
most powerful is Stan! Specifically the Hamiltonian Monte Carlo algorithm makes it the best choice at the moment
actually Stan is a probabilistic programming language that uses C++
## // generated with brms 2.14.4## functions {## }## data {## int<lower=1> N; // total number of observations## vector[N] Y; // response variable## int prior_only; // should the likelihood be ignored?## }## transformed data {## }## parameters {## real Intercept; // temporary intercept for centered predictors## real<lower=0> sigma; // residual SD## }## transformed parameters {## }## model {## // likelihood including all constants## if (!prior_only) {## // initialize linear predictor term## vector[N] mu = Intercept + rep_vector(0.0, N);## target += normal_lpdf(Y | mu, sigma);## }## // priors including all constants## target += normal_lpdf(Intercept | 500,50);## target += student_t_lpdf(sigma | 3, 0, 68.8)## - 1 * student_t_lccdf(0 | 3, 0, 68.8);## }## generated quantities {## // actual population-level intercept## real b_Intercept = Intercept;## }
To illustrate brms
and other packages we will do some analyses on TIMSS data...
library(haven)## Student achievement data (we select some general variables## and plausible values)Student_FL19 <- read_sav("asgbflm7.sav") %>% select(IDSCHOOL, IDCLASS, IDSTUD, ITSEX, ASDAGE, TOTWGT, HOUWGT, SENWGT, ASMMAT01, ASSSCI01, ASMNUM01, ASMGEO01, ASMDAT01, ASSLIF01, ASSPHY01, ASSEAR01, ASMKNO01, ASMAPP01, ASMREA01, ASSKNO01, ASSAPP01, ASSREA01) %>% zap_labels()## School context data (we select schools' socio-economic## composition)School_contextFL19 <- read_sav("acgbflm7.sav") %>% select(IDSCHOOL, ACDGSBC)
A quick exploration of the data...
How to fit in brms
# If necessary define (some of the) priors yourselfpriorMath <- c(set_prior("normal(500,50)", class = "Intercept"))Model_math_naive <- brm( math ~ 1 , # Typical R formula style data = Student_FL19, family = "gaussian", # Set the type of dependent variable prior = priorMath # Define some customized prior definition)
summary(Model_math_naive)
## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: math ~ 1 ## Data: Student_FL19 (Number of observations: 4655) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;## total post-warmup samples = 4000## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS## Intercept 532.14 0.99 530.23 534.10 1.00 3556 2471## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS## sigma 67.34 0.68 66.01 68.72 1.00 3571 2185## ## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS## and Tail_ESS are effective sample size measures, and Rhat is the potential## scale reduction factor on split chains (at convergence, Rhat = 1).
4000 samples of parameters
head( posterior_samples(Model_math_naive), 10 )
## b_Intercept sigma lp__## 1 531.7451 67.96813 -26208.14## 2 530.8633 67.74419 -26208.66## 3 533.0500 66.81828 -26208.38## 4 531.3245 68.11200 -26208.59## 5 531.4971 67.79558 -26208.08## 6 531.0264 67.99846 -26208.72## 7 531.2763 67.44605 -26208.06## 8 532.9161 67.38107 -26207.97## 9 532.0318 67.47086 -26207.69## 10 532.5863 67.43820 -26207.78
Let's set a theme:
theme_set(theme_linedraw() + theme(text = element_text(family = "Times"), panel.grid = element_blank()))
Package tidybayes
is a wrapper including functions from ggdist
that open a lot of plotting options
posterior_samples(Model_math_naive) %>% # Get draws of the posterior select(b_Intercept) %>% # Only select our Intercept pivot_longer(everything()) %>% # Rearrange the result to plot ggplot(aes(x = value, y = name)) + # Let's plot... stat_pointinterval(.width = .89) + # 89% HDI xlab("marginal posterior") # Set our title
Let's plot the variance estimate making use of intervals...
posterior_samples(Model_math_naive) %>% # Get draws of the posterior select(sigma) %>% # Select our 'residual' variance median_qi( .width = c(.95, .89, .5)) %>% # Choose the intervals we want to plot ggplot(aes(x = sigma)) + # Let's plot... geom_interval(aes(xmin = .lower, xmax = .upper)) + xlab("marginal posterior") + # Set our title scale_color_brewer() # Choose a color scale
Or let's combine some things...
names <- c("math average","sd pupillevel")posterior_samples(Model_math_naive) %>% select(b_Intercept,sigma) %>% set_names(names) %>% pivot_longer(everything()) %>% mutate(name = factor(name, levels = c("math average", "sd pupillevel"))) %>% ggplot(aes(x = value, y = 0)) + stat_halfeye(.width = .89, normalize = "panels") + scale_y_continuous(NULL, breaks = NULL) + xlab("marginal posterior") + facet_wrap(~name, scales = "free")
Let's assume that part of the variance can be ascribed to schools
a typical null model in multilevel analysis
yij∼N(μ,σeij)μ=β0+ujuj∼N(0,σuj)
Let's fit in brms
# If necessary define (some of the) priors yourselfpriorMath <- c(set_prior("normal(500,50)", class = "Intercept"))Model_math <- brm( math ~ 1 + (1|IDSCHOOL), # Recognise the lme4 style notation data = Student_FL19, family = "gaussian", prior = priorMath, # Set our prior cores = 4, # Use 4 cores of pc for estimation seed = 2021, # Make our analyses Reproducible sample_prior = T # Also save samples of the prior)
summary(Model_math)
## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: math ~ 1 + (1 | IDSCHOOL) ## Data: Student_FL19 (Number of observations: 4655) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;## total post-warmup samples = 4000## ## Group-Level Effects: ## ~IDSCHOOL (Number of levels: 147) ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS## sd(Intercept) 27.75 1.95 24.15 31.78 1.00 1299 2376## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS## Intercept 531.71 2.50 526.82 536.50 1.00 900 1691## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS## sigma 62.13 0.65 60.86 63.43 1.00 9331 3066## ## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS## and Tail_ESS are effective sample size measures, and Rhat is the potential## scale reduction factor on split chains (at convergence, Rhat = 1).
Let's see the samples and calculate the ICC each time:
head( posterior_samples(Model_math) %>% mutate(ICC = (sd_IDSCHOOL__Intercept/(sd_IDSCHOOL__Intercept + sigma))) %>% select(b_Intercept, sigma, sd_IDSCHOOL__Intercept, ICC), 10 )
## b_Intercept sigma sd_IDSCHOOL__Intercept ICC## 1 530.2516 63.61148 27.42303 0.3012378## 2 533.3364 60.70196 28.55842 0.3199451## 3 528.1693 63.56052 28.95115 0.3129459## 4 528.9997 60.86778 29.10308 0.3234723## 5 528.6217 61.42548 31.29036 0.3374866## 6 532.3404 62.69834 27.04696 0.3013747## 7 530.8171 60.81000 29.01396 0.3230091## 8 529.1805 62.00777 26.48913 0.2993227## 9 531.2647 62.03283 27.28977 0.3055192## 10 528.6474 62.19574 31.86373 0.3387615
Let's plot our parameter estimates including ICC & make use of stat_dotsinterval()
names <- c("math average","sd pupillevel","sd schoollevel", "ICC")posterior_samples(Model_math) %>% mutate(ICC = (sd_IDSCHOOL__Intercept/(sd_IDSCHOOL__Intercept + sigma))) %>% select(b_Intercept,sigma,sd_IDSCHOOL__Intercept, ICC) %>% set_names(names) %>% pivot_longer(everything()) %>% mutate(name = factor(name, levels = c("math average", "sd schoollevel", "sd pupillevel", "ICC"))) %>% ggplot(aes(x = value, y = 0)) + stat_dotsinterval(.width = .89, normalize = "panels", quantiles = 100) + scale_y_continuous(NULL, breaks = NULL) + xlab("marginal posterior") + facet_wrap(~name, scales = "free")
posterior_samples(Model_math) %>% pivot_longer(starts_with("r_IDSCHOOL")) %>% mutate(sigma_i = value) %>% ggplot(aes(x = sigma_i, y = reorder(name, sigma_i))) + stat_pointinterval(point_interval = mean_qi, .width = .89, size = 1/6) + scale_y_discrete(expression(italic(j)), breaks = NULL) + labs(subtitle = "Schoollevel residuals and 89% CI", x = expression(italic(u)[italic(j)]))
library(sjPlot)tab_model(Model_math)
1ST PLAUSIBLE VALUE MATHEMATICS |
||
---|---|---|
Predictors | Estimates | CI (95%) |
Intercept | 531.72 | 526.82 – 536.50 |
Random Effects | ||
σ2 | 3860.03 | |
τ00 IDSCHOOL | 770.05 | |
ICC | 0.17 | |
N IDSCHOOL | 147 | |
Observations | 4655 | |
Marginal R2 / Conditional R2 | 0.000 / 0.149 |
Let's introduce 2 explanatory variables:
yij∼N(μ,σeij)μ=β0+β1Girlij+β2SSCj+ujuj∼N(0,σuj)
Let's fit in brms
priorMathModel2 <- c(set_prior("normal(500,50)", class = "Intercept"), set_prior("normal(-7,5)", class = "b", coef = "Girl"), set_prior("normal(-10,10)", class = "b", coef = "SSC2"), set_prior("normal(-20,10)", class = "b", coef = "SSC3") )Model2_math <- brm(math ~ 1 + Girl + SSC + (1|IDSCHOOL), data = Student_FL19, family = "gaussian", prior = priorMathModel2, cores = 4, seed = 2021,)
tab_model(Model_math, Model2_math, dv.labels = c("Model 1 (null model)", "Model 2"))
Model 1 (null model) | Model 2 | |||
---|---|---|---|---|
Predictors | Estimates | CI (95%) | Estimates | CI (95%) |
Intercept | 531.72 | 526.82 – 536.50 | 548.82 | 543.79 – 554.06 |
Girl | -12.71 | -16.31 – -9.04 | ||
SSC: SSC2 | -16.53 | -25.70 – -7.25 | ||
SSC: SSC3 | -40.74 | -49.73 – -30.86 | ||
Random Effects | ||||
σ2 | 3860.03 | 3865.48 | ||
τ00 | 770.05 IDSCHOOL | 426.84 IDSCHOOL | ||
ICC | 0.17 | 0.10 | ||
N | 147 IDSCHOOL | 133 IDSCHOOL | ||
Observations | 4655 | 4257 | ||
Marginal R2 / Conditional R2 | 0.000 / 0.149 | 0.062 / 0.154 |
names <- c("More Affluent", "Neither more Affluent nor more Disadvantaged", "More Disadvantaged")posterior_samples(Model2_math) %>% mutate(Affluent = b_Intercept, Neutral = (b_Intercept + b_SSC2) , Disadvantaged = (b_Intercept + b_SSC3)) %>% select(Affluent, Neutral, Disadvantaged) %>% set_names(names) %>% pivot_longer(everything()) %>% mutate(name = factor(name, levels = c("More Disadvantaged", "Neither more Affluent nor more Disadvantaged", "More Affluent"))) %>% ggplot(aes(x = value, y = 0)) + stat_dotsinterval(.width = .89, normalize = "panels", quantiles = 100) + scale_y_continuous(NULL, breaks = NULL) + xlab("marginal posterior") + facet_wrap(~name, scales = "free") + ggtitle(label = "Average math scores according to SSC", subtitle = "Posterior distribution and 89% CI")
Visual posterior predictive check
p1 <- pp_check(Model_math_naive, type = "dens_overlay", nsamples = 20) + ggtitle("Model_math_naive")p2 <- pp_check(Model_math, type = "dens_overlay", nsamples = 20) + ggtitle("Model_math")p3 <- pp_check(Model2_math, type = "dens_overlay", nsamples = 20) + ggtitle("Model2_math")library(patchwork)p1 + p2 + p3 + plot_layout(nrow = 3, byrow = F) + plot_annotation(title = 'Posterior Predictive Checks')
Leave One Out - Cross Validation (loo)
loo_Model_math_naive <- loo(Model_math_naive)loo_Model_math <- loo(Model_math)loo_Model2_math <- loo(Model2_math)loo_Model_math_naive
## ## Computed from 4000 by 4655 log-likelihood matrix## ## Estimate SE## elpd_loo -26203.7 46.5## p_loo 1.9 0.1## looic 52407.4 93.0## ------## Monte Carlo SE of elpd_loo is 0.0.## ## All Pareto k estimates are good (k < 0.5).## See help('pareto-k-diagnostic') for details.
loo_Model_math
## ## Computed from 4000 by 4655 log-likelihood matrix## ## Estimate SE## elpd_loo -25891.6 46.3## p_loo 125.7 2.8## looic 51783.1 92.7## ------## Monte Carlo SE of elpd_loo is 0.1.## ## All Pareto k estimates are good (k < 0.5).## See help('pareto-k-diagnostic') for details.
loo_Model2_math
## ## Computed from 4000 by 4257 log-likelihood matrix## ## Estimate SE## elpd_loo -23674.0 44.5## p_loo 103.6 2.4## looic 47348.0 89.1## ------## Monte Carlo SE of elpd_loo is 0.1.## ## All Pareto k estimates are good (k < 0.5).## See help('pareto-k-diagnostic') for details.
brms
tidybayes
and bayesplot
brms
is more versatile than lme4 allowing also to estimate
we all should consider to do Bayesian modelling in the (near) future
I'll try to blog the coming months on Bayesian Analyses [https://svendemaeyer.netlify.app/]
Questions?
Get you familiar with a basic workflow for a Bayesian analysis making use of brms
and other great packages...
Not: a technical introduction in Bayesian analysis
Side-effect: get you in touch with Bayesian Inferences
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |