class: center, middle, inverse, title-slide # Multilevel analyses with brms ## Tutorial using TIMSS2019 data (FL) ### Sven De Maeyer ### 2021-02-08 --- class: left, top # Aim <br> <br> Get you familiar with a basic workflow for a Bayesian analysis making use of .pink[ `brms`] and other great packages... <br> > Not: a technical introduction in Bayesian analysis <br> > Side-effect: get you in touch with Bayesian Inferences --- class: inverse, center, middle # What's Bayesian about ... --- class: left, middle The fundament: Bayesian Theorem $$ P(\theta|data) = \frac{P(data|\theta)P(\theta)}{P(data)} $$ <br> with - `\(P(data|\theta)\)` : the .pink[likelihood] of the data given our model `\(\theta\)` - `\(P(\theta)\)` : our .pink[prior] belief about model parameters <br> `\(\rightarrow\)` *we will get back to these during the presentation* --- class: left ## Advantages & Disadvantages of Bayesian analyses Advantages: - Natural approach to express uncertainty - Ability to incorporate prior knowledge - Increased model flexibility - Full posterior distribution of the parameters - Natural propagation of uncertainty Disadvantage: - Slow speed of model estimation .footnote[[*] Slide from Paul Bürkner's presentation availble on YouTube [https://www.youtube.com/watch?v=FRs1iribZME] ] --- class: inverse, center, middle # Software --- class: left, middle ## Software - different dedicated software/packages are available: JAGS / BUGS / Stan - most powerful is Stan! Specifically the *Hamiltonian Monte Carlo* algorithm makes it the best choice at the moment - actually Stan is a probabilistic programming language that uses C++ --- class: middle ## Example of Stan code .scriptsize[ ``` ## // generated with brms 2.14.4 ## functions { ## } ## data { ## int<lower=1> N; // total number of observations ## vector[N] Y; // response variable ## int prior_only; // should the likelihood be ignored? ## } ## transformed data { ## } ## parameters { ## real Intercept; // temporary intercept for centered predictors ## real<lower=0> sigma; // residual SD ## } ## transformed parameters { ## } ## model { ## // likelihood including all constants ## if (!prior_only) { ## // initialize linear predictor term ## vector[N] mu = Intercept + rep_vector(0.0, N); ## target += normal_lpdf(Y | mu, sigma); ## } ## // priors including all constants ## target += normal_lpdf(Intercept | 500,50); ## target += student_t_lpdf(sigma | 3, 0, 68.8) ## - 1 * student_t_lccdf(0 | 3, 0, 68.8); ## } ## generated quantities { ## // actual population-level intercept ## real b_Intercept = Intercept; ## } ``` ] --- class: left, middle  --- class: inverse, center, middle # TIMSS 2019 - Belgium (Fl) --- class: left, middle To illustrate .pink[`brms`] and other packages we will do some analyses on TIMSS data... ```r library(haven) ## Student achievement data (we select some general variables ## and plausible values) Student_FL19 <- read_sav("asgbflm7.sav") %>% select(IDSCHOOL, IDCLASS, IDSTUD, ITSEX, ASDAGE, TOTWGT, HOUWGT, SENWGT, ASMMAT01, ASSSCI01, ASMNUM01, ASMGEO01, ASMDAT01, ASSLIF01, ASSPHY01, ASSEAR01, ASMKNO01, ASMAPP01, ASMREA01, ASSKNO01, ASSAPP01, ASSREA01) %>% zap_labels() ## School context data (we select schools' socio-economic ## composition) School_contextFL19 <- read_sav("acgbflm7.sav") %>% select(IDSCHOOL, ACDGSBC) ``` --- class: center, middle .bold[A quick exploration of the data...] <br> <!-- --> --- class: inverse, center, middle # Model 1: pretty naive... --- class: center,top background-image: url("Krusche_style_naive_model.jpg") background-position: 100% 100% background-size: 100% 100% ## To get started we will introduce an overly simple model --- class: left, middle How to fit in .pink[`brms`] ```r # If necessary define (some of the) priors yourself priorMath <- c(set_prior("normal(500,50)", class = "Intercept")) Model_math_naive <- brm( math ~ 1 , # Typical R formula style data = Student_FL19, family = "gaussian", # Set the type of dependent variable prior = priorMath # Define some customized prior definition ) ``` --- class: left, middle .small[ ```r summary(Model_math_naive) ``` ``` ## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: math ~ 1 ## Data: Student_FL19 (Number of observations: 4655) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; ## total post-warmup samples = 4000 ## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## Intercept 532.14 0.99 530.23 534.10 1.00 3556 2471 ## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sigma 67.34 0.68 66.01 68.72 1.00 3571 2185 ## ## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS ## and Tail_ESS are effective sample size measures, and Rhat is the potential ## scale reduction factor on split chains (at convergence, Rhat = 1). ``` ] --- class: left .pink[4000 samples of parameters] ```r head( posterior_samples(Model_math_naive), 10 ) ``` ``` ## b_Intercept sigma lp__ ## 1 531.7451 67.96813 -26208.14 ## 2 530.8633 67.74419 -26208.66 ## 3 533.0500 66.81828 -26208.38 ## 4 531.3245 68.11200 -26208.59 ## 5 531.4971 67.79558 -26208.08 ## 6 531.0264 67.99846 -26208.72 ## 7 531.2763 67.44605 -26208.06 ## 8 532.9161 67.38107 -26207.97 ## 9 532.0318 67.47086 -26207.69 ## 10 532.5863 67.43820 -26207.78 ``` --- ##Time for plots... Let's set a theme: ```r theme_set(theme_linedraw() + theme(text = element_text(family = "Times"), panel.grid = element_blank())) ``` --- .pull-left[ Package .pink[`tidybayes`] is a wrapper including functions from .pink[`ggdist`] that open a lot of plotting options .footnotesize[ ```r posterior_samples(Model_math_naive) %>% # Get draws of the posterior select(b_Intercept) %>% # Only select our Intercept pivot_longer(everything()) %>% # Rearrange the result to plot ggplot(aes(x = value, y = name)) + # Let's plot... * stat_pointinterval(.width = .89) + # 89% HDI xlab("marginal posterior") # Set our title ``` ] ] .pull-right[ <!-- --> ] --- .pull-left[ Let's plot the variance estimate making use of intervals... .footnotesize[ ```r posterior_samples(Model_math_naive) %>% # Get draws of the posterior select(sigma) %>% # Select our 'residual' variance median_qi( .width = c(.95, .89, .5)) %>% # Choose the intervals we want to plot ggplot(aes(x = sigma)) + # Let's plot... * geom_interval(aes(xmin = .lower, xmax = .upper)) + xlab("marginal posterior") + # Set our title scale_color_brewer() # Choose a color scale ``` ] ] .pull-right[ <!-- --> ] --- .pull-left[ Or let's combine some things... .footnotesize[ ```r names <- c("math average","sd pupillevel") posterior_samples(Model_math_naive) %>% select(b_Intercept,sigma) %>% set_names(names) %>% pivot_longer(everything()) %>% mutate(name = factor(name, levels = c("math average", "sd pupillevel"))) %>% ggplot(aes(x = value, y = 0)) + * stat_halfeye(.width = .89, normalize = "panels") + scale_y_continuous(NULL, breaks = NULL) + xlab("marginal posterior") + facet_wrap(~name, scales = "free") ``` ] ] .pull-right[ <!-- --> ] --- class: inverse, center, middle # Model 2: let's introduce schools --- ## A more complex model Let's assume that part of the variance can be ascribed to schools <br> a typical .pink[null model] in multilevel analysis `$$\begin{aligned} y_{ij} \sim & N(\mu,\sigma_{e_{ij}})\\ \mu & = \beta_0 + u_j \\ u_j \sim & N(0,\sigma_{u_{j}})\\ \end{aligned}$$` --- class: left, middle Let's fit in .pink[`brms`] ```r # If necessary define (some of the) priors yourself priorMath <- c(set_prior("normal(500,50)", class = "Intercept")) Model_math <- brm( * math ~ 1 + (1|IDSCHOOL), # Recognise the lme4 style notation data = Student_FL19, family = "gaussian", prior = priorMath, # Set our prior cores = 4, # Use 4 cores of pc for estimation seed = 2021, # Make our analyses Reproducible sample_prior = T # Also save samples of the prior ) ``` --- class: left, middle .small[ ```r summary(Model_math) ``` ``` ## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: math ~ 1 + (1 | IDSCHOOL) ## Data: Student_FL19 (Number of observations: 4655) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; ## total post-warmup samples = 4000 ## ## Group-Level Effects: ## ~IDSCHOOL (Number of levels: 147) ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sd(Intercept) 27.75 1.95 24.15 31.78 1.00 1299 2376 ## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## Intercept 531.71 2.50 526.82 536.50 1.00 900 1691 ## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sigma 62.13 0.65 60.86 63.43 1.00 9331 3066 ## ## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS ## and Tail_ESS are effective sample size measures, and Rhat is the potential ## scale reduction factor on split chains (at convergence, Rhat = 1). ``` ] --- class: left Let's see the samples and calculate the .pink[ICC] each time: ```r head( posterior_samples(Model_math) %>% mutate(ICC = (sd_IDSCHOOL__Intercept/(sd_IDSCHOOL__Intercept + sigma))) %>% select(b_Intercept, sigma, sd_IDSCHOOL__Intercept, ICC), 10 ) ``` ``` ## b_Intercept sigma sd_IDSCHOOL__Intercept ICC ## 1 530.2516 63.61148 27.42303 0.3012378 ## 2 533.3364 60.70196 28.55842 0.3199451 ## 3 528.1693 63.56052 28.95115 0.3129459 ## 4 528.9997 60.86778 29.10308 0.3234723 ## 5 528.6217 61.42548 31.29036 0.3374866 ## 6 532.3404 62.69834 27.04696 0.3013747 ## 7 530.8171 60.81000 29.01396 0.3230091 ## 8 529.1805 62.00777 26.48913 0.2993227 ## 9 531.2647 62.03283 27.28977 0.3055192 ## 10 528.6474 62.19574 31.86373 0.3387615 ``` --- Let's plot our parameter estimates including .pink[ICC] & make use of .pink[`stat_dotsinterval()`] .footnotesize[ ```r names <- c("math average","sd pupillevel","sd schoollevel", "ICC") posterior_samples(Model_math) %>% * mutate(ICC = (sd_IDSCHOOL__Intercept/(sd_IDSCHOOL__Intercept + sigma))) %>% select(b_Intercept,sigma,sd_IDSCHOOL__Intercept, ICC) %>% set_names(names) %>% pivot_longer(everything()) %>% mutate(name = factor(name, levels = c("math average", "sd schoollevel", "sd pupillevel", "ICC"))) %>% ggplot(aes(x = value, y = 0)) + * stat_dotsinterval(.width = .89, normalize = "panels", quantiles = 100) + scale_y_continuous(NULL, breaks = NULL) + xlab("marginal posterior") + facet_wrap(~name, scales = "free") ``` ] --- <!-- --> --- .pull-left[ .footnotesize[ ```r posterior_samples(Model_math) %>% pivot_longer(starts_with("r_IDSCHOOL")) %>% mutate(sigma_i = value) %>% ggplot(aes(x = sigma_i, y = reorder(name, sigma_i))) + stat_pointinterval(point_interval = mean_qi, .width = .89, size = 1/6) + scale_y_discrete(expression(italic(j)), breaks = NULL) + labs(subtitle = "Schoollevel residuals and 89% CI", x = expression(italic(u)[italic(j)])) ``` ] ] .pull-right[ <!-- --> ] --- ```r library(sjPlot) tab_model(Model_math) ``` <table style="border-collapse:collapse; border:none;"> <tr> <th style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; text-align:left; "> </th> <th colspan="2" style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; ">1ST PLAUSIBLE VALUE<br>MATHEMATICS</th> </tr> <tr> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; text-align:left; ">Predictors</td> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">Estimates</td> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">CI (95%)</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">Intercept</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">531.72</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">526.82 – 536.50</td> </tr> <tr> <td colspan="3" style="font-weight:bold; text-align:left; padding-top:.8em;">Random Effects</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">σ<sup>2</sup></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">3860.03</td> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">τ<sub>00</sub> <sub>IDSCHOOL</sub></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">770.05</td> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">ICC</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.17</td> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">N <sub>IDSCHOOL</sub></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">147</td> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm; border-top:1px solid;">Observations</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left; border-top:1px solid;" colspan="2">4655</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">Marginal R<sup>2</sup> / Conditional R<sup>2</sup></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.000 / 0.149</td> </tr> </table> --- class: inverse, center, middle # Model 3: time to predict --- Let's introduce 2 explanatory variables: - .pink[Girl] - .pink[Schools' Socioeconomic Composition (SSC)] `$$\begin{aligned} & y_{ij} \sim N(\mu,\sigma_{e_{ij}})\\ & \mu = \beta_0 + \beta_1\text{Girl}_{ij} + \beta_2\text{SSC}_{j} +u_j \\ & u_j \sim N(0,\sigma_{u_{j}})\\ \end{aligned}$$` --- ## Our prior knowledge - Girl: we expect girls to score lower than boys - in 2015: girls scored 6 points lower - in 2011: girls scored 8 points lower <br> - SSC: we expect that - schools with "More Affluent" composition score higher than - schools with "Neither more Affluent nor more Disadvantaged" composition, who will score higher than - schools with "More Disadvantaged" composotion --- class: left, middle Let's fit in .pink[`brms`] ```r priorMathModel2 <- c(set_prior("normal(500,50)", class = "Intercept"), set_prior("normal(-7,5)", class = "b", coef = "Girl"), set_prior("normal(-10,10)", class = "b", coef = "SSC2"), set_prior("normal(-20,10)", class = "b", coef = "SSC3") ) Model2_math <- brm(math ~ 1 + Girl + SSC + (1|IDSCHOOL), data = Student_FL19, family = "gaussian", prior = priorMathModel2, cores = 4, seed = 2021, ) ``` --- class: left, middle .small[ ```r tab_model(Model_math, Model2_math, dv.labels = c("Model 1 (null model)", "Model 2")) ``` <table style="border-collapse:collapse; border:none;"> <tr> <th style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; text-align:left; "> </th> <th colspan="2" style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; ">Model 1 (null model)</th> <th colspan="2" style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; ">Model 2</th> </tr> <tr> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; text-align:left; ">Predictors</td> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">Estimates</td> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">CI (95%)</td> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">Estimates</td> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">CI (95%)</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">Intercept</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">531.72</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">526.82 – 536.50</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">548.82</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">543.79 – 554.06</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">Girl</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">-12.71</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">-16.31 – -9.04</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">SSC: SSC2</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">-16.53</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">-25.70 – -7.25</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">SSC: SSC3</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">-40.74</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">-49.73 – -30.86</td> </tr> <tr> <td colspan="5" style="font-weight:bold; text-align:left; padding-top:.8em;">Random Effects</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">σ<sup>2</sup></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">3860.03</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">3865.48</td> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">τ<sub>00</sub></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">770.05 <sub>IDSCHOOL</sub></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">426.84 <sub>IDSCHOOL</sub></td> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">ICC</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.17</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.10</td> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">N</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">147 <sub>IDSCHOOL</sub></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">133 <sub>IDSCHOOL</sub></td> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm; border-top:1px solid;">Observations</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left; border-top:1px solid;" colspan="2">4655</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left; border-top:1px solid;" colspan="2">4257</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">Marginal R<sup>2</sup> / Conditional R<sup>2</sup></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.000 / 0.149</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.062 / 0.154</td> </tr> </table> ] --- .pull-left[ .footnotesize[ ```r names <- c("More Affluent", "Neither more Affluent nor more Disadvantaged", "More Disadvantaged") posterior_samples(Model2_math) %>% mutate(Affluent = b_Intercept, Neutral = (b_Intercept + b_SSC2) , Disadvantaged = (b_Intercept + b_SSC3)) %>% select(Affluent, Neutral, Disadvantaged) %>% set_names(names) %>% pivot_longer(everything()) %>% mutate(name = factor(name, levels = c("More Disadvantaged", "Neither more Affluent nor more Disadvantaged", "More Affluent"))) %>% ggplot(aes(x = value, y = 0)) + stat_dotsinterval(.width = .89, normalize = "panels", quantiles = 100) + scale_y_continuous(NULL, breaks = NULL) + xlab("marginal posterior") + facet_wrap(~name, scales = "free") + ggtitle(label = "Average math scores according to SSC", subtitle = "Posterior distribution and 89% CI") ``` ]] .pull-right[ <!-- --> ] --- class: inverse, center, middle # Let's compare models --- Visual posterior predictive check .pull-left[ .small[ ```r p1 <- pp_check(Model_math_naive, type = "dens_overlay", nsamples = 20) + ggtitle("Model_math_naive") p2 <- pp_check(Model_math, type = "dens_overlay", nsamples = 20) + ggtitle("Model_math") p3 <- pp_check(Model2_math, type = "dens_overlay", nsamples = 20) + ggtitle("Model2_math") library(patchwork) p1 + p2 + p3 + plot_layout(nrow = 3, byrow = F) + plot_annotation(title = 'Posterior Predictive Checks') ``` ] ] .pull-right[ <!-- --> ] --- .pink[Leave One Out - Cross Validation (loo)] .pull-left[ .footnotesize[ ```r loo_Model_math_naive <- loo(Model_math_naive) loo_Model_math <- loo(Model_math) loo_Model2_math <- loo(Model2_math) loo_Model_math_naive ``` ``` ## ## Computed from 4000 by 4655 log-likelihood matrix ## ## Estimate SE ## elpd_loo -26203.7 46.5 ## p_loo 1.9 0.1 ## looic 52407.4 93.0 ## ------ ## Monte Carlo SE of elpd_loo is 0.0. ## ## All Pareto k estimates are good (k < 0.5). ## See help('pareto-k-diagnostic') for details. ``` ] ] .pull-right[ .footnotesize[ ```r loo_Model_math ``` ``` ## ## Computed from 4000 by 4655 log-likelihood matrix ## ## Estimate SE ## elpd_loo -25891.6 46.3 ## p_loo 125.7 2.8 ## looic 51783.1 92.7 ## ------ ## Monte Carlo SE of elpd_loo is 0.1. ## ## All Pareto k estimates are good (k < 0.5). ## See help('pareto-k-diagnostic') for details. ``` ```r loo_Model2_math ``` ``` ## ## Computed from 4000 by 4257 log-likelihood matrix ## ## Estimate SE ## elpd_loo -23674.0 44.5 ## p_loo 103.6 2.4 ## looic 47348.0 89.1 ## ------ ## Monte Carlo SE of elpd_loo is 0.1. ## ## All Pareto k estimates are good (k < 0.5). ## See help('pareto-k-diagnostic') for details. ``` ] ] --- ## To conclude - Bayesian modelling is made simple by .pink[`brms`] - Great add-ons are .pink[`tidybayes`] and .pink[`bayesplot`] ## What's next - .pink[`brms`] is more versatile than lme4 allowing also to estimate - multivariate models - finite mixture models - generalized models - splines - ... - we all should consider to do Bayesian modelling in the (near) future --- class: inverse, left, middle ## My ambition I'll try to blog the coming months on Bayesian Analyses [https://svendemaeyer.netlify.app/] ## Some great sources - if you have time: "A student's guide to Bayesian Statistics", by Ben Lambert - inspiring blogs: - Solomon Kurz: [https://solomonkurz.netlify.app/post/] - Research on visualization & prior generation: [https://mucollective.northwestern.edu/] --- class: center background-image: url(camylla-battani-AoqgGAqrLpU-unsplash.jpg) background-size: contain .pink[Questions?]