class: center, middle <style> .center2 { margin: 0; position: absolute; top: 50%; left: 50%; -ms-transform: translate(-50%, -50%); transform: translate(-50%, -50%); } </style> <style type="text/css"> .right-column{ padding-top: 0; } .remark-code, .remark-inline-code { font-family: 'Source Code Pro', 'Lucida Console', Monaco, monospace; font-size: 90%; } </style> <div class="my-logo-left"> <img src="img/UA-eng-hor-1-RGB.jpg" width="90%"/> </div> # Workshop: Bayesian analyses in R .font160[ .SW-greenD[Sven De Maeyer] ] [University of Antwerp / Faculty of Social Sciences] .font80[ .UA-red[ 13/09/2021 ] ] --- background-image: url(despicable_me2.jpeg) background-size: contain class: inverse ## .yellow[About me] - University of Antwerp; Faculty of Social Sciences; dept. Training and Education Sciences - phd on methodological issues in research on School Effectiveness - research on a broad range of topics (e.g. psychometrics, writing research, learning strategies, education for sustainable development, linguistics, speech development of impaired children, ...) - my main focus shifted recently to Comparative Judgement and Learning from comparisons - since a year or 2 discovering Bayesian Statistics & `brms` <br> <br> <br> Blog: https://svendemaeyer.netlify.app/ Twitter: @svawa github: https://github.com/Sdemaeyer2 --- background-image: url(we_like_you_too.jpg) background-size: contain class: inverse, center, bottom ## .green[What about you?] .black[Who has some experience with Bayesian analyses?] .black[... with `Stan`?] .black[ ... with `brms`?] <br> <br> <br> --- name: topicslide # Topics 1. Aims of the workshop & some practical stuff ([let's go there](#aims)) 2. What's Bayesian? ([let's go there](#Whats_Bayesian)) 3. Why .UA-red[`brms`]? ([let's go there](#why_brms)) 4. .UA-red[`brms`] basic example ([let's go there](#brms_example)) 5. Let's get practical: .SW-greenD[the example data] ([let's go there](#example_data)) 6. Let's slightly increase complexity ([let's go there](#add_complexity)) --- class: inverse-green, center, middle name: aims # 1. Aims & practical stuff .footnote[[<i> get me back to the topic slide</i>](#topicslide)] --- ## My aims are ... <br> <br> - to give some background on Bayesian statistics without getting too technical (I hope) - to get you familiar with a basic workflow for a Bayesian analysis making use of .UA-red[ `brms`] and other great packages - to let you apply what's learned on your data (so it's hands-on) - to supply you with .UA-red[ `R code`] that may inspire - to share some great sources --- ## All material is shared... <br> On Github there is a dedicated repository where you'll find: - the Rmarkdown file for the slides - saved models from the slides (some take too long to estimate 'live') - data used for the exercises <br> The Github repository can be accessed using the following link: [https://github.com/Sdemaeyer2/Turku_brms_2021] <br> The slides are on my Talks site [https://slides-sdemaeyer.netlify.app/] --- ## Zoom <br> Recordings <br> Please ask questions and interfere! Just give a shout! <br> Stuck with some exercises? `\(\rightarrow\)` Share your screen and I will try to help --- ##Your own data Day 2: .center2[ - some dedicated time to apply stuff to your own data - be prepared: - think about some (simple) models (and possible priors...) - prepare your data to avoid a lot of data issues on day 2 - for the brave: try some modelling at home before day 2 ] --- name: Prerequisites ##Prerequisites - be familiar with R - have `CmdStan` & `cmdstanr` up and running (see https://mc-stan.org/cmdstanr/articles/cmdstanr.html) - have `brms` installed - have `tidyverse` installed I will often use the .SW-greenD[functional programming] style making use of the pipe operator (works if `tidyverse` is loaded): ```r A %>% B ``` This reads as .UA-red[take A] and then .UA-red[do B] An example: ```r c(1,2,3,4,5) %>% mean() ``` reads as .UA-red[create a vector of the values 1, 2, 3, 4 and 5] and then .UA-red[calculate the mean] Want to learn dplyr by nice GIFS: [hop to this great tweet feed](#tweet_dplyr) --- class: inverse-green, center, middle name: Whats_Bayesian ## What's Bayesian inference about? .footnote[[<i> get me back to the topic slide</i>](#topicslide)] --- ## First things first... <br> <br> What is <b>.SW-greenD[statistical inference]</b> about? <br> <br> In your definition, what is a <b>.SW-greenD[statistical model]</b>? <br> <br> What are <b>.SW-greenD[parameters]</b>? --- ## Frequentist vs. Bayesian inference .pull-left[ <img src="figures_inferences_NHST.jpeg" width="110%" height="110%" /> ] .pull-right[ <img src="figures_inferences_Bayes.jpeg" width="110%" height="110%" /> ] --- ## To get us started... .left-column[ <img src="Van_Helsing_1931.png" width="110%" height="110%" /> ] .right-column[ - prof. dr. Von Helsing (a famous Vampire Hunter) - How many Vampires are out there? - Time for his famous .UA-red[Vampire Test] - Sample of 200 citizens - Result: 10.5% = Vampire - Ok, but what about the population? ] --- ## What a Frequentist approach can tell us Make a Confidence Interval for the 10.5%: - Estimate the sampling error: * sampling error: 0.0217 - Calculate upper and lower limit of the 95% CI: * lower: 0.0625 * upper: 0.1475 - Inference: > In 95% of the samples (of sample size 200) we would get a percentage of vampires varying somewhere between 6.25% and 14.75% --- ## The Bayesian approach .left-column[ Before Von Helsing gathers data he had no knowledge at all! If we visualize this 'prior knowledge' it could look like this.] .right-column[ <img src="Part1_files/figure-html/unnamed-chunk-9-1.png" width="576" /> ] --- ## The Bayesian approach .left-column[ Von Helsing starts testing ... First tested person is NOT a Vampire! So, 100% of Vampires is already less probable. Time to adjust our knowledge ] .right-column[ <img src="Part1_files/figure-html/unnamed-chunk-10-1.png" width="576" /> ] --- ## The Bayesian approach .left-column[ After testing the whole sample of 200 citizens, Von Helsing's knowledge is updated: ] .right-column[ <img src="Part1_files/figure-html/unnamed-chunk-11-1.png" width="576" /> ] --- ## The Bayesian approach .left-column[ We get a probability for each possible percentage of vampires. * plot it in a .UA-red[probability density] plot * summarize: e.g. 90% most probable values are situated between 7.5% and 14.8% ] .right-column[ <img src="Part1_files/figure-html/unnamed-chunk-12-1.png" width="576" /> ] --- ## Advantages & Disadvantages of Bayesian analyses <br> Advantages: - Natural approach to express uncertainty - Ability to incorporate prior knowledge - Increased model flexibility - Full posterior distribution of the parameters - Natural propagation of uncertainty <br> Disadvantage: - Slow speed of model estimation - Some reviewers don't understand you (<i>"give me the p-value"</i>) .footnote[[*] Slight adaptation of a slide from Paul BΓΌrkner's presentation available on YouTube [https://www.youtube.com/watch?v=FRs1iribZME] ] --- ## Bayesian Theorem .pull-left[ <img src="prior_data_posterior.png" width="50%" height="50%" /> ] .pull-right[ $$ P(\theta|data) = \frac{P(data|\theta)P(\theta)}{P(data)} $$ <br> with - `\(P(data|\theta)\)` : the .UA-red[likelihood] of the data given our model `\(\theta\)` - `\(P(\theta)\)` : our .UA-red[prior] belief about model parameters - `\(P(\theta|data)\)`: the .UA-red[posterior] probability about model parameters ] .footnote[meme from https://twitter.com/ChelseaParlett/status/1421291716229746689?s=20] --- ## Likelihood .left-column[ `\(P(data|\theta)\)` sometimes also written as `\(L(\theta|data)\)` ] .right-colomn[ <br> Actually this is our .UA-red[<b>model</b>] part <br> with `\(\theta\)` being the model and all parameters in the model ] --- ## Likelihood .left-column[ Example: .SW-greenD[<i>What if we want to model how fast people can run a 1OK?</i>] <img src="running.jpg" width="80%" height="80%" /> ] .right-column[ Data = 10 observations (Running times for 10K in minutes) ``` ## [1] 52 54 58 48 41 49 72 53 64 62 ``` The model (<i>normal distribution</i>): `\(y_i \backsim N(\mu, \sigma)\)` The parameter values that maximize the likelihood of our data: .footnotessize[ ```r Mean_RT <- mean(RT, na.rm = T) Mean_RT ``` ``` ## [1] 55.3 ``` ```r Sd_RT <- sd(RT, na.rm = T) Sd_RT ``` ``` ## [1] 8.957306 ``` ] ] --- ## Prior .center2[ .SW-greenD[ Expression of our prior knowledge (belief) about probable parameter values as a probability density function ] <br> <br> ><i>"For Bayesians, the data are treated as fixed and the parameters vary. [...] Bayes' rule tells us that to calculate the posterior probability distribution we must combine a likelihood with a prior probability distribution over parameter values." </i> <br> (Lambert, 2018, p.88) ] --- ## Prior .left-column[ Example: .SW-greenD[<i>What if we want to model how fast people can run a 1OK?</i>] <img src="running.jpg" width="80%" height="80%" /> ] .right-column[ `\(y_i \backsim N(\mu, \sigma)\)` Express our prior beliefs about .UA-red[ `\(\mu\)` ] and .UA-red[ `\(\sigma\)` ] .UA-blue[ > .Large[π€] <i> Please share your thoughts? What are possible values of the population average and standard deviation for our example?] </i> ] --- ## Prior .pull-left[ ### Uninformative / Vague When .SW-greenD[objectivity] is crucial and you want <i> .SW-greenD[let the data speak for itself...] </i> ] .pull-right[ ### Informative When including significant information is crucial - previously collected data - results from former research/analyses - data of another source - theoretical considerations ] --- ## Prior .left-column[ Example: .SW-greenD[<i>What if we want to model how fast people can run a 1OK?</i>] <img src="running.jpg" width="80%" height="80%" /> ] .right-column[ <b>Vague priors for `\(\mu\)` </b> <img src="Part1_files/figure-html/unnamed-chunk-19-1.png" width="504" /> ] --- ## Say we have following priors ```r par(mfrow=c(2,2)) curve( dnorm( x , 50 , 20 ) , from=1 , to=200 ,xlab="mu", main="Prior for mu") curve( dunif( x , 1 , 40 ) , from=-10 , to=50 ,xlab="sigma", main="Prior for sigma") ``` <img src="Part1_files/figure-html/unnamed-chunk-20-1.png" width="504" /> --- ## Calculate the Posterior by hand (aka .UA-red[grid approximation]) .pull-left[ ```r # sample some values for mu and sigma mu.list <- seq(from = 30, to = 90, length.out=200) sigma.list <- seq(from = 2, to = 30, length.out = 200) post <- expand.grid(mu = mu.list, sigma = sigma.list) # Calculate the loglikelihood of the data # for each parameter value post$LL <- sapply(1:nrow(post), function(i) sum( dnorm( RT , mean=post$mu[i] , sd=post$sigma[i] , log=TRUE ) ) ) ``` ] .pull-right[ ```r # Calculate posterior as product # of LL and Prior # but you see a '+ sign' # because we put everything on the # log scale # (to avoid getting zero's # due to rounding in R) post$prod <- post$LL + dnorm(post$mu, 50 , 10 , TRUE) + dunif(post$sigma , 1 , 40 , TRUE) # Re-scale the posterior # to the probability scale post$prob <- exp(post$prod - max(post$prod) ) ``` ] --- ## Visualise the posterior distribution <i> Make a contour plot </i> .pull-left[ ```r # Sampling going on # So for reproducibility set.seed(1975) post %>% # sample 10000 rows # with replacement # higher prob higher prob to # be sampled sample_n(size = 10000, replace = TRUE, weight = prob) %>% # create the plot ggplot(aes(x = mu, y = sigma)) + geom_density_2d_filled() + theme_minimal() ``` ] .pull-right[ <img src="Part1_files/figure-html/unnamed-chunk-24-1.png" width="432" /> ] --- ## Visualise the posterior distributions <i> Or sample from posterior and plot simple density plots </i> ```r sample.rows <- sample(1:nrow(post), size=10000, replace=TRUE, prob=post$prob) sample.mu <- post$mu[sample.rows] sample.sigma <- post$sigma[sample.rows] ``` .pull-left[ ```r plot(density(sample.mu),main="mu") ``` <img src="Part1_files/figure-html/unnamed-chunk-26-1.png" width="288" /> ] .pull-right[ ```r plot(density(sample.sigma),main="sigma") ``` <img src="Part1_files/figure-html/unnamed-chunk-27-1.png" width="288" /> ] --- class: inverse-green, center, middle name: why_brms ## 3. Why `brms`? .footnote[[<i> get me back to the topic slide</i>](#topicslide)] --- ## Imagine A 'simple' linear model <br> `$$\begin{aligned} & RT_{i} \sim N(\mu,\sigma_{e_{i}})\\ & \mu = \beta_0 + \beta_1*\text{Weigth}_{i} + \beta_2*\text{Height}_{i} + \beta_3*\text{Age}_{i} + \beta_4*\text{WeeklyTrainingHours}_{i} + \beta_5*\text{Gender}_{i} \\ \end{aligned}$$` <br> So you can get a REALLY LARGE number of parameters! --- ## Markov Chain Monte Carlo - Why? Complex models `\(\rightarrow\)` Large number of parameters `\(\rightarrow\)` exponentional number of combinations! <br> Posterior gets unsolvable by grid approximation <br> We will approximate the 'joint posterior' .SW-greenD[by 'smart' sampling] <br> Samples of combinations of parametervalues are drawn <br> BUT: .SW-greenD[samples will not be random!] --- ## MCMC - demonstrated .center2[ Following link brings you to an interactive tool that let's you get the basic idea behind MCMC sampling: https://chi-feng.github.io/mcmc-demo/app.html#HamiltonianMC,standard ] --- ## Side-step: conjugate priors Some prior probability distributions can be <i>'more easely'</i> combined with a certain likelihood! These are called .SW-greenD[<b>conjugate</b>] priors <b>PRO: </b> You can actually analytically calculate the posterior. So: no need for all the sampling and approximation of the posterior. A list of conjugate likelihood-prior pairs: https://en.wikipedia.org/wiki/Conjugate_prior <b>BUT: </b> ><i>"Simplicity comes with a cost! In most real-life examples of inference, the contraint of choosing likelihood-prior conjugate pairs is too restrictive and can lead us to use models that inadequately capture the variability in the data. (Lambert, 2018, p. 209)" --- ## Software <br> - different dedicated software/packages are available: JAGS / BUGS / Stan <br> - most powerful is .UA-red[Stan]! Specifically the *Hamiltonian Monte Carlo* algorithm makes it the best choice at the moment <br> - .UA-red[Stan] is a probabilistic programming language that uses C++ --- ## Example of Stan code .scriptsize[ ``` ## // generated with brms 2.14.4 ## functions { ## } ## data { ## int<lower=1> N; // total number of observations ## vector[N] Y; // response variable ## int prior_only; // should the likelihood be ignored? ## } ## transformed data { ## } ## parameters { ## real Intercept; // temporary intercept for centered predictors ## real<lower=0> sigma; // residual SD ## } ## transformed parameters { ## } ## model { ## // likelihood including all constants ## if (!prior_only) { ## // initialize linear predictor term ## vector[N] mu = Intercept + rep_vector(0.0, N); ## target += normal_lpdf(Y | mu, sigma); ## } ## // priors including all constants ## target += normal_lpdf(Intercept | 500,50); ## target += student_t_lpdf(sigma | 3, 0, 68.8) ## - 1 * student_t_lccdf(0 | 3, 0, 68.8); ## } ## generated quantities { ## // actual population-level intercept ## real b_Intercept = Intercept; ## } ``` ] ---  --- class: inverse-green, center, middle name: brms_example ## 4. `brms` basic example .footnote[[<i> get me back to the topic slide</i>](#topicslide)] --- ## `brms` syntax Very very similar to `lme4` and in line with typical R-style writing up of a model ... .pull-left[ `lme4` ```r Model <- lmer( y ~ x1 + x2 + (1|Group), data = Data, ... ) ``` ] .pull-left[ `brms` ```r Model <- brm( y ~ x1 + x2 + (1|Group), data = Data, * family = "gaussian", * backend = "cmdstanr" ... ) ``` ] Notice: - `family = "gaussian"` indicates the "likelihood" function we will use - `backend = "cmdstanr"` indicates the way we want to interact with Stan and C++ --- ## Let's retake the example on running .left-column[ Example: .SW-greenD[<i>What if we want to model how fast people can run a 1OK?</i>] <img src="running.jpg" width="80%" height="80%" /> ] .right-column[ The simplest model looked like: $$ RT_i \sim N(\mu,\sigma_e) $$ In `brms` this model is: ```r RT <- c( 52, 54, 58, 48, 41, 49, 72, 53, 64, 62 ) # First make a dataset from our RT vector DataRT <- data_frame(RT) Mod_RT1 <- brm( RT ~ 1, # We only model an intercept data = DataRT, backend = "cmdstanr", seed = 1975 ) ``` .UA-red[<b> .Large[π] Try it yourself and run the model ... (don't forget to load the necessary packages: `brms` & `tidyverse`) </b> ] ] --- <img src="RT_Model1_Estimation.jpg" width="60%" height="60%" /> --- ## But ... .center2[ <img src="MCMC_Skeleton.jpg" width="50%" height="50%" /> ] --- ## So For the workshop all models are in the .SW-greenD[models] folder You can normally read model information with `readRDS` Here's an example: ```r Mod_RT1 <- readRDS(here("models", "Mod_RT1.RDS")) ``` This model information was saved with the `saveRDS` function: ```r saveRDS(Mod_RT1, here("models", "Mod_RT1.RDS")) ``` .footnote[Notice that I use the package `here` allowing me to access files anywhere in the RStudio Projects folder using the `here( )` commando (no fuzz with paths on your computer)] --- # MCMC = sampling .left-column[ MCMC results will differ each time run <br> Of course, not a lot if your model is good! <br> So, what if you want to be reproducible? ] .right-column[ ```r Mod_RT1 <- brm( RT ~ 1, # We only model an intercept data = DataRT, backend = "cmdstanr", * seed = 1975 # make sure we get the same results each time :-) ) ``` ] --- ## Good old `summary( )` function ```r summary(Mod_RT1) ``` ``` ## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: RT ~ 1 ## Data: DataRT (Number of observations: 10) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; ## total post-warmup samples = 4000 ## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS *## Intercept 55.10 2.82 49.68 60.90 1.00 2000 2180 ## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS *## sigma 9.47 2.32 6.17 15.08 1.00 2445 2318 ## ## Samples were drawn using sample(hmc). For each parameter, Bulk_ESS ## and Tail_ESS are effective sample size measures, and Rhat is the potential ## scale reduction factor on split chains (at convergence, Rhat = 1). ``` --- ## Samples & chains? ``` ## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: RT ~ 1 ## Data: DataRT (Number of observations: 10) *## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; *## total post-warmup samples = 4000 ## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## Intercept 55.10 2.82 49.68 60.90 1.00 2000 2180 ## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sigma 9.47 2.32 6.17 15.08 1.00 2445 2318 ## ## Samples were drawn using sample(hmc). For each parameter, Bulk_ESS ## and Tail_ESS are effective sample size measures, and Rhat is the potential ## scale reduction factor on split chains (at convergence, Rhat = 1). ``` <i>By default `brms` sets .SW-greenD[4 chains of 2000 iterations] of which .SW-greenD[1000 iterations/chain are warm-up] </i> --- ## Samples & chains: what are 'burn-in' iterations .center2[ <img src="mcmc_fases_p1.png" width="100%" height="100%" /> ] --- ## Samples & chains: what are 'burn-in' iterations .center2[ <img src="mcmc_fases_p2.png" width="100%" height="100%" /> ] --- ## Samples & chains: what are 'burn-in' iterations .center2[ <img src="mcmc_fases_p3.png" width="100%" height="100%" /> ] --- ## Good ald `plot( )` function .pull-left[ ```r plot(Mod_RT1) ``` <img src="Part1_files/figure-html/unnamed-chunk-44-1.png" width="432" /> ] .pull-right[ .UA-red[Left panel] the posterior distributions (<i>later we will see other more informative ways to plot the information in the posterior</i>) .UA-red[Right panel] the convergence of the each parameter `\(\rightarrow\)` should look like a caterpillar <img src="catterpillar.jpeg" width="50%" height="50%" /> ] --- ## Model Convergence .pull-left[ <img src="Vethari_paper.jpg" width="99%" height="99%" /> ] .pull-right[ - `\(\widehat R\)` < 1.015 for each parameter estimate - at least 4 chains are recommended - Effective Sample Size (ESS) > 400 to rely on `\(\widehat R\)` ] --- ## Let's inspect the output again ``` ## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: RT ~ 1 ## Data: DataRT (Number of observations: 10) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; ## total post-warmup samples = 4000 ## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## Intercept 55.10 2.82 49.68 60.90 1.00 2000 2180 ## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sigma 9.47 2.32 6.17 15.08 1.00 2445 2318 ## ## Samples were drawn using sample(hmc). For each parameter, Bulk_ESS ## and Tail_ESS are effective sample size measures, and Rhat is the potential ## scale reduction factor on split chains (at convergence, Rhat = 1). ``` --- class: inverse-green, center, middle name: example_data ## 5. Let's get practical: the example dataset .footnote[[<i> get me back to the topic slide</i>](#topicslide)] --- ## Judging 'argumentative texts'  --- ## Procedure - 26 high school teachers (Dutch) voluntary participated - each did 10 comparisons of 2 argumentative texts from 10th graders - 3 batches of comparisons with random allocation of judges to one of the batches - all batches similar composition of comparisons regarding the characteristics of the pairs; the pairs, however not the same - Tobii TX300 dark pupil eye-tracker with a 23-inch TFT monitor (max. resolution of 1920 x 1080 pixels) - data sampled binocularly at 300 Hz --- ## AOI's  --- ## The data
--- class: inverse-green, center, middle name: add_complexity ## 6. Let's slightly increase complexity .footnote[[<i> get me back to the topic slide</i>](#topicslide)] --- ## Question 1 .UA-red[<i><b> How much time do people spend in a text (in total) when comparing two texts? </i></b>] A simple model could look like: $$ log(Time_i) \sim N(\mu,\sigma_e) $$ with `\(Time_i\)` being the total time for each combination of participant, comparison and AOI_type .small[ ```r Durations_CJ %>% group_by(Partic, Compar, AOI_Type) %>% summarize( Dur_Seconds = mean(Dur_Seconds), Dur_Log = mean(Dur_Log) ) %>% head(5) ``` ``` ## # A tibble: 5 x 5 ## # Groups: Partic, Compar [3] ## Partic Compar AOI_Type Dur_Seconds Dur_Log ## <dbl> <chr> <chr> <dbl> <dbl> ## 1 1 DHH 214-425 AOI_LEFT_TEXT 7.12 6.61 ## 2 1 DHH 214-425 AOI_RIGHT_TEXT 15.3 8.55 ## 3 1 DLL 508- 51 AOI_LEFT_TEXT 4.61 7.31 ## 4 1 DLL 508- 51 AOI_RIGHT_TEXT 3.28 6.63 ## 5 1 DMH 414-609 AOI_LEFT_TEXT 1.93 6.82 ``` ] --- ## Question 1 .UA-red[<i><b> How much time do people spend in a text (in total) when comparing two texts? </i></b>] A simple model could look like: $$ log(Time_i) \sim N(\mu,\sigma_e) $$ ---- .UA-red[<b> .left-column[.Super[π] ] .right-column[ - Try it yourself and run the model - Inspect convergence - Interpret the results </b> ] ] --- ## Question 1 First we make a dataset ```r Data_CJ1 <- Durations_CJ %>% group_by(Partic, Compar, AOI_Type) %>% summarize( Dur_Seconds = mean(Dur_Seconds), Dur_Log = mean(Dur_Log) ) ``` Then we can run the model ```r Model1_CJ <- brm( Dur_Log ~ 1, data = Data_CJ1, family = "gaussian", backend = "cmdstanr", seed = 1975 ) ``` --- ## Question 1 ```r Model1_CJ <- readRDS(here("models", "Model1_CJ.RDS")) summary(Model1_CJ) ``` ``` ## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: Dur_Log ~ 1 ## Data: Data_CJ1 (Number of observations: 499) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; ## total post-warmup samples = 4000 ## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## Intercept 7.65 0.06 7.54 7.76 1.00 3411 2340 ## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sigma 1.23 0.04 1.15 1.31 1.00 3446 2663 ## ## Samples were drawn using sample(hmc). For each parameter, Bulk_ESS ## and Tail_ESS are effective sample size measures, and Rhat is the potential ## scale reduction factor on split chains (at convergence, Rhat = 1). ``` ??? Convergence is clearly ok! --- ## Question 1 ```r plot(Model1_CJ) ``` <img src="Part1_files/figure-html/unnamed-chunk-53-1.png" width="432" /> --- ## Further exploration of the posterior Pull the 4000 samples of our parameter values from the posterior with .UA-red[`posterior_samples()`] ```r posterior_samples(Model1_CJ) %>% head(10) ``` ``` ## b_Intercept sigma Intercept lp__ ## 1 7.64887 1.15629 7.64887 -812.378 ## 2 7.63948 1.21292 7.63948 -810.850 ## 3 7.67498 1.22842 7.67498 -810.910 ## 4 7.59150 1.23610 7.59150 -811.445 ## 5 7.68766 1.24512 7.68766 -811.191 ## 6 7.59683 1.22045 7.59683 -811.293 ## 7 7.69482 1.24441 7.69482 -811.274 ## 8 7.61122 1.19426 7.61122 -811.343 ## 9 7.62011 1.16751 7.62011 -812.042 ## 10 7.61525 1.17405 7.61525 -811.850 ``` .footnote[*with the `head()` function we ask to only show the first 10 rows] --- ## Time to unpack the plotting power of `ggplot2` .center2[ First we set a custom theme that makes our plots look 'better' ```r theme_set(theme_linedraw() + theme(text = element_text(family = "Times"), panel.grid = element_blank())) ``` ] --- ## Time to unpack the plotting power of `ggplot2` .pull-left[ Package .UA-red[`tidybayes`] is a wrapper including functions from .UA-red[`ggdist`] that open a lot of plotting options .footnotesize[ ```r posterior_samples(Model1_CJ) %>% # Get draws of the posterior select(b_Intercept) %>% # Only select our Intercept pivot_longer(everything()) %>% # Rearrange the result to plot ggplot(aes(x = value, y = name)) + # Let's plot... * stat_pointinterval(.width = .89) + # 89% CI xlab("marginal posterior") # Set our x-axis label ``` ] ] .pull-right[ <img src="Part1_files/figure-html/unnamed-chunk-57-1.png" width="504" /> ] --- ## Your turn .center2[ .UA-red[<b> .left-column[.Super[π] ] - Make a similar plot for the estimate of the standard deviation - With a CI of 80% - Interpret the results </b> ] ] --- ## Solution... .pull-left[ .footnotesize[ ```r posterior_samples(Mode1l_CJ) %>% # Get draws of the posterior * select(sigma) %>% # Only select sigma pivot_longer(everything()) %>% # Rearrange the result to plot ggplot(aes(x = value, y = name)) + # Let's plot... * stat_pointinterval(.width = .80) + # 80% CI xlab("marginal posterior") # Set our x-axis label ``` ] ] .pull-right[ <img src="Part1_files/figure-html/unnamed-chunk-59-1.png" width="504" /> ] --- ## Another type of plot with some more information in it Let's use .UA-red[`stat_interval()`] from the `ggdist` package .pull-left[ .scriptsize[ ```r posterior_samples(Model1_CJ) %>% select(b_Intercept) %>% ggplot(aes(x = b_Intercept, y = 0)) + * stat_interval( * .width = c(.95,.90,.50) # The CI's to plot * ) + xlab("marginal posterior") + scale_y_continuous( # No y-axis information NULL, breaks = NULL ) + scale_color_brewer() ``` ] ] .pull-right[ <img src="Part1_files/figure-html/unnamed-chunk-61-1.png" width="432" /> ] --- ## Your turn .center2[ .UA-red[<b> .left-column[.Super[π] ] - Make a similar plot for the estimate of the standard deviation - With a CI of 50%, 80%, 89%, 95% </b> ] ] --- ## Solution .pull-left[ .scriptsize[ ```r posterior_samples(Model1_CJ) %>% select(sigma) %>% ggplot(aes(x = sigma, y = 0)) + * stat_interval( * .width = c(.95, .89,.80,.50) # The CI's to plot * ) + xlab("marginal posterior") + scale_y_continuous( NULL, breaks = NULL ) + scale_color_brewer() ``` ] ] .pull-right[ <img src="Part1_files/figure-html/unnamed-chunk-63-1.png" width="432" /> ] --- ## We can also integrate plots for mu and sigma And we introduce the .UA-red[`stat_halfeye()`] function of `ggdist` .pull-left[ .footnotesize[ ```r names <- c("mu","sigma") posterior_samples(Model1_CJ) %>% select(b_Intercept,sigma) %>% set_names(names) %>% pivot_longer(everything()) %>% ggplot(aes(x = value, y = 0)) + * stat_halfeye(.width = .89, normalize = "panels") + scale_y_continuous(NULL, breaks = NULL) + xlab("marginal posterior") + facet_wrap(~name, scales = "free") ``` ] ] .pull-right[ <img src="Part1_files/figure-html/unnamed-chunk-65-1.png" width="504" /> ] --- ## Your turn .center2[ .UA-red[<b> .left-column[.Super[π] ] - Try to replicate the graph - But with other CI's (e.g. for 50%, 90% and 95%) </b> ] ] --- ## Solution .pull-left[ .footnotesize[ ```r names <- c("mu","sigma") posterior_samples(Model1_CJ) %>% select(b_Intercept,sigma) %>% set_names(names) %>% pivot_longer(everything()) %>% ggplot(aes(x = value, y = 0)) + * stat_halfeye( * .width = c(.50, .90, .95), normalize = "panels") + scale_y_continuous(NULL, breaks = NULL) + xlab("marginal posterior") + facet_wrap(~name, scales = "free") ``` ] ] .pull-right[ <img src="Part1_files/figure-html/unnamed-chunk-67-1.png" width="504" /> ] --- ## How to get rid of the log? We can simply do calculations with our posterior samples! So, we can just use `exp()`: .pull-left[ .footnotesize[ ```r names <- c("mu_log_scale","mu_sec", "sigma") posterior_samples(Model1_CJ) %>% select(b_Intercept,sigma) %>% * mutate( * mu_sec = exp(b_Intercept) * ) %>% # we do the select() again to reorder the columns # this is not super necessary, but then look out # with setting the names of the columns! * select(b_Intercept, mu_sec, sigma) %>% set_names(names) %>% pivot_longer(everything()) %>% ggplot(aes(x = value, y = 0)) + stat_halfeye(.width = c(.5, .9, .95), normalize = "panels") + scale_y_continuous(NULL, breaks = NULL) + xlab("marginal posterior") + facet_wrap(~name, scales = "free") ``` ] ] .pull-right[ <img src="Part1_files/figure-html/unnamed-chunk-69-1.png" width="504" /> ] --- ## Question 2 .UA-red[<i><b> Does the time that people spend reading a text differs for both texts if we control for the length of the text? </i></b>] A model could look like: `$$\begin{aligned} & log(Time_{i}) \sim N(\mu,\sigma_{e_{i}})\\ & \mu = \beta_0 + \beta_1*\text{Right_Text}_{i} + \beta_2*\text{Text_Length}_{i} \\ \end{aligned}$$` ```r load(here("data", "Data_CJ2.Rdata")) head(Data_CJ2, 6) ``` ``` ## # A tibble: 6 x 7 ## # Groups: Partic, Compar [3] ## Partic Compar AOI_Type Dur_Seconds Dur_Log Right_Text Text_Length ## <dbl> <chr> <chr> <dbl> <dbl> <dbl> <dbl> ## 1 1 DHH 214-425 AOI_LEFT_TEXT 7.12 6.61 0 2.72 ## 2 1 DHH 214-425 AOI_RIGHT_TEXT 15.3 8.55 1 2.48 ## 3 1 DLL 508- 51 AOI_LEFT_TEXT 4.61 7.31 0 3.11 ## 4 1 DLL 508- 51 AOI_RIGHT_TEXT 3.28 6.63 1 3.13 ## 5 1 DMH 414-609 AOI_LEFT_TEXT 1.93 6.82 0 1.78 ## 6 1 DMH 414-609 AOI_RIGHT_TEXT 1.53 6.33 1 3.89 ``` --- ## Your turn .center2[ .UA-red[<b> .left-column[.Super[π] ] - Try to run the model with `brms` - Check model convergence - Make one plot to visualise the posteriors for the regression parameters </b> ] ] --- ## Solution ```r Model2_CJ <- brm( Dur_Log ~ 1 + Text_Length + Right_Text, data = Data_CJ2, family = "gaussian", backend = "cmdstanr", seed = 1975 ) ``` --- ## Solution Visual checking of convergence .pull-left[ ```r plot(Model2_CJ) ``` ] .pull-right[ <img src="Part1_files/figure-html/unnamed-chunk-74-1.png" width="432" /> ] --- ## Solution: check output .small[ ```r summary(Model2_CJ) ``` ``` ## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: Dur_Log ~ 1 + Text_Length + Right_Text ## Data: Data_CJ2 (Number of observations: 499) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; ## total post-warmup samples = 4000 ## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## Intercept 6.34 0.16 6.02 6.66 1.00 4980 3109 ## Text_Length 0.50 0.05 0.40 0.59 1.00 4968 3327 ## Right_Text -0.33 0.10 -0.52 -0.14 1.00 4424 2890 ## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sigma 1.11 0.04 1.04 1.18 1.00 4759 3210 ## ## Samples were drawn using sample(hmc). For each parameter, Bulk_ESS ## and Tail_ESS are effective sample size measures, and Rhat is the potential ## scale reduction factor on split chains (at convergence, Rhat = 1). ``` ] --- ## Solution: create plot on posteriors for beta's .pull-left[ ```r posterior_samples(Model2_CJ) %>% select( b_Intercept, b_Text_Length, b_Right_Text ) %>% pivot_longer(everything()) %>% ggplot(aes(x = value, y = 0)) + stat_halfeye( .width = c(.5, .9, .95), normalize = "panels") + scale_y_continuous(NULL, breaks = NULL) + xlab("marginal posterior") + facet_wrap(~name, scales = "free") ``` ] .pull-right[ <img src="Part1_files/figure-html/unnamed-chunk-77-1.png" width="432" /> ] --- ## Let's introduce `mcmc_areas()` from the `bayesplot` .pull-left[ ```r mcmc_areas( Model2_CJ, pars = c("b_Text_Length", "b_Right_Text" ), prob = 0.89 ) ``` ] .pull-right[ <img src="Part1_files/figure-html/unnamed-chunk-79-1.png" width="432" /> ] --- ## or `mcmc_areas_ridges` from `bayesplot` .pull-left[ ```r mcmc_areas_ridges( Model2_CJ, pars = c("b_Text_Length", "b_Right_Text" ), prob = 0.89 ) ``` ] .pull-right[ <img src="Part1_files/figure-html/unnamed-chunk-81-1.png" width="432" /> ] --- ## or `mcmc_intervals` from `bayesplot` & let's play around with the `color_scheme_set()` that works with all `bayesplot` functions... .pull-left[ ```r color_scheme_set("red") mcmc_intervals( Model2_CJ, pars = c("b_Text_Length", "b_Right_Text"), prob = c(0.5, 0.89, 0.95) ) ``` ] .pull-right[ <img src="Part1_files/figure-html/unnamed-chunk-83-1.png" width="432" /> ] --- ## CI's (ETI's) vs. HDI's <img src="Part1_files/figure-html/unnamed-chunk-84-1.png" width="504" /> --- ## Alternative for p-values? --- ## Alternative for p-values? .pull-left[ <img src="Kruchke_2018.jpg" width="99%" height="99%" /> ] .pull-right[ <b>.UA-red[ROPE] </b>: Region Of Practical Equivalence .SW-greenD[ <i> > Set a region of parameter values that can be considered equivalent to having no effect </i> ] - in standard effect sizes the advised default is a range of -0.1 to 0.1 - this equals half of a small effect size (Cohen, 1988) - all parameter values in that range are set equal to no effect ] --- ## Alternative for p-values? .pull-left[ <img src="Kruchke_2018.jpg" width="99%" height="99%" /> ] .pull-right[ <b>.UA-red[ROPE + HDI rule] </b> - 95% of HDI out of ROPE `\(\rightarrow\)` `\(H_0\)` rejected - 95% of HDI all in ROPE `\(\rightarrow\)` `\(H_0\)` not rejected - 95% of HDI partially out of ROPE `\(\rightarrow\)` undecided ] --- ## HDI + ROPE with `brms` results We can use the `equivalence_test()` function of the `bayestestR` package ```r equivalence_test(Model2_CJ) ``` ``` ## # Test for Practical Equivalence ## ## ROPE: [-0.12 0.12] ## ## Parameter | H0 | inside ROPE | 95% HDI ## ---------------------------------------------------- ## Intercept | Rejected | 0.00 % | [ 5.99 6.63] ## Text_Length | Rejected | 0.00 % | [ 0.41 0.59] ## Right_Text | Rejected | 0.00 % | [-0.52 -0.14] ``` --- ## Or the `parameters` package This package has the nice flexible `model_parameters()` function! .footnotezise[ ```r posterior_samples(Model2_CJ) %>% model_parameters(rope_range=c(-0.12,0.12)) ``` ``` ## # Fixed Effects ## ## Parameter | Median | 89% CI | pd | % in ROPE ## ----------------------------------------------------------------- ## b_Intercept | 6.34 | [ 6.07, 6.59] | 100% | 0% ## b_Text_Length | 0.50 | [ 0.42, 0.57] | 100% | 0% ## b_Right_Text | -0.34 | [ -0.49, -0.17] | 99.95% | 1.43% ## sigma | 1.11 | [ 1.05, 1.16] | 100% | 0% ## Intercept | 7.65 | [ 7.57, 7.73] | 100% | 0% ## lp__ | -761.62 | [-763.69, -760.08] | 100% | 0% ``` ] `pd` stands for <i>.SW-greenD[probability of direction]</i> --- ## Now it's getting more fun We can combine `model_parameters()` with `plot()` because the `see` package deals with the output! .pull-left[ ```r posterior_samples(Model2_CJ) %>% select( b_Right_Text, b_Text_Length ) %>% model_parameters( ) %>% plot(show_labels = TRUE, size_text = 3) ``` ] .pull-right[ <img src="Part1_files/figure-html/unnamed-chunk-90-1.png" width="432" /> ] --- ## Now it's getting more fun And another one ... We can combine `equivalence_test` with `plot()` as well thx to the `see` package! .pull-left[ ```r equivalence_test(Model2_CJ) %>% plot( ) ``` ] .pull-right[ <img src="Part1_files/figure-html/unnamed-chunk-92-1.png" width="432" /> ] --- <i>"Hey! You promised mixed effects model"</i> ## Question 3 .UA-red[<i><b> Does the time that people spend reading a text differs for both texts if we control for the length of the text? </i></b>] In the previous analyses we neglected that we have multiple observations per person! A mixed effects model could look like: `$$\begin{aligned} & log(Time_{ij}) \sim N(\mu,\sigma_{e_{ij}})\\ & \mu = \beta_0 + \beta_1*\text{Right_Text}_{i} + \beta_2*\text{Text_Length}_{i} + \mu_j\\ & \mu_j \sim N(0,\sigma_{u_j}) \end{aligned}$$` --- ## Adding random term to the model ```r Model3_CJ <- brm( * Dur_Log ~ Text_Length + Right_Text + (1|Partic), data = Data_CJ2, family = "gaussian", backend = "cmdstanr", seed = 1975 ) ``` ---- .UA-red[<b> .left-column[.Super[π] ] - Estimate the model - Check the output - Interpret the results </b> ] --- ## Solution .footnotesize[ ```r Model3_CJ <- readRDS(here("models", "Model3_CJ.RDS")) summary(Model3_CJ) ``` ``` ## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: Dur_Log ~ 1 + Text_Length + Right_Text + (1 | Partic) ## Data: Data_CJ2 (Number of observations: 499) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; ## total post-warmup samples = 4000 ## *## Group-Level Effects: *## ~Partic (Number of levels: 26) *## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS *## sd(Intercept) 0.62 0.11 0.45 0.87 1.00 1014 1218 ## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## Intercept 6.63 0.19 6.25 7.01 1.00 1543 2156 ## Text_Length 0.39 0.04 0.31 0.48 1.00 6940 2934 ## Right_Text -0.35 0.08 -0.52 -0.19 1.00 6448 2957 ## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sigma 0.95 0.03 0.89 1.01 1.00 5550 2644 ## ## Samples were drawn using sample(hmc). For each parameter, Bulk_ESS ## and Tail_ESS are effective sample size measures, and Rhat is the potential ## scale reduction factor on split chains (at convergence, Rhat = 1). ``` ] --- ## What do we get? If we want to plot information on the variances (or sd) we have to know the names of elements .footnotesize[ ```r posterior_samples(Model3_CJ) %>% str() ``` ``` ## 'data.frame': 4000 obs. of 59 variables: ## $ b_Intercept : num 6.63 6.79 6.66 6.74 6.57 ... ## $ b_Text_Length : num 0.419 0.396 0.404 0.347 0.373 ... ## $ b_Right_Text : num -0.309 -0.403 -0.288 -0.258 -0.286 ... *## $ sd_Partic__Intercept : num 0.505 0.531 0.537 0.637 0.589 ... ## $ sigma : num 0.928 0.937 0.939 0.936 0.874 ... ## $ Intercept : num 7.73 7.78 7.72 7.65 7.54 ... *## $ r_Partic[1,Intercept] : num -0.141 -0.673 -0.112 0.07 -0.57 ... *## $ r_Partic[2,Intercept] : num 0.2899 -0.0303 0.5392 0.2669 0.5086 ... *## $ r_Partic[3,Intercept] : num 0.619 0.767 0.996 1.152 0.998 ... *## $ r_Partic[4,Intercept] : num -0.3145 0.0216 -0.3654 0.1749 0.0874 ... *## $ r_Partic[5,Intercept] : num -0.3268 -0.0846 0.1697 -0.0618 -0.1935 ... *## $ r_Partic[6,Intercept] : num 0.492 0.142 0.256 0.332 0.393 ... *## $ r_Partic[7,Intercept] : num -0.617 -0.777 -0.576 -0.444 -0.248 ... *## $ r_Partic[8,Intercept] : num 0.1582 -0.1739 -0.0464 0.0666 0.1937 ... *## $ r_Partic[9,Intercept] : num 0.1173 -0.4262 -0.2635 -0.4233 0.0362 ... *## $ r_Partic[10,Intercept]: num -0.0433 0.2303 -0.0641 0.3779 0.061 ... *## $ r_Partic[11,Intercept]: num 0.0683 0.4425 0.1351 0.0419 0.4656 ... *## $ r_Partic[12,Intercept]: num -0.0778 0.5016 0.1428 0.1423 0.1359 ... *## $ r_Partic[13,Intercept]: num 0.533 0.211 0.403 0.8 0.465 ... *## $ r_Partic[14,Intercept]: num -0.0189 0.7486 0.2406 0.8047 0.8817 ... *## $ r_Partic[15,Intercept]: num 0.552 0.526 0.492 0.843 0.495 ... *## $ r_Partic[16,Intercept]: num -1.06 -1.61 -1.61 -1.53 -1.62 ... *## $ r_Partic[17,Intercept]: num 0.293 -0.11 -0.366 0.151 0.181 ... *## $ r_Partic[18,Intercept]: num 0.358 0.562 0.7 0.415 0.357 ... *## $ r_Partic[19,Intercept]: num -0.898 -0.246 -0.3 -0.456 -0.226 ... *## $ r_Partic[20,Intercept]: num -0.892 -0.2 -0.697 -0.315 -0.5 ... *## $ r_Partic[21,Intercept]: num 0.3688 -0.1558 0.099 0.6076 0.0982 ... ## $ r_Partic[22,Intercept]: num 0.2282 -0.4344 -0.1846 -0.0366 -0.0649 ... ## $ r_Partic[23,Intercept]: num 0.749 0.76 0.743 0.904 0.707 ... ## $ r_Partic[24,Intercept]: num -0.792 -0.671 -0.647 -0.726 -0.701 ... ## $ r_Partic[25,Intercept]: num -0.302 -0.384 -0.14 -0.312 -0.306 ... ## $ r_Partic[26,Intercept]: num -0.599 -1.12 -0.663 -1.047 -1.045 ... ## $ lp__ : num -726 -730 -726 -722 -725 ... ## $ z_1[1,1] : num -0.28 -1.267 -0.208 0.11 -0.968 ... ## $ z_1[1,2] : num 0.5736 -0.0571 1.0042 0.4188 0.8632 ... ## $ z_1[1,3] : num 1.22 1.44 1.86 1.81 1.69 ... ## $ z_1[1,4] : num -0.6223 0.0406 -0.6806 0.2744 0.1483 ... ## $ z_1[1,5] : num -0.647 -0.159 0.316 -0.097 -0.329 ... ## $ z_1[1,6] : num 0.974 0.267 0.476 0.52 0.667 ... ## $ z_1[1,7] : num -1.221 -1.464 -1.072 -0.697 -0.421 ... ## $ z_1[1,8] : num 0.3131 -0.3275 -0.0864 0.1046 0.3287 ... ## $ z_1[1,9] : num 0.2322 -0.8027 -0.4908 -0.6641 0.0614 ... ## $ z_1[1,10] : num -0.0858 0.4337 -0.1193 0.5929 0.1036 ... ## $ z_1[1,11] : num 0.1351 0.8334 0.2516 0.0658 0.7903 ... ## $ z_1[1,12] : num -0.154 0.945 0.266 0.223 0.231 ... ## $ z_1[1,13] : num 1.055 0.398 0.751 1.255 0.788 ... ## $ z_1[1,14] : num -0.0374 1.4099 0.4482 1.2623 1.4965 ... ## $ z_1[1,15] : num 1.091 0.99 0.916 1.323 0.84 ... ## $ z_1[1,16] : num -2.1 -3.02 -3 -2.4 -2.75 ... ## $ z_1[1,17] : num 0.579 -0.208 -0.682 0.237 0.307 ... ## $ z_1[1,18] : num 0.708 1.059 1.304 0.651 0.606 ... ## $ z_1[1,19] : num -1.777 -0.463 -0.559 -0.715 -0.384 ... ## $ z_1[1,20] : num -1.765 -0.376 -1.298 -0.494 -0.848 ... ## $ z_1[1,21] : num 0.73 -0.293 0.184 0.953 0.167 ... ## $ z_1[1,22] : num 0.4516 -0.818 -0.3438 -0.0574 -0.1101 ... ## $ z_1[1,23] : num 1.48 1.43 1.38 1.42 1.2 ... ## $ z_1[1,24] : num -1.57 -1.26 -1.21 -1.14 -1.19 ... ## $ z_1[1,25] : num -0.598 -0.723 -0.261 -0.49 -0.52 ... ## $ z_1[1,26] : num -1.19 -2.11 -1.23 -1.64 -1.77 ... ``` ] --- # Your turn! .UA-red[<b> .left-column[.Super[π] ] - Make a plot to summarise the posterior for sigma & the sd for participants - If your quick! Play around with different options we saw </b> ] --- ## A possible solution: .pull-left[ ```r posterior_samples(Model3_CJ) %>% select( sigma, sd_Partic__Intercept ) %>% model_parameters( ) %>% plot(show_labels = TRUE, size_text = 3) ``` ] .pull-right[ <img src="Part1_files/figure-html/unnamed-chunk-97-1.png" width="432" /> ] --- ## Make plot for random effects We can also make plots for all mu_j's (aka: the specific residuals for each participant) .pull-left[ ```r posterior_samples(Model3_CJ) %>% select( starts_with("r_Partic") ) %>% pivot_longer( starts_with("r_Partic") ) %>% mutate( mu_j = value ) %>% ggplot( aes(x = mu_j, y = reorder(name, mu_j)) ) + stat_pointinterval( point_interval = mean_qi, .width = .89, size = 1/6 ) + scale_y_discrete("") ``` ] .pull-right[ <img src="Part1_files/figure-html/unnamed-chunk-99-1.png" width="432" /> ] --- ## Make plot for random effects Remember! We can simply calculate with the posterior! So, we can plot the differences between our participants by using mu_j and the Intercept estimate (6.63). Sum them together and take the `exp()` .pull-left[ .small[ ```r posterior_samples(Model3_CJ) %>% select( starts_with("r_Partic") ) %>% pivot_longer( starts_with("r_Partic") ) %>% mutate( * mu_j = exp(6.63 + value) ) %>% ggplot( aes(x = mu_j, y = reorder(name, mu_j)) ) + stat_pointinterval( point_interval = mean_qi, .width = .89, size = 1/6 ) + scale_y_discrete("") + scale_x_continuous("Reading time (in seconds)") ``` ] ] .pull-right[ <img src="Part1_files/figure-html/unnamed-chunk-101-1.png" width="432" /> ] --- ## Another great package `sjPlots` .pull-left[ ```r library(sjPlot) tab_model(Model3_CJ) ``` ] .pull-right[ <table style="border-collapse:collapse; border:none;"> <tr> <th style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; text-align:left; "> </th> <th colspan="2" style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; ">Dur Log</th> </tr> <tr> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; text-align:left; ">Predictors</td> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">Estimates</td> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">CI (95%)</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">Intercept</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">6.63</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">6.25 – 7.01</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">Text Length</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">0.39</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">0.31 – 0.48</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">Right Text</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">-0.34</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">-0.52 – -0.19</td> </tr> <tr> <td colspan="3" style="font-weight:bold; text-align:left; padding-top:.8em;">Random Effects</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">σ<sup>2</sup></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.90</td> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">τ<sub>00</sub> <sub>Partic</sub></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.39</td> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">ICC</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.30</td> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">N <sub>Partic</sub></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">26</td> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm; border-top:1px solid;">Observations</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left; border-top:1px solid;" colspan="2">499</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">Marginal R<sup>2</sup> / Conditional R<sup>2</sup></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.133 / 0.405</td> </tr> </table> ] --- ## Combine models output in a table ```r tab_model(Model2_CJ, Model3_CJ) ``` <table style="border-collapse:collapse; border:none;"> <tr> <th style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; text-align:left; "> </th> <th colspan="2" style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; ">Dur Log</th> <th colspan="2" style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; ">Dur Log</th> </tr> <tr> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; text-align:left; ">Predictors</td> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">Estimates</td> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">CI (95%)</td> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">Estimates</td> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">CI (95%)</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">Intercept</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">6.34</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">6.02 – 6.66</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">6.63</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">6.25 – 7.01</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">Text Length</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">0.50</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">0.40 – 0.59</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">0.39</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">0.31 – 0.48</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">Right Text</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">-0.34</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">-0.52 – -0.14</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">-0.34</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">-0.52 – -0.19</td> </tr> <tr> <td colspan="5" style="font-weight:bold; text-align:left; padding-top:.8em;">Random Effects</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">σ<sup>2</sup></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2"> </td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.90</td> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">τ<sub>00</sub></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2"> </td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.39 <sub>Partic</sub></td> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">ICC</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2"> </td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.30</td> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">N</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2"> </td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">26 <sub>Partic</sub></td> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm; border-top:1px solid;">Observations</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left; border-top:1px solid;" colspan="2">499</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left; border-top:1px solid;" colspan="2">499</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">R<sup>2</sup> Bayes</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.186</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.133 / 0.405</td> </tr> </table> --- ## How to tweak the estimation? .SW-greenD[Set the number of chains; the number of iterations; the number of warm-up] ```r Model3_CJ <- brm( Dur_Log ~ Text_Length + Right_Text + (1|Partic), data = Data_CJ2, family = "gaussian", backend = "cmdstanr", * iter = 3000, * chains = 8, * warmup = 1500, seed = 1975 ) ``` Tip: it's better to run less chains and more iterations per chain, so you get more precise parameter estimation --- ## How to tweak the estimation? .SW-greenD[Increase efficiency] .small[ ```r Model3_CJ <- brm( Dur_Log ~ Text_Length + Right_Text + (1|Partic), data = Data_CJ2, family = "gaussian", backend = "cmdstanr", iter = 3000, chains = 8, warmup = 1500, cores = 4, seed = 1975 ) ``` ] Modern computers have multiple cores `\(\rightarrow\)` let them work simultaneously --- ## Question 4 <i> Hey! It's still the same question! </i> .UA-red[<i><b> Does the time that people spend reading a text differs for both texts if we control for the length of the text? </i></b>] In the previous analyses we neglected that we have multiple observations per pair of texts! # Your turn! You have the power! .left-column[  ] .right-column[ .UA-red[<b> .left-column[.Super[π] ] - Do the necessary analyses - Make a plot to summarise the posterior for sigma & the sd's - If your quick! Play around with different options we saw... </b> ] ] --- ## Solution: ```r Model4_CJ <- brm( Dur_Log ~ 1 + Text_Length + Right_Text + (1|Partic) + (1|Compar), data = Data_CJ2, family = "gaussian", backend = "cmdstanr", cores = 4, seed = 1975 ) ``` --- .footnotesize[ ```r Model4_CJ <- readRDS(here("models", "Model4_CJ.RDS")) summary(Model4_CJ) ``` ``` ## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: Dur_Log ~ 1 + Text_Length + Right_Text + (1 | Partic) + (1 | Compar) ## Data: Data_CJ2 (Number of observations: 499) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; ## total post-warmup samples = 4000 ## ## Group-Level Effects: ## ~Compar (Number of levels: 30) ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sd(Intercept) 0.14 0.07 0.01 0.28 1.00 1179 1770 ## ## ~Partic (Number of levels: 26) ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sd(Intercept) 0.63 0.11 0.45 0.88 1.00 967 1194 ## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## Intercept 6.63 0.19 6.23 6.99 1.00 1304 2184 ## Text_Length 0.39 0.04 0.31 0.48 1.00 4341 3143 ## Right_Text -0.35 0.09 -0.52 -0.18 1.00 4095 2984 ## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sigma 0.94 0.03 0.88 1.00 1.00 3609 2829 ## ## Samples were drawn using sample(hmc). For each parameter, Bulk_ESS ## and Tail_ESS are effective sample size measures, and Rhat is the potential ## scale reduction factor on split chains (at convergence, Rhat = 1). ``` ] --- ## But is it a good model? .center2[ Two complementary procedures: - posterior-predictive check - compare models with .SW-greenD[<i>leave one out cross-validation</i>] ] --- # Posterior-predictive check A visual check that can be performed with `pp_check()` from `brms` .small[ ```r pp_check(Model4_CJ) ``` <img src="Part1_files/figure-html/unnamed-chunk-109-1.png" width="360" /> ] --- ## Model comparison with loo cross-validation `\(\sim\)` AIC or BIC in Frequentist statistics `\(\widehat{elpd}\)`: "expected log predictive density" (higher `\(\widehat{elpd}\)` implies better model fit without being sensitive for overfitting!) .footnotesize[ ```r loo_Model2 <- loo(Model2_CJ) loo_Model3 <- loo(Model3_CJ) loo_Model4 <- loo(Model4_CJ) Comparison<- loo_compare(loo_Model2, loo_Model3, loo_Model4) print(Comparison, simplify = F) ``` ``` ## elpd_diff se_diff elpd_loo se_elpd_loo p_loo se_p_loo looic ## Model4_CJ 0.0 0.0 -693.8 17.7 33.5 2.4 1387.7 ## Model3_CJ -0.4 1.5 -694.3 17.8 25.7 1.9 1388.5 ## Model2_CJ -67.0 11.2 -760.8 15.7 4.1 0.3 1521.7 ## se_looic ## Model4_CJ 35.3 ## Model3_CJ 35.6 ## Model2_CJ 31.3 ``` ] --- ## What about setting your own priors? `brms` has some default priors programmed (Uninformative priors)! But: .UA-red[WAMBS] (when to <b>W</b>orry and how to <b>A</b>void <b>M</b>isuse of <b>B</b>ayesian <b>S</b>tatistics) (Rens van de Schoot et al. (2021)) <i> .SW-greenD[ > "Ensure the prior distributions and the model or likelihood are well understood and described in detail in the text. Prior-predictive checking can help identify any prior-data conflict" ] --- ## What about setting your own priors? Let's first see how to get to know which priors are set by `brms` in the model we defined. `get_prior()` function .footnotesize[ ```r get_prior( Dur_Log ~ 1 + Text_Length + Right_Text + (1|Partic) + (1|Compar), data = Data_CJ2, family = "gaussian") ``` ``` ## prior class coef group resp dpar nlpar bound ## (flat) b ## (flat) b Right_Text ## (flat) b Text_Length ## student_t(3, 7.5, 2.5) Intercept ## student_t(3, 0, 2.5) sd ## student_t(3, 0, 2.5) sd Compar ## student_t(3, 0, 2.5) sd Intercept Compar ## student_t(3, 0, 2.5) sd Partic ## student_t(3, 0, 2.5) sd Intercept Partic ## student_t(3, 0, 2.5) sigma ## source ## default ## (vectorized) ## (vectorized) ## default ## default ## (vectorized) ## (vectorized) ## (vectorized) ## (vectorized) ## default ``` ] --- ## Visualise priors Imagine you want to visualise the prior for the `Intercept` & also visualise an alternative prior: `\(N(6,2)\)` .pull-left[ .footnotesize[ ```r library(metRology) # to use dt.scaled() # Create a vector with possible Dur_log values # and call this vector x x <- seq(1,20,by=.1) # probability density for our own prior # saved in the vector custom custom <- dnorm(x,6,2) # probability density for brms prior # saved in the vector brms_default brms_default <- dt.scaled(x,3,7.5,2.5) # Plot both data_frame(x,custom,brms_default) %>% pivot_longer( c(custom,brms_default), names_to = "Prior") %>% ggplot( aes(x = x, y = value, lty = Prior)) + geom_line() + scale_x_continuous( "Dur_log",limits = c(1,20)) + scale_y_continuous( "probability density") ``` ] ] .pull-right[ <img src="Part1_files/figure-html/unnamed-chunk-113-1.png" width="432" /> ] --- ## Change priors in `brms` We use `set_prior()` to define priors and `prior = ... ` in the `brm()` part .footnotesize[ ```r # Create an object with the definition of the priors you want to define manually priors_M4 <- c( set_prior("normal(6,2)", class = "Intercept")) Model4b_CJ <- brm( Dur_Log ~ 1 + Text_Length + Right_Text + (1|Partic) + (1|Compar), data = Data_CJ2, family = "gaussian", backend = "cmdstanr", * prior = priors_M4, cores = 4, seed = 1975 ) ``` ] if we want to add other manually defined priors as well, we can by doing something like: .footnotesize[ ```r priors_Manually <- c( set_prior("normal(6,2)", class = "Intercept"), set_prior("normal(0,3)", class = "b", coef = "Right_Text"), ... ) ``` ] --- ## Let's think about the effects we expect BΓΌrkner: .SW-greenD[ <i> > The default prior for population-level effects (including monotonic and category specific effects) is an improper flat prior over the reals. </i> https://rdrr.io/cran/brms/man/set_prior.html ] If you know that sd for `Dur_Log` equals 1.22, what could be a good prior (not to informative) for the effect of: - `Text_Length` - `Right_Text` --- ## Some possibilities Option 1: `\(N(0,2)\)` or Option 2: `\(N(0,4)\)` .pull-left[ .footnotesize[ ```r x <- seq(-8,8,by=.1) # probability densities for both option1 <- dnorm(x,0,2) option2 <- dnorm(x,0,4) # Plot both data_frame(x, option1, option2) %>% pivot_longer( c(option1, option2), names_to = "Prior") %>% ggplot( aes(x = x, y = value, lty = Prior)) + geom_line() + scale_x_continuous( "Dur_log",limits = c(-8,8)) + scale_y_continuous( "probability density") ``` ] ] .pull-right[ <img src="Part1_files/figure-html/unnamed-chunk-117-1.png" width="432" /> ] --- ## Your turn! .UA-red[<b> .left-column[.Super[π] ] - Re-estimate the model 4 with a custom prior for both regression effects - Use `tab_model()` to summarize both models: this model and the model with the `brms` default priors - Big changes? </b> ] Remember:  --- ## Possible solution ```r # Set the priors priors_M4b <- c( set_prior("normal(0,2)", class = "b", coef = "Right_Text"), set_prior("normal(0,2)", class = "b", coef = "Text_Length") ) # Estimate the model Model4b_CJ <- brm( Dur_Log ~ 1 + Text_Length + Right_Text + (1|Partic) + (1|Compar), data = Data_CJ2, family = "gaussian", prior = priors_M4b, backend = "cmdstanr", cores = 4, seed = 1975 ) # Summarize both models side by side tab_model(Model4_CJ, Model4b_CJ) ``` --- ## Possible solution <table style="border-collapse:collapse; border:none;"> <tr> <th style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; text-align:left; "> </th> <th colspan="2" style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; ">Dur Log</th> <th colspan="2" style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; ">Dur Log</th> </tr> <tr> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; text-align:left; ">Predictors</td> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">Estimates</td> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">CI (95%)</td> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">Estimates</td> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">CI (95%)</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">Intercept</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">6.63</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">6.23 – 6.99</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">6.64</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">6.25 – 7.01</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">Text Length</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">0.39</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">0.31 – 0.48</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">0.39</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">0.30 – 0.48</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">Right Text</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">-0.35</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">-0.52 – -0.18</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">-0.35</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">-0.51 – -0.18</td> </tr> <tr> <td colspan="5" style="font-weight:bold; text-align:left; padding-top:.8em;">Random Effects</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">σ<sup>2</sup></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.88</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.88</td> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">τ<sub>00</sub></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.02 <sub>Compar</sub></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.02 <sub>Compar</sub></td> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;"></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.39 <sub>Partic</sub></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.39 <sub>Partic</sub></td> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">ICC</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.32</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.32</td> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">N</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">26 <sub>Partic</sub></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">26 <sub>Partic</sub></td> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;"></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">30 <sub>Compar</sub></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">30 <sub>Compar</sub></td> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm; border-top:1px solid;">Observations</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left; border-top:1px solid;" colspan="2">499</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left; border-top:1px solid;" colspan="2">499</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">Marginal R<sup>2</sup> / Conditional R<sup>2</sup></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.133 / 0.418</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="2">0.132 / 0.417</td> </tr> </table> --- ## Where to get help? Now that you're ready to set your steps in .SW-greenD[Bayesian Mixed Effects Models] there is a high probability that this will happen sometime: .pull-left[ <img src="computer-says-no.jpeg" width="99%" height="99%" /> ] .pull-right[ This is a great source to learn more about `Stan`; `brms` & `Hamiltonian MCMC`: https://discourse.mc-stan.org/t/divergent-transitions-a-primer/17099] --- ## To wrap-up: why `brms` ? - .UA-red[`brms`] is more versatile than `lme4` allowing also to estimate - multivariate models - finite mixture models - generalized mixed effects models - splines - ... - .UA-red[`brms`] as stepping-stone to learn `Stan` --- class: inverse-blue, center, middle ## Some references & appendices .footnote[[<i> get me back to the topic slide</i>](#topicslide)] --- ## Some books <img src="cover_Lambert.jpg" width="90%" height="90%" /> --- ## Some books <img src="cover_rethinking2.jpg" width="90%" height="90%" /> --- ## Some free online books - Bayes Rules!: https://www.bayesrulesbook.com/ - Last Friday I learned about this book: https://vasishth.github.io/bayescogsci/book/ --- ## Rens van de Schoot In <i>Nature Reviews</i> <img src="Rens_NatureReviews.jpg" width="90%" height="90%" /> --- ## THE Podcast .center2[ If you like running - like I do - this could be a great companion on your run! https://www.learnbayesstats.com/ ] --- ## Site on creating the graphs .center2[ There are many blogs and websites that you can consult if you want to find out more about making graphs. <br> One that I often fall back to is: <br> http://mjskay.github.io/tidybayes/ ] --- ## Examplary paper with an associated 'Walkthrough in R': https://rpubs.com/jensroes/765467 <img src="Paper_Jens.jpg" width="90%" height="90%" /> --- class: inverse, center, bottom background-image: url(marko-pekic-IpLa37Uj2Dw-unsplash.jpg) background-size: contain # Questions? Do not hesitate to contact me! sven.demaeyer@uantwerpen.be --- class: center, middle name: tweet_dplyr <blockquote class="twitter-tweet"><p lang="en" dir="ltr"><a href="https://twitter.com/hashtag/dplyr?src=hash&ref_src=twsrc%5Etfw">#dplyr</a> in 6 tweets. Using the starwars dataset.<br><br>1. select() columns<a href="https://twitter.com/hashtag/tidyverse?src=hash&ref_src=twsrc%5Etfw">#tidyverse</a> <a href="https://twitter.com/hashtag/RStats?src=hash&ref_src=twsrc%5Etfw">#RStats</a> <a href="https://t.co/juGZbGeT9t">pic.twitter.com/juGZbGeT9t</a></p>— Hari (@illustratedbyte) <a href="https://twitter.com/illustratedbyte/status/1432416532391514113?ref_src=twsrc%5Etfw">August 30, 2021</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> [back to Prerequisites](#Prerequisites)