In this post we imagine a fictional country called Dansia to illustrate how multilevel regression and post-stratification can help us to make sense of data from non-representative surveys.
Previously I wrote about our efforts to monitor vaccine hesitancy through askNivi country polls. In our initial round in Kenya, we estimated that the probability is greater than 95% that at least 70% of people would be likely to get vaccinated if a vaccine were available for free today.
Skeptical readers might be thinking, “This is based on data from 486 people who responded to ads on Facebook. How is this representative of the country?” 🤔 Good question!
The gold standard approach for estimating characteristics of some population has long been probability-based representative sampling where every person in the target population has a chance of being invited to participate. In the global health world, the best examples come from the Demographic and Health Surveys (DHS) Program. DHS surveys involve between 5,000 and 30,000 households in a country and are typically conducted every 5 years. They cover dozens of topics and produce nationally-representative estimates through multistage cluster sampling. But there’s a catch: DHS surveys take 18 to 20 months (and a lot of money) to design, conduct door-to-door, and analyze. Despite their many strengths, DHS surveys are not an option when you need answers quickly.
A popular alternative is to contact a random sample of phone numbers and conduct the survey via SMS or phone. For instance, researchers might partner with a telecommunications company to get a list of all phone subscribers and randomly select numbers from this list, also known as a sampling frame. Another approach is called random digit dialing, where researchers randomly generate a list of phone numbers and try each number on the list (some will work, some won’t). While low cost and quick, poor coverage and high non-response can make these methods no better than non-representative convenience samples.1
Falling response rates in traditional public opinion polling, along with the growing ease of collecting data online, have led researchers to develop techniques for making convenience samples more representative (Wang et al. 2015). Before exploring one such approach, it will help to build some intuition about why just relying on convenience samples without any adjustment is usually a bad idea.
Imagine a country with 1 million residents. Let’s call it Dansia. Since we’re imagining things, let’s pretend that the good people of Dansia conducted a census last week of all 1 million residents and the data are ready to share. As luck would have it, the census included a question about our key dependent variable—frequency of mask wearing in public among people ages 15 to 64. This means we can calculate the true frequency in the population. And since we know the “real” answer, we can compare how different sampling approaches recover the true value.
library(tidyverse)
library(ggtext)
library(gtsummary)
library(gt)
# get the population data and custom ggplot function
load(url("https://www.dropbox.com/s/0u3lblqfv85hch3/pop_df.RData?dl=1"))
pop_df %>%
dplyr::select(-ageCat) %>%
mutate(gender = factor(gender,
levels=c("male", "female", "non-binary"),
labels=c("Male", "Female", "Non-binary"))) %>%
tbl_summary(by=gender,
label = c(age ~ "Mean age (SD)",
dv ~ "Wears mask"),
statistic = list(all_continuous() ~ "{mean} ({sd})",
all_categorical() ~ "{p}%")
) %>%
add_overall() %>%
bold_labels() %>%
modify_footnote(update = everything() ~ NA) %>%
as_gt() %>%
tab_header(title="Population of Dansia, Ages 15 to 64") %>%
gt::tab_style(
style = list(gt::cell_fill(color = "#4FBFA5"),
gt::cell_text(color = "white",
weight="bold")),
locations = gt::cells_body(columns = vars(stat_0),
rows = 3:6)
)
Population of Dansia, Ages 15 to 64 | ||||
---|---|---|---|---|
Characteristic | Overall, N = 681,732 | Male, N = 323,970 | Female, N = 323,758 | Non-binary, N = 34,004 |
Mean age (SD) | 39 (14) | 39 (14) | 39 (14) | 39 (14) |
Wears mask | ||||
never | 26% | 33% | 18% | 26% |
rarely | 27% | 33% | 21% | 27% |
sometimes | 22% | 19% | 26% | 22% |
often | 25% | 16% | 35% | 25% |
Remember, we only know this distribution of mask wearing in the population because we’re pretending we have timely census results. This would not happen in reality (and the census is not going to ask about mask wearing). In reality, we’d need to infer information about the population from a sample of people we contact. So let’s do that. First we’ll draw a random sample, followed by a convenience sample.
The gold standard approach to sampling from this population would be to draw a random sample from a sampling frame. Let’s assume we could do this from a phone book and contacted 500 randomly selected people. Let’s also assume that we achieved 100% coverage and 0% non-response!
library(tidyverse)
library(ggtext)
# get the population data and custom ggplot function
load(url("https://www.dropbox.com/s/0u3lblqfv85hch3/pop_df.RData?dl=1"))
# draw random sample
set.seed(12345)
sample_rand <- pop_df %>%
sample_n(500)
# calculate response frequency in the population
overall_pop <- pop_df %>%
group_by(dv) %>%
count() %>%
ungroup() %>%
mutate(p = n/sum(n)) %>%
mutate(source = "true population") %>%
dplyr::select(source, dv, p)
# calculate response frequency in the random sample
overall_sample_rand <- sample_rand %>%
group_by(dv) %>%
count() %>%
ungroup() %>%
mutate(p = n/sum(n)) %>%
mutate(source = "random sample") %>%
dplyr::select(source, dv, p)
# combine and plot (uses custom ggplot function that comes with data)
overall_pop %>%
bind_rows(overall_sample_rand) %>%
mrp_plot(., plotTitle = "Comparing a <span style = 'color:#005BE7;'>random sample</span> to the <span style = 'color:#4FBFA5;'>true population</span>",
annLabel = "Random sample is\nclose to the true value",
ann_x = 3.3, ann_y = 0.10, annColor = "#005BE7",
arrow_x_start = 3.6, arrow_y_start = 0.10,
arrow_x_end = 4, arrow_y_end = 0.23, arrowColor = "#005BE7",
fills = c("#005BE7", "#4FBFA5"), colors = c("#005BE7", "#4FBFA5"))
As you can see in Figure 1, our random sample gets pretty close to the true population values. Of course this is just one possible sample of 500 people from Dansia, but it illustrates the point that random samples are usually great (well, at least when coverage and response rates are high).
Now let’s pull together a convenience sample and see how it compares to the population. “Convenience” means that we take who we get. Suppose we posted an advertisement online for people ages 15 to 64 to complete our survey. Some people in this age range would see the ad, most would not (poor coverage). And as it goes with ads, some people who see it will click, but most will not (high non-response).
# draw a convenience sample that is mostly young men
# ----------------------------------------------------
# 50 older men
sample_obs_m1 <- pop_df %>%
filter(gender=="male") %>%
filter(age>35) %>%
sample_n(50)
# 400 younger men
sample_obs_m2 <- pop_df %>%
filter(gender=="male") %>%
filter(age<=35) %>%
sample_n(400)
# 45 women
sample_obs_f <- pop_df %>%
filter(gender=="female") %>%
sample_n(45)
# 5 individuals who identified as non-binary
sample_obs_n <- pop_df %>%
filter(gender=="non-binary") %>%
sample_n(5)
# combine to make a sample of 500
sample_obs <- sample_obs_f %>%
bind_rows(sample_obs_m1) %>%
bind_rows(sample_obs_m2) %>%
bind_rows(sample_obs_n)
tbl_pop <- pop_df %>%
dplyr::select(-ageCat, -dv) %>%
mutate(gender = factor(gender,
levels=c("male", "female", "non-binary"),
labels=c("Male", "Female", "Non-binary"))) %>%
tbl_summary(label = c(age ~ "Mean age (SD)",
gender ~ "Gender"),
statistic = list(all_continuous() ~ "{mean} ({sd})",
all_categorical() ~ "{p}%")
) %>%
bold_labels() %>%
modify_footnote(update = everything() ~ NA)
tbl_rand <- sample_rand %>%
dplyr::select(-ageCat, -dv) %>%
mutate(gender = factor(gender,
levels=c("male", "female", "non-binary"),
labels=c("Male", "Female", "Non-binary"))) %>%
tbl_summary(label = c(age ~ "Mean age (SD)",
gender ~ "Gender"),
statistic = list(all_continuous() ~ "{mean} ({sd})",
all_categorical() ~ "{p}%")
) %>%
bold_labels() %>%
modify_footnote(update = everything() ~ NA)
tbl_obs <- sample_obs %>%
dplyr::select(-ageCat, -dv) %>%
mutate(gender = factor(gender,
levels=c("male", "female", "non-binary"),
labels=c("Male", "Female", "Non-binary"))) %>%
tbl_summary(label = c(age ~ "Mean age (SD)",
gender ~ "Gender"),
statistic = list(all_continuous() ~ "{mean} ({sd})",
all_categorical() ~ "{p}%")
) %>%
bold_labels() %>%
modify_footnote(update = everything() ~ NA)
tbl_merge(list(tbl_pop, tbl_rand, tbl_obs),
tab_spanner = c("**Population**", "**Random Sample**",
"**Convenience Sample**"))
Characteristic | Population | Random Sample | Convenience Sample |
---|---|---|---|
N = 681,732 | N = 500 | N = 500 | |
Gender | |||
Male | 48% | 49% | 90% |
Female | 47% | 48% | 9.0% |
Non-binary | 5.0% | 3.4% | 1.0% |
Mean age (SD) | 39 (14) | 37 (14) | 29 (11) |
Our imaginary ad must have appealed more to younger, male audiences. Whereas males make up about half of the population and random sample, they are 90% of the convenience sample!
Now let’s estimate mask wearing in Dansia from this convenience sample that we know is younger and more male on average than the population.
# calculate response frequency in the random sample
overall_sample_obs <- sample_obs %>%
group_by(dv) %>%
count() %>%
ungroup() %>%
mutate(p = n/sum(n)) %>%
mutate(source = "biased sample, unadjusted") %>%
dplyr::select(source, dv, p)
# combine and plot
overall_pop %>%
bind_rows(overall_sample_rand) %>%
bind_rows(overall_sample_obs) %>%
mrp_plot(., plotTitle = "Comparing a <span style = 'color:#F17DB1;'>convenience sample</span> to the <span style = 'color:#4FBFA5;'>true population</span>",
annLabel = "Convenience sample is\nfar from the true value",
ann_x = 3.3, ann_y = 0.05, annColor = "#F17DB1",
arrow_x_start = 3.6, arrow_y_start = 0.05,
arrow_x_end = 4, arrow_y_end = 0.11, arrowColor = "#F17DB1",
fills = c("white", "#005BE7", "#4FBFA5"),
colors = c("#F17DB1", "#005BE7", "#4FBFA5"))
Figure 2 is a bit depressing for #teamConvenience. The clue to why is in the first table (remember our fake census?): men in the population—particularly young men—endorsed mask wearing less often than women. Since our convenience sample skews male and younger, these anti-mask views are over-represented in our dataset. If we stopped here, we’d come to the wrong conclusion about the frequency of mask wearing in the population.
This, friends, is why convenience samples are hard to interpret when the goal is to make some inference about a target population.
But all is not lost. Take heart in this example from Wang et al. (2015) (pdf).
Andrew Gelman2 and colleagues ran daily voter polls on the Xbox gaming platform in the run up to the 2012 U.S. Presidential election. This was an opt-in poll. Every day, gamers were presented 3 to 5 questions, including the question, “If the election were held today, who would you vote for?”. The first time a person participated they were also asked to answer several demographic questions.
Like our fake example from Dansia, this real sample skewed male and young compared to U.S. voters. Men represented 93% of the Xbox sample, but only 47% of voters. Very non-representative.
But in a very different twist, the answer Wang et al. (2015) reached with this dataset lined up nicely with representative polls. How did they do it?3 Mister P.
Monica Alexander gives a great overview of MRP in this post. I used her code get started, so consider popping over there to learn from a card carrying demographer.
When you have non-representative samples like the ones from Dansia (our fictional country) or Wang et al. (2015)’s Xbox survey, one approach to making the data more representative is to reweight—that is, post-stratify—the results by the known population weights. All you need is a census or a representative survey that includes the variables you want to use for post-stratification.
The first step in post-stratification is to tabulate the count of the population (in the census or the representative survey) that falls into each “cell”. For instance, in the Dansia example there are 30 combinations of gender (3 levels: male, female, non-binary) and 5-year age categories between ages 15 and 64 (10 levels). Everyone in the country falls into one and only one cell.
Population of Dansia, Ages 15 to 64 | |||
---|---|---|---|
Cell counts by gender and 5-year age category | |||
Characteristic | Male, N = 323,970 | Female, N = 323,758 | Non-binary, N = 34,004 |
5-Year Age Category | |||
15 | 34,457 | 34,142 | 3,556 |
20 | 34,372 | 34,525 | 3,680 |
25 | 34,245 | 34,474 | 3,624 |
30 | 34,432 | 34,660 | 3,636 |
35 | 34,524 | 34,373 | 3,455 |
40 | 30,843 | 30,546 | 3,173 |
45 | 30,897 | 30,750 | 3,244 |
50 | 31,046 | 30,971 | 3,269 |
55 | 30,953 | 31,039 | 3,269 |
60 | 28,201 | 28,278 | 3,098 |
The overall population of Dansians ages 15 to 64 is 681,732. We can divide each cell count by this figure to get the percentages (cell weights). For instance, men ages 15 to 19 make up 5.1% of the target population, but a whopping 20% of the convenience sample. This means that we need to down-weight responses from 15 to 19 year old men in the Dansia survey.
Population of Dansia, Ages 15 to 64 | |||
---|---|---|---|
Cell percentages by gender and 5-year age category | |||
Characteristic | Male, N = 323,970 | Female, N = 323,758 | Non-binary, N = 34,004 |
5-Year Age Category | |||
15 | 5.05% | 5.01% | 0.52% |
20 | 5.04% | 5.06% | 0.54% |
25 | 5.02% | 5.06% | 0.53% |
30 | 5.05% | 5.08% | 0.53% |
35 | 5.06% | 5.04% | 0.51% |
40 | 4.52% | 4.48% | 0.47% |
45 | 4.53% | 4.51% | 0.48% |
50 | 4.55% | 4.54% | 0.48% |
55 | 4.54% | 4.55% | 0.48% |
60 | 4.14% | 4.15% | 0.45% |
In the next code chunk, I run the same calculation as above but format the overall_prop
in a long tibble to use in a later step.
pop_df %>%
mutate(weight=1) %>%
group_by(gender, ageCat) %>%
summarise(n = sum(weight))%>%
ungroup() %>%
dplyr::select(gender, ageCat, n) %>%
mutate(overall="overall") %>%
group_by(overall) %>%
mutate(prop = n/sum(n)) %>%
ungroup() %>%
{. ->> overall_prop} %>%
arrange(ageCat) %>%
slice(1:9)
# A tibble: 9 x 5
gender ageCat n overall prop
<fct> <chr> <dbl> <chr> <dbl>
1 male 15 34457 overall 0.0505
2 female 15 34142 overall 0.0501
3 non-binary 15 3556 overall 0.00522
4 male 20 34372 overall 0.0504
5 female 20 34525 overall 0.0506
6 non-binary 20 3680 overall 0.00540
7 male 25 34245 overall 0.0502
8 female 25 34474 overall 0.0506
9 non-binary 25 3624 overall 0.00532
Finally, before we move onto the MR in MRP, let’s think about why stratification alone might not be sufficient. The next table shows the cell counts from our non-representative Dansia survey of 500 people. Most of the gender
X ageCat
cells have fewer than 10 people. For instance, there are only 3 women in the 20 to 24 age category.
This is an example of a sparse cell. There are lots of sparse cells in this survey because the respondents were 90% men. This means that with post-stratification alone we’d have to put a lot of weight on a handful of individuals. These 3 women would be the stand-in for all women in this age range.
Dansia Convenience Sample, Ages 15 to 64 | |||
---|---|---|---|
Counts by gender and 5-year age category | |||
Characteristic | Male, N = 450 | Female, N = 45 | Non-binary, N = 5 |
5-Year Age Category | |||
15 | 100 | 9 | 0 |
20 | 95 | 3 | 2 |
25 | 96 | 3 | 0 |
30 | 91 | 5 | 0 |
35 | 23 | 4 | 2 |
40 | 6 | 6 | 1 |
45 | 9 | 3 | 0 |
50 | 13 | 2 | 0 |
55 | 9 | 5 | 0 |
60 | 8 | 5 | 0 |
The MR before the P stands for multilevel regression. Just as Mister comes before P, multilevel regression can be used prior to post-stratification to avoid relying too heavily on information from sparse cells.
For instance, we learned in the previous section that there were only 3 women in the 20 to 24 age category in the 500-person convenience sample. 2 out of these 3 people said they “never” or “rarely” wear a mask in public. Rather than applying post-stratification weights to this result (66.7%) that is based on 3 people, we can use multilevel regression to pull the estimate toward the other groups. This partial pooling is a key feature of multilevel regression.
Working with the survey data from our 500-person convenience sample, we can model the association between our outcome (frequency of wearing a mask) and our independent variables: gender and 5-year age category. Just like Dr. Alexander does in her post, we’ll also use the {brms
} package in R to fit a Bayesian model. The difference is that we have an ordinal outcome (mask wearing frequency: never, rarely, sometimes, often), so instead of using family=bernoulli()
, we’ll use family = cumulative("probit")
to fit an ordinal regression model (Bürkner and Vuorre 2019).4
library(brms)
fit <- brm(dv ~ (1|gender) + (1|ageCat),
data = sample_obs,
family = cumulative("probit"),
control = list(adapt_delta = 0.99),
cores = 4, # depends on your machine
backend = "cmdstanr") # or rstan, see footnote
brms
code is similar to glmer
, so if you can write a glmer
model you are most of the way there.5 In the above chunk we fit a Bayesian ordinal regression model on the convenience sample data (sample_obs
) and save the results to the fit
object.6
In the last steps, we’ll use the results of the regression to predict the mask wearing outcome for all of our subgroups.
library(tidybayes)
overall_sample_est <- fit %>%
add_predicted_draws(newdata = overall_prop,
allow_new_levels=TRUE) %>%
mutate(
dv = .prediction,
p = 1
) %>%
mutate(.prediction_prop = p*prop) %>%
group_by(.draw, dv) %>%
summarise(p = sum(.prediction_prop)) %>%
group_by(dv) %>%
mean_qi(p) %>%
mutate(source = "biased sample, adjusted") %>%
dplyr::select(source, dv, p) %>%
mutate(dv = factor(dv,
levels=c("never", "rarely", "sometimes", "often"),
labels=c("never", "rarely", "sometimes", "often"),
ordered = TRUE))
The add_predicted_draws()
function takes as inputs our fit
object and our overall_prop
object, which is a tibble with cell proportions and subgroup labels.
# A tibble: 30 x 5
gender ageCat n overall prop
<fct> <chr> <dbl> <chr> <dbl>
1 male 15 34457 overall 0.0505
2 male 20 34372 overall 0.0504
3 male 25 34245 overall 0.0502
4 male 30 34432 overall 0.0505
5 male 35 34524 overall 0.0506
6 male 40 30843 overall 0.0452
7 male 45 30897 overall 0.0453
8 male 50 31046 overall 0.0455
9 male 55 30953 overall 0.0454
10 male 60 28201 overall 0.0414
# … with 20 more rows
add_predicted_draws()
plugs the subgroup information like “male” and age category “15” into the model we fit in the previous step and predicts the value of the dependent variable 4,000 times for all 30 subgroup combinations. This results in a tibble with 120,000 rows.
step1 <- fit %>%
add_predicted_draws(newdata = overall_prop,
allow_new_levels=TRUE) %>%
dplyr::select(gender, ageCat, n, prop, .draw, .prediction)
step1
# A tibble: 120,000 x 8
# Groups: gender, ageCat, n, overall, prop, .row [30]
overall .row gender ageCat n prop .draw .prediction
<chr> <int> <fct> <chr> <dbl> <dbl> <int> <fct>
1 overall 1 male 15 34457 0.0505 1 rarely
2 overall 1 male 15 34457 0.0505 2 never
3 overall 1 male 15 34457 0.0505 3 never
4 overall 1 male 15 34457 0.0505 4 rarely
5 overall 1 male 15 34457 0.0505 5 rarely
6 overall 1 male 15 34457 0.0505 6 rarely
7 overall 1 male 15 34457 0.0505 7 never
8 overall 1 male 15 34457 0.0505 8 never
9 overall 1 male 15 34457 0.0505 9 rarely
10 overall 1 male 15 34457 0.0505 10 never
# … with 119,990 more rows
Next, we set the predicted value, p
, to 1 for each row,7 multiply by the cell proportion that we obtained from the census (post-stratification), and sum these weights by draw (4,000) and level of the dependent variable (4). This resulted in 4,000 estimates of the frequency distribution of the dependent variable.
step2 <- step1 %>%
mutate(
dv = .prediction,
p = 1
) %>%
mutate(.prediction_prop = p*prop) %>%
group_by(.draw, dv) %>%
summarise(p = sum(.prediction_prop))
step2
# A tibble: 15,988 x 3
# Groups: .draw [4,000]
.draw dv p
<int> <fct> <dbl>
1 1 never 0.212
2 1 rarely 0.268
3 1 sometimes 0.138
4 1 often 0.383
5 2 never 0.389
6 2 rarely 0.247
7 2 sometimes 0.147
8 2 often 0.217
9 3 never 0.257
10 3 rarely 0.251
# … with 15,978 more rows
Lastly, the mean_qi()
function creates point estimates and credible intervals for each level of the dependent variable.
step3 <- step2 %>%
group_by(dv) %>%
mean_qi(p)
step3
# A tibble: 4 x 7
dv p .lower .upper .width .point .interval
<fct> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 never 0.227 0.0558 0.405 0.95 mean qi
2 rarely 0.333 0.147 0.540 0.95 mean qi
3 sometimes 0.182 0.0461 0.355 0.95 mean qi
4 often 0.259 0.0911 0.448 0.95 mean qi
Let’s recap this workflow:
brms
} using data from a convenience sample of 500 individuals (sample_obs
).~ (1|gender) + (1|ageCat)
]. This is important. If these variables do not exist in the population stratification data, you can’t do the next step.fit
to predict the dependent variable from aggregated data about our population [add_predicted_draws()
]. We’re not predicting the DV for individuals. We’re predicting it for subgroup cells. For sparse cells like gender=="female" & ageCat=="15"
, the multilevel regression pulls the estimates toward the other cells.And now let’s look at the results. Figure 4 plots these adjusted estimates against the results of the other approaches we reviewed in this post (unadjusted, random sample).
# combine and plot
overall_pop %>%
bind_rows(overall_sample_rand) %>%
bind_rows(overall_sample_obs) %>%
bind_rows(overall_sample_est) %>%
mutate(source = factor(source,
levels=c("biased sample, unadjusted",
"biased sample, adjusted",
"random sample",
"true population"))) %>%
mrp_plot(., plotTitle = "Comparing different sample estimates to the <span style = 'color:#4FBFA5;'>true population</span>",
annLabel = "Unadjusted estimate is\nfar from the true value",
ann_x = 3.3, ann_y = 0.05, annColor = "#F17DB1",
arrow_x_start = 3.6, arrow_y_start = 0.05,
arrow_x_end = 4, arrow_y_end = 0.11, arrowColor = "#F17DB1",
fills = c("white", "#F17DB1", "#005BE7", "#4FBFA5"),
colors = c("#F17DB1", "#F17DB1", "#005BE7", "#4FBFA5")) +
annotate("text", x = 4.2, y = .40,
label = "Adjusted estimates are\nclose to the true value",
size = rel(3), color="#F17DB1") +
geom_curve(aes(x = 3.9, y = .40, xend = 3.8, yend = .26),
color = "#F17DB1", arrow = arrow(type = "open",
length = unit(0.15, "inches")),
curvature = -.5, angle = -100, ncp =15)
Compared to the unadjusted estimates straight from the convenience sample, it appears that adjusting the estimates reduces bias across all levels of the dependent variable.8
I opened this post with a reference to a recent opt-in poll we conducted in Kenya about COVID-19 vaccine hesitancy. We used data from a non-representative (convenience) sample of 486 people to estimate the percentage of people who would be likely to get vaccinated if a vaccine were available for free today. I imagined the reaction of a skeptical reader:
Skeptical readers might be thinking, “This is based on data from 486 people who responded to ads on Facebook. How is this representative of the country?
The short answer is, “It’s probably not”. Without adjustment, raw summaries from convenience samples like this are likely to be biased and misleading. But if suitable data about the target population exists (census, nationally representative survey), techniques like Mister P can make the data more representative.
To learn more, see posts by Monica Alexander, Andrew Gelman, and see Rohan Alexander’s reading list under “Next Steps”.
See this gist.
Bürkner, Paul-Christian, and Matti Vuorre. 2019. “Ordinal Regression Models in Psychology: A Tutorial.” Advances in Methods and Practices in Psychological Science 2 (1): 77–101.
Lau, Charles Q, Ansie Lombaard, Melissa Baker, Joe Eyerman, and Lisa Thalji. 2019. “How Representative Are Sms Surveys in Africa? Experimental Evidence from Four Countries.” International Journal of Public Opinion Research 31 (2): 309–30.
Wang, Wei, David Rothschild, Sharad Goel, and Andrew Gelman. 2015. “Forecasting Elections with Non-Representative Polls.” International Journal of Forecasting 31 (3): 980–91.
Evidence from several African countries comparing SMS surveys to face-to-face surveys suggests that results may not be representative of the general population—particularly with respect to education—even after poststratification adjustments (Lau et al. 2019).↩︎
I’m calling out the “senior” author Gelman specifically because I first learned about Mister P from his blog.↩︎
See this post by Rohan Alexander for a nice review of Wang et al. (2015).↩︎
Credit for the assist goes to to Matthew Kay, author of the {tidybayes
} package, for contributing this helpful answer to my question on the Stan Forum on ordinal outcomes with MRP. And of course, we owe a big debt of gratitude to Paul Bürkner for his amazing package brms
and for lots of handy tutorials like his one on ordinal regression (Bürkner and Vuorre 2019).↩︎
By default, brms
models are fit with rstan
. However, Andrew Heiss tweeted the other day that switching to cmdstanr
will speed up the model runs. Boy was he right. To use this backend, you have to first install {cmdstanr
} and then run the function install_cmdstan()
.↩︎
We could also model the interaction of age group and gender. (h/t Staffan Betner)↩︎
I like how this figure packs in a lot of information, but it’s incomplete without a way to represent uncertainty in the estimates. I’m hopeful that readers might have some ideas for additional visualizations.↩︎
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Green (2020, Dec. 8). Nivi Research: Mister P helps us understand vaccine hesitancy. Retrieved from https://research.nivi.io/posts/2020-12-08-mister-p-helps-us-understand-vaccine-hesitancy/
BibTeX citation
@misc{green2020mister, author = {Green, Eric}, title = {Nivi Research: Mister P helps us understand vaccine hesitancy}, url = {https://research.nivi.io/posts/2020-12-08-mister-p-helps-us-understand-vaccine-hesitancy/}, year = {2020} }