Developmental curve comparison

Apr 23, 2025

Introduction
Simple random effects structure
Complex random effects structure (nested paper effects)
Complex random effects structure (random slope)
Subset to only those papers with multiple ages
- Two age groups
- Three age groups
Conclusions

Introduction

The goal for this analysis was to better understand the shape of the developmental curve. To do this we used linear mixed effects models, so that the variance of individual meta-analyses could be accounted for.

Summary of investigations:

Meta-analytic mixed-effects are promising, but…
The random effects structure (specifically if we nest paper effects within meta-analysis) makes a big difference to the implied curve shape.
Subsetting to only studies with multiple age groups improves performance.

Simple random effects structure

Below is our data set separated by meta-analysis with four curve types: 1) linear, 2) logarithmic, 3) quadratic, and 4) both linear and logarithmic predictors.

We ran mixed effects regressions where meta-analysis (data set) was included as a random intercept for our four models. We then ran Chi-squared tests to compare the different models to one another. Model comparison found the quadratic model to be a better fit than the linear model and the logarithmic model, but no difference between the linear and logarithmic model. The two predictor model (linear and log) also outperformed both the single predictor linear model and the single predictor logarithmic model.

lin_lmer <- lmer(d_calc ~ mean_age + 
                (1 | dataset), 
                weights = 1/d_var_calc, 
                REML = F, data = metalab_data)

log_lmer <- lmer(d_calc ~ log(mean_age) + 
                (1 | dataset), 
                weights = 1/d_var_calc, 
                REML = F, data = metalab_data)

quad_lmer <- lmer(d_calc ~ I(mean_age^2) + 
                 (1 | dataset), 
                 weights = 1/d_var_calc, 
                 REML = F, data = metalab_data)

lin_log_lmer <- lmer(d_calc ~ mean_age + log(mean_age) + 
                    (1| dataset), 
                    weights = 1/d_var_calc, 
                    REML = F, data = metalab_data)

kable(coef(summary(lin_lmer)))

	Estimate	Std. Error	t value
(Intercept)	0.2657694	0.0460356	5.773127
mean_age	0.0000998	0.0000283	3.524569

kable(coef(summary(log_lmer)))

	Estimate	Std. Error	t value
(Intercept)	0.3722238	0.0792106	4.6991675
log(mean_age)	-0.0100233	0.0111321	-0.9003996

kable(coef(summary(quad_lmer)))

	Estimate	Std. Error	t value
(Intercept)	0.3115005	0.0461291	6.752799
I(mean_age^2)	0.0000000	0.0000000	1.167441

kable(coef(summary(lin_log_lmer)))

	Estimate	Std. Error	t value
(Intercept)	0.4212793	0.0788645	5.341813
mean_age	0.0001291	0.0000308	4.186278
log(mean_age)	-0.0294871	0.0120632	-2.444380

kable(anova(lin_lmer, log_lmer))

	npar	AIC	BIC	logLik	-2*log(L)	Chisq	Df	Pr(>Chisq)
lin_lmer	4	5068.739	5092.715	-2530.369	5060.739	NA	NA	NA
log_lmer	4	5080.186	5104.162	-2536.093	5072.186	0	0	NA

kable(anova(lin_lmer, quad_lmer))

	npar	AIC	BIC	logLik	-2*log(L)	Chisq	Df	Pr(>Chisq)
lin_lmer	4	5068.739	5092.715	-2530.369	5060.739	NA	NA	NA
quad_lmer	4	5079.621	5103.597	-2535.811	5071.621	0	0	NA

kable(anova(log_lmer, quad_lmer))

	npar	AIC	BIC	logLik	-2*log(L)	Chisq	Df	Pr(>Chisq)
log_lmer	4	5080.186	5104.162	-2536.093	5072.186	NA	NA	NA
quad_lmer	4	5079.621	5103.597	-2535.811	5071.621	0.5645005	0	NA

kable(anova(lin_lmer, lin_log_lmer))

	npar	AIC	BIC	logLik	-2*log(L)	Chisq	Df	Pr(>Chisq)
lin_lmer	4	5068.739	5092.715	-2530.369	5060.739	NA	NA	NA
lin_log_lmer	5	5064.807	5094.777	-2527.404	5054.807	5.931607	1	0.0148717

kable(anova(log_lmer, lin_log_lmer))

	npar	AIC	BIC	logLik	-2*log(L)	Chisq	Df	Pr(>Chisq)
log_lmer	4	5080.186	5104.162	-2536.093	5072.186	NA	NA	NA
lin_log_lmer	5	5064.807	5094.777	-2527.404	5054.807	17.37879	1	3.06e-05

Focusing on the linear, log, and two predictor (linear and log) models, we built predictions for all complete meta-analyses. The plot separated by meta-analysis is presented below. The linear and two predictor models are pretty similar, mainly being different at very early ages.

metalab_data_preds <- metalab_data %>%
  select(dataset, short_name, d_calc, d_var_calc, mean_age, study_ID) %>%
  filter(complete.cases(.)) %>%
  mutate(lin_pred = predict(lin_lmer)) %>%
  mutate(log_pred = predict(log_lmer)) %>%
  mutate(lin_log_pred = predict(lin_log_lmer))

Overall this method looks pretty good. It seems like the linear predictor in the two predictor model is smoothing the logs at the edges, and the random effects are working to capture cross data set variability.

Complex random effects structure (nested paper effects)

We decided to expand our random effects structure to include paper (study_ID) nested in meta-analysis (data set) to capture the fact that a given meta-analysis has a specific set of papers, and may have multiple entries from the same paper. Here we focused on the two predictor model. The simpler random effects model from above is repeated here for easy comparison. A Chi-squared test finds a significant difference between the models, with the model with the more complex random effects structure having a lower AIC.

metalab_data <- metalab_data %>% select(d_calc, d_var_calc, mean_age, dataset, study_ID) %>% filter(complete.cases(.))

lin_log_paper_lmer <- lmer(d_calc ~ mean_age + log(mean_age) + 
                          (1 | dataset / study_ID),
                          weights = 1/d_var_calc, 
                          REML = F, data = metalab_data)

kable(coef(summary(lin_log_paper_lmer)))

	Estimate	Std. Error	t value
(Intercept)	0.2779933	0.0959522	2.897206
mean_age	0.0001041	0.0000351	2.962590
log(mean_age)	0.0046257	0.0156777	0.295050

kable(coef(summary(lin_log_lmer)))

	Estimate	Std. Error	t value
(Intercept)	0.4212793	0.0788645	5.341813
mean_age	0.0001291	0.0000308	4.186278
log(mean_age)	-0.0294871	0.0120632	-2.444380

kable(anova(lin_log_lmer, lin_log_paper_lmer))

	npar	AIC	BIC	logLik	-2*log(L)	Chisq	Df	Pr(>Chisq)
lin_log_lmer	5	5064.807	5094.777	-2527.404	5054.807	NA	NA	NA
lin_log_paper_lmer	6	4831.536	4867.500	-2409.768	4819.536	235.2713	1	0

We then predicted from this new, more complex model and compare it to the model without the nested paper random effects structure. Initial testing found that including the nested random effect in the predict algorithm led to massive overfitting, so just the simple random effects structure was used to compute predictions. Below is the figure comparing the two curves. The curves are very similar, suggesting that while adding the nesting of paper does significantly improve the model, visually the difference is very small.

lin_log_paper_preds <- predict(lin_log_paper_lmer, re.form = ~ (1 | dataset))

metalab_data_preds <- metalab_data_preds %>%
  mutate(lin_log_paper_pred = lin_log_paper_preds)

Complex random effects structure (random slope)

Another way to make the random effects structure more complex is to add a random slope to the model. Two more models were built with a random slope by mean age log transformed, one without the nesting of paper, and one with the nesting of paper. Summaries of the new models with the random slope and the previous models without the random slope are below. Model comparison found that the model with nested structure and the slope was better than the simplest model (no nesting, no slope). The model with both the nested structure and the slope was also better than than the model with only the slope. Based on this it appears that the best model is one that includes both the nested random effects structure and a random slope of mean age log transformed.

lin_log_slope_lmer <- lmer(d_calc ~ mean_age + log(mean_age) + 
                          (log(mean_age)| dataset),
                          weights = 1/d_var_calc, 
                          REML = F, data = metalab_data)

lin_log_paper_slope_lmer <- lmer(d_calc ~ mean_age + log(mean_age) + 
                          (log(mean_age)| dataset / study_ID),
                          weights = 1/d_var_calc, 
                          REML = F, data = metalab_data)

kable(coef(summary(lin_log_slope_lmer)))

	Estimate	Std. Error	t value
(Intercept)	-0.1692284	0.3833667	-0.4414271
mean_age	0.0000860	0.0000579	1.4857497
log(mean_age)	0.0718651	0.0678465	1.0592310

kable(coef(summary(lin_log_paper_slope_lmer)))

	Estimate	Std. Error	t value
(Intercept)	-0.3787363	0.4301714	-0.8804311
mean_age	0.0001336	0.0000788	1.6953685
log(mean_age)	0.1099003	0.0772974	1.4217855

kable(coef(summary(lin_log_lmer)))

	Estimate	Std. Error	t value
(Intercept)	0.4212793	0.0788645	5.341813
mean_age	0.0001291	0.0000308	4.186278
log(mean_age)	-0.0294871	0.0120632	-2.444380

kable(coef(summary(lin_log_paper_lmer)))

	Estimate	Std. Error	t value
(Intercept)	0.2779933	0.0959522	2.897206
mean_age	0.0001041	0.0000351	2.962590
log(mean_age)	0.0046257	0.0156777	0.295050

kable(anova(lin_log_lmer, lin_log_slope_lmer))

	npar	AIC	BIC	logLik	-2*log(L)	Chisq	Df	Pr(>Chisq)
lin_log_lmer	5	5064.807	5094.777	-2527.404	5054.807	NA	NA	NA
lin_log_slope_lmer	7	4996.794	5038.752	-2491.397	4982.794	72.01276	2	0

kable(anova(lin_log_lmer, lin_log_paper_slope_lmer))

	npar	AIC	BIC	logLik	-2*log(L)	Chisq	Df	Pr(>Chisq)
lin_log_lmer	5	5064.807	5094.777	-2527.404	5054.807	NA	NA	NA
lin_log_paper_slope_lmer	10	4733.850	4793.789	-2356.925	4713.850	340.9573	5	0

kable(anova(lin_log_paper_lmer, lin_log_slope_lmer))

	npar	AIC	BIC	logLik	-2*log(L)	Chisq	Df	Pr(>Chisq)
lin_log_paper_lmer	6	4831.536	4867.500	-2409.768	4819.536	NA	NA	NA
lin_log_slope_lmer	7	4996.794	5038.752	-2491.397	4982.794	0	1	1

kable(anova(lin_log_paper_lmer, lin_log_paper_slope_lmer))

	npar	AIC	BIC	logLik	-2*log(L)	Chisq	Df	Pr(>Chisq)
lin_log_paper_lmer	6	4831.536	4867.500	-2409.768	4819.536	NA	NA	NA
lin_log_paper_slope_lmer	10	4733.850	4793.789	-2356.925	4713.850	105.6859	4	0

kable(anova(lin_log_slope_lmer, lin_log_paper_slope_lmer))

	npar	AIC	BIC	logLik	-2*log(L)	Chisq	Df	Pr(>Chisq)
lin_log_slope_lmer	7	4996.794	5038.752	-2491.397	4982.794	NA	NA	NA
lin_log_paper_slope_lmer	10	4733.850	4793.789	-2356.925	4713.850	268.9445	3	0

As before, predictions were built for the new models and plotted on the data. Overall the models do look pretty similar in terms of their predicted fit of the data.

lin_log_slope_preds <- predict(lin_log_slope_lmer, re.form = ~ (log(mean_age) | dataset))
lin_log_paper_slope_preds <- predict(lin_log_paper_slope_lmer, re.form = ~ (log(mean_age) | dataset))


metalab_data_preds <- metalab_data_preds %>%
  mutate(lin_log_slope_pred = lin_log_slope_preds) %>%
  mutate(lin_log_paper_slope_pred = lin_log_paper_slope_preds)

Subset to only those papers with multiple ages

One concern with using a random slope with meta-analysis nested by paper is that not all studies included more than one age group. To be sure that our random slope is really appropriate, here we look at only papers where more than one age group was tested.

Two age groups

To start we arbitrarily decided on two age groups, at least more than two months apart. Below we build a new model only looking at data where papers had at least two age groups. The structure of the model is the same as the model with papers nested in meta-analysis and a random slope of mean age log transformed.

multiage_data <- metalab_data_preds %>%
  group_by(dataset, study_ID) %>%
  mutate(count_ages = length(unique(floor(mean_age/2)))) %>%
  filter(count_ages > 1)

lin_log_paper_slope_multiage_lmer <- lmer(d_calc ~ mean_age + log(mean_age) + 
                                     (log(mean_age) | dataset / study_ID),
                                     weights = 1/d_var_calc, 
                                     REML = F, data = multiage_data)

kable(coef(summary(lin_log_paper_slope_multiage_lmer)))

	Estimate	Std. Error	t value
(Intercept)	-0.4028799	0.4600573	-0.8757167
mean_age	0.0001897	0.0001020	1.8593404
log(mean_age)	0.1037670	0.0847375	1.2245709

Predictions were calculated for the new model and visually compared to the simple model on the full data set with no nested random effects and no random slope. The plot below suggests that for this reduced data set, the addition of the nested random effects and random slope does differ greatly from the simpler model.

lin_log_paper_slope_multiage_preds <- predict(lin_log_paper_slope_multiage_lmer, re.form = ~ (log(mean_age)| dataset))

multiage_data <- multiage_data %>%
  ungroup %>%
  select(dataset, short_name, study_ID, d_calc, mean_age, d_var_calc, lin_log_pred) %>%
  mutate(lin_log_paper_slope_multiage_pred = lin_log_paper_slope_multiage_preds)

Three age groups

We also did the same analysis, but further reducing the data set to only papers that included at least three different age groups.

multiage_3_data <- metalab_data_preds %>%
  group_by(dataset, study_ID) %>%
  mutate(count_ages = length(unique(floor(mean_age/2)))) %>%
  filter(count_ages > 2)

lin_log_paper_slope_multiage_3_lmer <- lmer(d_calc ~ mean_age + log(mean_age) + 
                                     (log(mean_age) | dataset / study_ID),
                                     weights = 1/d_var_calc, 
                                     REML = F, data = multiage_3_data)

kable(coef(summary(lin_log_paper_slope_multiage_3_lmer)))

	Estimate	Std. Error	t value
(Intercept)	-0.4158974	0.4996795	-0.8323283
mean_age	0.0002249	0.0001189	1.8910389
log(mean_age)	0.1015908	0.0916035	1.1090279

Predictions were calculated for the new model and visually compared to the simple model on the full data set with no nested random effects and no random slope. The plot below suggests that for this further reduced data set, the difference becomes the two models becomes even more apparent.

lin_log_paper_slope_multiage_3_preds <- predict(lin_log_paper_slope_multiage_3_lmer, re.form = ~ (log(mean_age)| dataset))

multiage_3_data <- multiage_3_data %>%
  ungroup %>%
  select(dataset, short_name, study_ID, d_calc, mean_age, d_var_calc, lin_log_pred) %>%
  mutate(lin_log_paper_slope_multiage_3_pred = lin_log_paper_slope_multiage_3_preds)

Conclusions

In this report we sought to find the best curve to describe developmental data. We found that, regarding predictors, the best fit of the data appears to be a model that includes both mean age and mean age log transformed as predictors. We further improved the curve by using complex random effects structures in our regressions. We found that a more complicated random effects structure was best by: 1) nesting paper in meta-analysis, and 2) including a random slope by mean age log transformed. This structure was most appropriate when we only looked at papers that had at least two or three age groups, to ensure the random slope applied to all data analyzed.