Using Historical and Projected Climate Data to Forecast Child Malnutrition

In the final post in our series on future estimation we create a model to predict child malnutrition and plug in projected climate scenario data to predict a future level of child malnutrition.

Climate
Scenarios
Temperature
R
Nutrition
Forecasts
Authors
Affiliations

Rebecca Luttinen

IPUMS Global Health Data Analyst

Jessie Pinchoff

IPUMS Researcher

Published

October 24, 2025

This post concludes our series on how to use future estimation in child health research. The previous posts introduced proper terminology, showcased how to use Demographic and Health Survey (DHS) data to project a population/vital rates, and introduced a source of climate scenario data. Now we will apply what we have learned from this series and use it to create a forecast of child malnutrition. Information such as this can be used to inform humanitarian intervention which can take preventative action against future child malnutrition.

Refresher: What is a forecast?

Our first post in this series parsed some of the terminology used when using methods of future estimation. We defined forecasts as “short-term, numeric model estimates of physical phenomena which can have accuracy assessments embedded in them.” To create a forecast of something, analysts and researchers develop a quantitative model of that outcome using both historical and current data. For example, if a model predicts the number of children with acute malnutrition within a given region, a forecast of that model would be the number of children with acute malnutrition that the model predicts into a future timeframe. Predictions can be short term (weeks to months in the future) or long-term (out to the year 2050 or 2100). For the far future, we use different scenarios to give a range of possible outcomes, while in the short-term we can likely assume most conditions will stay the same.

Tying in Climate Scenario Data

In the third post in this series, we introduced climate scenarios. Data from a climate scenario can be used as an input in a quantitative model to generate a forecast, which was described in the first post, given the specific climate conditions in the scenario. Before we can use the climate scenario data as an input in our model, we need to build the forecasting model with historical climate data. Below we demonstrate a workflow that can be used to accomplish this.

Building a Forecast

Data Harmonization

First, this post introduces how to make a quantitative model from Kenya 2022 DHS data. Second, we plug climate scenario data into our the model and generate forecasts.

First, we will load the libraries we will employ for this analysis.

library(haven)
library(tidyverse)
library(sf)
library(terra)
library(stats)

As we have demonstrated how to format DHS and climate data in earlier posts (i.e here). We will skip over how to format climate and DHS data together, and get right into temporally harmonizing these data sources and modeling. First, we will read in a .Rda file, which is an R data file. This file contains already formatted child health/ household information from the 2022 Kenya DHS survey, as well as monthly average maximum temperature for a 10 kilometer buffer zone around each DHS cluster from CHIRTS. If you are not familiar with how to harmonize climate and DHS data, read the post linked at the start of this paragraph .

The DHS is currently reviewing applications for data access. Apply for and access DHS data here.


#load DHS data

load("data/DHS/childdatafilter.Rda")

#load climate data

load("data/ken_chirts/ken_chirts_long_filter_bp.Rda")

Now, that we have loaded our child health and climate information, we must make sure our data has the proper temporal format. In this blog post we will follow an anticipatory action framework, which involves acting ahead of predicted hazards to reduce the impact at various time-frames, such as 1, 3, 6, 9 or 12 months in advance. For the purpose of this blog post, we use the average temperature information for 1-3 months prior to the date of the DHS survey.

Note

Note: In this example, we have we have named our DHS cluster variable to “ID” in both our climate and child health datasets. This is different from IPUMS DHS, and the raw variable names provided by the DHS.

First, we will merge our child health and climate information by the DHS cluster variable, ID to get all of our information into one dataset.

#merge with DHS 
surveytempreframe <- merge(childdatafilter, ken_chirts_long_filter, by="ID")

Then we can use this information to create new variables for the temperature information 1-3 months before the survey date.


#create new variables for 1-3 months before survey

surveytempreframe$presurveytemp1<-ifelse(surveytempreframe$CHIRTSCMC== (surveytempreframe$surveydate-1), surveytempreframe$AV_TEMP, NA)

surveytempreframe$presurveytemp2<-ifelse(surveytempreframe$CHIRTSCMC== (surveytempreframe$surveydate-2), surveytempreframe$AV_TEMP, NA)

surveytempreframe$presurveytemp3<-ifelse(surveytempreframe$CHIRTSCMC== (surveytempreframe$surveydate-3), surveytempreframe$AV_TEMP, NA)

In addition to geographic variation, there is temporal variation in this dataset, however, and we only need the temporal information pertinent to the DHS survey. We can restructure our dataset to only keep the temporal information necessary for our analysis. We can uses the contains() function to make our new variables into a new single variable in a long format, as each of the new variables contains the prefix “surveytemp.” The old variable names we have created, will be stored in a new column called “monthbeforesurvey” and the monthly average maximum value in Celsius will be stored in a new column called “temperature.”


#pivot long

surveytempreframe2<-surveytempreframe %>%
  pivot_longer(
    cols = contains("surveytemp"),
    names_to = "monthbeforesurvey",
    names_prefix = "temp",
    values_to = "temperature",
    values_drop_na = TRUE)

We just used pivot_longer, to get our climate information to be in a long format. There is one more step necessary to be able to merge this information with our DHS data. We still need our climate information to be in a wide format, meaning we need separate variables for each month. We can use the pivot_wider() function to accomplish this. Here we signify three ID columns that we do not want to be merged into the wide format. Caseid is a unique identifier for each woman surveyed by the DHS. We also use the birthorder column to make sure we are getting each individual birth, because women can report up to five births prior to the DHS survey. And lastly, we use the DHS cluster variable.


#pivot wider
ken_chirts_wide2 <- surveytempreframe2 %>%
  pivot_wider(id_cols = c(caseid, birthorder, ID), names_from = monthbeforesurvey, values_from = temperature)


#merge with DHS again
tempandDHSwithsurvey <- merge(childdatafilter, ken_chirts_wide2, by= c('caseid', 'birthorder', 'ID'))

Now that we have a full dataset, we can begin modeling. First, we will drop the NA’s from our dataset.


#drop children without accurate age information
tempandDHSwithsurveyfilter<-tempandDHSwithsurvey%>%
  filter(curchildage!='NA')

#drop children without accurate sex information

tempandDHSwithsurveyfilter<-tempandDHSwithsurvey%>%
  filter(childsex!='NA')

Bivariate Analysis

Then we can test for bivariate associations between our temperature values and our outcome: weight-for-age. Below, we do this for each month separately.


#correlation tests

#1 month before survey
cor.test(tempandDHSwithsurveyfilter$w4age, tempandDHSwithsurveyfilter$presurveytemp1 )
#> 
#>  Pearson's product-moment correlation
#> 
#> data:  tempandDHSwithsurveyfilter$w4age and tempandDHSwithsurveyfilter$presurveytemp1
#> t = -13.697, df = 14192, p-value < 2.2e-16
#> alternative hypothesis: true correlation is not equal to 0
#> 95 percent confidence interval:
#>  -0.13043032 -0.09795672
#> sample estimates:
#>       cor 
#> -0.114224

#2 months before survey
cor.test(tempandDHSwithsurveyfilter$w4age, tempandDHSwithsurveyfilter$presurveytemp2 )
#> 
#>  Pearson's product-moment correlation
#> 
#> data:  tempandDHSwithsurveyfilter$w4age and tempandDHSwithsurveyfilter$presurveytemp2
#> t = -13.917, df = 14192, p-value < 2.2e-16
#> alternative hypothesis: true correlation is not equal to 0
#> 95 percent confidence interval:
#>  -0.13223468 -0.09977481
#> sample estimates:
#>        cor 
#> -0.1160357

#3 months before survey
cor.test(tempandDHSwithsurveyfilter$w4age, tempandDHSwithsurveyfilter$presurveytemp3 )
#> 
#>  Pearson's product-moment correlation
#> 
#> data:  tempandDHSwithsurveyfilter$w4age and tempandDHSwithsurveyfilter$presurveytemp3
#> t = -13.834, df = 14192, p-value < 2.2e-16
#> alternative hypothesis: true correlation is not equal to 0
#> 95 percent confidence interval:
#>  -0.13155195 -0.09908686
#> sample estimates:
#>        cor 
#> -0.1153502

The correlation between child weight-for-age and the monthly average daily maximum temperature per DHS cluster is significant for each month 1-3 months before the survey.

Next, we create variable that is the average of these 1-3 months.


tempandDHSwithsurveyfilter$avtemppresurvey<-((tempandDHSwithsurveyfilter$presurveytemp1+ tempandDHSwithsurveyfilter$presurveytemp2+ tempandDHSwithsurveyfilter$presurveytemp3)/3)

Multivariate Multilevel Modeling

Now we will build a multivariate model that takes into account the multilevel structure of our dataset. We use various variables that are known to influence child health outcomes such as: household urban/ rural status, maternal educational attainment, child age, child sex, maternal marital status, and whether the household has a finished floor or not.

Note

Research questions that require multi-level analysis involve outcomes that are clustered by time or space. This perspective requires consideration at the person-level and some other structural level. Multi-level data can include 2 or more levels. DHS surveys include information at the household, cluster, regional, and national levels. Read more here.

Why use a multilevel model instead of ordinary least squares (OLS) regression? If data are clustered at a structural level, we cannot assume independence of observations, a key assumption of OLS.

Multilevel modeling can be accomplished in R using the lme4 package. The last part of the code (1|ID) is how we allow the intercept to vary per DHS cluster, which is the ID variable.


library(lme4)

wastingmlm <- lmer(w4age ~ urban + educationlevel + curchildage+ childsex + married + AGE + floor + avtemppresurvey + (1 | ID), data = tempandDHSwithsurveyfilter)

summary(wastingmlm)
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: w4age ~ urban + educationlevel + curchildage + childsex + married +  
#>     AGE + floor + avtemppresurvey + (1 | ID)
#>    Data: tempandDHSwithsurveyfilter
#> 
#> REML criterion at convergence: 41734.9
#> 
#> Scaled residuals: 
#>     Min      1Q  Median      3Q     Max 
#> -4.5910 -0.6247  0.0046  0.6229  4.9335 
#> 
#> Random effects:
#>  Groups   Name        Variance Std.Dev.
#>  ID       (Intercept) 0.08981  0.2997  
#>  Residual             1.06497  1.0320  
#> Number of obs: 14077, groups:  ID, 1670
#> 
#> Fixed effects:
#>                             Estimate Std. Error t value
#> (Intercept)                -0.027917   0.114754  -0.243
#> urbanurban                  0.114733   0.031287   3.667
#> educationlevelno education -0.700616   0.041547 -16.863
#> educationlevelprimary      -0.374286   0.033177 -11.282
#> educationlevelsecondary    -0.215657   0.032811  -6.573
#> curchildage                -0.134463   0.006419 -20.949
#> childsexmale               -0.116956   0.019824  -5.900
#> marriedmarried/ cohabiting  0.061776   0.038015   1.625
#> marriednot married          0.045038   0.050275   0.896
#> marriedwidowed             -0.079050   0.073061  -1.082
#> AGE                         0.007984   0.001493   5.348
#> floorunfinished            -0.239385   0.022700 -10.546
#> avtemppresurvey            -0.006765   0.003139  -2.155

An alternative way to present our results, can be accomplished with the modelsummary() pacakge.

library(modelsummary)
#> Warning: package 'modelsummary' was built under R version 4.5.1

modelsummary(wastingmlm, stars = TRUE, title = 'Multilevel Regression Model Results')
Multilevel Regression Model Results
(1)
+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001
(Intercept) -0.028
(0.115)
urbanurban 0.115***
(0.031)
educationlevelno education -0.701***
(0.042)
educationlevelprimary -0.374***
(0.033)
educationlevelsecondary -0.216***
(0.033)
curchildage -0.134***
(0.006)
childsexmale -0.117***
(0.020)
marriedmarried/ cohabiting 0.062
(0.038)
marriednot married 0.045
(0.050)
marriedwidowed -0.079
(0.073)
AGE 0.008***
(0.001)
floorunfinished -0.239***
(0.023)
avtemppresurvey -0.007*
(0.003)
SD (Intercept ID) 0.300
SD (Observations) 1.032
Num.Obs. 14077
R2 Marg. 0.106
R2 Cond. 0.175
AIC 41764.9
BIC 41878.2
ICC 0.1
RMSE 1.01

Our multivariate, linear model, with a random intercept for each DHS cluster shows a negative, significant relationship between the average maximum temperature for 1-3 months prior to a survey and child weight-for-age z-score.

Now that we have built a model predicting child weight-for-age z-score, we will use this model with the climate scenario data to make a forecast of the count of children experiencing wasting per DHS cluster in the year of 2050.

Integrating Climate Scenario Data into our Workflow

We will begin this process by reading in the climate scenario data. For this example, we will use SSP5-8.5. Read more about this SSP and how to download and format it in order to use it with DHS data in the post before this one.

#read in climate scenario data

load("C:/Users/Rebecca/Downloads/ken_Tmax_585_2050_spatial_mean_sf.Rda")

load("C:/Users/Rebecca/Downloads/ken_Tmax_585_2049_spatial_mean_sf.Rda")

We grab only a subset of months from 2049 because we are creating a forecast for 2050, not 2049. The reason we need information is because we are using an anticipatory action framework, which involves studying how climate trends prior to 2050 influence child malnutrition in 2050.

The months we need from 2049 are stored in columns 21:23 in this example. Second, as we mentioned earlier, we have renamed our DHS cluster variable to be ID, so we will rename it here as well to streamline any further merging.

#make a subset of the 2049 since we only need a few months of data from this year
ken_Tmax_585_2049<-select(ken_Tmax_585_2049_spatial_mean_sf, 21:23)

climatescenariodata<-st_join(ken_Tmax_585_2050_spatial_mean_sf, ken_Tmax_585_2049)

#rename DHS cluster to ID for merging

climatescenariodata<-climatescenariodata%>%
  rename('ID'= 'DHSCLUST')

Now we will need to create dataframe that is the same structure as the dataframe that we used to create our model. We will need to reorganize our climate scenario data to be 1-3 month averages prior to the month that the DHS survey was administered. We can do this by calculating the month of the DHS using information from our century-month-code variable.


#use the century-month code formula to calculate the month

tempandDHSwithsurveyfilter$surveymonth<-(tempandDHSwithsurveyfilter$surveydate-(12*(2022-1900)))

Next, we will create a new dataframe for the year of 2050. We will manipulate our original dataframe in order to accomplish this. We will hold all of our DHS variables constant, meaning we will use the same values as the original dataframe.


#make a subset of the a dataframe with predictors for analysis 

build2050frame<-select(tempandDHSwithsurveyfilter, caseid, birthorder, ID, urban, married, curchildage, childsex, educationlevel, AGE, floor)

#add value for year as 2050 and value for outcome as NA

build2050frame$year <- rep(2050, nrow(build2050frame))

build2050frame$w4age <- rep(NA, nrow(build2050frame))

We will swap our historical climate information for the climate scenario information, however.

Next, we need to merge our climate scenario information with this new dataframe for 2050. Our climate scenario information is integral to generating a forecast because all of our other variables besides DHS cluster are equal to ‘NA.’

First, we select a subset of our cliamte scenario data that only includes the year-months that we need. In this example, these variables are in columns 21:35.


climatescenarioselect<-select(climatescenariodata, ID, 21:35)

#drop geometry 
climatescenarioselect_df <- climatescenarioselect%>% 
  st_drop_geometry()

At this point, we have built a dataframe that contains the DHS cluster column and the projecte average monthly maximum temperature for select months in 2049 and 2050. We have to restructure this to be in the 1-3 month prior to the survey date format, as we did above with the historical information.

We can start by using pivot_longer to make this our projected temperature data into the long format. In this example, we create a scenario_DATE variable with stores the full date in mm-dd-yyyy format. Then we can calculate the CMC for this variable.

#make data long

climatescenario_long <- climatescenarioselect_df%>%
  pivot_longer(
    cols = -ID, # Do not pivot the ID col
    names_to = "scenario_DATE", # Rename output columns
    values_to = "AV_TEMP"
  ) %>%
  mutate(
    scenario_DATE= str_replace(scenario_DATE, "ym_", ""),
    scenario_DATE = ym(scenario_DATE))


#calculate the CMC

climatescenario_long<-climatescenario_long%>%
  mutate(scenarioCMC=(year(scenario_DATE) - 1900) * 12 + month(scenario_DATE))

Now we will change our DHS survey date variable, to make it as if it were collected in the year of 2050. We can do this by editing the CMC to match that of 2050. First, we extract the exact survey dates from our dataset. This variable is in CMC format.


#first see what the unique values of the survey date are

unique(tempandDHSwithsurveyfilter$surveydate)
#> [1] 1468 1469 1470 1471 1467 1466

#another way to check this is by creating a month variable

#calculate the month from  the cmc

tempandDHSwithsurveyfilter<-tempandDHSwithsurveyfilter%>%
  mutate(month= surveydate- 12 *(2022-1900))

unique(tempandDHSwithsurveyfilter$month)
#> [1] 4 5 6 7 3 2

Now that we know what the exact dates/ months are. We can write code to create a ‘future’ survey date.


#now use case_when to get the CMC for these specific months in 2049-2050

tempandDHSwithsurveyfilter<-tempandDHSwithsurveyfilter%>%
  mutate(futuresurveydate= case_when(month==2~ 1802,
                          month==3~ 1803,
                          month==4~ 1804,
                          month==5~ 1805,
                          month==6~ 1806,
                          month==7~ 1807))

Now that we have a future survey date in our dataset, we can lag our climate scenario information.


#merge with DHS 
climatescenarioreframe <- merge(tempandDHSwithsurveyfilter, climatescenario_long, by="ID")

#create new variables for 1-3 months before future survey date


climatescenarioreframe$prefuturesurveytemp1<-ifelse(climatescenarioreframe$scenarioCMC== (climatescenarioreframe$futuresurveydate-1), climatescenarioreframe$AV_TEMP, NA)


climatescenarioreframe$prefuturesurveytemp2<-ifelse(climatescenarioreframe$scenarioCMC== (climatescenarioreframe$futuresurveydate-2), climatescenarioreframe$AV_TEMP, NA)


climatescenarioreframe$prefuturesurveytemp3<-ifelse(climatescenarioreframe$scenarioCMC== (climatescenarioreframe$futuresurveydate-3), climatescenarioreframe$AV_TEMP, NA)

Now we will restructure the dataset one last time and merge it with the dataframe for 2050 that we have created.

#pivot long

climatescenarioreframe2<-climatescenarioreframe %>%
  pivot_longer(
    cols = contains("futuresurveytemp"),
    names_to = "cmcinreftosurvey",
    names_prefix = "temp",
    values_to = "temperature",
    values_drop_na = TRUE)

#pivot wider
scenario_wide2 <- climatescenarioreframe2%>%
  pivot_wider(id_cols = c(caseid, ID, birthorder), names_from = cmcinreftosurvey, values_from = temperature)


#merge with 2050 df 
df_2050_scenariodata<- left_join(build2050frame, scenario_wide2, by= c('caseid', 'birthorder', 'ID'))

Next, since we worked with the average maximum temperature for 1-3 months prior to a survey. We need to create an average variable of our climate scenario data.

#calculate the average temperature for 1-3 months prior to survey

df_2050_scenariodata$avtemppresurvey<-((df_2050_scenariodata$prefuturesurveytemp1+ df_2050_scenariodata$prefuturesurveytemp2+ df_2050_scenariodata$prefuturesurveytemp3)/3)

How Much Hotter Is 2050 than 2022?

Before we generate our weight-for-age forecasts, let’s visualize how much hotter the temperature in projected to be in 2050 is than the observed temperature in 2022. For the sake of comparison, let’s visualize the difference between the average monthly maximum temperature in January 2022 and January 2050.


#2022
Jan2022<-ken_chirts_long_filter%>%
  filter(CHIRTSCMC==1465)%>%
  rename('AV_TEMP_2022'='AV_TEMP')

#2050
Jan2050<-climatescenario_long%>%
  filter(scenarioCMC==1801)%>%
  rename('AV_TEMP_2050'='AV_TEMP')

#merge them 

temptogether<-left_join(Jan2022, Jan2050)
#> Joining with `by = join_by(ID)`

#add the geometry back in

justgps<-select(ken_Tmax_585_2049_spatial_mean_sf, ID=DHSCLUST, geometry)
 
temp20222050wgeo<-left_join(temptogether, justgps)
#> Joining with `by = join_by(ID)`

#convert to spatial frame

temptogetherwgeo<-st_sf(temp20222050wgeo)

#calculate the difference between them

temp20222050wgeo$delta<-temp20222050wgeo$AV_TEMP_2050-temp20222050wgeo$AV_TEMP_2022

Next, we read in an administrative boundary file for Kenya, and use both ggplot() and a package inspired by the filmmaker Wes Anderson, that allows you to use a color palette reminiscent of the colors you see in his films. Read more about this package here.

Code
#read into kenya borders

ken_borders<-st_read("data/geo_ke1989_2014/geo_ke1989_2014.shp")
#> Reading layer `geo_ke1989_2014' from data source 
#>   `C:\Users\Rebecca\Documents\Projects\dhs-research-hub\posts\2025-10-24-forecasting-pt4\data\geo_ke1989_2014\geo_ke1989_2014.shp' 
#>   using driver `ESRI Shapefile'
#> Simple feature collection with 9 features and 3 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 33.90983 ymin: -4.680056 xmax: 41.90684 ymax: 5.033421
#> Geodetic CRS:  WGS 84


# Load the wesanderson package 
library(wesanderson)
#> Warning: package 'wesanderson' was built under R version 4.5.1


# Create a reversed continuous gradient from Zissou1, a reference to the 'The Life Aquatic'
gb_palette <- colorRampPalette(wesanderson::wes_palette("Zissou1", type = "continuous"))(100)

ggplot(temp20222050wgeo) +
  geom_sf(aes(geometry = geometry, color = delta), size = 5, alpha = 0.8) +
  geom_sf(data = ken_borders, color = "black", fill = NA, linewidth = 0.4) +
  scale_color_gradientn(colors = gb_palette) +
  labs(
    title = "Delta between Monthly Average Maximum Temperature \nPredicted in January 2050 and Observed in January 2022",
    color = "Delta (°C)"
  ) +
  theme_minimal(base_size = 20)+
theme(
  plot.title = element_text(size = 14)
)

Generate the Forecasts

Now we have our dataframe for 2050, we can generate the fitted values of our outcome, weight-for-age, for 2050. We can use the predict() function to run the model we have built earlier with the dataframe we have created in which we substituted out our historical climate data for climate scenario data.


#generate the predictions

forecasts_2050 <-predict(wastingmlm, newdata = df_2050_scenariodata, allow.new.levels = TRUE)

Comparing our Observed and Forecasted Values

Now let’s compare our forecasted weight-for-age z-scores for 2050 using SSP5-8.5, with our observed weight-for-age z-scores in 2022.

To visualize the differences, we will calculate the observed average weight-for-age z-score per DHS cluster in 2022, alongside the predicted one for 2050, given the SSP5-8.5 assumptions.


# Combine actual and predicted values
forecasts_and_observed <- data.frame(Predictedw4age=forecasts_2050, caseid=df_2050_scenariodata$caseid, ID=df_2050_scenariodata$ID, observedw4age=tempandDHSwithsurveyfilter$w4age)

Now let’s plot the distribution of our observed and predicted weight-for-age scores.

Code
# Reshape to long format
forecasts_and_observed_long <- forecasts_and_observed %>%
  pivot_longer(cols = c(Predictedw4age, observedw4age), names_to = "predicted_observed", values_to = "value")


forecasts_and_observed_long <- forecasts_and_observed_long %>%
  filter(!is.na(value))

# Histogram with facet_wrap
ggplot(forecasts_and_observed_long, aes(x = value)) +
  geom_histogram(binwidth = 0.5, fill = "skyblue", color = "white") +
  facet_wrap(~ predicted_observed, scales = "fixed", labeller = labeller(predicted_observed = c(
    "observedw4age" = "Observed values, 2022",
    "Predictedw4age" = "Forecasted values, 2050, SSP5-8.5"
  ))) +
  theme_minimal() +
  labs(title = "Distribution of weight-for-age z-scores ",
       x = "Weight-for-age z-score",
       y = "Count")+
  theme_minimal(base_size = 20)+
theme(
  plot.title = element_text(size = 14)
)

Our forecasts are mainly concentrated below zero, meaning that our forecasted sample is predicted to have a higher concentration of lower weight-for-age z-scores than our observed sample from 2022.

Our historical model confirmed that monthly average maximum temperature per DHS cluster 1-3 months before the survey is associated with a lower weight-for-age z-score. In addition to this, the predicted temperature data for SSP5-8.5 in 2050 is higher than the observed temperature data in 2022, on average. Therefore, it is reasonable to assume that the forecasted weight-for-age values would be more negative than those that we observed.

Conclusion

This post is the last of our series on climate and population projections. Together the posts in this series introduced proper terminology when using future estimation, demonstrated how to carry out demographic methods like calcualting vital rates and population projections, and showed how climate data can be integrated into these workflows. This series used data from the Kenya 2022 DHS, however, these methods can be integrated in different regions with different data sources as well. In this final post, we found the child population surveyed by the Kenya 2022 DHS would experience lower weight-for-age z-scores on average if they were surveyed in 2050, given SSP5-8.5 assumptions. This information can be utilized to mitigate the future risks of increased child malnutrition as temperatures continue to rise in Kenya.

Looking ahead

Our next blog post will be the second post in our series on dietary diversity, a topic that definitely has implications on child health. When a child has regular access to nutritious foods, their risk of experiencing indicators of malnutrition, such as wasting or stunting, is greatly reduced.