Abstract
Traditional integrated assessment models assume parametric climate damage functions that may miss nonlinearities, heterogeneity, and dynamic effects on investment. This thesis develops a data-driven climate damage function for capital formation by estimating the predictive relationship between climate conditions and future gross fixed capital formation (% GDP) across 125 countries over 1982–2019. We construct a panel dataset by combining daily ERA5 climate reanalysis data (accessed via the Copernicus Climate Data Store API and aggregated to yearly country-level variables including temperature anomalies, extreme heat days, frost days, precipitation, and solar radiation) with economic indicators from the World Bank World Development Indicators and the Penn World Table. A twelve-step preprocessing pipeline addresses missing data,multicollinearity, distributional skewness, and within-country demeaning to isolate temporal variation in investment from cross-country structural differences. We compare eleven models spanning four paradigms: linear (OLS, fixed effects), tree based (XGBoost, Random Forest), feedforward neural networks (MLP), and deep sequence models (LSTM, GRU, TCN), across forecast horizons of 1, 3, 6, 9, and 10 years. All models are validated with expanding-window time-series cross-validation and formally compared using Diebold-Mariano, Model Confidence Set, and Clark-West hypothesis tests. Results reveal horizon-dependent model selection: OLS dominates at short horizons (H = 1,3) while a Temporal Convolutional Network (TCN) achieves the lowest RMSE at long horizons (H = 9,10). Clark-West tests show climate variables significantly improve short-horizon predictions (p < 0.003) but their linear predictive power fades at longer horizons, where sequence models capture the climate signal through temporal patterns. We quantify a counterfactual climate penalty of -0.50 percentage points at H = 10, with middle income countries bearing the heaviest burden (-0.69 pp). This middle-income vulnerability is consistent with the dual exposure of transitioning economies to both climate-sensitive primary sectors and capital-intensive industrialization. The long-horizon estimate is model sensitive, with TCN and LSTM yielding estimates of opposite sign, underscoring the importance of model uncertainty in damage function estimation. These findings suggest that climate damage estimation for investment requires horizon-specific modeling approaches, and that the single parametric damage functions used in current integrated assessment models may inadequately capture the nonlinear and temporal nature of climate impacts on capital formation.
Advisor
Naseef Mansoor
Committee Member
Michael Spencer
Committee Member
Christophe Veltsos
Date of Degree
2026
Language
english
Document Type
Thesis
Degree
Master of Science (MS)
Program of Study
Computer Information Science
Department
Computer Information Science
College
Science, Engineering and Technology
Recommended Citation
Wicaksono, P. (2026). Data-driven climate damage functions for capital formation: Estimating the climate penalty using machine learning [Master’s thesis, Minnesota State University, Mankato]. Cornerstone: A Collection of Scholarly and Creative Works for Minnesota State University, Mankato. https://cornerstone.lib.mnsu.edu/etds/1606/
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.