How to Deal With Missing Financial Data
The problem of missing financial data is widespread yet often overlooked. An interesting insight into the structure of missing financial data provides a novel research paper by authors Bryzgalova et al. (2022). Firstly, examining the dataset of the 45 most popular characteristics in asset pricing, the authors found that missing data is frequent among almost any characteristic and affects all kinds of firms — small, large, young, mature, profitable, or in financial distress. The requirement of multiple characteristics simultaneously makes the problem even worse. Moreover, the data is not missing randomly; missing values clusters both cross-sectionally and over time. This may lead to a selection bias, making most famous ad-hoc approaches like the median invalid. Considering the abovementioned findings, the authors propose a novel imputation method based on Principal Component Analysis (PCA).
https://quantpedia.com/how-to-deal-with-missing-financial-data/