High-dimensional data in economics and their (robust) analysis

Kalina, Jan

doi:10.5937/sjm12-10778

Serbian Journal of Management

2017, vol. 12, iss. 1, pp. 157-169

article language: English
Review Paper
doi:10.5937/sjm12-10778
Full text (pdf)

High-dimensional data in economics and their (robust) analysis

Jan Kalina

Institute of Computer Science, Czech Academy of Sciences & Institute of Information
Theory and Automation, Czech Academy of Sciences, Czech Republic

e-mail: kalina@cs.cas.cz

Abstract

This work is devoted to statistical methods for the analysis of economic data with a large number of variables. The authors present a review of references documenting that such data are more and more commonly available in various theoretical and applied economic problems and their analysis can be hardly performed with standard econometric methods. The paper is focused on highdimensional data, which have a small number of observations, and gives an overview of recently proposed methods for their analysis in the context of econometrics, particularly in the areas of dimensionality reduction, linear regression and classification analysis. Further, the performance of various methods is illustrated on a publicly available benchmark data set on credit scoring. In comparison with other authors, robust methods designed to be insensitive to the presence of outlying measurements are also used. Their strength is revealed after adding an artificial contamination by noise to the original data. In addition, the performance of various methods for a prior dimensionality reduction of the data is compared.

This Work is licensed under a Creative Commons Attribution 4.0 License.

Keywords

Econometrics, high-dimensional data, dimensionality reduction, linear regression, classification analysis, robustness.

Abstract

Keywords

References