论文标题
使用层次相关重建来预测活性银河核红移的条件概率分布
Predicting conditional probability distributions of redshifts of Active Galactic Nuclei using Hierarchical Correlation Reconstruction
论文作者
论文摘要
虽然一般关注的值的预测,但实际数据通常只允许预测有条件的概率分布,并且有条件熵$ H(y | x)$的功能。如果另外估计不确定性,我们可以将预测的值视为拉普拉斯分布的高斯中心 - 理想化远非真实数据的复杂条件分布。本文将层次相关重建(HCR)方法应用于廉价地预测相当复杂的条件概率分布(例如多模式):通过独立的MSE估计多动力矩呈参数,可以重建条件分布。为此,使用线性回归,我们获得了可解释的模型:系数描述了特征对条件矩的贡献。本文扩展了原始方法,尤其是通过使用规范相关分析(CCA)进行特征优化和L1“ Lasso”正则化,重点是基于第四个Fermi-LAT数据释放2(4LAC)数据集的活性银河系核(AGN)的实际问题。
While there is a general focus on prediction of values, real data often only allows to predict conditional probability distributions, with capabilities bounded by conditional entropy $H(Y|X)$. If additionally estimating uncertainty, we can treat a predicted value as the center of Gaussian of Laplace distribution - idealization which can be far from complex conditional distributions of real data. This article applies Hierarchical Correlation Reconstruction (HCR) approach to inexpensively predict quite complex conditional probability distributions (e.g. multimodal): by independent MSE estimation of multiple moment-like parameters, which allow to reconstruct the conditional distribution. Using linear regression for this purpose, we get interpretable models: with coefficients describing contributions of features to conditional moments. This article extends on the original approach especially by using Canonical Correlation Analysis (CCA) for feature optimization and l1 "lasso" regularization, focusing on practical problem of prediction of redshift of Active Galactic Nuclei (AGN) based on Fourth Fermi-LAT Data Release 2 (4LAC) dataset.
