...
首页> 外文期刊>ISPRS Journal of Photogrammetry and Remote Sensing >The Naive Overfitting Index Selection (NOIS): A new method to optimize model complexity for hyperspectral data
【24h】

The Naive Overfitting Index Selection (NOIS): A new method to optimize model complexity for hyperspectral data

机译:天真的过拟合指数选择(NOIS):一种优化高光谱数据模型复杂度的新方法

获取原文
获取原文并翻译 | 示例
           

摘要

The growing number of narrow spectral bands in hyperspectral remote sensing improves the capacity to describe and predict biological processes in ecosystems. But it also poses a challenge to fit empirical models based on such high dimensional data, which often contain correlated and noisy predictors. As sample sizes, to train and validate empirical models, seem not to be increasing at the same rate, overfitting has become a serious concern. Overly complex models lead to overfitting by capturing more than the underlying relationship, and also through fitting random noise in the data. Many regression techniques claim to overcome these problems by using different strategies to constrain complexity, such as limiting the number of terms in the model, by creating latent variables or by shrinking parameter coefficients. This paper is proposing a new method, named Naive Overfitting Index Selection (NOIS), which makes use of artificially generated spectra, to quantify the relative model overfitting and to select an optimal model complexity supported by the data. The robustness of this new method is assessed by comparing it to a traditional model selection based on cross-validation. The optimal model complexity is determined for seven different regression techniques, such as partial least squares regression, support vector machine, artificial neural network and tree-based regressions using five hyperspectral datasets. The NOIS method selects less complex models, which present accuracies similar to the cross-validation method. The NOIS method reduces the chance of overfitting, thereby avoiding models that present accurate predictions that are only valid for the data used, and too complex to make inferences about the underlying process. (C) 2017 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). Published by Elsevier B.V. All rights reserved.
机译:高光谱遥感中越来越多的窄光谱带提高了描述和预测生态系统中生物过程的能力。但是,基于这样的高维数据拟合经验模型也带来了挑战,这些数据通常包含相关且嘈杂的预测变量。由于用于训练和验证经验模型的样本数量似乎并没有以相同的速度增长,因此过度拟合已成为一个严重的问题。过于复杂的模型会通过捕获更多的基本关系,以及通过在数据中拟合随机噪声来导致过度拟合。许多回归技术声称通过使用不同的策略来限制复杂性来克服这些问题,例如,限制模型中的项数,创建潜在变量或缩小参数系数。本文提出了一种新方法,即天真的过拟合指数选择(NOIS),它利用人工生成的光谱来量化相对模型的过拟合并选择数据支持的最佳模型复杂度。通过与基于交叉验证的传统模型选择进行比较,评估了该新方法的鲁棒性。使用五种高光谱数据集,针对七种不同的回归技术(例如偏最小二乘回归,支持向量机,人工神经网络和基于树的回归)确定最佳模型复杂度。 NOIS方法选择不太复杂的模型,其准确性与交叉验证方法相似。 NOIS方法减少了过拟合的机会,从而避免了模型提供仅对所使用的数据有效的准确预测,并且模型过于复杂而无法推断基础过程。 (C)2017国际摄影测量与遥感学会(ISPRS)。由Elsevier B.V.发布。保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号