Sacrificing information for the greater good: how to select photometric bands for optimal accuracy
Research output: Contribution to journal › Journal article › Research › peer-review
Standard
Sacrificing information for the greater good : how to select photometric bands for optimal accuracy. / Stensbo-Smidt, Kristoffer; Gieseke, Fabian Cristian; Igel, Christian; Zirm, Andrew Wasmuth; Pedersen, Kim Steenstrup.
In: Monthly Notices of the Royal Astronomical Society, Vol. 464, No. 3, 2017, p. 2577-2596.Research output: Contribution to journal › Journal article › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - JOUR
T1 - Sacrificing information for the greater good
T2 - how to select photometric bands for optimal accuracy
AU - Stensbo-Smidt, Kristoffer
AU - Gieseke, Fabian Cristian
AU - Igel, Christian
AU - Zirm, Andrew Wasmuth
AU - Pedersen, Kim Steenstrup
PY - 2017
Y1 - 2017
N2 - Large-scale surveys make huge amounts of photometric data available. Because of the sheer amount of objects, spectral data cannot be obtained for all of them. Therefore it is important to devise techniques for reliably estimating physical properties of objects from photometric information alone. These estimates are needed to automatically identify interesting objects worth a follow-up investigation as well as to produce the required data for a statistical analysis of the space covered by a survey. We argue that machine learning techniques are suitable to compute these estimates accurately and efficiently. This study promotes a feature selection algorithm, which selects the most informative magnitudes and colours for a given task of estimating physical quantities from photometric data alone. Using k nearest neighbours regression, a well-known non-parametric machine learning method, we show that using the found features significantly increases the accuracy of the estimations compared to using standard features and standard methods. We illustrate the usefulness of the approach by estimating specific star formation rates (sSFRs) and redshifts (photo-z's) using only the broad-band photometry from the Sloan Digital Sky Survey (SDSS). For estimating sSFRs, we demonstrate that our method produces better estimates than traditional spectral energy distribution (SED) fitting. For estimating photo-z's, we show that our method produces more accurate photo-z's than the method employed by SDSS. The study highlights the general importance of performing proper model selection to improve the results of machine learning systems and how feature selection can provide insights into the predictive relevance of particular input features.
AB - Large-scale surveys make huge amounts of photometric data available. Because of the sheer amount of objects, spectral data cannot be obtained for all of them. Therefore it is important to devise techniques for reliably estimating physical properties of objects from photometric information alone. These estimates are needed to automatically identify interesting objects worth a follow-up investigation as well as to produce the required data for a statistical analysis of the space covered by a survey. We argue that machine learning techniques are suitable to compute these estimates accurately and efficiently. This study promotes a feature selection algorithm, which selects the most informative magnitudes and colours for a given task of estimating physical quantities from photometric data alone. Using k nearest neighbours regression, a well-known non-parametric machine learning method, we show that using the found features significantly increases the accuracy of the estimations compared to using standard features and standard methods. We illustrate the usefulness of the approach by estimating specific star formation rates (sSFRs) and redshifts (photo-z's) using only the broad-band photometry from the Sloan Digital Sky Survey (SDSS). For estimating sSFRs, we demonstrate that our method produces better estimates than traditional spectral energy distribution (SED) fitting. For estimating photo-z's, we show that our method produces more accurate photo-z's than the method employed by SDSS. The study highlights the general importance of performing proper model selection to improve the results of machine learning systems and how feature selection can provide insights into the predictive relevance of particular input features.
U2 - 10.1093/mnras/stw2476
DO - 10.1093/mnras/stw2476
M3 - Journal article
VL - 464
SP - 2577
EP - 2596
JO - Royal Astronomical Society. Monthly Notices
JF - Royal Astronomical Society. Monthly Notices
SN - 0035-8711
IS - 3
ER -
ID: 167193838