Selective Imputation for Multivariate Time Series Datasets with Missing Values
Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › fagfællebedømt
Standard
Selective Imputation for Multivariate Time Series Datasets with Missing Values. / Blazquez-Garcia, Ane; Wickstrom, Kristoffer; Yu, Shujian; Mikalsen, Karl Oyvind; Boubekki, Ahcene; Conde, Angel; Mori, Usue; Jenssen, Robert; Lozano, Jose A.
I: IEEE Transactions on Knowledge and Data Engineering, Bind 35, Nr. 9, 2023, s. 9490-9501.Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › fagfællebedømt
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - JOUR
T1 - Selective Imputation for Multivariate Time Series Datasets with Missing Values
AU - Blazquez-Garcia, Ane
AU - Wickstrom, Kristoffer
AU - Yu, Shujian
AU - Mikalsen, Karl Oyvind
AU - Boubekki, Ahcene
AU - Conde, Angel
AU - Mori, Usue
AU - Jenssen, Robert
AU - Lozano, Jose A.
N1 - Publisher Copyright: © 1989-2012 IEEE.
PY - 2023
Y1 - 2023
N2 - Multivariate time series often contain missing values for reasons such as failures in data collection mechanisms. Since these missing values can complicate the analysis of time series data, imputation techniques are typically used to deal with this issue. However, the quality of the imputation directly affects the performance of downstream tasks. In this paper, we propose a selective imputation method that identifies a subset of timesteps with missing values to impute in a multivariate time series dataset. This selection, which will result in shorter and simpler time series, is based on both reducing the uncertainty of the imputations and representing the original time series as good as possible. In particular, the method uses multi-objective optimization techniques to select the optimal set of points, and in this selection process, we leverage the beneficial properties of the Multi-task Gaussian Process (MGP). The method is applied to different datasets to analyze the quality of the imputations and the performance obtained in downstream tasks, such as classification or anomaly detection. The results show that much shorter and simpler time series are able to maintain or even improve both the quality of the imputations and the performance of the downstream tasks.
AB - Multivariate time series often contain missing values for reasons such as failures in data collection mechanisms. Since these missing values can complicate the analysis of time series data, imputation techniques are typically used to deal with this issue. However, the quality of the imputation directly affects the performance of downstream tasks. In this paper, we propose a selective imputation method that identifies a subset of timesteps with missing values to impute in a multivariate time series dataset. This selection, which will result in shorter and simpler time series, is based on both reducing the uncertainty of the imputations and representing the original time series as good as possible. In particular, the method uses multi-objective optimization techniques to select the optimal set of points, and in this selection process, we leverage the beneficial properties of the Multi-task Gaussian Process (MGP). The method is applied to different datasets to analyze the quality of the imputations and the performance obtained in downstream tasks, such as classification or anomaly detection. The results show that much shorter and simpler time series are able to maintain or even improve both the quality of the imputations and the performance of the downstream tasks.
KW - imputation
KW - irregular sampling
KW - missing data
KW - Multivariate time series
UR - http://www.scopus.com/inward/record.url?scp=85148442415&partnerID=8YFLogxK
U2 - 10.1109/TKDE.2023.3240858
DO - 10.1109/TKDE.2023.3240858
M3 - Journal article
AN - SCOPUS:85148442415
VL - 35
SP - 9490
EP - 9501
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
SN - 1041-4347
IS - 9
ER -
ID: 364497888