Selective Imputation for Multivariate Time Series Datasets with Missing Values

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

  • Ane Blazquez-Garcia
  • Kristoffer Wickstrom
  • Shujian Yu
  • Karl Oyvind Mikalsen
  • Ahcene Boubekki
  • Angel Conde
  • Usue Mori
  • Jenssen, Robert
  • Jose A. Lozano

Multivariate time series often contain missing values for reasons such as failures in data collection mechanisms. Since these missing values can complicate the analysis of time series data, imputation techniques are typically used to deal with this issue. However, the quality of the imputation directly affects the performance of downstream tasks. In this paper, we propose a selective imputation method that identifies a subset of timesteps with missing values to impute in a multivariate time series dataset. This selection, which will result in shorter and simpler time series, is based on both reducing the uncertainty of the imputations and representing the original time series as good as possible. In particular, the method uses multi-objective optimization techniques to select the optimal set of points, and in this selection process, we leverage the beneficial properties of the Multi-task Gaussian Process (MGP). The method is applied to different datasets to analyze the quality of the imputations and the performance obtained in downstream tasks, such as classification or anomaly detection. The results show that much shorter and simpler time series are able to maintain or even improve both the quality of the imputations and the performance of the downstream tasks.

OriginalsprogEngelsk
TidsskriftIEEE Transactions on Knowledge and Data Engineering
Vol/bind35
Udgave nummer9
Sider (fra-til)9490-9501
Antal sider12
ISSN1041-4347
DOI
StatusUdgivet - 2023

Bibliografisk note

Publisher Copyright:
© 1989-2012 IEEE.

ID: 364497888