Scalable robust principal component analysis using Grassmann averages
Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › fagfællebedømt
Standard
Scalable robust principal component analysis using Grassmann averages. / Hauberg, Søren; Feragen, Aasa; Enficiaud, Raffi; Black, Michael J.
I: IEEE Transactions on Pattern Analysis and Machine Intelligence, Bind 38, Nr. 11, 2016, s. 2298-2311.Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › fagfællebedømt
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - JOUR
T1 - Scalable robust principal component analysis using Grassmann averages
AU - Hauberg, Søren
AU - Feragen, Aasa
AU - Enficiaud, Raffi
AU - Black, Michael J.
PY - 2016
Y1 - 2016
N2 - In large datasets, manual data verification is impossible, and we must expect the number of outliers to increase with data size. While principal component analysis (PCA) can reduce data size, and scalable solutions exist, it is well-known that outliers can arbitrarily corrupt the results. Unfortunately, state-of-the-art approaches for robust PCA are not scalable. We note that in a zero-mean dataset, each observation spans a one-dimensional subspace, giving a point on the Grassmann manifold. We show that the average subspace corresponds to the leading principal component for Gaussian data. We provide a simple algorithm for computing this Grassmann Average (GA), and show that the subspace estimate is less sensitive to outliers than PCA for general distributions. Because averages can be efficiently computed, we immediately gain scalability. We exploit robust averaging to formulate the Robust Grassmann Average (RGA) as a form of robust PCA. The resulting Trimmed Grassmann Average (TGA) is appropriate for computer vision because it is robust to pixel outliers. The algorithm has linear computational complexity and minimal memory requirements. We demonstrate TGA for background modeling, video restoration, and shadow removal. We show scalability by performing robust PCA on the entire Star Wars IV movie; a task beyond any current method. Source code is available online.
AB - In large datasets, manual data verification is impossible, and we must expect the number of outliers to increase with data size. While principal component analysis (PCA) can reduce data size, and scalable solutions exist, it is well-known that outliers can arbitrarily corrupt the results. Unfortunately, state-of-the-art approaches for robust PCA are not scalable. We note that in a zero-mean dataset, each observation spans a one-dimensional subspace, giving a point on the Grassmann manifold. We show that the average subspace corresponds to the leading principal component for Gaussian data. We provide a simple algorithm for computing this Grassmann Average (GA), and show that the subspace estimate is less sensitive to outliers than PCA for general distributions. Because averages can be efficiently computed, we immediately gain scalability. We exploit robust averaging to formulate the Robust Grassmann Average (RGA) as a form of robust PCA. The resulting Trimmed Grassmann Average (TGA) is appropriate for computer vision because it is robust to pixel outliers. The algorithm has linear computational complexity and minimal memory requirements. We demonstrate TGA for background modeling, video restoration, and shadow removal. We show scalability by performing robust PCA on the entire Star Wars IV movie; a task beyond any current method. Source code is available online.
KW - Gaussian processes
KW - data handling
KW - image processing
KW - principal component analysis
KW - GA
KW - Gaussian data
KW - Grassmann Average
KW - Grassmann manifold
KW - RGA
KW - Robust Grassmann Average
KW - TGA
KW - Trimmed Grassmann average
KW - average subspace
KW - computer vision
KW - data size
KW - data verification
KW - grassmann averages
KW - one dimensional subspace
KW - pixel outliers
KW - robust PCA
KW - scalable robust principal component analysis
KW - simple algorithm
KW - zero mean dataset
KW - Approximation methods
KW - Complexity theory
KW - Computer vision
KW - Estimation
KW - Manifolds
KW - Principal component analysis
KW - Robustness
KW - Dimensionality reduction
KW - robust principal component analysis
KW - subspace estimation
U2 - 10.1109/TPAMI.2015.2511743
DO - 10.1109/TPAMI.2015.2511743
M3 - Journal article
C2 - 26731634
VL - 38
SP - 2298
EP - 2311
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
SN - 0162-8828
IS - 11
ER -
ID: 168251565