|
| |
Multivariate Outlier Detection Using Robust Statistics
Authors:
Tvarlapati1, K. J., K. A.
Hoo2, M. J. Piovoso3, and R. Hajare4
1Department of
Chemical Engineering, University of South Carolina, Columbia, SC
2Department of Chemical
Engineering, Texas Tech University, Lubbock, TX
3Graduate
School, Penn State University, Malvern, PA
4ExxonMobil,
Houston, TX
Abstract
Robust multivariate methods for dealing with problems caused by outliers in the data are essential especially
when process data are used to validate mechanistic models, develop regression models, and in applications such as controller design and process monitoring. Gross outliers are easily detected by
simple methods such as range checking however, a multivariate outlier is very difficult to discern and techniques that rely on
data to generate empirical models may produce erroneous results.
In this work, a methodology to perform multivariate outlier replacement in the score
space generated by Principal Component Analysis is proposed. The objective is to find an
accurate estimate of the covariance matrix of the data so that a Principal Component Analysis model might be developed that can
then be used for monitoring and fault detection and identification. The methodology uses the concept of winsorization
to provide robust estimates of the mean (location) and the standard deviation (scale) iteratively, yielding a robust set of
data. The paper develops the approach, discusses the concept of robust statistics and winsorization, and presents the procedures
for robust multivariate outlier filtering. One simulated and two industrial examples are provided to demonstrate the approach.
Publication: Computers & Chemical Engineering, 26, pp 17-39, 2002
|