Process Control and Optimization Consortium

 Updated: 06/27/05 06:19 PM     

 

Multivariate Outlier Detection Using Robust Statistics 

Authors:

Tvarlapati1, K. J., K. A. Hoo2, M. J. Piovoso3, and R. Hajare4
1Department of Chemical Engineering, University of South Carolina, Columbia, SC
2Department of Chemical Engineering, Texas Tech University, Lubbock, TX 
3Graduate School, Penn State University, Malvern, PA
4ExxonMobil, Houston, TX

Abstract

Robust multivariate methods for dealing with problems caused by outliers in the data are essential especially when process data are used to validate mechanistic models, develop regression models, and in applications such as controller design and process monitoring. Gross outliers are easily detected by simple methods such as range checking however, a multivariate outlier is very difficult to discern and techniques that rely on data to generate empirical models may produce erroneous results. 

In this work, a methodology to perform multivariate outlier replacement in the score space generated by Principal Component Analysis is proposed. The objective is to find an accurate estimate of the covariance matrix of the data so that a Principal Component Analysis model might be developed that can then be used for monitoring and fault detection and identification. The methodology uses the concept of winsorization to provide robust estimates of the mean (location) and the standard deviation (scale) iteratively, yielding a robust set of data. The paper develops the approach, discusses the concept of robust statistics and winsorization, and presents the procedures for robust multivariate outlier filtering. One simulated and two industrial examples are provided to demonstrate the approach.

Publication: Computers & Chemical Engineering, 26, pp 17-39, 2002

Corresponding Author:    Karlene A. Hoo

©  Texas Tech University.  All Rights Reserved.
For website questions or comments, contact the
consortium director.