» Articles » PMID: 29270197

A Hybrid Semi-Supervised Anomaly Detection Model for High-Dimensional Data

Overview
Specialty Biology
Date 2017 Dec 23
PMID 29270197
Citations 8
Authors
Affiliations
Soon will be listed here.
Abstract

Anomaly detection, which aims to identify observations that deviate from a nominal sample, is a challenging task for high-dimensional data. Traditional distance-based anomaly detection methods compute the neighborhood distance between each observation and suffer from the curse of dimensionality in high-dimensional space; for example, the distances between any pair of samples are similar and each sample may perform like an outlier. In this paper, we propose a hybrid semi-supervised anomaly detection model for high-dimensional data that consists of two parts: a deep autoencoder (DAE) and an ensemble -nearest neighbor graphs- (-NNG-) based anomaly detector. Benefiting from the ability of nonlinear mapping, the DAE is first trained to learn the intrinsic features of a high-dimensional dataset to represent the high-dimensional data in a more compact subspace. Several nonparametric KNN-based anomaly detectors are then built from different subsets that are randomly sampled from the whole dataset. The final prediction is made by all the anomaly detectors. The performance of the proposed method is evaluated on several real-life datasets, and the results confirm that the proposed hybrid model improves the detection accuracy and reduces the computational complexity.

Citing Articles

A Survey of Advanced Border Gateway Protocol Attack Detection Techniques.

Scott B, Johnstone M, Szewczyk P Sensors (Basel). 2024; 24(19).

PMID: 39409453 PMC: 11479385. DOI: 10.3390/s24196414.


A Novel Unsupervised Video Anomaly Detection Framework Based on Optical Flow Reconstruction and Erased Frame Prediction.

Huang H, Zhao B, Gao F, Chen P, Wang J, Hussain A Sensors (Basel). 2023; 23(10).

PMID: 37430742 PMC: 10221939. DOI: 10.3390/s23104828.


A hybrid anomaly detection method for high dimensional data.

Zhang X, Wei P, Wang Q PeerJ Comput Sci. 2023; 9:e1199.

PMID: 37346598 PMC: 10280180. DOI: 10.7717/peerj-cs.1199.


Anomaly Detection Framework for Wearables Data: A Perspective Review on Data Concepts, Data Analysis Algorithms and Prospects.

Sunny J, Patro C, Karnani K, Pingle S, Lin F, Anekoji M Sensors (Basel). 2022; 22(3).

PMID: 35161502 PMC: 8840097. DOI: 10.3390/s22030756.


Developing an Embedding, Koopman and Autoencoder Technologies-Based Multi-Omics Time Series Predictive Model (EKATP) for Systems Biology research.

Liu S, You Y, Tong Z, Zhang L Front Genet. 2021; 12:761629.

PMID: 34764986 PMC: 8576451. DOI: 10.3389/fgene.2021.761629.


References
1.
Scholkopf B, Platt J, Shawe-Taylor J, Smola A, Williamson R . Estimating the support of a high-dimensional distribution. Neural Comput. 2001; 13(7):1443-71. DOI: 10.1162/089976601750264965. View

2.
Hinton G, Salakhutdinov R . Reducing the dimensionality of data with neural networks. Science. 2006; 313(5786):504-7. DOI: 10.1126/science.1127647. View

3.
Wulsin D, Gupta J, Mani R, Blanco J, Litt B . Modeling electroencephalography waveforms with semi-supervised deep belief nets: fast classification and anomaly measurement. J Neural Eng. 2011; 8(3):036015. PMC: 3193936. DOI: 10.1088/1741-2560/8/3/036015. View