» Articles » PMID: 34815619

A Network-based Positive and Unlabeled Learning Approach for Fake News Detection

Overview
Journal Mach Learn
Date 2021 Nov 24
PMID 34815619
Citations 2
Authors
Affiliations
Soon will be listed here.
Abstract

Fake news can rapidly spread through internet users and can deceive a large audience. Due to those characteristics, they can have a direct impact on political and economic events. Machine Learning approaches have been used to assist fake news identification. However, since the spectrum of real news is broad, hard to characterize, and expensive to label data due to the high update frequency, One-Class Learning (OCL) and Positive and Unlabeled Learning (PUL) emerge as an interesting approach for content-based fake news detection using a smaller set of labeled data than traditional machine learning techniques. In particular, network-based approaches are adequate for fake news detection since they allow incorporating information from different aspects of a publication to the problem modeling. In this paper, we propose a network-based approach based on Positive and Unlabeled Learning by Label Propagation (PU-LP), a one-class and transductive semi-supervised learning algorithm that performs classification by first identifying potential interest and non-interest documents into unlabeled data and then propagating labels to classify the remaining unlabeled documents. A label propagation approach is then employed to classify the remaining unlabeled documents. We assessed the performance of our proposal considering homogeneous (only documents) and heterogeneous (documents and terms) networks. Our comparative analysis considered four OCL algorithms extensively employed in One-Class text classification (-Means, -Nearest Neighbors Density-based, One-Class Support Vector Machine, and Dense Autoencoder), and another traditional PUL algorithm (Rocchio Support Vector Machine). The algorithms were evaluated in three news collections, considering balanced and extremely unbalanced scenarios. We used Bag-of-Words and Doc2Vec models to transform news into structured data. Results indicated that PU-LP approaches are more stable and achieve better results than other PUL and OCL approaches in most scenarios, performing similarly to semi-supervised binary algorithms. Also, the inclusion of terms in the news network activate better results, especially when news are distributed in the feature space considering veracity and subject. News representation using the Doc2Vec achieved better results than the Bag-of-Words model for both algorithms based on vector-space model and document similarity network.

Citing Articles

A review of semi-supervised learning for text classification.

Duarte J, Berton L Artif Intell Rev. 2023; :1-69.

PMID: 36743267 PMC: 9887265. DOI: 10.1007/s10462-023-10393-8.


A systematic literature review and existing challenges toward fake news detection models.

Nirav Shah M, Ganatra A Soc Netw Anal Min. 2022; 12(1):168.

PMID: 36407554 PMC: 9663194. DOI: 10.1007/s13278-022-00995-5.

References
1.
Yang C, Xiao Y, Zhang Y, Sun Y, Han J . Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark. IEEE Trans Knowl Data Eng. 2023; 34(10):4854-4873. PMC: 10619966. DOI: 10.1109/tkde.2020.3045924. View

2.
Shu K, Mahudeswaran D, Wang S, Lee D, Liu H . FakeNewsNet: A Data Repository with News Content, Social Context, and Spatiotemporal Information for Studying Fake News on Social Media. Big Data. 2020; 8(3):171-188. DOI: 10.1089/big.2020.0062. View

3.
Zhao J, Cao N, Wen Z, Song Y, Lin Y, Collins C . #FluxFlow: Visual Analysis of Anomalous Information Spreading on Social Media. IEEE Trans Vis Comput Graph. 2015; 20(12):1773-82. DOI: 10.1109/TVCG.2014.2346922. View

4.
Lu L, Jin C, Zhou T . Similarity index based on local paths for link prediction of complex networks. Phys Rev E Stat Nonlin Soft Matter Phys. 2009; 80(4 Pt 2):046122. DOI: 10.1103/PhysRevE.80.046122. View

5.
DePaulo B, Charlton K, Cooper H, Lindsay J, Muhlenbruck L . The accuracy-confidence correlation in the detection of deception. Pers Soc Psychol Rev. 1997; 1(4):346-57. DOI: 10.1207/s15327957pspr0104_5. View