Classification of fake news on Twitter data

Česky
English

Thesis title:	Klasifikace fake news na datech z Twitteru
Author:	Dumnov, Kirill
Thesis type:	Bakalářská práce
Supervisor:	Chudán, David
Opponents:	Smutný, Zdeněk
Thesis language:	Česky
Abstract:	Šíření dezinformací na platformách sociálních médií, zejména na Twitteru, představuje pro veřejnou diskusi značnou výzvu. Táto práce představuje klasifikační systém pro identifikaci dezinformačních tweetů využívající algoritmy zpracování přirozeného jazyka (NLP). Systém integruje modely strojového učení, jako jsou rekurentní neuronové sítě (RNN), logistická regrese a Random Forest, s technikami NLP včetně lemmatizace a tokenizace. Řešením problémů, jako je rozmanitost dezinformací, kvalita a množství trénovacích dat a spoléhání se pouze na jazykové rysy, navrhovaný klasifikační systém demonstruje svůj potenciál pro detekci dezinformačních tweetů. Experimentální vyhodnocení ukazuje účinnost systému při identifikaci dezinformací a naznačuje jeho potenciál pro zmírnění šíření dezinformací na platformách sociálních médií a posílení veřejného diskurzu.
Keywords:	dezinformace; NLP; detekce Fake News; twitter

Thesis title:	Classification of fake news on Twitter data
Author:	Dumnov, Kirill
Thesis type:	Bachelor thesis
Supervisor:	Chudán, David
Opponents:	Smutný, Zdeněk
Thesis language:	Česky
Abstract:	The spread of misinformation on social media platforms, particularly Twitter, poses a significant challenge to public debate. This paper presents a classification system for identifying misinformation tweets using Natural Language Processing (NLP) algorithms. The system integrates machine learning models such as Recurrent Neural Networks (RNN), Logistic Regression and Random Forest with NLP techniques including lemmatization and tokenization. By addressing issues such as the diversity of misinformation, the quality and quantity of training data, and the reliance on linguistic features alone, the proposed classification system demonstrates its potential for detecting misinformation tweets. Experimental evaluation showcases the system's effectiveness in identifying misinformation, suggesting its potential to mitigate the spread of misinformation on social media platforms and enhance public discourse.
Keywords:	NLP; misinformation; fake news classification; twitter

Information about study

Study programme:	Aplikovaná informatika
Type of study programme:	Bakalářský studijní program
Assigned degree:	Bc.
Institutions assigning academic degree:	Vysoká škola ekonomická v Praze
Faculty:	Faculty of Informatics and Statistics
Department:	Department of Information and Knowledge Engineering

Information on submission and defense

Date of assignment:	3. 6. 2022
Date of submission:	7. 5. 2023
Date of defense:	12. 6. 2023
Identifier in the InSIS system:	https://insis.vse.cz/zp/80858/podrobnosti

Files for download

Main text
80858_dumk00.pdf, 775.7 kB Download

Opponent's review
78425_xsmuz00.pdf, 58.2 kB Download

Supervisor's review
80858_xchud01.pdf, 56.2 kB Download