Development of a Sentiment Analysis Model for Evaluating Open Source Reviews on CCleaner

English
Česky

Thesis title:	Development of a Sentiment Analysis Model for Evaluating Open Source Reviews on CCleaner
Author:	Flores Trochez, Luis Diego
Thesis type:	Diploma thesis
Supervisor:	Ziaei Nafchi, Majid
Opponents:	Sudzina, František
Thesis language:	English
Abstract:	The increasing volume of user generated content on digital platforms has made automated analysis essential for understanding consumer perceptions, identifying product issues, and supporting decision making. Mobile applications in particular receive large quantities of short and informal reviews that contain valuable information about user satisfaction and expectations. This thesis develops and evaluates a sentiment analysis framework for classifying user reviews of the CCleaner Android application, with the goal of assessing model performance, identifying dominant themes, and providing data driven recommendations for product improvement. The study applies a multi model approach combining traditional machine learning algorithms, specifically Logistic Regression and Support Vector Machines, with a deep learning architecture based on Long Short-Term Memory networks. The methodology includes comprehensive preprocessing, feature extraction using TF IDF and word embeddings, model training, and evaluation with accuracy, precision, recall, and F1 score. In addition, K Means clustering is used to uncover underlying themes within user feedback and to complement the results of sentiment classification. The findings indicate that the Long Short Term Memory model outperforms traditional machine learning methods, achieving the highest overall classification accuracy and demonstrating strong capabilities for interpreting the sequential structure of short app reviews. Topic modeling results reveal recurring themes related to performance, usability, device optimization, and advertising concerns. The combined insights highlight key areas for product refinement and show the practical value of sentiment analysis for supporting the development of mobile applications. The thesis concludes with recommendations for future research and model enhancements.
Keywords:	LSTM; Logistic Regression; Topic Modeling; Machine Learning; Support Vector Machine; Deep Learning; CCleaner; K-Means Clustering; Natural Language Processing (NLP); App Reviews; Sentiment Analysis

Thesis title:	Development of a Sentiment Analysis Model for Evaluating Open Source Reviews on CCleaner
Author:	Flores Trochez, Luis Diego
Thesis type:	Diplomová práce
Supervisor:	Ziaei Nafchi, Majid
Opponents:	Sudzina, František
Thesis language:	English
Abstract:	The increasing volume of user generated content on digital platforms has made automated analysis essential for understanding consumer perceptions, identifying product issues, and supporting decision making. Mobile applications in particular receive large quantities of short and informal reviews that contain valuable information about user satisfaction and expectations. This thesis develops and evaluates a sentiment analysis framework for classifying user reviews of the CCleaner Android application, with the goal of assessing model performance, identifying dominant themes, and providing data driven recommendations for product improvement. The study applies a multi model approach combining traditional machine learning algorithms, specifically Logistic Regression and Support Vector Machines, with a deep learning architecture based on Long Short-Term Memory networks. The methodology includes comprehensive preprocessing, feature extraction using TF IDF and word embeddings, model training, and evaluation with accuracy, precision, recall, and F1 score. In addition, K Means clustering is used to uncover underlying themes within user feedback and to complement the results of sentiment classification. The findings indicate that the Long Short Term Memory model outperforms traditional machine learning methods, achieving the highest overall classification accuracy and demonstrating strong capabilities for interpreting the sequential structure of short app reviews. Topic modeling results reveal recurring themes related to performance, usability, device optimization, and advertising concerns. The combined insights highlight key areas for product refinement and show the practical value of sentiment analysis for supporting the development of mobile applications. The thesis concludes with recommendations for future research and model enhancements.
Keywords:	Logistic Regression; Machine Learning; Natural Language Processing (NLP); CCleaner; Topic Modeling; K-Means Clustering; Sentiment Analysis; Support Vector Machine; App Reviews; Deep Learning; LSTM

Information about study

Study programme:	Information Systems Management/Data and Business
Type of study programme:	Magisterský studijní program
Assigned degree:	Ing.
Institutions assigning academic degree:	Vysoká škola ekonomická v Praze
Faculty:	Faculty of Informatics and Statistics
Department:	Department of Systems Analysis

Information on submission and defense

Date of assignment:	2. 12. 2024
Date of submission:	1. 12. 2025
Date of defense:	15. 1. 2026
Identifier in the InSIS system:	https://insis.vse.cz/zp/90605/podrobnosti

Files for download

Main text
90605_flol00.pdf, 1 MB Download

Opponent's review
88364_sudf01.pdf, 128.7 kB Download

Supervisor's review
90605_ziam00.pdf, 139.3 kB Download