| Thesis title: |
Development of a Sentiment Analysis Model for Evaluating Open Source Reviews on CCleaner |
| Author: |
Flores Trochez, Luis Diego |
| Thesis type: |
Diploma thesis |
| Supervisor: |
Ziaei Nafchi, Majid |
| Opponents: |
Sudzina, František |
| Thesis language: |
English |
| Abstract: |
This study develops a machine learning-based sentiment analysis model to automate the evaluation of user reviews for CCleaner and two of its competitors. By collecting over 3,000 reviews from the Google Play Store and applying preprocessing, classification, and clustering techniques, the study compares the performance of Logistic Regression, Support Vector Machine (SVM), and Long Short-Term Memory (LSTM) models for sentiment classification. Results indicate that LSTM consistently outperforms traditional models across accuracy, precision, recall, and F1-score metrics. In addition, K-Means clustering reveals five dominant feedback themes, aiding product teams in pinpointing areas of user concern and satisfaction. The findings show the practical value of automated sentiment analysis in enhancing user experience, reducing manual effort, and informing decision makers of current user concerns. |
| Keywords: |
CCleaner; LSTM; Logistic Regression; Natural Language Processing (NLP); App Reviews; K-Means Clustering; Topic Modeling; Sentiment Analysis; Machine Learning; Support Vector Machine; Deep Learning |
| Thesis title: |
Development of a Sentiment Analysis Model for Evaluating Open Source Reviews on CCleaner |
| Author: |
Flores Trochez, Luis Diego |
| Thesis type: |
Diplomová práce |
| Supervisor: |
Ziaei Nafchi, Majid |
| Opponents: |
Sudzina, František |
| Thesis language: |
English |
| Abstract: |
This study develops a machine learning-based sentiment analysis model to automate the evaluation of user reviews for CCleaner and two of its competitors. By collecting over 3,000 reviews from the Google Play Store and applying preprocessing, classification, and clustering techniques, the study compares the performance of Logistic Regression, Support Vector Machine (SVM), and Long Short-Term Memory (LSTM) models for sentiment classification. Results indicate that LSTM consistently outperforms traditional models across accuracy, precision, recall, and F1-score metrics. In addition, K-Means clustering reveals five dominant feedback themes, aiding product teams in pinpointing areas of user concern and satisfaction. The findings show the practical value of automated sentiment analysis in enhancing user experience, reducing manual effort, and informing decision makers of current user concerns. |
| Keywords: |
Sentiment Analysis; Machine Learning; Support Vector Machine; Logistic Regression; App Reviews; Natural Language Processing (NLP); CCleaner; Deep Learning; Topic Modeling; K-Means Clustering; LSTM |
Information about study
| Study programme: |
Information Systems Management/Data and Business |
| Type of study programme: |
Magisterský studijní program |
| Assigned degree: |
Ing. |
| Institutions assigning academic degree: |
Vysoká škola ekonomická v Praze |
| Faculty: |
Faculty of Informatics and Statistics |
| Department: |
Department of Systems Analysis |
Information on submission and defense
| Date of assignment: |
2. 12. 2024 |
| Date of submission: |
22. 7. 2025 |
| Date of defense: |
2025 |
Files for download
The files will be available after the defense of the thesis.