Comparison of U.S. states based on demographic, economic and social indicators

Název práce: Comparison of U.S. states based on demographic, economic and social indicators
Autor(ka) práce: Ramos Cardenas, Jennifer Alejandra
Typ práce: Diploma thesis
Vedoucí práce: Miskolczi, Martina
Oponenti práce: Hon, Filip
Jazyk práce: English
Abstrakt:
This thesis examines whether U.S. states can be grouped into meaningful clusters when demographic, economic, and social characteristics are analyzed simultaneously. Using data from the American Community Survey and the Centers for Disease Control and Prevention for 2022, a dataset of 30 state-level indicators was compiled and reduced to 23 non-redundant variables through correlation analysis. Principal Component Analysis was applied to the reduced dataset, retaining four components that explained 78.58% of the total variance. These components were used as inputs for K-means cluster analysis, which identified five distinct groups of states. The optimal number of clusters was determined using the silhouette score criterion, and the solution was validated through comparison with Ward's hierarchical clustering, showing 74% agreement after label matching. The five typologies identified are: aging states with moderate socioeconomic conditions; a small heterogeneous group with unique demographic and institutional characteristics; states with lower socioeconomic outcomes concentrated in the South and lower Midwest; high-income and highly urbanized states; and states with strong labor market performance located mainly in the Mountain West and Plains. Comparison with the U.S. Census Bureau's nine geographic divisions showed partial overlap, with some clusters displaying clear geographic concentration while others cut across regional boundaries. The findings suggest that a multidimensional approach can identify structural patterns among U.S. states that are not captured by single-indicator analyses or traditional geographic classifications.
Klíčová slova: demographic indicators; state-level analysis; United States; cluster analysis; principal component analysis; socioeconomic classification; K-means clustering; regional comparison
Název práce: Comparison of U.S. states based on demographic, economic and social indicators
Autor(ka) práce: Ramos Cardenas, Jennifer Alejandra
Typ práce: Diplomová práce
Vedoucí práce: Miskolczi, Martina
Oponenti práce: Hon, Filip
Jazyk práce: English
Abstrakt:
This thesis examines whether U.S. states can be grouped into meaningful clusters when demographic, economic, and social characteristics are analyzed simultaneously. Using data from the American Community Survey and the Centers for Disease Control and Prevention for 2022, a dataset of 30 state-level indicators was compiled and reduced to 23 non-redundant variables through correlation analysis. Principal Component Analysis was applied to the reduced dataset, retaining four components that explained 78.58% of the total variance. These components were used as inputs for K-means cluster analysis, which identified five distinct groups of states. The optimal number of clusters was determined using the silhouette score criterion, and the solution was validated through comparison with Ward's hierarchical clustering, showing 74% agreement after label matching. The five typologies identified are: aging states with moderate socioeconomic conditions; a small heterogeneous group with unique demographic and institutional characteristics; states with lower socioeconomic outcomes concentrated in the South and lower Midwest; high-income and highly urbanized states; and states with strong labor market performance located mainly in the Mountain West and Plains. Comparison with the U.S. Census Bureau's nine geographic divisions showed partial overlap, with some clusters displaying clear geographic concentration while others cut across regional boundaries. The findings suggest that a multidimensional approach can identify structural patterns among U.S. states that are not captured by single-indicator analyses or traditional geographic classifications.
Klíčová slova: demographic indicators; state-level analysis; United States; cluster analysis; principal component analysis; socioeconomic classification; K-means clustering; regional comparison

Informace o studiu

Studijní program / obor: Economic Data Analysis/Data Analysis and Modeling
Typ studijního programu: Magisterský studijní program
Přidělovaná hodnost: Ing.
Instituce přidělující hodnost: Vysoká škola ekonomická v Praze
Fakulta: Fakulta informatiky a statistiky
Katedra: Katedra demografie

Informace o odevzdání a obhajobě

Datum zadání práce: 1. 10. 2025
Datum podání práce: 25. 6. 2026
Datum obhajoby: 2026

Soubory ke stažení

Soubory budou k dispozici až po obhajobě práce.

    Poslední aktualizace: