Design and Evaluation of a Model for Continuous Monitoring of Data Reliability

Název práce: Design and Evaluation of a Model for Continuous Monitoring of Data Reliability
Autor(ka) práce: Erbaşı Koşalay, Ayça
Typ práce: Diploma thesis
Vedoucí práce: Kučera, Jan
Oponenti práce: Karkošková, Soňa
Jazyk práce: English
Abstrakt:
This thesis designs, implements, and evaluates a continuous monitoring model for data reliability in a cloud-native Snowflake data platform, using Datadog for monitoring and dashboards and Slack and PagerDuty for alert routing. Organisations increasingly depend on analytical data products, yet data that is late, incomplete, or rule-breaking undermines decisions and creates operational and financial risk; despite this, reliability monitoring in modern data platforms often remains ad hoc and reactive. Following a Design Science Research (DSR) methodology structured around relevance, design, and rigour cycles, the research derives eight design requirements from the problem context and the literature and produces a reusable artefact: a compact catalogue of Service Level Indicators (SLIs) and Service Level Objectives (SLOs) for the freshness, completeness, validity, volume, and schema dimensions; a four-tier severity taxonomy (P1–P4); a pipeline ownership registry; an escalation and routing matrix; and runbook templates. The model is implemented with Snowflake Tasks and Datadog monitors and evaluated using a synthetic dataset and four demonstration scenarios, complemented by an expert interview. The evaluation shows that the SRE SLI/SLO abstraction transfers directly to data pipeline monitoring, that detection latency is bounded by the monitor evaluation window, and that automated severity-based routing reduces manual triage. The principal contribution is the domain transfer of Site Reliability Engineering principles to data pipeline observability, delivered as a deployable and transferable operating model.
Klíčová slova: data quality; Snowflake; Datadog; Slack; PagerDuty; service levels; Data reliability; SLI and SLO; observability; incident management; MTTD; MTTR; runbooks; design science research
Název práce: Design and Evaluation of a Model for Continuous Monitoring of Data Reliability
Autor(ka) práce: Erbaşı Koşalay, Ayça
Typ práce: Diplomová práce
Vedoucí práce: Kučera, Jan
Oponenti práce: Karkošková, Soňa
Jazyk práce: English
Abstrakt:
This thesis designs, implements, and evaluates a continuous monitoring model for data reliability in a cloud-native Snowflake data platform, using Datadog for monitoring and dashboards and Slack and PagerDuty for alert routing. Organisations increasingly depend on analytical data products, yet data that is late, incomplete, or rule-breaking undermines decisions and creates operational and financial risk; despite this, reliability monitoring in modern data platforms often remains ad hoc and reactive. Following a Design Science Research (DSR) methodology structured around relevance, design, and rigour cycles, the research derives eight design requirements from the problem context and the literature and produces a reusable artefact: a compact catalogue of Service Level Indicators (SLIs) and Service Level Objectives (SLOs) for the freshness, completeness, validity, volume, and schema dimensions; a four-tier severity taxonomy (P1–P4); a pipeline ownership registry; an escalation and routing matrix; and runbook templates. The model is implemented with Snowflake Tasks and Datadog monitors and evaluated using a synthetic dataset and four demonstration scenarios, complemented by an expert interview. The evaluation shows that the SRE SLI/SLO abstraction transfers directly to data pipeline monitoring, that detection latency is bounded by the monitor evaluation window, and that automated severity-based routing reduces manual triage. The principal contribution is the domain transfer of Site Reliability Engineering principles to data pipeline observability, delivered as a deployable and transferable operating model.
Klíčová slova: Data reliability; data quality; Snowflake; SLI and SLO; incident management; MTTD; MTTR; runbooks; design science research; Datadog; Slack; PagerDuty; service levels; observability

Informace o studiu

Studijní program / obor: Information Systems Management/Data and Business
Typ studijního programu: Magisterský studijní program
Přidělovaná hodnost: Ing.
Instituce přidělující hodnost: Vysoká škola ekonomická v Praze
Fakulta: Fakulta informatiky a statistiky
Katedra: Katedra informačních technologií

Informace o odevzdání a obhajobě

Datum zadání práce: 23. 10. 2025
Datum podání práce: 24. 6. 2026
Datum obhajoby: 2026

Soubory ke stažení

Soubory budou k dispozici až po obhajobě práce.

    Poslední aktualizace: