Co se děje na VŠE?

-

Termíny

-

Další termíny »

Hledat
Pokročilé hledání

Automation of Descriptive Data Mining: The Use of External Data in the Evaluation Phase

Autor práce: Nekvapil, Viktor
Typ práce: Disertační práce
Vedoucí práce: Rauch, Jan
Osoba oponující práci: Kliegr, Tomáš; Kléma, Jiří; Popelínský, Lubomír

Informace o vysokoškolské kvalifikační práci

Název práce: Automation of Descriptive Data Mining: The Use of External Data in the Evaluation Phase
Typ práce: Doctoral thesis
Jazyk práce: English
Abstrakt: Data mining has reached a mature state. There is a plethora of algorithms available, used more and more in everyday business. As far as the descriptive data mining is concerned, its assets are not clear but it often represents a necessary first step in the predictive task. Therefore, there is a distinct requirement to automate the descriptive task and thus reduce costs which are not easily justifiable considering the perspective of business people. This thesis strives to contribute to this issue.The thesis focuses on an automation of descriptive data mining task, specifically, the evaluation phase. Automation of data mining requires encoding a significant amount of domain knowledge. The thesis tries to use cheaper sources of domain knowledge than the domain experts source, that is, processed external data (External Knowledge), which can be either internal data of a company or publicly available resources, such as open data.The main aim of the thesis is to propose a new way of automation of the evaluation phase of descriptive data mining task based on external data. The thesis provides a comprehensive overview of data mining, domain knowledge and automation. The newly proposed External Knowledge Framework includes two approaches to the utilization of External Knowledge. The first approach, called Explanation System, offers the user an additional knowledge that could help him in the evaluation of the results. The second approach, referred to as SEI-formulas, further automates the evaluation phase of the descriptive data mining task and prepares automatic conclusions based on the consequences or contradictions of the defined SEI-formula. SEI-formula is a pre-defined relationship of the attribute from the resulting pattern and the attribute from External Knowledge.Both Explanation system and the SEI-formulas are implemented making use of custom-made Python engine; SQLite database is used as a storage of External Knowledge. The programming codes of implementation are publicly available at GitHub. The proof of concept solution employs association rules as resulting patterns, the evaluation of which is to be automated. Association rules are mined using the 4ft-Miner procedure of the LISp-Miner system. The proposed artifacts have been evaluated through methods of Experiments, Scenarios, Functional testing, Dynamic analysis and Comparison. The evaluation methods indicate that the External Knowledge Framework can be used to automate the evaluation phase of descriptive data mining task in the financial services industry domain. Furthermore, the integration of External Knowledge Framework and FOFRADAR has been proposed.
Klíčová slova: data mining; descriptive data mining; automation; evaluation phase of data mining task; external data

Informace o studiu

Studijní program a Studijní obor: Aplikovaná informatika/Aplikovaná informatika
Typ studijního programu: Doktorský studijní program
Jméno přidělované hodnosti: Ph.D.
Instituce přidělující hodnost: University of Economics, Prague
Název fakulty: Faculty of Informatics and Statistics
Název katedry: Department of Information and Knowledge Engineering
Instituce archivující a zpřístupňující VŠKP: University of Economics, Prague

Informace o odevzdání a obhajobě

Datum zadání práce: 8. 7. 2013
Datum podání práce: 16. 2. 2019
Datum obhajoby: 2019
Výsledek obhajoby: -

Soubory ke stažení

Hlavní práceSoubor bude zveřejněn po obhajobě

Údaje ze systému InSIS

Identifikátor Odkaz identifikátoru je funkční pouze u obhájených prací.