Assessing Policy Optimization agents using Algorithmic IQ test

English
Česky

Název práce:	Assessing Policy Optimization agents using Algorithmic IQ test
Autor(ka) práce:	Zeman, Petr
Typ práce:	Diploma thesis
Vedoucí práce:	Vadinský, Ondřej
Oponenti práce:	Berka, Petr
Jazyk práce:	English
Abstrakt:	This thesis briefly introduces the history and ideology behind the formalised evaluation of intelligence before focusing on the workings of Algorithmic Intelligence Quotient test. As AIQ is closely related to the Reinforcement Learning framework, the following chapter is dedicated to the principles of this framework and introduces some of the popular agents before focusing more in-depth on the agents I have chosen for implementation: Vanilla Policy Gradient and Proximal Policy Optimization. The practical part of this thesis first introduces the history of the prototypical implementation of the Algorithmic Intelligence Quotient and a description of how its code work. This is closely followed by my update of the code base from Python 2 to Python 3, fixes for the code on the Windows operating system, the introduction of a system for logging a failure of an agent and a few other minor tweaks. Next, the thesis focuses on the complexities in introducing agents based on modern Reinforcement Learning architectures into the AIQ test. As AIQ works with environments inverse to the modern standards of OpenAI Gym and its variations, it was necessary to transform them into a state compatible with the AIQ test. The agents are tested to find good default values of a newly introduced parameter before using various statistical methods to compare the new agents against each other and then against originally implemented agents by using the acquired data. Due to its similarity to other existing benchmarks of implemented agents, the results of the comparison between newly implemented agents support the viability of AIQ as an evaluation tool. The results of the comparison to the initially implemented agents give interesting insight into various agents’ intelligence.
Klíčová slova:	Artifficial Intelligence Quotient test; Reinforcement Learning; Policy Gradient agents; Agent Intelligence Evaluation; Vanilla Policy Gradient; Proximal Policy Optimization

Název práce:	Assessing Policy Optimization agents using Algorithmic IQ test
Autor(ka) práce:	Zeman, Petr
Typ práce:	Diplomová práce
Vedoucí práce:	Vadinský, Ondřej
Oponenti práce:	Berka, Petr
Jazyk práce:	English
Abstrakt:	Tato diplomová práce nejdříve čtenáře uvede do historie a ideologie formalizovaného testování inteligence, než se pustí do principů testu Algoritmického IQ (AIQ). Jelikož má AIQ mnoho podobného s frameworkem Posilovaného Učení, je následující kapitola dedikovaná principům tohoto frameworku. Součástí této kapitoly je i představení některých populárních agentů uživateli, po něm následuje hlubší analýza agentů vybraných pro tuto diplomovou práci: Vanilla Policy Gradient a Proximal Policy Optimization. Praktická část diplomové práce nejprve představí historii prototypové implementace testu Algoritmického IQ včetně popisu kódu. To je následováno popisem mé práce na oživení kódu testu z Python 2 na Python 3, opravami pro Operační Systém Windows, Implementací systému pro logování chyb agenta a pár dalšími menšími úpravami. Další část této diplomové práce se soustředí na problémy při implementaci agentů založených na moderních architekturách Posilovaného Učení do testu AIQ. Jelikož AIQ test pracuje s prostředími inverzně vůči moderním standardům OpenAI Gym a jejich variacemi, bylo potřebné kód agentů transformovat do stavu schopného spolupracovat s AIQ testem. Nakonec byly implementovány agenty testování pro nalezení vhodných výchozích hodnot nového parametru a získané hodnoty byly využity pro další testy. Skrze statistické metody byly mezi sebou porovnány výsledky nově implementovaných agentů. Jejich výsledky, díky podobnosti s dalšími testy, podporují použitelnost AIQ jako nástroje pro testování inteligence. Následné porovnání nových agentů s těmi původně implementovanými dává zajímavé informace o jejich inteligenci.
Klíčová slova:	Test Algoritmického IQ; Posilované učení; Agenti typu Policy Gradient; Vanilla Policy Gradient; Proximal Policy Optimization; Hodnocení Inteligence Agentů

Informace o studiu

Studijní program / obor:	Znalostní a webové technologie
Typ studijního programu:	Magisterský studijní program
Přidělovaná hodnost:	Ing.
Instituce přidělující hodnost:	Vysoká škola ekonomická v Praze
Fakulta:	Fakulta informatiky a statistiky
Katedra:	Katedra informačního a znalostního inženýrství

Informace o odevzdání a obhajobě

Datum zadání práce:	2. 3. 2022
Datum podání práce:	27. 6. 2023
Datum obhajoby:	6. 9. 2023
Identifikátor v systému InSIS:	https://insis.vse.cz/zp/80035/podrobnosti

Soubory ke stažení

Hlavní práce
80035_zemp02.pdf, 1.3 MB Stáhnout

Veřejná příloha
26613_zemp02.rar, 1.2 MB Stáhnout

Oponentura
79626_berka.pdf, 49.8 kB Stáhnout

Hodnocení vedoucího
80035_xvado00.pdf, 61.9 kB Stáhnout