Assessing Policy Optimization agents using Algorithmic IQ test

English
Česky

Thesis title:	Assessing Policy Optimization agents using Algorithmic IQ test
Author:	Zeman, Petr
Thesis type:	Diploma thesis
Supervisor:	Vadinský, Ondřej
Opponents:	Berka, Petr
Thesis language:	English
Abstract:	This thesis briefly introduces the history and ideology behind the formalised evaluation of intelligence before focusing on the workings of Algorithmic Intelligence Quotient test. As AIQ is closely related to the Reinforcement Learning framework, the following chapter is dedicated to the principles of this framework and introduces some of the popular agents before focusing more in-depth on the agents I have chosen for implementation: Vanilla Policy Gradient and Proximal Policy Optimization. The practical part of this thesis first introduces the history of the prototypical implementation of the Algorithmic Intelligence Quotient and a description of how its code work. This is closely followed by my update of the code base from Python 2 to Python 3, fixes for the code on the Windows operating system, the introduction of a system for logging a failure of an agent and a few other minor tweaks. Next, the thesis focuses on the complexities in introducing agents based on modern Reinforcement Learning architectures into the AIQ test. As AIQ works with environments inverse to the modern standards of OpenAI Gym and its variations, it was necessary to transform them into a state compatible with the AIQ test. The agents are tested to find good default values of a newly introduced parameter before using various statistical methods to compare the new agents against each other and then against originally implemented agents by using the acquired data. Due to its similarity to other existing benchmarks of implemented agents, the results of the comparison between newly implemented agents support the viability of AIQ as an evaluation tool. The results of the comparison to the initially implemented agents give interesting insight into various agents’ intelligence.
Keywords:	Artifficial Intelligence Quotient test; Reinforcement Learning; Policy Gradient agents; Agent Intelligence Evaluation; Vanilla Policy Gradient; Proximal Policy Optimization

Thesis title:	Assessing Policy Optimization agents using Algorithmic IQ test
Author:	Zeman, Petr
Thesis type:	Diplomová práce
Supervisor:	Vadinský, Ondřej
Opponents:	Berka, Petr
Thesis language:	English
Abstract:	Tato diplomová práce nejdříve čtenáře uvede do historie a ideologie formalizovaného testování inteligence, než se pustí do principů testu Algoritmického IQ (AIQ). Jelikož má AIQ mnoho podobného s frameworkem Posilovaného Učení, je následující kapitola dedikovaná principům tohoto frameworku. Součástí této kapitoly je i představení některých populárních agentů uživateli, po něm následuje hlubší analýza agentů vybraných pro tuto diplomovou práci: Vanilla Policy Gradient a Proximal Policy Optimization. Praktická část diplomové práce nejprve představí historii prototypové implementace testu Algoritmického IQ včetně popisu kódu. To je následováno popisem mé práce na oživení kódu testu z Python 2 na Python 3, opravami pro Operační Systém Windows, Implementací systému pro logování chyb agenta a pár dalšími menšími úpravami. Další část této diplomové práce se soustředí na problémy při implementaci agentů založených na moderních architekturách Posilovaného Učení do testu AIQ. Jelikož AIQ test pracuje s prostředími inverzně vůči moderním standardům OpenAI Gym a jejich variacemi, bylo potřebné kód agentů transformovat do stavu schopného spolupracovat s AIQ testem. Nakonec byly implementovány agenty testování pro nalezení vhodných výchozích hodnot nového parametru a získané hodnoty byly využity pro další testy. Skrze statistické metody byly mezi sebou porovnány výsledky nově implementovaných agentů. Jejich výsledky, díky podobnosti s dalšími testy, podporují použitelnost AIQ jako nástroje pro testování inteligence. Následné porovnání nových agentů s těmi původně implementovanými dává zajímavé informace o jejich inteligenci.
Keywords:	Test Algoritmického IQ; Posilované učení; Agenti typu Policy Gradient; Vanilla Policy Gradient; Proximal Policy Optimization; Hodnocení Inteligence Agentů

Information about study

Study programme:	Znalostní a webové technologie
Type of study programme:	Magisterský studijní program
Assigned degree:	Ing.
Institutions assigning academic degree:	Vysoká škola ekonomická v Praze
Faculty:	Faculty of Informatics and Statistics
Department:	Department of Information and Knowledge Engineering

Information on submission and defense

Date of assignment:	2. 3. 2022
Date of submission:	27. 6. 2023
Date of defense:	6. 9. 2023
Identifier in the InSIS system:	https://insis.vse.cz/zp/80035/podrobnosti

Files for download

Main text
80035_zemp02.pdf, 1.3 MB Download

Public annex
26613_zemp02.rar, 1.2 MB Download

Opponent's review
79626_berka.pdf, 49.8 kB Download

Supervisor's review
80035_xvado00.pdf, 61.9 kB Download