623.714 (22W) Selected Topics in Distributed Multimedia Systems: Hands-on Reinforcement Learning
Überblick
Weitere Informationen zum Lehrbetrieb vor Ort finden Sie unter: https://www.aau.at/corona.
- Lehrende/r
- LV-Titel englisch Selected Topics in Distributed Multimedia Systems: Hands-on Reinforcement Learning
- LV-Art Vorlesung-Kurs (prüfungsimmanente LV )
- LV-Modell Präsenzlehrveranstaltung
- Semesterstunde/n 2.0
- ECTS-Anrechnungspunkte 4.0
- Anmeldungen 17 (30 max.)
- Organisationseinheit
- Unterrichtssprache Englisch
- mögliche Sprache/n der Leistungserbringung Englisch
- LV-Beginn 09.11.2022
- eLearning zum Moodle-Kurs
Zeit und Ort
LV-Beschreibung
Intendierte Lernergebnisse
This course introduces Artificial Intelligence (AI) and optimization in a fun, easy, interesting, immersive, and hands-on way. Reinforcement Learning (RL) is part of a decades-long trend within AI and machine learning back toward simple general principles. RL is the closest to the kind of learning that humans and other animals do, and many of the core algorithms of RL were originally inspired by biological learning systems. RL has also given back, both through a psychological model of animal learning that better matches some of the empirical data, and through an influential model of parts of the brain's reward system. A good way to understand reinforcement learning is to consider some of the examples and possible applications that have guided its development
- A master chess player makes a move. The choice is informed both by planning - anticipating possible replies and counterreplies - and my immediate, intuitive judgments of the desirability of particular positions and moves
- An adaptive controller adjusts the parameters of a petroleum refinery's operations in real-time. The controller optimizes the yield/cost/quality trade-off on the basis of specified marginal costs without sticking strictly to the setpoints originally suggested by engineers.
- A gazelle calf struggles to its feet minutes after being born. Half an hour later it is running at 32 kilometers per hour.
- A mobile robot decides whether it should enter a new room in search of more trash to collect or start trying to find its way back to its battery recharging station. It makes its decisions based on the current charge level of its battery and how quickly and easily it has been able to find the recharger in the past
- Phil prepares his breakfast. Closely examined, even this apparently mundane activity reveals a complex web of conditional behavior and interlocking goal - subgoal relationships: walking to the cupboard, opening it, selecting a cereal box, then reaching for, grasping, and retrieving the box. Other complexes, tuned, interactive sequences of behavior are required to obtain a bowl, spoon, and milk carton. Each step involves a series of movements to obtain information and to guide reaching and locomotion. Rapid judgments are continually made about how to carry the objects or whether it is better to ferry some of them to the dining table before obtaining others. Each step is guided by goals, such as grasping a spoon or getting to the refrigerator and is in service of other goals, such as having the spoon to eat with once the cereal is prepared and ultimately obtaining nourishment. Whether he is aware of it or not, Phil is accessing information about the state of his body that determines his nutritional needs, level of hunger, and food preferences.
- All of these examples involve interaction between an agent with its environment. The agent observes the environment and decides to perform an action. Based on the agent's action, they receive a reward from the environment with an updated state of the environment itself.
Lehrmethodik
The primary way of delivering the lectures would be through in-class sessions. Additionally, a discord channel would also be in place for a live broadcast of the in-class session and discussions/Q&A.
Inhalt/e
The course consists of lectures that cover concepts from the following topics. A lecture would cover properties from one or more topics. These lectures would be conducted at regular intervals.
- Topic 1: Reinforcement Learning - an introduction
- Topic 2: Course Materials, Supplementary Resources, and Development Environment
- Topic 3: Tabular Methods
- Topic 4: Dynamic Programming
- Topic 5: Monte-Carlo & Temporal Difference and Q-Learning
- Topic 6: Policy Gradients
- Topic 7: The Actor-Critic Method
- Topic 8: Deep Q-Network - an Overview
- Topic 9: Further Exploration
Erwartete Vorkenntnisse
- Python Programming
- Statistics
Literatur
The course would closely follow the Deep reinforcement learning in action from Manning publications > Zai, Alexander, and Brandon Brown. Deep reinforcement learning in action. Manning Publications, 2020.
Additionally, we would also refer to > Reinforcement learning: An introduction from MIT press Sutton, Richard S., and Andrew G. Barto. Reinforcement learning: An introduction. MIT press, 2018. [OPTIONAL]
Prüfungsinformationen
Prüfungsmethode/n
Project work and Oral Exam
Prüfungsinhalt/e
The complete curriculum taught throughout the course duration. Students would be challenged with a project and would have upto 3 weeks to solve. The evaluation of the project would be carried our through an oral exam that would require demostration of knowledge aquired in the course and skills in solving the project work.
Beurteilungskriterien/-maßstäbe
The final assesment would be based on the oral exam. The following criteria would be evaluated
- Project code, installation guide, documentation
- trained models
- understanding of concepts taught in the course
- ability to use and explain the concepts used in their project
Beurteilungsschema
Note BenotungsschemaPosition im Curriculum
- Masterstudium Angewandte Informatik
(SKZ: 911, Version: 13W.1)
-
Fach: Distributed Multimedia Systems
(Wahlfach)
-
Selected Topics in Distributed Multimedia Systems (
2.0h VK / 4.0 ECTS)
- 623.714 Selected Topics in Distributed Multimedia Systems: Hands-on Reinforcement Learning (2.0h VC / 4.0 ECTS)
-
Selected Topics in Distributed Multimedia Systems (
2.0h VK / 4.0 ECTS)
-
Fach: Distributed Multimedia Systems
(Wahlfach)
- Masterstudium Informatics
(SKZ: 911, Version: 19W.1)
-
Fach: Distributed System
(Wahlfach)
-
Weitere LVen aus dem gewählten Spezialisierungsfach (
0.0h XX / 12.0 ECTS)
- 623.714 Selected Topics in Distributed Multimedia Systems: Hands-on Reinforcement Learning (2.0h VC / 4.0 ECTS) Absolvierung im 1., 2. Semester empfohlen
-
Weitere LVen aus dem gewählten Spezialisierungsfach (
0.0h XX / 12.0 ECTS)
-
Fach: Distributed System
(Wahlfach)
- Doktoratsstudium Doktoratsstudium der Technischen Wissenschaften
(SKZ: 786, Version: 12W.4)
-
Fach: Studienleistungen gem. § 3 Abs. 2a des Curriculums
(Pflichtfach)
-
Studienleistungen gem. § 3 Abs. 2a des Curriculums (
16.0h XX / 32.0 ECTS)
- 623.714 Selected Topics in Distributed Multimedia Systems: Hands-on Reinforcement Learning (2.0h VC / 4.0 ECTS)
-
Studienleistungen gem. § 3 Abs. 2a des Curriculums (
16.0h XX / 32.0 ECTS)
-
Fach: Studienleistungen gem. § 3 Abs. 2a des Curriculums
(Pflichtfach)
- Masterstudium Game Studies and Engineering
(SKZ: 992, Version: 17W.2)
-
Fach: Gebundenes Wahlfach
(Wahlfach)
-
Modul: Game Engineering
-
4.1 Fundamental Topics I in Distributed Multimedia Systems (
0.0h VC / 4.0 ECTS)
- 623.714 Selected Topics in Distributed Multimedia Systems: Hands-on Reinforcement Learning (2.0h VC / 4.0 ECTS) Absolvierung im 1., 2., 3. Semester empfohlen
-
4.1 Fundamental Topics I in Distributed Multimedia Systems (
0.0h VC / 4.0 ECTS)
-
Modul: Game Engineering
-
Fach: Gebundenes Wahlfach
(Wahlfach)
- Masterstudium Game Studies and Engineering
(SKZ: 992, Version: 17W.2)
-
Fach: Gebundenes Wahlfach
(Wahlfach)
-
Modul: Game Engineering
-
4.1 Selected Topics in Distributed Multimedia Systems (
0.0h VC / 4.0 ECTS)
- 623.714 Selected Topics in Distributed Multimedia Systems: Hands-on Reinforcement Learning (2.0h VC / 4.0 ECTS) Absolvierung im 1., 2., 3. Semester empfohlen
-
4.1 Selected Topics in Distributed Multimedia Systems (
0.0h VC / 4.0 ECTS)
-
Modul: Game Engineering
-
Fach: Gebundenes Wahlfach
(Wahlfach)