{"id":951,"date":"2017-12-08T15:38:22","date_gmt":"2017-12-08T15:38:22","guid":{"rendered":"http:\/\/krakoras.net\/blog\/?p=951"},"modified":"2018-07-20T10:08:05","modified_gmt":"2018-07-20T10:08:05","slug":"ai-programming","status":"publish","type":"post","link":"http:\/\/krakoras.net\/blog\/?p=951","title":{"rendered":"Deep Q-learning v praxi"},"content":{"rendered":"<p>Zkousel jsem naprogramovat hrajiciho agenta hry <strong>SpaceInvaders<\/strong> metodou Deep Q-learning (<a href=\"https:\/\/keon.io\/deep-q-learning\/\" class=\"autohyperlink\">keon.io\/deep-q-learning\/<\/a>)<\/p>\n<p><img loading=\"lazy\" width=\"162\" height=\"242\" class=\"alignnone size-full wp-image-953 \" src=\"http:\/\/krakoras.net\/blog\/wp-content\/uploads\/2017\/12\/img_5a2aafa292a47.png\" alt=\"\" \/><\/p>\n<p>Teoreticky zaklad a zajimave clanky o deep Q learning jsou na<\/p>\n<ul>\n<li><a href=\"https:\/\/keon.io\/deep-q-learning\/\" class=\"autohyperlink\">keon.io\/deep-q-learning\/<\/a><\/li>\n<li><a href=\"https:\/\/www.intelnervana.com\/demystifying-deep-reinforcement-learning\/\" class=\"autohyperlink\">www.intelnervana.com\/demystifying-deep-reinforcement-learning\/<\/a><\/li>\n<\/ul>\n<p>Vychazel jsem z kodu a videa &#8216;Deep Q Learning for Video Games &#8211; The Math of Intelligence #9&#8217;<\/p>\n<ul>\n<li><a href=\"https:\/\/github.com\/llSourcell\/deep_q_learning\/blob\/master\/03_PlayingAgent.ipynb\" class=\"autohyperlink\">github.com\/llSourcell\/deep_q_learning\/blob\/master\/03_PlayingAgent.ipynb<\/a><\/li>\n<\/ul>\n<p>Za pomoci knihoven<\/p>\n<ul>\n<li><strong>Tensorflow<\/strong>\u00a0(https:\/\/www.tensorflow.org\/),<\/li>\n<li><strong>Keras<\/strong>\u00a0(https:\/\/keras.io\/) a<\/li>\n<li>simulatoru <strong>Gym<\/strong>\u00a0(https:\/\/github.com\/openai\/gym)<\/li>\n<\/ul>\n<p>se mi podarilo uhrat skore <strong>455 bodu<\/strong>. Nejlepsi uhrane score je sice cca 5800 (<a href=\"https:\/\/gym.openai.com\/envs\/SpaceInvaders-v0\/\" class=\"autohyperlink\">gym.openai.com\/envs\/SpaceInvaders-v0\/<\/a>), tento algoritmus pouziva ale algorimtus &#8220;Asynchronous Actor-Critic Agents (A3C)&#8221;:<a href=\"https:\/\/medium.com\/emergent-future\/simple-reinforcement-learning-with-tensorflow-part-8-asynchronous-actor-critic-agents-a3c-c88f72a5e9f2\" class=\"autohyperlink\">medium.com\/emergent-future\/simple-reinforcement-learning-with-tensorflow-part-8-asynchronous-actor-critic-agents-a3c-c88f72a5e9f2<\/a> , jen co notebook nebude funet jako blazen, zkusim :).<\/p>\n<p>Me zdrojove kody jsou na disku<\/p>\n<ul>\n<li>c:\\Users\\honza\\Documents\\Projekty\\20171208_ai_deep-q-learning\\<\/li>\n<li>Nej agent je ulozen v modelu game_agent_model_scoring_235.h5, 2017\/12\/8 17:00<\/li>\n<\/ul>\n<p><strong>Jine zajimave<\/strong><\/p>\n<p>* AI PyGame test prostredi &#8211; <a href=\"http:\/\/pygame-learning-environment.readthedocs.io\/en\/latest\/user\/games\/waterworld.html\" class=\"autohyperlink\">pygame-learning-environment.readthedocs.io\/en\/latest\/user\/games\/waterworld.html<\/a><br \/>\n* AI Doom test prostredi (nema port pro Win) &#8211; <a href=\"https:\/\/github.com\/openai\/doom-py\" class=\"autohyperlink\">github.com\/openai\/doom-py<\/a><br \/>\n* Clanek &#8211; <a href=\"https:\/\/arxiv.org\/abs\/1312.5602\" class=\"autohyperlink\">arxiv.org\/abs\/1312.5602<\/a><\/p>\n<p><strong>Videa<\/strong><\/p>\n<p><iframe loading=\"lazy\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/79pmNdyxEGo?feature=oembed\" frameborder=\"0\" allow=\"autoplay; encrypted-media\" allowfullscreen><\/iframe><\/p>\n<p><iframe loading=\"lazy\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/L4KBBAwF_bE?feature=oembed\" frameborder=\"0\" allow=\"autoplay; encrypted-media\" allowfullscreen><\/iframe><\/p>\n<p><strong>Zavislosti k installaci<\/strong><\/p>\n<ul>\n<li><a href=\"https:\/\/www.tensorflow.org\/install\/install_windows\" class=\"autohyperlink\">www.tensorflow.org\/install\/install_windows<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/Kojoley\/atari-py\/releases\" class=\"autohyperlink\">github.com\/Kojoley\/atari-py\/releases<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Zkousel jsem naprogramovat hrajiciho agenta hry SpaceInvaders metodou Deep Q-learning (keon.io\/deep-q-learning\/) Teoreticky zaklad a zajimave clanky o deep Q learning jsou na keon.io\/deep-q-learning\/ <a href=\"http:\/\/www.intelnervana.com\/demystifying-deep-reinforcement-learning\/\" class=\"autohyperlink\">www.intelnervana.com\/demystifying-deep-reinforcement-learning\/<\/a> Vychazel jsem z kodu a videa &#8216;Deep Q Learning for Video Games &#8211; The Math of Intelligence #9&#8217; <a href=\"http:\/\/github.com\/llSourcell\/deep_q_learning\/blob\/master\/03_PlayingAgent.ipynb\" class=\"autohyperlink\">github.com\/llSourcell\/deep_q_learning\/blob\/master\/03_PlayingAgent.ipynb<\/a> Za pomoci knihoven Tensorflow\u00a0(https:\/\/www.tensorflow.org\/), Keras\u00a0(https:\/\/keras.io\/) a simulatoru Gym\u00a0(https:\/\/github.com\/openai\/gym) se mi podarilo uhrat&#8230;<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_mi_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0},"categories":[3,9,8],"tags":[],"_links":{"self":[{"href":"http:\/\/krakoras.net\/blog\/index.php?rest_route=\/wp\/v2\/posts\/951"}],"collection":[{"href":"http:\/\/krakoras.net\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/krakoras.net\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/krakoras.net\/blog\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"http:\/\/krakoras.net\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=951"}],"version-history":[{"count":11,"href":"http:\/\/krakoras.net\/blog\/index.php?rest_route=\/wp\/v2\/posts\/951\/revisions"}],"predecessor-version":[{"id":1035,"href":"http:\/\/krakoras.net\/blog\/index.php?rest_route=\/wp\/v2\/posts\/951\/revisions\/1035"}],"wp:attachment":[{"href":"http:\/\/krakoras.net\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=951"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/krakoras.net\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=951"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/krakoras.net\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=951"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}