Index
強化学習 / Reinforcement Learning
機械学習の学習方法の枠組みの中の一つ.
アルゴリズム
Gato / 2022
2022年5月に DeepMind が発表したGatoは、テキストや画像などの出力だけでなく、様々なアクションまでも実行できる多機能なマルチモーダルAI.
Policy-Space Response Oracles / PSRO / 2023
- Combining Tree-Search, Generative Models, and Nash Bargaining Concepts in Game-Theoretic Reinforcement Learning
- [2023]
- arxiv.org
Controllability-aware Skill Discovery / CSD /2023
- Controllability-Aware Unsupervised Skill Discovery
- [2023]
- arxiv.org
Reusable Slotwise Mechanisms / RSM
- Reusable Slotwise Mechanisms
- [2023]
- arxiv.org
Scaled Q-learning / 2023
- Pre-training generalist agents using offline reinforcement learning
テクニック・工夫
Imitation Learning / 模倣学習
- Imitation Learning / 模倣学習
Meta Reinforcement Learning
Adaptive Agent / AdA / 2023
- Human-Timescale Adaptation in an Open-Ended Task Space
- [2023]
- arxiv.org
Transformer
- Transormer を利用した強化学習
Curriculum Reinforcement Learning / CRL
GRADIENT / 2023
- Curriculum Reinforcement Learning using Optimal Transport via Gradual Domain Adaptation
- [2023]
- arxiv.org
ELLM / Exploring with LLMs / 2023
- Guiding Pretraining in Reinforcement Learning with Large Language Models
- [2023]
- arxiv.org
Intrinsic Performance
- 【DL輪読会】Scaling laws for single-agent reinforcement learning
Offline
Cal QL / 2023
- Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning
- [2023]
- arxiv.org
Synthetic Experience Replay / SynthER / 2023
- Synthetic Experience Replay
- [2023]
- arxiv.org
Dataset / Benchmark
ManiSkill2 / 2023
- ManiSkill2: A Unified Benchmark for Generalizable Manipulation Skills
- [2023]
- arxiv.org
研究
- Can Wikipedia Help Offline Reinforcement Learning?
- [2022]
- arxiv.org
オフライン強化学習とTransformerにおいて、
テキストコーパスによる事前学習済みモデルが無関係な下流タスク(例:Atariのゲーム)に転移できる.
- NeurIPS 2022 参加報告 後編
- 強化学習
- オフライン強化学習
- blog.recruit.co.jp
On the Effect of Pre-training for Transformer in Different Modality on Offline Reinforcement Learning
- [2022]
- arxiv.org
The Role of Baselines in Policy Gradient Optimization
- [2023]
- arxiv.org
The Phenomenon of Policy Churn
- [2022]
- arxiv.org
自然言語処理 / NLP への応用
環境
Alexa Arena / 2023
- Alexa Arena: A User-Centric Interactive Platform for Embodied AI
- [2023]
- arxiv.org
- github.com
参考
Synthesizing Physical Character-Scene Interactions
- [2023]
- arxiv.org
Reinforcement Learning: An Introduction
書籍
Web サイト
CS 294: Deep Reinforcement Learning, Spring 2017
Reinforcement Learning Course: Hands-On, Step By Step, And Free