fastrl
================
<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->
[](https://github.com/josiahls/fastrl/actions?query=workflow%3A%22Fastrl+Testing%22)
[](https://pypi.python.org/pypi/fastrl)
[](https://hub.docker.com/repository/docker/josiahls/fastrl)
[](https://hub.docker.com/repository/docker/josiahls/fastrl-dev)
[](https://pypi.python.org/pypi/fastrl)
[](https://pypi.python.org/pypi/fastrl)
> Warning: This is in alpha, and so uses latest torch and torchdata,
> very importantly torchdata. The base API, while at the point of
> semi-stability, might be changed in future versions, and so there will
> be no promises of backward compatiblity. For the time being, it is
> best to hard-pin versions of the library.
> Warning: Even before fastrl==2.0.0, all Models should converge
> reasonably fast, however HRL models `DADS` and `DIAYN` will need
> re-balancing and some extra features that the respective authors used.
# Overview
Fastai for computer vision and tabular learning has been amazing. One
would wish that this would be the same for RL. The purpose of this repo
is to have a framework that is as easy as possible to start, but also
designed for testing new agents.
This version fo fastrl is basically a wrapper around
[torchdata](https://github.com/pytorch/data).
It is built around 4 pipeline concepts (half is from fastai):
- DataLoading/DataBlock pipelines
- Agent pipelines
- Learner pipelines
- Logger plugins
Documentation is being served at https://josiahls.github.io/fastrl/ from
documentation directly generated via `nbdev` in this repo.
Basic DQN example:
``` python
from fastrl.loggers.core import *
from fastrl.loggers.vscode_visualizers import *
from fastrl.agents.dqn.basic import *
from fastrl.agents.dqn.target import *
from fastrl.data.block import *
from fastrl.envs.gym import *
import torch
```
``` python
# Setup Loggers
logger_base = ProgressBarLogger(epoch_on_pipe=EpocherCollector,
batch_on_pipe=BatchCollector)
# Setup up the core NN
torch.manual_seed(0)
model = DQN(4,2)
# Setup the Agent
agent = DQNAgent(model,[logger_base],max_steps=10000)
# Setup the DataBlock
block = DataBlock(
GymTransformBlock(agent=agent,nsteps=2,nskips=2,firstlast=True), # We basically merge 2 steps into 1 and skip.
(GymTransformBlock(agent=agent,nsteps=2,nskips=2,firstlast=True,n=100,include_images=True),VSCodeTransformBlock())
)
dls = L(block.dataloaders(['CartPole-v1']*1))
# Setup the Learner
learner = DQNLearner(model,dls,logger_bases=[logger_base],bs=128,max_sz=20_000,nsteps=2,lr=0.001,
batches=1000,
dp_augmentation_fns=[
# Plugin TargetDQN code
TargetModelUpdater.insert_dp(),
TargetModelQCalc.replace_dp()
])
learner.fit(10)
#learner.validate()
```
# Whats new?
As we have learned how to support as many RL agents as possible, we
found that `fastrl==1.*` was vastly limited in the models that it can
support. `fastrl==2.*` will leverage the `nbdev` library for better
documentation and more relevant testing, and `torchdata` is the base
lib. We also will be building on the work of the `ptan`<sup>1</sup>
library as a close reference for pytorch based reinforcement learning
APIs.
<sup>1</sup> “Shmuma/Ptan”. Github, 2020,
https://github.com/Shmuma/ptan. Accessed 13 June 2020.
## Install
## PyPI
Below will install the alpha build of fastrl.
**Cuda Install**
`pip install fastrl==0.0.* --pre --extra-index-url https://download.pytorch.org/whl/nightly/cu113`
**Cpu Install**
`pip install fastrl==0.0.* --pre --extra-index-url https://download.pytorch.org/whl/nightly/cpu`
## Docker (highly recommend)
Install:
[Nvidia-Docker](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker)
Install: [docker-compose](https://docs.docker.com/compose/install/)
``` bash
docker-compose pull && docker-compose up
```
## Contributing
After you clone this repository, please run `nbdev_install_hooks` in
your terminal. This sets up git hooks, which clean up the notebooks to
remove the extraneous stuff stored in the notebooks (e.g. which cells
you ran) which causes unnecessary merge conflicts.
Before submitting a PR, check that the local library and notebooks
match. The script `nbdev_clean` can let you know if there is a
difference between the local library and the notebooks. \* If you made a
change to the notebooks in one of the exported cells, you can export it
to the library with `nbdev_build_lib` or `make fastai2`. \* If you made
a change to the library, you can export it back to the notebooks with
`nbdev_update_lib`.