معرفی شرکت ها

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

ویژگی	مقدار
سیستم عامل	-
نام فایل	deep-rl-0.3.7
نام	deep-rl
نسخه کتابخانه	0.3.7
نگهدارنده	[]
ایمیل نگهدارنده	[]
نویسنده	Jonáš Kulhánek
ایمیل نویسنده	jonas.kulhanek@live.com
آدرس صفحه اصلی	-
آدرس اینترنتی	https://pypi.org/project/deep-rl/
مجوز	MIT License

# Deep RL PyTorch [![https://www.singularity-hub.org/static/img/hosted-singularity--hub-%23e32929.svg](https://www.singularity-hub.org/static/img/hosted-singularity--hub-%23e32929.svg)](https://singularity-hub.org/collections/2581) This repo contains implementation of popular Deep RL algorithms. Furthermore it contains unified interface for training and evaluation with unified model saving and visualization. It can be used as a good starting point when implementing new RL algorithm in PyTorch. ## Getting started If you want to base your algorithm on this repository, start by installing it as a package ``` pip install git+https://github.com/jkulhanek/deep-rl-pytorch.git ``` If you want to run attached experiments yourself, feel free to clone this repository. ``` git clone https://github.com/jkulhanek/deep-rl-pytorch.git ``` All dependencies are prepared in a docker container. If you have nvidia-docker enabled, you can use this image. To pull and start the image just run: ``` docker run --runtime=nvidia --net=host -it kulhanek/deep-rl-pytorch:latest bash ``` From there, you can either clone your own repository containing your experiments or clone this one. ## Concepts All algorithms are implemented as base classes. In your experiment your need to subclass from those base classes. The `deep_rl.core.AbstractTrainer` class is used for all trainers and all algorithms inherit this class. Each trainer can be wrapped in several wrappers (classes extending `deep_rl.core.AbstractWrapper`). Those wrappers are used for saving, logging, terminating the experiment and etc. All experiments should be registered using `@deep_rl.register_trainer` decorator. This decorator than wraps the trainer with default wrappers. This can be controlled by passing arguments to the decorator. All registered trainers (experiments) can be run by calling `deep_rl.make_trainer(<<name>>).run()`. ## Implemented algorithms ### A2C A2C is a synchronous, deterministic variant of Asynchronous Advantage Actor Critic (A3C) [2] which according to OpenAI [1] gives equal performance. It is however more efficient for GPU utilization. Start your experiment by subclassing `deep_rl.a2c.A2CTrainer`. Several models are included in `deep_rl.a2c.model`. You may want to use at least some helper modules contained in this package when designing your own experiment. In most of the models, initialization is done according to [3]. ### Asynchronous Advantage Actor Critic (A3C) [2] This implementation uses multiprocessing. It comes with two optimizers - RMSprop and Adam. ### Actor Critic using Kronecker-Factored Trust Region (ACKTR) [1] This is an improvement of A2C described in [1]. ## Experiments > Comming soon ## Requirements Those packages must be installed before using the framework for your own algorithm: - OpenAI baselines (can be installed by running `pip install git+https://github.com/openai/baselines.git`) - PyTorch - Visdom (`pip install visdom`) - Gym (`pip install gym`) - MatPlotLib Those packages must be installed prior running experiments: - DeepMind Lab - Gym[atari] ## Sources This repository is based on work of several other authors. We would like to express our thanks. - https://github.com/openai/baselines/tree/master/baselines - https://github.com/ikostrikov/pytorch-a2c-ppo-acktr/tree/master/a2c_ppo_acktr - https://github.com/miyosuda/unreal - https://github.com/openai/gym ## References [1] Wu, Y., Mansimov, E., Grosse, R.B., Liao, S. and Ba, J., 2017. Scalable trust-region method for deep reinforcement learning using kronecker-factored approximation. In Advances in neural information processing systems (pp. 5279-5288). [2] Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D. and Kavukcuoglu, K., 2016, June. Asynchronous methods for deep reinforcement learning. In International conference on machine learning (pp. 1928-1937). [3] Saxe, A.M., McClelland, J.L. and Ganguli, S., 2013. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. arXiv preprint arXiv:1312.6120.

نیازمندی

مقدار	نام
-	matplotlib
-	numpy
-	torch
-	gym

نحوه نصب

نصب پکیج whl deep-rl-0.3.7:

pip install deep-rl-0.3.7.whl

نصب پکیج tar.gz deep-rl-0.3.7:

pip install deep-rl-0.3.7.tar.gz