معرفی شرکت ها

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

DEP-RL, a method for robust control of musculoskeletal systems.

ویژگی	مقدار
سیستم عامل	-
نام فایل	deprl-0.0.4
نام	deprl
نسخه کتابخانه	0.0.4
نگهدارنده	[]
ایمیل نگهدارنده	[]
نویسنده	Pierre Schumacher
ایمیل نویسنده	pierre.schumacher@tuebingen.mpg.de
آدرس صفحه اصلی	-
آدرس اینترنتی	https://pypi.org/project/deprl/
مجوز	MIT

# DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems This repo contains the code for the paper [DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems](https://openreview.net/forum?id=C-xa_D3oTj6) paper, published at ICLR 2023 with perfect review scores (8, 8, 8, 10) and a notable-top-25% rating. See [here](https://sites.google.com/view/dep-rl) for videos. The work was performed by Pierre Schumacher, Daniel F.B. Haeufle, Dieter Büchler, Syn Schmitt and Georg Martius. If you just want to see the code for DEP, take a look at `deprl/dep_controller.py`, `deprl/custom_agents.py` and `deprl/env_wrapper/wrappers.py` <p align="center"> <img src=https://user-images.githubusercontent.com/24903880/229783811-44c422e9-3cc3-42e4-b657-d21be9af6458.gif width=250> <img src=https://user-images.githubusercontent.com/24903880/229783729-d068e87c-cb0b-43c7-91d5-ff2ba836f05b.gif width=214> <img src=https://user-images.githubusercontent.com/24903880/229783370-ee95b9c3-06a0-4ef3-9b60-78e88c4eae38.gif width=214> </p> ## Abstract Muscle-actuated organisms are capable of learning an unparalleled diversity of dexterous movements despite their vast amount of muscles. Reinforcement learning (RL) on large musculoskeletal models, however, has not been able to show similar performance. We conjecture that ineffective exploration in large overactuated action spaces is a key problem. This is supported by our finding that common exploration noise strategies are inadequate in synthetic examples of overactuated systems. We identify differential extrinsic plasticity (DEP), a method from the domain of self-organization, as being able to induce state-space covering exploration within seconds of interaction. By integrating DEP into RL, we achieve fast learning of reaching and locomotion in musculoskeletal systems, outperforming current approaches in all considered tasks in sample efficiency and robustness. ## Installation We provide a python package for easy installation: ``` pip install deprl ``` If you would like to use `jax` with CUDA support, which significantly speeds up learning, we recommend running: ``` pip install --upgrade "jax[cuda12_pip]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html ``` or similar afterwards. The humanreacher environment can be installed with ``` pip install pip install git+https://github.com/P-Schumacher/warmup.git ``` OstrichRL can be installed from [here](https://github.com/vittorione94/ostrichrl). ## Experiments The major experiments (humanreacher reaching and ostrich running) can be repeated with the config files. Simply run from the root folder: ``` python -m deprl.main experiments/ostrich_running_dep.json python -m deprl.main experiments/humanreacher.json ``` to train an agent. Model checkpoints will be saved in the `output` directory. The progress can be monitored with: ``` python -m tonic.plot --path output/folder/ ``` To execute a trained policy, use: ``` python -m deprl.play --path output/folder/ ``` See the [TonicRL](https://github.com/fabiopardo/tonic) documentation for details. Be aware that ostrich training can be seed-dependant, as seen in the plots of the original publication. ## Pure DEP If you want to see pure DEP in action, just run the following bash files after installing the ostrichrl and warmup environments. ``` bash play_files/play_dep_humanreacher.sh bash play_files/play_dep_ostrich.sh bash play_files/play_dep_dmcontrol_quadruped.sh ``` You might see a more interesting ostrich behavior by disabling episode resets in the ostrich environment first. ## Environments The ostrich environment can be found [here](https://github.com/vittorione94/ostrichrl) and is installed automatically via poetry. The arm-environment [warmup](https://github.com/P-Schumacher/warmup) is also automatically installed by poetry and can be used like any other gym environment: ``` import gym import warmup env = gym.make("humanreacher-v0") for ep in range(5): ep_steps = 0 state = env.reset() while True: next_state, reward, done, info = env.step(env.action_space.sample()) env.render() if done or (ep_steps >= env.max_episode_steps): break ep_steps += 1 ``` The humanoid environments were simulated with [SCONE](https://scone.software/doku.php?id=start). A ready-to-use RL package will be released in cooperation with GOATSTREAM at a later date. ## Source Code Installation We recommend an installation with [poetry](https://python-poetry.org/) to ensure reproducibility. While [TonicRL](https://github.com/fabiopardo/tonic) with PyTorch is used for the RL algorithms, DEP itself is implemented in JAX. We *strongly* recommend GPU-usage to speed up the computation of DEP. On systems without GPUs, give the tensorflow version of TonicRL a try! We alternatively provide a pip_requirements file. 1. Make sure to install poetry and deactivate all virtual environments. 2. Clone the environment ``` git clone https://github.com/martius-lab/depRL ``` 3. Go to the root folder and run ``` poetry install poetry shell ``` That's it! The build has been tested with: ``` Ubuntu 20.04 and Ubuntu 22.04 CUDA 12.0 poetry 1.4.0 ``` ### Troubleshooting * A common error with poetry is a faulty interaction with the python keyring, resulting in a `Failed to unlock the collection!`-error. It could also happen that the dependency solving takes very long (more than 60s), this is caused by the same error. If it happens, try to append ``` export PYTHON_KEYRING_BACKEND=keyring.backends.null.Keyring ``` to your bashrc. You can also try to prepend it to the poetry command: `PYTHON_KEYRING_BACKEND=keyring.backends.null.Keyring poetry install`. * If you have an error related to your `ptxas` version, this means that your cuda environment is not setup correctly and you should install the cuda-toolkit. The easiest way is to do this via conda if you don't have admin rights on your workstation. I recommend running ``` conda install -c conda-forge cudatoolkit-dev ``` * In any other case, first try to delete the `poetry.lock` file and the virtual env `.venv`, then run `poetry install` again. Feel free to open an issue if you encounter any problems. ## Citation Please use the following citation if you make use of our work: ``` @inproceedings{schumacher2023:deprl, title = {DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems}, author = {Schumacher, Pierre and Haeufle, Daniel F.B. and B{\"u}chler, Dieter and Schmitt, Syn and Martius, Georg}, booktitle = {Proceedings of the Eleventh International Conference on Learning Representations (ICLR)}, month = may, year = {2023}, doi = {}, url = {https://openreview.net/forum?id=C-xa_D3oTj6}, month_numeric = {5} } ```

نیازمندی

مقدار	نام
>=6.0,<7.0	pyyaml
>=1.24.2,<2.0.0	numpy
>=2.2.0,<3.0.0	termcolor
>=1.13.1	torch
>=0.4.7,<0.5.0	jaxlib
>=0.4.8,<0.5.0	jax
==0.13.0	gym

زبان مورد نیاز

مقدار	نام
>=3.8,<3.11	Python

نحوه نصب

نصب پکیج whl deprl-0.0.4:

pip install deprl-0.0.4.whl

نصب پکیج tar.gz deprl-0.0.4:

pip install deprl-0.0.4.tar.gz