Home » Data Science » Engineering » Technology » Dopamine – Research framework for fast prototyping of reinforcement learning algorithms

Dopamine – Research framework for fast prototyping of reinforcement learning algorithms

September 8, 2018 sherry 8 Data Science, Engineering, Technology,

By Kirti Bakshi

Over the past few years, Reinforcement learning (RL) research has seen a number of significant advances and the type of progress it has made turn out to be very important, as the algorithms that yield these advances are additionally applicable for other domains, such as in robotics.

Very often, developing these kinds of advances requires the iteration over a design quickly, and it is done so often with no clear direction — and disrupting the structure of established methods. However, most existing RL frameworks do not provide the combination of flexibility as well as stability that effectively enables researchers to iterate on RL methods, and thus explore new research directions that may not have obvious benefits immediately.

Further, from existing frameworks reproducing the results is often time-consuming, which can lead to scientific reproducibility issues down the line. Therefore, you are presented to a new Tensorflow-based framework: Dopamine, A Research Framework For Fast Prototyping Of Reinforcement Learning Algorithms that aims to provide flexibility, stability, and reproducibility for new as well as experienced RL researchers alike.

What are the principles that Dopamine is based on?

Dopamine, having had its inspiration from one of the main components in reward-motivated behaviour in the brain and reflecting the strong historical connection between neuroscience and reinforcement learning research, aims to enable the kind of speculative research that can drive radical discoveries.

And in order to do so, the framework was designed keeping the below into consideration:

1- Ease Of Use:

The two key considerations in the design of this framework are Clarity and simplicity. The code that is provided is compact as well as well-documented. This is achieved by focusing on a mature, well-understood benchmark: Arcade Learning Environment and four value-based agents:

DQN,
C51,
A simplified carefully curated variant of the Rainbow agent,
The Implicit Quantile Network agent, which was presented at the International Conference on Machine Learning (ICML).

2- Reproducibility:

The team is particularly sensitive to the importance of reproducibility in reinforcement learning research. To this end, their code is provided with full test coverage that serves as an additional form of documentation.

3- Benchmarking:

It is very important for new researchers to be able to benchmark their ideas against established methods quickly. As such, the team provides the full training data of the four provided agents, across the 60 games supported by the Arcade Learning Environment, available as Python pickle files for agents trained with their framework and as JSON data files for comparison with agents trained in other frameworks.

In addition to this, they also provide a website where the user can visualize the training runs for all provided agents on all 60 games quickly.

Conclusion:

Since we now know, what Dopamine is and what it relies on, to be more specific and to provide a clear understanding, the design principles of the framework can be put as:

Easy experimentation: Make it easier for new users to run benchmark experiments.
Flexible development: Make it easier for new users to try out research ideas.
Compact and reliable: Provision of implementations for a few, battle-tested algorithms.
Reproducible: Facilitate reproducibility in results.

Currently, the team is actively making the use of it for their research and finding it is giving them the flexibility to iterate over many ideas quickly.

And with such a design, it is hoped that the framework’s flexibility and ease-of-use will empower researchers to try out new ideas and it will also be exciting to see what the larger community can make of it!

More Information: GitHub

0 0 votes

Article Rating

Tags: A Research Framework For Fast Prototyping Of Reinforcement Learning Algorithms, academia, administrative perspectives, AI, AI methods, analytics, analytics space, applied machine learning, Arcade Learning Environment, artificial intelligence, Automation of Machine-Learning, Autonomous driving, battle-tested algorithms, Benchmarking, Big Data Analytics, broad spectrum, Clarity, comments, Compact and reliable, conclusion, Conference, content, Continuous delivery, Cyber, Data Science, Data Scientists, deep learning, delivering, DESIGN, design principles, Developers, DevOps, Dialogue Bots, documentation, domains, DOPAMINE, ease-of-use, Education, engineering, Ethics of artificial intelligence, excellent, experience, experienced RL, Explaining of Israel, Facilitate reproducibility in results, Fintech, flexibility, Flexible development, focus, frameworks, Future of AI, great success, Healthcare, helpful feedback, high-quality, ICML, ideas, improvement, industry, industry tracks, innovation, Intelligent robots, International Conference on Machine Learning, iot, issues, JSON data files, Kirti Bakshi, Large scale analytics, larger community, leading experts, Machine ethics, Natural Language Understanding, organizing conference, participants, predictive applications, Provision of implementations for a few, python, radical discoveries, Rainbow agent, real world, real-world domains, Registration, reinforcement learning, reinforcement learning research, reproducibility, Reproducible, research and application, RESEARCH FRAMEWORK FOR FAST PROTOTYPING OF REINFORCEMENT LEARNING ALGORITHMS, research ideas, research innovations, research track, researchers, Retail, RL, RL frameworks, RL methods, Robot rights, robotics, scientific, Simplicity, simplified, sponsorship, stability, state-of-the-art, Systems for ML, technical presentations, TECHNOLOGY, Tensorflow-based framework, The conference, The Implicit Quantile Network agent, The Summit, Threat to human dignity, topics, tutorial, value-based agents, Weaponization of AI

About The Author

sherry

CODESIGN.BLOG

8 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Spud

5 years ago

I thank you humbly for shnraig your wisdom JJWY

yeezy boost 350

5 years ago

I have to express some thanks to this writer just for bailing me out of this scenario. After looking out throughout the the net and getting concepts which were not powerful, I was thinking my life was gone. Being alive without the presence of strategies to the difficulties you’ve fixed by means of your site is a crucial case, and the kind that might have badly affected my career if I had not noticed the website. Your own personal training and kindness in handling a lot of things was vital. I don’t know what I would have done if I… Read more »

yeezy shoes

5 years ago

I simply desired to thank you very much all over again. I’m not certain the things I would have accomplished without the entire techniques shared by you on this problem. It actually was a terrifying difficulty in my position, but noticing a new well-written technique you managed it took me to jump with fulfillment. I will be happy for this work and hope that you are aware of a powerful job you are providing training the mediocre ones with the aid of your webpage. Most probably you have never met any of us.

adidas nmd

5 years ago

I just wanted to construct a quick comment so as to appreciate you for some of the lovely information you are giving at this website. My rather long internet search has at the end been paid with wonderful details to talk about with my good friends. I would say that we readers actually are very much endowed to exist in a very good place with many special individuals with insightful basics. I feel pretty lucky to have discovered your site and look forward to really more awesome times reading here. Thanks a lot once more for everything.

NMD R1 Runner PK Dark Red

5 years ago

I simply had to say thanks once more. I’m not certain the things I would’ve undertaken in the absence of these information shown by you relating to this topic. It became a very terrifying situation in my circumstances, but coming across a well-written fashion you handled it made me to weep with joy. Now i’m happy for your work and trust you are aware of a great job you have been carrying out training the others by way of your site. I am sure you’ve never met all of us.

adidas ultra boost

5 years ago

I simply wanted to develop a word so as to thank you for the fantastic guides you are showing at this website. My rather long internet research has at the end been paid with really good suggestions to exchange with my great friends. I would tell you that most of us website visitors are undeniably blessed to be in a wonderful community with so many wonderful individuals with helpful pointers. I feel really grateful to have used the webpages and look forward to tons of more pleasurable minutes reading here. Thanks a lot again for everything.

goyard handbags

5 years ago

I intended to compose you a little bit of remark to say thanks a lot again just for the unique opinions you have shared in this case. This is really unbelievably open-handed with you to offer unhampered all that a few people could have marketed for an ebook in making some profit for their own end, even more so considering that you might have done it in the event you wanted. Those solutions as well served to become easy way to fully grasp that the rest have a similar desire much like my personal own to figure out very much… Read more »

zx flux

5 years ago

I as well as my pals appeared to be studying the best tricks found on your site and so all of the sudden got an awful suspicion I had not thanked you for those techniques. All of the ladies were definitely totally glad to read through all of them and already have undoubtedly been making the most of those things. I appreciate you for simply being really thoughtful and for considering varieties of great issues millions of individuals are really desirous to be informed on. My very own honest apologies for not expressing appreciation to sooner.

wpDiscuz

Related posts

About The Author

sherry