8 Commits

Author SHA1 Message Date
ritchie
569d5623b0 ga bridge 2017-12-26 13:48:09 +01:00
ritchie
1b0ebe6683 goal proper added in her 2017-12-16 16:34:27 +01:00
ritchie
c7d6fea511 fixed reward function -> less deflection is rewarding 2017-12-16 15:26:44 +01:00
ritchie
599420dbe6 no double elements anymore in her_deep_Q_bridge.ipynb. deploy instead of last action 2017-12-16 15:12:27 +01:00
vik
c21a0681d0 popleft bug buffer fixed and double deep q learning added 2017-11-06 16:17:50 +01:00
vik
86341c51ab more stable learning due to target network 2017-11-06 11:01:38 +01:00
Ritchie
7e7d931adc Q algorithm learns 2017-11-04 13:17:22 +01:00
Ritchie
40dcf31329 policy network openai gym flagpole 2017-10-31 22:30:20 +01:00