ritchie
|
569d5623b0
|
ga bridge
|
2017-12-26 13:48:09 +01:00 |
|
ritchie
|
1b0ebe6683
|
goal proper added in her
|
2017-12-16 16:34:27 +01:00 |
|
ritchie
|
c7d6fea511
|
fixed reward function -> less deflection is rewarding
|
2017-12-16 15:26:44 +01:00 |
|
ritchie
|
599420dbe6
|
no double elements anymore in her_deep_Q_bridge.ipynb. deploy instead of last action
|
2017-12-16 15:12:27 +01:00 |
|
vik
|
c21a0681d0
|
popleft bug buffer fixed and double deep q learning added
|
2017-11-06 16:17:50 +01:00 |
|
vik
|
86341c51ab
|
more stable learning due to target network
|
2017-11-06 11:01:38 +01:00 |
|
Ritchie
|
7e7d931adc
|
Q algorithm learns
|
2017-11-04 13:17:22 +01:00 |
|
Ritchie
|
40dcf31329
|
policy network openai gym flagpole
|
2017-10-31 22:30:20 +01:00 |
|