Commit Graph

19 Commits

Author SHA1 Message Date
ritchie
1b0ebe6683 goal proper added in her 2017-12-16 16:34:27 +01:00
ritchie
c7d6fea511 fixed reward function -> less deflection is rewarding 2017-12-16 15:26:44 +01:00
ritchie
599420dbe6 no double elements anymore in her_deep_Q_bridge.ipynb. deploy instead of last action 2017-12-16 15:12:27 +01:00
vik
8e6f17f036 log regression formula update 2017-11-15 13:26:45 +01:00
vik
c21a0681d0 popleft bug buffer fixed and double deep q learning added 2017-11-06 16:17:50 +01:00
vik
86341c51ab more stable learning due to target network 2017-11-06 11:01:38 +01:00
Ritchie
7e7d931adc Q algorithm learns 2017-11-04 13:17:22 +01:00
Ritchie
40dcf31329 policy network openai gym flagpole 2017-10-31 22:30:20 +01:00
Ritchie
5b68b7ec43 logistic regression 2017-09-10 13:54:17 +02:00
Ritchie
f351e89421 readme 2017-09-03 12:50:43 +02:00
Ritchie
7dfe24aa76 vanilla_mlp 2017-07-13 11:28:12 +02:00
Ritchie
36ca828e6b No activation prime 2017-07-01 12:20:00 +02:00
Ritchie
43a8098031 learning for both nn working 2017-07-01 12:01:07 +02:00
Ritchie
ea28e2ee29 Backprop working 2017-06-30 18:13:59 +02:00
Ritchie
70b4cc7bc9 Backprop not exploding 2017-06-30 17:49:09 +02:00
Ritchie
5c56c91a4a backprop though not working 2017-06-30 17:03:58 +02:00
Ritchie
a3c38d2ad7 Various layer network. Feed forward 2017-06-30 14:11:02 +02:00
Ritchie
c21622525c cross entropy delta 2017-06-25 22:05:55 +02:00
Ritchie
add5381051 first 2017-06-24 21:13:41 +02:00