| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2026-03-04 | 3.4 kB | |
| v5.2.0 source code.tar.gz | 2026-03-04 | 7.1 MB | |
| v5.2.0 source code.zip | 2026-03-04 | 7.3 MB | |
| Totals: 3 Items | 14.4 MB | 2 | |
5.2.0 (2026-03-04)
Bug Fixes
- adjust CrossQ frames from learning curves — Ant 3M, Humanoid 2M, Swimmer 5M (f251cba)
- align CrossQ MuJoCo specs with actual benchmark runs for reproducibility (2cf11b0)
- align CrossQ reproduce table max_frame with actual run data (a4d3026)
- canonical numeric substitution and unsubstituted var validation in specs (1114fb5)
- cap CrossQ max_frame at SAC levels for fair comparison (1aae308)
- CartPole 200K→300K frames, Humanoid iter=2→4 for higher UTD (f0f1dc0)
- CartPole revert to 200K frames, add training_iter=2 for more gradients (bda2611)
- CartPole training_iter=2 for moderate UTD bump (1323766)
- CartPole training_iter=4, BRN warmup=2000 for more gradients (f57d843)
- CartPole training_start_step=5000 for better initial buffer diversity (3e6102d)
- correct 4 BENCHMARKS.md discrepancies found in audit (7cfc6c5)
- CrossQ InvDblPend critic [1024]→[512], Humanoid 3.5M→4M frames (92803f7)
- restore PPO minibatch_size=64 and hardcoded Atari max_frame (d8f5078)
- revert CartPole spec to match arc run that scored 405 (d8695ec)
- revert pinned memory to diagnose consistent score regression (21beecb)
- update dstack configs for 0.20.x compatibility (ca91e73)
Features
- bump version to 5.2.0 — CrossQ algorithm (6e0f68b)
- CrossQ algorithm — SAC without target networks + BatchRenorm (945625b)
- CrossQ benchmark specs for all environments (37a25ea)
- CrossQ improvement specs for ⚠️ envs (InvPend, InvDblPend, Hopper) (82cf6c4)
- integrate optimization branch — pinned memory, profiler, PPO minibatch (2bdd565)