Download Latest Version v5.2.0 source code.tar.gz (7.1 MB)
Email in envelope

Get an email when there's a new version of SLM Lab

Home / v5.2.0
Name Modified Size InfoDownloads / Week
Parent folder
README.md 2026-03-04 3.4 kB
v5.2.0 source code.tar.gz 2026-03-04 7.1 MB
v5.2.0 source code.zip 2026-03-04 7.3 MB
Totals: 3 Items   14.4 MB 2

5.2.0 (2026-03-04)

Bug Fixes

  • adjust CrossQ frames from learning curves — Ant 3M, Humanoid 2M, Swimmer 5M (f251cba)
  • align CrossQ MuJoCo specs with actual benchmark runs for reproducibility (2cf11b0)
  • align CrossQ reproduce table max_frame with actual run data (a4d3026)
  • canonical numeric substitution and unsubstituted var validation in specs (1114fb5)
  • cap CrossQ max_frame at SAC levels for fair comparison (1aae308)
  • CartPole 200K→300K frames, Humanoid iter=2→4 for higher UTD (f0f1dc0)
  • CartPole revert to 200K frames, add training_iter=2 for more gradients (bda2611)
  • CartPole training_iter=2 for moderate UTD bump (1323766)
  • CartPole training_iter=4, BRN warmup=2000 for more gradients (f57d843)
  • CartPole training_start_step=5000 for better initial buffer diversity (3e6102d)
  • correct 4 BENCHMARKS.md discrepancies found in audit (7cfc6c5)
  • CrossQ InvDblPend critic [1024]→[512], Humanoid 3.5M→4M frames (92803f7)
  • restore PPO minibatch_size=64 and hardcoded Atari max_frame (d8f5078)
  • revert CartPole spec to match arc run that scored 405 (d8695ec)
  • revert pinned memory to diagnose consistent score regression (21beecb)
  • update dstack configs for 0.20.x compatibility (ca91e73)

Features

  • bump version to 5.2.0 — CrossQ algorithm (6e0f68b)
  • CrossQ algorithm — SAC without target networks + BatchRenorm (945625b)
  • CrossQ benchmark specs for all environments (37a25ea)
  • CrossQ improvement specs for ⚠️ envs (InvPend, InvDblPend, Hopper) (82cf6c4)
  • integrate optimization branch — pinned memory, profiler, PPO minibatch (2bdd565)
Source: README.md, updated 2026-03-04