...The repository is organized as a set of tables that list corpora along with their languages, total hours, number of speakers, download links, and licenses, giving practitioners a quick way to find data that matches their needs. It emphasizes free and truly “open” datasets, favoring those released under Creative Commons or community-friendly data licenses, though it also lists corpora that are accessible for research and many commercial uses. The catalog covers well-known resources such as Mozilla Common Voice, Yesno, LJ Speech and numerous Nordic and parliamentary speech corpora, along with their license variants like CC-0 and CC-BY. ...