ISMETA@lemmy.ziptoTechnology@beehaw.org•Sarah Silverman and other authors are suing OpenAI and Meta for copyright infringement, alleging that they're training their LLMs on books via Library Genesis and Z-LibraryEnglish
1·
1 year agoGPT3 is 800GB while the entirety of the English Wikipedia is around 10GB compressed. So yeah it doesn’t store evey detail of everything but LLMs do memorize a lot of things verbatim. Also see https://bair.berkeley.edu/blog/2020/12/20/lmmem/
Sounds good but there isn’t any consumer equipment that can handle 2GB/s. Even 10 Gigabit Ethernet switches are super expensive and I don’t think we have anything that can do more than 10Gb/s in the consumer Networking space at all .