Sveriges mest populära poddar

REAL Talk With Sam Holcman

Large data sources with bad data. Smaller data sources with good data. Which is better?

6 min • 26 juli 2023

Large bad data.  Obviously not good and some have recognized this as “model collapse” – the bad data causes more bad data to be generated.  Small bad data – nothing needs to be said here.  Small (vetted, provenance-known) data – perhaps within your enterprise walls – perhaps the way to go, until large, good data that is used for training appears.  When will this happen?  I do not think this is on the horizon.

Förekommer på
00:00 -00:00