Could the reddit API changes have to do with ChatGPT rather than third party apps?

gotofritz@beehaw.org · edit-2 1 year ago

Could the reddit API changes have to do with ChatGPT rather than third party apps?

iMeddles@fedia.io · 1 year ago

Charging for their api is reasonable in answer to the llm data scrapers. The amount they’re chsrging, and the speed of the changes is not reasonable however IMO.

JohnDClay@sh.itjust.works · 1 year ago

The original announcement said they were making exceptions for applications that gave back to Reddit. I and many others hoped that was basically everyone who wasn’t AI scraping. But seems like they got greedy while they were at it and decided to kill everything

spoonful@beehaw.org · 1 year ago

Reddit data is public and can be easily web scraped. Reddit doesn’t own it. Spez is just throwing random memes in to distract people.

gotofritz@beehaw.org · 1 year ago

I am sorry but you don’t know what you are talking about. These things are regulated by legal documents, you don’t just wake up on morning and say “trust me bro, their data is public”

If you go and read their TnC’s it explicitly statea that scraping is forbidden without prioir written consent. They only allow access to their data via APIs, which of course they charge for

The fact that it can be easily scraped it’s neither here nor there, if they catch you they can sue you

spoonful@beehaw.org · edit-2 1 year ago

Nah Terms of Service is not enforcable through browse wrap agreement in the US and most of EU. You can’t implicitly agree with a legal document just by looking at something.

Check out LinkedIn v. Hiq case which went to 9th circuit and set the precedent for this. LinkedIn lost.

deegeese@sopuli.xyz · edit-2 1 year ago

99% of LLMs have pirated content and will continue to regurgitate pirated content until there is enough money at stake for a big lawsuit.

gotofritz@beehaw.org · 1 year ago

Getty is already suing the Dall-E creators, and someone is suing MS for Copilot; so it’s already started

deegeese@sopuli.xyz · 1 year ago

Again, big money users will get sued, everyone else will scrape with impunity.

j4k3@lemmy.world · 1 year ago

The value of LLM’s has changed drastically in favor of open source since the Meta weights leak. The proprietary model looks pretty much wrecked now, at least as far as I understand the leaked internal memo from a google researcher last month.

https://www.semianalysis.com/p/google-we-have-no-moat-and-neither

MarPan@lemmy.world · 1 year ago

This is a fascinating read, thank you very much for sharing.

gotofritz@beehaw.org · 1 year ago

Oh I’m not saying they are doing the right thing or that it was the correct decision. Just speculating whether LLMs is what kicked off the whole thing

j4k3@lemmy.world · 1 year ago

I’m saying the premise that LLM’s have anything to do with it is either incompetent failure to keep up with LLM developments, or a pack of lies.

gotofritz@beehaw.org · edit-2 1 year ago

I disagree, it’s still too early and a bit presumptuous to make such conclusive statements

whofearsthenight@beehaw.org · 1 year ago

Could they have something to do with it? Yes, for sure. But the thing is that they didn’t have to do any of this the way they did. They could have made an API plan that allowed third party apps to still exist/thrive, and also charge big companies that just want to use reddit to train LLM’s. Change the pricing/terms based around this idea. They deliberately went after third party apps, and then double and tripled down on it in the face of massive backlash. If spez was competent, he would have been able to better pivot this conversation and make it about training LLM’s for megacorps, but he didn’t and even then it would have still been bullshit that is easily seen past.

shortwavesurfer@monero.house · 1 year ago

Yes, but it could have been handled better. If ai was the problem they could have gone the route of api only being allowed after an application process so they know who is using it and everyone else trying to use it would get denied until they were assigned a key

Scrubbles@poptalk.scrubbles.tech · 1 year ago

This right here. They could have made a licensing agreement that is based on classification your use falls into. Apps has one pricing model, llm has another. This is just lazy and greedy.

jay@beehaw.org · 1 year ago

100% and they also didn’t need to be total tools about it. giving a month window is a joke, being snarky assholes answering AMAs, telling their user base that profitability is the only thing that matters to them.

Surprising nobody, Reddit continues to make really awful business decisions. This is just another nail in their coffin.

dawnerd@lemm.ee · 1 year ago

No. Data scrapers will still scrape the site as long as they want to be indexed by search engines. IMO charging for API access is fine when reasonable. Lying about why you’re doing it isn’t.

hendrik@lemmy.ml · 1 year ago

This is mainly just being used as a pretext.

z2k_@lemmy.nz · 1 year ago

Yes but imo it would be easy to seperate LLM and 3rd party apps since 3rd party apps have users sign in independently. They chose to also target 3rd party apps and take them down.

Schelleberg@feddit.de · 1 year ago

I’m very sure that this is the case. Reddit is pissed they gave away all the content as training data for free while struggling to monetize their platform adequately.

But I suspect the damage is already done. There are projects like “Orca” from Microsoft that skip the learning process from source data for a big part by using chatGPT and GPT4.

They missed the timing but are too stubborn and double down on it

damn@lemmy.fmhy.ml · 1 year ago

Why not both? I think they see this as an opportunity to kill two birds with one stone.

Crotaro@beehaw.org · 1 year ago

Surprisingly tough question. On one hand, I don’t think every ex-Reddit user should go “Nah, it’s too late, fam” because then it wouldn’t even make sense for the devs to make any changes if they had no chance of regaining their userbase. On the other hand, I feel like even if they made really good changes, I would still always be on edge waiting for the bad thing to happen (pretty much what I imagine an abusive relationship to be like).

rubythulhu@beehaw.org · 1 year ago

Yup. AI consumers are more profitable than 3rd party apps. why focus on tiered pricing when you can just name a price point everyone has to pay that only huge AI companies are willing to.

Reddit gets their content for free. Reselling it at a high price to AI/ML consumers is an easy way to turn free content into profit with almost no effort.

Klinkertinlegs@beehaw.org · 1 year ago

I think the LLM wave hit, they saw dollar signs, and they made a change without thinking it through, but then they were backed into a corner between money and avoiding outrage, but greed won over.

nob0dy@beehaw.org · 1 year ago

They could have created better licensing models. It does rely on people honoring the agreements but besides countries that disregard IPs I think its a viable model. Their business is social media, not curating datasets.

Kris@lemmy.world · 1 year ago

Yes but nothings stopping scraping of reddit content from the front end

gotofritz@beehaw.org · 1 year ago

Technically not (well, they can make it harder), but they can sue them for doing it

jpv@beehaw.org · 1 year ago

Sure, but they could do the same thing with an API. Make scraping for LLMs against the TOS; not personal use. I really do think (as the OP says) it’s two birds with one stone.

Could the reddit API changes have to do with ChatGPT rather than third party apps?

Could the reddit API changes have to do with ChatGPT rather than third party apps?

Addressing the community about changes to our API