Forget security – Google's reCAPTCHA v2 is exploiting users for profit | Web puzzles don't protect against bots, but humans have spent 819 million unpaid hours solving them

ForgottenFlux@lemmy.world · 2 months ago

Forget security – Google's reCAPTCHA v2 is exploiting users for profit | Web puzzles don't protect against bots, but humans have spent 819 million unpaid hours solving them

repungnant_canary@lemmy.world · 2 months ago

It is undoubtedly a new piece of research, but the cause is always the same: corporations exploit people because they are taken out of government and democratic control effectively everywhere.

Some corporations employ more people and have bigger budgets than some countries and they often influence people’s lives more than the government. Yet they’re effectively electoral monarchies where electors and monarchs are just a bunch of rich assholes who respond to nobody.

Only when we change that system then those headlines will stop.

Sunkblake@lemmy.world · edit-2 2 months ago

Is it only 7200 people solvning reCAPTCHA every hour for the past 13 years? Feels like it should be more?

kingthrillgore@lemmy.ml · edit-2 2 months ago

Remember the good old days when it was just malformed text you have to solve? I miss those days. AI was complete garbage and they had to use farms of eyeballs to solve them for bots, making it a costly operation. We’ve now totally gotten away from all of that.

0laura@lemmy.world · 2 months ago

that was also to train ai.

dan@upvote.au · edit-2 2 months ago

No it wasn’t… It was human-assisted OCR to help digitize books. Initially for Project Gutenberg, but then for Google Books once Google acquired it in 2009.

gentooer@programming.dev · 2 months ago

OCR is a form of AI.

wreckedcarzz@lemmy.world · 2 months ago

I thought this was old news 20 years ago?

FierySpectre@lemmy.world · 2 months ago

I mean, duh? With proof of work captchas existing, there’s no reason to have those image selection captchas… Ever…

How those work is by having the server generate a puzzle. Server side this is cheap to generate, while client side solving is “hard”. The server can even choose the difficulty of the puzzle, and even set it dynamically. This means that when your website is under light load the captcha can be really easy/fast to solve. If your website is under attack however the captcha can be set to take seconds to solve.

MonkderVierte@lemmy.ml · 2 months ago

Does this work?

https://addons.mozilla.org/de/firefox/addon/noptcha/

I Cast Fist@programming.dev · 2 months ago

Judging from the reviews, it doesn’t

MonkderVierte@lemmy.ml · 2 months ago

Ah, right, there are reviews too.

ohmyiv@lemmy.world · 2 months ago

I tried it before. It worked for me on one small game website for account creation. After that it was more or less useless on any other site. It has a weird focus thing where it’ll try to solve the captcha before you can enter in login details so if by chance the extension works, you’ll fail the login anyways.

It still needs work. I think if the dev can work out those issues it could be great. Until then, it’s pretty much worthless.

HiramFromTheChi@lemmy.world · 2 months ago

There’s nothing that can express my disdain for Google’s reCaptcha.

😒 We’re training its AI models 😒 It’s free labor for Google 😒 Sometimes it wants the corner of an object, sometimes it doesn’t 😒 Wildly inconsistent 😒 Always blurry and hard to see 😒 Seemingly endless 😒 It’s the robot asking us humans if we’re the robots

TypicalHog@lemm.ee · 2 months ago

I always thought they are just getting the training data for AI using these.

Flying Squid@lemmy.world · 2 months ago

I had to deal with one yesterday that wouldn’t let me in no matter what I did.

So it isn’t even good at figuring out who isn’t a robot.

icedterminal@lemmy.world · 2 months ago

Solving too fast. I shit you not. Sometimes you have to go really slow. Like you’re 80 and can’t see very well trying to discern what’s in those boxes.

aaaaace@lemmy.blahaj.zone · 2 months ago

Try the headphone option.

brbposting@sh.itjust.works · 2 months ago

Finally heard a clear audio CAPTCHA for the first time in my life this past month. It was glorious. There was slight garbling before and after the characters were read, but that’s it.

Besides that singular experience, all audio CAPTCHAs have been utterly 100% impossible to interpret. Blaring white noise followed by a small squeak of “threeve” or “eleventeen”.

cley_faye@lemmy.world · 2 months ago

reCAPTCHA v2 visual challenge images are all pre-labeled and user input plays no role in image labeling

That’s funny, because when I’m faced with this, I keep adding/removing one of the image randomly and it keeps accepting them as ok.

Pulptastic@midwest.social · 2 months ago

I like this strategy.

serenissi@lemmy.world · 2 months ago

The objective of reCAPTCHA (or any captcha) isn’t to detect bots. It is more of stopping automated requests and rate limiting. The captcha is ‘defeated’ if the time complexity to solve it, whether human or bot, is less than what expected. Now humans are very slow, hence they can’t beat them anyway.

tb_@lemmy.world · 2 months ago

I thought captcha’s worked in a way where they provided some known good examples, some known bad examples, and a few examples which aren’t certain yet. Then the model is trained depending on whether the user selects the uncertain examples.

Also it’s very evident what’s being trained. First it was obscured words for OCR, then Google Maps screenshots for detecting things, now you see them with clearly machine-generated images.

nickwitha_k (he/him)@lemmy.sdf.org · 2 months ago

There are much better ways of rate limiting that don’t steal labor from people.

serenissi@lemmy.world · 2 months ago

hCaptcha, Microsoft CAPTCHA all do the same. Can you give example of some that can’t easily be overcome just by better compute hardware?

nickwitha_k (he/him)@lemmy.sdf.org · 2 months ago

The problem is the unethical use of software that does not do what it claims and instead uses end users for free labor. The solution is not to use it. For rate limiting a proxy/load-balancer like HAProxy will accomplish the task easily. Ex:

serenissi@lemmy.world · 2 months ago

And what will you do if a person in a CGNAT is DoSing/scraping your site while you want others to access? IP based limiting isn’t very useful, both ways.

smb@lemmy.ml · 2 months ago

[…] reCAPTCHA […] isn’t to detect bots. It is more of stopping automated requests […]

which is bots. bots do automated requests and every automated request doer can also be called a bot (i.e. web crawlers are called bots too and -if kind- also respect robots.txt which has “bots” in its name for this very reason and bots is the shortcut for robots) use of different words does not change reality behind it, but may add a fact of someone trying something on the other.

serenissi@lemmy.world · 2 months ago

There isn’t a good way to classify human users with scripts without adding too much friction to normal use. Also bots are sometimes welcome amd useful, it’s a problem when someone tries to mine data in large volume or effectively DoS the server.

Forget bots, there exist centers in India and other countries where you can employ humans to do ‘automated things’ (youtube like count, watch hour for example) at the same expense of bots. There are similar CAPTCHA services too. Good luck with those :)

Only rate limiting is the effective option.

smb@lemmy.ml · 2 months ago

Only rate limiting is the effective option.

i doubt that. you could maybe ratelimit per IP and the abusers will change their IP whenever needed. if you ratelimit the whole service over all users in the world, then your service dies as quickly into uselessness as effective your ratelimiter is. if you ratelimit actions of logged in users, then your ratelimiting is limited by your ability to identify fake or duplicate accounts, where captchas are not helpful at all.

at the same expense of bots. they might be cheap, but i doubt that anyway, bots don’t need sleep.

i was answering about that wording (that captchas were “not” about bots but about “stopping automated requests”) and that automated requests “are” bots instead.

call centers are neither bots nor automated requests (the opposite IS their advantage) and thus have no relation to what i was specifically saying in reply to that post that suggested automated requests and bots would be different things in this context.

i wasn’t talking about effectiveness of captchas either or if bots should be banned or not, only about bots beeing automated requests (and vice versa) from the perspective of the platform stopping bots. and that trying to use different words for things, (claiming like “X isn’t X, it is really U!”* or automated requests aren’t bots) does not change the reality of the thing itself.

*) unrelated to any (a-)social media platform

serenissi@lemmy.world · 2 months ago

stopping automated requests

yeah my bad. I meant too many automated requests. Both humans and bot generate spams and the issue is high influx of it. Legitimate users also use bots and by no means it’s harmful. That way you do not encounter captcha everytime you visit any google page, nor a couple of scraping scripts gets a problem. Recaptcha (or hcaptcha, say) triggers when there is high volume of request coming from same ip. Instead of blocking everyone out to protect their servers, they might allow slower requests so legitimate users face mininimal hindrance.

Most google services nowadays require accounts with stronger (like cell phone) verification so automated spam isn’t a big deal.

smb@lemmy.ml · 1 month ago

since bots are better at solving captchas and humanoid services exist that solve them, the only ones negatively affected by captchas are regular legitimate users. the bad guys use bots or services and are done. regular users have to endure while no security is added, and for the influx i guess it is much more like with the better lock on the front door: if your lock is a bit better than that of your neigbhour, theirs might be force-opened more likely than yours. it might help you, but its not a real but only relative and also very subjective feeling of 'security".

beeing slower than the wolves also isn’t as bad as long as you are not the slowest in your group (some people say)… so doing a bit more than others always is a good choice (just better don’t put that bar too low like using crowdsnakeoil for anything)

serenissi@lemmy.world · 1 month ago

the bad guys use bots or services and are done. regular users have to endure while no security is added

put in other words, common users can’t easily become ‘bad guy’ ie cost of attack is higher hence lower number of script kiddies and automated attacks. You want to reduce number. These protections are nothing for bitnet owners or other high profile bad actors.

ps: recaptcha (or captcha in general) isn’t a security feature. At most it can be a safety feature.

interdimensionalmeme@lemmy.ml · 2 months ago

When they slow fade in the picture, I add one more software engineer to my kill list.

Appoxo@lemmy.dbzer0.com · 2 months ago

In case you didnt know: This is alrrady a thing with pictures slowly fading in for selecting stuff like traffic cones or busses.

kingthrillgore@lemmy.ml · 2 months ago

I will gladly solve a reCAPTCHA for you today if you pay me for it today.

BangCrash@lemmy.world · 2 months ago

There’s platforms that do that.

I can pay a service to auto solve captcha and anything that can’t be solved will be pushed to a human to solve.

Never actually used it but it was interesting learning it existed

gradyp@awful.systems · 2 months ago

I honestly thought it was common knowledge that these things were essentially free labor for training AI.

dan@upvote.au · 2 months ago

The original reCAPTCHA from Carnegie Mellon University was helping to digitize books. It showed one known word and one unknown word, and if enough people answered the second word with the same answer, that’d be marked as the correct value.

thrawn@lemmy.world · 2 months ago

It’s basically always been outsourcing labor while checking. I guess they don’t want to provide that service for free.

But now that it doesn’t work, all it does is attempt to source free labor by refusing to show what you want to see. Cloudflare’s verification doesn’t show the puzzle because it’s not trying to make money off you.

Also, the books one reminds me of 4chan’s attempt to hijack it. Wasn’t a fan of the way they did it, but the intent was interesting.

lud@lemm.ee · 2 months ago

V3 of the Google one doesn’t always show a puzzle to you. In fact it’s designed to not be noticed by users at all. Whether that is successful or not is a different discussion.

thrawn@lemmy.world · 2 months ago

It might well be if it’s being used, but the site itself still uses v2 a lot. I get the picture one a lot when searching things up.

That actually makes me feel all the more strongly that it’s just there to extract free labor— they have something else, but still use v2 for what seems like most purposes

lud@lemm.ee · 2 months ago

the site

What site?

I assume it’s up to the website owner to implement V3 and not Google. V3 also has puzzles but only when it’s not sure. I rarely see capchas so I don’t really have anything to complain about.

xuv@lemmy.blahaj.zone · 2 months ago

I expect they mean the site google.com, because that’s been my experience. Whenever I get captcha’d there for using a VPN (which is getting more and more common), I always see the Maps image style captcha. Like 60% of the time it tells me I’m wrong anyway and I just give up.

lud@lemm.ee · 2 months ago

Alright, I don’t use google.com

thrawn@lemmy.world · 2 months ago

Yeah my b, I get captcha’d for VPN use. It’s almost always the “train our self driving car” one, and it tells me I’m wrong all the time too. Very frustrating

Forget security – Google's reCAPTCHA v2 is exploiting users for profit | Web puzzles don't protect against bots, but humans have spent 819 million unpaid hours solving them

Forget security – Google's reCAPTCHA v2 is exploiting users for profit | Web puzzles don't protect against bots, but humans have spent 819 million unpaid hours solving them

Google's reCAPTCHA v2 just labor exploitation, boffins say