It’s fine if the photo is either shopped or a false-perspective illusion. It could be even a drawing. The idea is that this sort of picture imposes a lot of barriers for the bot in question:
must be able to parse language
must be able to recognise objects in a picture, even out-of-proportion ones
must be able to guesstimate the size of those objects, based on nearby ones
must handle RW knowledge, as “X only fits Y if X is smaller than Y”
must handle hypothetical, unrealistic scenarios, as “what if there was a kitty this big?”
Each of those barriers decrease the likelihood of a bot being able to solve the question.
It’s fine if the photo is either shopped or a false-perspective illusion. It could be even a drawing. The idea is that this sort of picture imposes a lot of barriers for the bot in question:
Each of those barriers decrease the likelihood of a bot being able to solve the question.