Chat GPT appears to hallucinate or outright lie about everything

Buttflapper@lemmy.world · 18 days ago

Chat GPT appears to hallucinate or outright lie about everything

cheddar@programming.dev · 18 days ago

It’s incorrect to ask chatgpt such questions in the first place. I thought we’ve figured that out 18 or so months ago.

ABCDE@lemmy.world · 18 days ago

Why? It actually answered the question properly, just not to the OP’s satisfaction.

ramirezmike@programming.dev · 18 days ago

because it could have just as easily confidentiality said something incorrect. You only know it’s correct by going through the process of verifying it yourself, which is why it doesn’t make sense to ask it anything like this in the first place.

ABCDE@lemmy.world · 18 days ago

I mean… I guess? But the question was answered correctly, I was playing Beat Saber on my 1060 with my Vive and Quest 2.

ramirezmike@programming.dev · 17 days ago

It doesn’t matter that it was correct. There isn’t anything that verifies what it’s saying, which is why it’s not recommended to ask it questions like that. You’re taking a risk if you’re counting on the information it gives you.

medgremlin@midwest.social · 17 days ago

(To be fair, I did manage to run Half Life: Alyx and Beat Saber on a 1060)

aberrate_junior_beatnik@lemmy.world · 18 days ago

ChatGPT does not “hallucinate” or “lie”. It does not perceive, so it can’t hallucinate. It has no intent, so it can’t lie. It generates text without any regard to whether said text is true or false.

GetOffMyLan@programming.dev · 18 days ago

Hallucinating is the term for when ai generate incorrect information.

aberrate_junior_beatnik@lemmy.world · 18 days ago

I know, but it’s a ridiculous term. It’s so bad it must have been invented or chosen to mislead and make people think it has a mind, which seems to have been successful, as evidenced by the OP

GetOffMyLan@programming.dev · edit-2 18 days ago

At no point does OP imply it can actually think and as far as I can see they only use the term once and use it correctly.

If you are talking about the use of “lie” that’s just a simplification of explaining it creates false information.

From the context there is nothing that implies OP thinks it has a real mind.

You’re essentially arguing semantics even though it’s perfectly clear what they mean.

aberrate_junior_beatnik@lemmy.world · 18 days ago

OP clearly expects LLMs to exhibit mind-like behaviors. Lying absolutely implies agency, but even if you don’t agree, OP is confused that

It did not simply analyze the best type of graphics card for the situation

The whole point of the post is that OP is upset that LLMs are generating falsehoods and parroting input back into its output. No one with a basic understanding of LLMs would be surprised by this. If someone said their phone’s autocorrect was “lying”, you’d be correct in assuming they didn’t understand the basics of what autocorrect is, and would be completely justified in pointing out that that’s nonsense.

helenslunch@feddit.nl · 18 days ago

Well, you’re wrong. Its right a lot of the time.

You have a fundamental misunderstanding of how LLMs are supposed to work. They’re mostly just text generation machines.

In the case of more useful ones like Bing or Perplexity, they’re more like advanced search engines. You can get really fast answers instead of personally trawling the links it provides and trying to find the necessary information. Of course, if it’s something important, you need to verify the answers they provide, which is why they provide links to the sources they used.

ABCDE@lemmy.world · 18 days ago

Perplexity has been great for my ADHD brain and researching for my master’s.

conciselyverbose@sh.itjust.works · 18 days ago

Except they also aren’t reliable at parsing and summarizing links, so it’s irresponsible to use their summary of a link without actually going to the link and seeing for yourself.

It’s a search engine with confabulation and extra steps.

helenslunch@feddit.nl · 18 days ago

Except they also aren’t reliable at parsing and summarizing links

Probably 90%+ of the time they are.

so it’s irresponsible to use their summary

You missed this part:

if it’s something important

hamms@lemmy.world · 18 days ago

I think this article does a good job of exploring and explaining how LLM attempts at text summarization could be more accurately described as “text shortening”; a subtle but critical distinction.

conciselyverbose@sh.itjust.works · 18 days ago

90% reliability is not anywhere remotely in the neighborhood of acceptable, let alone good.

No, I didn’t miss anything. All misinformation makes you dumber. Filling your head with bullshit that may or may not have any basis in reality is always bad, no matter how low the stakes.

helenslunch@feddit.nl · 18 days ago

Agree to disagree, I suppose.

conciselyverbose@sh.itjust.works · edit-2 18 days ago

You can’t just handwave away your deliberate participation in making humanity dumber by shoveling known bullshit as a valid source of truth.

helenslunch@feddit.nl · 18 days ago

I guess it’s a good thing I’m not doing that, then.

conciselyverbose@sh.itjust.works · 18 days ago

Wasting a ridiculous amount of energy for the sole purpose of making yourself dumber is literally all you’re doing every single time you use an LLM as a search engine.

ABCDE@lemmy.world · 18 days ago

Yes and no. 1060 is fine for basic VR stuff. I used my Vive and Quest 2 on one.

vxx@lemmy.world · 17 days ago

I think we shouldn’t expect anything other than language from a language model.

linearchaos@lemmy.world · 17 days ago

I don’t want to sound like an AI fanboy but it was right. It gave you minimum requirements for most VR games.

No man Sky’s minimum requirements are at 1060 and 8 gigs of system RAM.

If you tell it it’s wrong when it’s not, it will wake s*** up to satisfy your statement. Earlier versions of the AI argued with people and it became a rather sketchy situation.

Now if you tell it it’s wrong when it’s wrong, It has a pretty good chance of coming back with information as to why it was wrong and the correct answer.

VinS@sh.itjust.works · 17 days ago

Well I asked some questions yesterday about classes of DAoC game to help me choose a starter class. It totally failed there attributing skills to wrong class. When poking it with this error it said : you are right, class x don’t do Mezz, it’s the speciality of class Z.

But class Z don’t do Mezz either… I wanted to gain some time. Finally I had to do the job myself because I could not trust anything it said.

linearchaos@lemmy.world · 17 days ago

God I loved DAoC, Play the hell of it back in it’s Hey Day.

I can’t help but think it would have low confidence on it though, there’s going to be an extremely limited amount of training data that’s still out there. I’d be interested in seeing how well it fares on world of Warcraft or one of the newer final fantasies.

The problem is there’s as much confirmation bias positive is negative. We can probably sit here all day and I can tell you all the things that it picks up really well for me and you can tell me all the things that it picks up like crap for you and we can make guesses but there’s no way we’ll ever actually know.

VinS@sh.itjust.works · 17 days ago

I like it for brainstorming while debbuging, finding funny names, creating stories “where you are the hero” for the kids or things that don’t impact if it’s hallucinating . I don’t trust it for much more unfortunately. I’d like to know your uses cases where it works. It could open my mind on things I haven’t done yet.

DAoC is fun, playing on some freeshard (eden actually, started one week ago, good community)

webghost0101@sopuli.xyz · 18 days ago

This is an issue with all models, also the paid ones and its actually much worse then in the example where you at least expressed not being happy with the initial result.

My biggest road block with AI is that i ask a minor clarifying question. “Why did you do this in that way?” Expecting a genuine answer and being met with “i am so sorry here is some rubbish instead. “

My guess is this has to do with the fact that llms cannot actually reason so they also cannot provide honest clarification about their own steps, at best they can observe there own output and generate a possible explanation to it. That would actually be good enough for me but instead it collapses into a pattern where any questioning is labeled as critique with logical follow up for its assistant program is to apologize and try again.

Tellore@lemmy.world · 18 days ago

I’ve also had similar problem, but the trick is if you ask it for clarifications without it sounding like you imply them wrong, they might actually try to explain the reasoning without trying to change the answer.

webghost0101@sopuli.xyz · 18 days ago

I have tried to be more blunt with an underwhelming succes.

It has highlighted some of my everyday struggles i have with neurotypicals being neurodivergent. There are lots of cases where people assume i am criticizing while i was just expressing curiosity.

gravitas_deficiency@sh.itjust.works · 18 days ago

The “i” in LLM stands for intelligence

ipkpjersi@lemmy.ml · 18 days ago

Don’t use them for facts, use them for assisting you with menial tasks like data entry.

maniclucky@lemmy.world · 18 days ago

Best use I’ve had for them (data engineer here) is things that don’t have a specific answer. Need a cover letter? Perfect. Script for a presentation? Gets 95% of the work done. I never ask for information since it has no capability to retain a fact.

Oka@sopuli.xyz · 17 days ago

If I narrow down the scope, or ask the same question a different way, there’s a good chance I reach the answer I’m looking for.

https://chatgpt.com/share/ca367284-2e67-40bd-bff5-2e1e629fd3c0

filister@lemmy.world · edit-2 18 days ago

And you as an analytics engineer should know that already? I am using some LLMs on almost a daily basis, Gemini, OpenAI, Mistral, etc. and I know for sure that if you ask it a question about a niche topic, the chances for the LLM to hallucinate are much higher. But also to avoid hallucinating, you can use different prompt engineering techniques and ask a better question.

Another very good question to ask an LLM is what is heavier one kilogram of iron or one kilogram of feathers. A lot of LLMs are really struggling with this question and start hallucinating and invent their own weird logical process by generating completely credibly sounding but factually wrong answers.

I still think that LLMs aren’t the silver bullet for everything, but they really excel in certain tasks. And we are still in the honeymoon period of AIs, similar to self-driving cars, I think at some point most of the people will realise that even this new technology has its limitations and hopefully will learn how to use it more responsibly.

bane_killgrind@slrpnk.net · 18 days ago

They seem to give the average answer, not the correct answer. If you can bound your prompt to the range of the correct answer, great

If you can’t bind the prompt it’s worse than useless, it’s misleading.

Brkdncr@lemmy.world · 18 days ago

TIL ChatGPT is taking notes off my ex.

Petter1@lemm.ee · 18 days ago

For such questions you need to use a LLM that can search the web and summarise the top results in good quality and shows what sources are used for which parts of the answer. Something like copilot in bing.

emmy67@lemmy.world · 18 days ago

Or, the words “i don’t know” would work

Petter1@lemm.ee · 17 days ago

I don’t think LLM can do that very well, since there are very little people on the internet admitting that they don’t know about anything 🥸😂

Funny thing is, that the part of the brain used for talking makes things up on the fly as well 😁 there is great video from Joe about this topic, where he shows experiments done to people where the two brain sides were split.

https://youtu.be/_TYuTid9a6k?si=PylqvQ24QHWw_6PN

emmy67@lemmy.world · 17 days ago

Funny thing is, that the part of the brain used for talking makes things up on the fly as well 😁 there is great video from Joe about this topic, where he shows experiments done to people where the two brain sides were split.

Having watched the video. I can confidently say you’re wrong about this and so is Joe. If you want an explanation though let me know.

Petter1@lemm.ee · edit-2 17 days ago

Yes please! Hope you commented that on Joe‘s Video so he can correct himself in a coming video

SomeGuy69@lemmy.world · 17 days ago

People would move to the competition LLM that does always provide a solution, even if it’s wrong more often. People are often not as logical and smart as you wish.

r_se_random@sh.itjust.works · 18 days ago

The copilot app doesn’t seem to be any better.

r_se_random@sh.itjust.works · 18 days ago

Petter1@lemm.ee · 17 days ago

At least it gives you links to validate the info it serves you I’d say. LLM can do nothing about bad search results, the search algorithm works a bit different and is its own machine learning process.

But I just recognised, that chatGPT as well can search the web, if you prompt in the right way, and then it will give you the sources as well

r_se_random@sh.itjust.works · 17 days ago

But that also discredits me from ever asking an LLM a question which I don’t already know the answer to. If I have to go through the links to get my info, we already have search engines for it.

The entire point of LLM with Web search was to summarise the info correctly which I have seen them fail at, continuously and hilariously.

Petter1@lemm.ee · 17 days ago

Yea, but I prefer just writing what I am thinking instead of keywords. And more often than not, it feels like I get to answer more quickly as if I just used a search engine. But of course, I bet there are multiple people, that find stuff faster on web search engines, than me with LLM, it is just for me the faster way to find what I search.

Kazumara@discuss.tchncs.de · 18 days ago

It did not simply analyze the best type of graphics card for the situation.

Yes it certainly didn’t: It’s a large language model, not some sort of knowledge engine. It can’t analyze anything, it only generates likely text strings. I think this is still fundamentally misunderstood widely.

leftzero@lemmynsfw.com · 18 days ago

I think this is still fundamentally misunderstood widely.

The fact that it’s being sold as artificial intelligence instead of autocomplete doesn’t help.

Or Google and Microsoft trying to sell it as a replacement for search engines.

It’s malicious misinformation all the way down.

Christer Enfors@lemm.ee · 17 days ago

Agreed. As far as I know, there is no actual artificial intelligence yet, only simulated intelligence.