Yeah. I mean I started reading that story and was thinking how cool it would be… Until it started going bad. Something like a GPS for whatever task you were doing at work would be cool.
Yeah. I mean I started reading that story and was thinking how cool it would be… Until it started going bad. Something like a GPS for whatever task you were doing at work would be cool.
You should check out the short story Manna. It’s maybe a bit dated now but explores what could go wrong with that sort of thing.
Ahh ok that makes sense. I think even with GPT4, it’s still going to be difficult for a non-programmer to use for anything that isn’t fairly trivial. I still have to use my knowledge of stuff to know the right things to ask. In Feb or Mar, you were using GPT3 (4 requires you to pay monthly). 3 is much worse at everything than 4.
I’m curious about this. What model were you using? A few people at my game dev company have said similar things about it not producing good code for unity and unreal. I haven’t seen that at all. I typically use GPT4 and Copilot. Sometimes the code has a logic flaw or something, but most of the time it works on the first try. I do at this point have a ton of experience working with LLMs so maybe it’s just a matter of prompting? When Copilot doesn’t read my mind (which tends to happen quite a bit), I just write a comment with what I want it to do and sometimes I have to start writing the first line of code but it usually catches on and does what I ask. I rarely run into a problem that is too hairy for GPT4, but it does happen.
In the case I mentioned, it was just a poorly aligned LLM. The ones from OpenAI would almost definitely not do that. That’s because they go through a process called RLHF where those sorts of negative responses get trained out of them for the most part. Of course there’s still stuff that will get through, but unless you are really trying to get it to say something bad, it’s unlikely to do something like in that article. That’s not to say they won’t say something accidentally harmful. They are really good at telling you things that sound extremely plausible but are actually false because they don’t really have any way of checking by default. I have to cross check the output of my system all the time for accuracy. I’ve spent a lot of time building in systems to make sure it’s accurate and it generally is on the important stuff. Tonight it did have an inaccuracy, but I sort of don’t blame it because the average person could have made the same mistake. I had it looking up contractors to work on a bathroom remodel (fake test task) and it googled for the phone number of the one I picked from its suggestions. Google proceeded to give a phone number in a big box with tiny text saying a different company’s name. Anyone not paying close attention (including my AI) would call that number instead. It wasn’t an ad or anything, just somehow this company came up in the little info box any time you searched for the other company.
Anyway, as to your question, they’re actually pretty good at knowing what’s harmful when they are trained with RLHF. Figuring out what’s missing to prevent them from saying false things is an open area of research right now, so in effect, nobody knows how to fix that yet.
That last bit already happened. An AI (allegedly) told a guy to commit suicide and he did. A big part of the problem is while GPT4 for instance knows all about all the things you just said and can probably do what you’re suggesting, nobody can guarantee it won’t get something horribly wrong at some point. Sort of like how self driving cars can handle like 95% of things correctly but that 5% of unexpected stuff that maybe takes some extra context that a human has and the car was never trained on is very hard to get past.
That’s possible now. I’ve been working on such a thing for a bit now and it can generally do all that, though I wouldn’t advise it to be used for therapy (or medical advice), but mostly for legal reasons rather than ability. When you create a new agent, you can tell it what type of personality you want. It doesn’t just respond to commands but also figures out what needs to be done and does it independently.
Etsy employee #3 or so here but haven’t worked there in more than a decade. Rob is a great guy, but I don’t think he could have grown Etsy the way it has. I’m sure some people will say that’s not a bad thing but my response is you probably wouldn’t know about Etsy if he stayed on.
I think on the whole, the new CEO has done more good than bad for the company. They’ve always had criticism of non handmade stuff being sold on there. I think they could do more to that end, and if the video is right that the new CEO is allowing non handmade stuff on there, I don’t agree with him on that. I haven’t seen that myself and I do still use the site. While he’s made other decisions I don’t agree with, encouraging sellers to do free shipping was a good move. Many buyers expect that thanks to Amazon. The fee increases while for sure had an impact on sellers bottom lines, don’t compare to what Amazon Handmade (if that still exists) and ebay charge (not to get into most other marketplaces like the app stores that charge 30%). The current CEO in my opinion understands Etsy way more than the other two they had after Rob was out.
Also in terms of Fred Wilson, she should have done a little more homework on him. He was one of the original investors. He understands Etsy. He’s also entitled to some return for making a very risky investment on 4 kids (they were like 20 when they started it). I haven’t spoken to Fred in some time so maybe he’s changed, but I doubt it.
Anyway, I don’t mean to be so negative about the video, but I also don’t think Etsy has lost its way as much as the video implies. Granted I am not a seller, just a user at this point.
Bing is GPT4 based, though I don’t think the same version as ChatGPT. But either way GPT4 can solve these types of problems all day.
Yeah. It still definitely does! The interesting thing is that it seems to be very good at estimating and the final answer it gives is usually pretty close to correct in my experience. Of course close doesn’t really count in math problems.
Not surprised. I got access to bard a while back and it does quite a lot more hallucinating than even GPT3.5.
Though it doubling down on the wrong answer even when corrected is something I’ve seen GPT4 do even in some cases. It seems like once it says something, it usually sticks to it.
I was going to say you could give it a math problem that uses big numbers but tried one on GPT4 and it succeeded. GPT3 though will absolutely fail at nontrivial math every time.
From ChatGPT 4:
Yes, the box is awesome.
According to the provided statements, the box is yellow and red.
“If the box is yellow, it is good.” - So, since the box is yellow, it is good.
“If the box is red, it is happy.” - And since the box is red, it is happy.
Finally, “If the box is good and happy, the box is awesome.” - Therefore, because the box is both good (due to being yellow) and happy (due to being red), the box is indeed awesome.
My girlfriend and I recently decided to watch every Arnold Schwarzenegger movie in order. We saw Hercules in New York this weekend. It was pretty amusing. They clearly shot all the mt Olympus scenes in central park because you can hear the traffic in the background and the occasional crying baby or what not.
Funny story… I switched to Home assistant from custom software I wrote when I realized I was reverse engineering the MyQ API for the 5th time and really didn’t feel like doing it a 6th. Just ordered some ratdgos.