Companies are going all-in on artificial intelligence right now, investing millions or even billions into the area while slapping the AI initialism on their products, even when doing so seems strange and pointless.
Heavy investment and increasingly powerful hardware tend to mean more expensive products. To discover if people would be willing to pay extra for hardware with AI capabilities, the question was asked on the TechPowerUp forums.
The results show that over 22,000 people, a massive 84% of the overall vote, said no, they would not pay more. More than 2,200 participants said they didn’t know, while just under 2,000 voters said yes.
But instead of relying on the GPU to power it the dedicated AI chip did the work. Like it had it’s own distinct chip on the graphics card that would handle the upscaling.
I forget who demoed it, and searching for anything related to “AI” and “upscaling” gets buried with just what they’re already doing.
That’s already the nvidia approach, upscaling runs on the tensor cores.
And no it’s not something magical it’s just matrix math. AI workloads are lots of convolutions on gigantic, low-precision, floating point matrices. Low-precision because neural networks are robust against random perturbation and more rounding is exactly that, random perturbations, there’s no point in spending electricity and heat on high precision if it doesn’t make the output any better.
The kicker? Those tensor cores are less complicated than ordinary GPU cores. For general-purpose hardware and that also includes consumer-grade GPUs it’s way more sensible to make sure the ALUs can deal with 8-bit floats and leave everything else the same. That stuff is going to be standard by the next generation of even potatoes: Every SoC with an included GPU has enough oomph to sensibly run reasonable inference loads. And with “reasonable” I mean actually quite big, as far as I’m aware e.g. firefox’s inbuilt translation runs on the CPU, the models are small enough.
Nvidia OTOH is very much in the market for AI accelerators and figured it could corner the upscaling market and sell another new generation of cards by making their software rely on those cores even though it could run on the other cores. As AMD demonstrated, their stuff also runs on nvidia hardware.
What’s actually special sauce in that area are the RT cores, that is, accelerators for ray casting though BSP trees. That’s indeed specialised hardware but those things are nowhere near fast enough to compute enough rays for even remotely tolerable outputs which is where all that upscaling/denoising comes into play.
Nvidia’s tensor cores are inside the GPU, this was outside the GPU, but on the same card (the PCB looked like an abomination). If I remember right in total it used slightly less power, but performed about 30% faster than normal DLSS.
Found it.
https://www.neowin.net/news/powercolor-uses-npus-to-lower-gpu-power-consumption-and-improve-frame-rates-in-games/
I can’t find a picture of the PCB though, that might have been a leak pre reveal and now that it’s revealed good luck finding it.
Having to send full frames off of the GPU for extra processing has got to come with some extra latency/problems compared to just doing it actually on the gpu… and I’d be shocked if they have motion vectors and other engine stuff that DLSS has that would require the games to be specifically modified for this adaptation. IDK, but I don’t think we have enough details about this to really judge whether its useful or not, although I’m leaning on the side of ‘not’ for this particular implementation. They never showed any actual comparisons to dlss either.
As a side note, I found this other article on the same topic where they obviously didn’t know what they were talking about and mixed up frame rates and power consumption, its very entertaining to read
I’ve been trying to find some better/original sources [1] [2] [3] and from what I can gather it’s even worse. It’s not even an upscaler of any kind, it apparently uses an NPU just to control clocks and fan speeds to reduce power draw, dropping FPS by ~10% in the process.
So yeah, I’m not really sure why they needed an NPU to figure out that running a GPU at its limit has always been wildly inefficient. Outside of getting that investor money of course.
Ok, i guess its just kinda similar to dynamic overclocking/underclocking with a dedicated npu. I don’t really see why a tiny 2$ microcontroller or just the cpu can’t accomplish the same task though.