The Rise and Fall of Multi-GPU Graphics Cards – TechSpot

When it comes to graphics cards, more is nearly always much better. What if you can only make the chips so huge?

Simple: include another chip! (not so easy as you will rapidly learn.) Heres a short stroll through the story of multi-GPU graphics cards, the true giants of performance, power, and price.

The Dawn of the Dual

It wasnt always like this. A few of the really first 3D graphics cards sported multiple chips, although these werent truly GPUs. 3dfxs Voodoo 1 launched in 1996 had two processors on the circuit board, however one just dealt with the textures, and the other mixing pixels together.

Like so numerous early 3D accelerators, a separate card altogether was required for 2D work. Companies such as ATI, Nvidia, and S3 focused the development of their chips to incorporate all of the individual processors into a single structure.

The Oxygen chips would turn the vertices into triangles, rasterize the frame, and after that texture and color the pixels (read our rendering 101 guide for an overview of the procedure). However why four of them? Why didnt 3DLabs just make one huge, super powerful chip?

All of the estimations and data handling required to speed up 2D, 3D, and video processing are all done by the one chip. The only other ones youll discover are DRAM modules, specifically for the GPU to utilize, and some voltage controllers.

Before we head back in time to the beginning of our story, lets analyze how the vast bulk of graphics cards are equipped these days. On the circuit board, after youve removed off the cooling system, youll find an extremely big chunk of silicon– the graphics processing system (GPU).

When the powerhouse of the professional rendering industry, 3DLabs developed their reputation on monstrous devices, such as the Dynamic Pictures Oxygen 402, as shown below.

This card has two big chips, bottom left, for handling the 2D processing and video output, and then four accelerators (hidden below heatsinks) for all of the 3D workload. In those days, vertex processing was done on the CPU, which then passed the remainder of the rendering on to the graphics card.

Image: Wikipedia

The Voodoo 1– incredible power for its time. Image: VGA Museum

The more pieces of silicon a card sported, the more costly it was to produce, which is why consumer-grade models quickly switched to single chips just. Nevertheless, expert graphics cards of the same age as the Voodoo 1 frequently took a multi-chip approach.

What Multi-GPU Offers

To comprehend why 3DLabs chose to choose many processors, lets take a broad summary of the procedure of showing a 3d and producing image. In theory, all of the estimations required can be done using a CPU, but theyre developed to manage random, branching jobs issued in a linear manner.

3D graphics is a lot more uncomplicated, however many of the stages require a big quantity of parallel work to be done– something thats not a CPUs strength. And if its connected up dealing with the making of a frame, it cant truly be used for anything else.

This is why graphics processors were developed– the preparatory work for a 3D frame is still done using a CPU, however the math for the graphics itself is done on a highly specialized chip. The image below represents the timeline for a sequence of 4 frames, where the CPU produces the tasks required at set intervals.

And theres still a noteworthy delay between the very first frame being released and it appearing on the screen– all caused by the truth that one GPU is still needing to process the entire frame.

With this technique, each frame is processed far quicker, lowering the delay between the CPU releasing the work and it being shown. The general frame rate might not be any much better than when utilizing AFR (possibly even a little even worse), but it is more consistent.

Another way of looking at it is that the GPUs frame rate is lower than the CPUs rate. A more powerful graphics chip would clearly decrease the time required to render the frame, but if there are engineering or manufacturing limits to how good you can make them, what are your other alternatives?

Well, there are two: (1) utilize another GPU to begin on the next frame, while the other is still processing the very first one, or (2) split the workload of a frame throughout several chips. The first approach is normally called alternate frame making or AFR, for short.

The other approach involves splitting the rendering jobs throughout 2 or more GPUs, sharing out areas of the frame in blocks (split frame rendering) or varying lines of pixels (as used with the Dynamics Oxygen card).

Both of these strategies can be carried by utilizing several graphics cards instead of several processors on one card– technologies such as AMDs CrossFire and Nvidias SLI are still around, but have actually fallen greatly out of favor in the basic customer market.

The above diagram shows approximately how this works in practice. You can see that the time gap between frames appearing on the monitor is smaller sized, compared to utilizing just one GPU. The total frame rate is much better, although its still slower than the CPUs.

Those guidelines and information about what data is required is then issued to the graphics processor to grind through. If this takes longer than the time required for the next frame to be set up, then there will be a delay in the showing of the next frame up until the first one is finished.

A modest and standard dual graphics card setup

Go Into The Dragon( s).

Nevertheless, for this post, were just interested in multi-GPU items– graphics cards packing two or more processors, so lets go into them.

3DLabs hulking multi-GPU cards were incredibly effective, however also painfully pricey– the Oxygen 402 retailed at $3,695, almost $6k in todays money! There was one business that provided a product sporting 2 graphics chips at an economical price.

The MAXX sported two of their twin pipeline Rage 128 Pro chips, with each getting 32 MB of SDRAM to deal with. It utilized the AFR technique to press frames out, however it was significantly outshined by Nvidias GeForce 256 DDR and cost $299, about $20 more than its competitor.

Roughly a year after the look of ATIs MAXX, they brought the Voodoo 5 5500 to the masses, selling at simply under $300.

Another business likewise thinking about exploring multi-GPU products was 3dfx. They had currently originated a method of linking two graphics cards together (understood as scan line interleaving, SLI) with their earlier Voodoo 2 designs.

Incidentally, it was the GeForce 256 that was the very first graphics card to be promoted as having a GPU.

In 1999, 2 years on from the Oxygen 402, ATI Technologies launched latest thing Fury MAXX. This Canadian fabless business had actually been in the graphics business for over 10 years by this point, and their Rage series of chips were preferred.

ATIs Rage Fury MAXX– remember when coolers were always this tiny? Image: Wikipedia.

The term itself had actually been in circulation before this card appeared, but if we assume a GPU to be a chip that deals with all of the computations in rendering sequence (vertex transforms and lighting, rasterization, pixel and texturing mixing), then Nvidia was certainly the very first to make one.

3dfxs Voodoo 5 5500.

The VSA-100 chips on the board were twin pipelined like the Rage 128 Pros, but supported more functions and had a broader memory bus. Sadly the product was late to market, and not entirely problem free; even worse still, it was just a little better than the GeForce 256 DDR and rather a lot slower than its follower, the GeForce 2 GTS (which was also less expensive).

While it significantly enhanced how data could be transferred to and from the card, by having a direct connection to the system memory, the user interface wasnt designed to have numerous gadgets using it.

Even XGI Technology, a spin-off from the venerable chipset company SiS, attempted to join the show their Volari Duo V8 Ultra cards. Sadly, despite the early pledge of the hardware, the efficiency didnt match up to the items ultra cool name!

ATI continued to experiment with dual GPU products, although few were ever publicly released, and 3dfx was ultimately purchased by Nvidia before they had opportunity to enhance their VSA-100 chips. Their SLI technology was integrated into Nvidias graphics cards, although only in name– Nvidias version was rather different beneath the hood.

Multi-GPU cards either needed an extra chip to act as the device on the AGP, manage the data streams to the GPUs, or as in the case of the Voodoo 5, one of the GPUs would deal with all of those tasks. Typically, this developed issues around bus stability and the only method to get around the concerns would be to run the interface at a lower rate.

The Volari Duo did absolutely nothing to save XGIs fast death, and ATI had even more success with their single GPU items, such as the Radeon 9800 XT and 9600 Pro.

While some of the efficiency deficit might be blamed on the graphics processors themselves, the user interface used by the card didnt assist. The Accelerated Graphics Port (AGP) was a specialized version of the old PCI bus, created solely for GPUs.

A collection of XGI Volari Duos.

You d think that everybody would just provide up– after all, who would desire to try to sell a pricey, underwhelming graphics card?

After briefly messing around with double GPU cards in 2004 and 2005, the California-based graphics huge released more serious efforts in 2006: the GeForce 7900 and 7950 GX2.

Nvidia, thats who.

The Race for Excess.

Their technique was unconventional, to say the least. Rather than taking two GPUs and fitting them to a single circuit board, Nvidia essentially took 2 GeForce 7900 GT cards and bolted them together.

2 GPUs, 2 PCBs– the GeForce 7900 GX2. Image: Wikipedia.

In 2015, Microsoft launched Direct3D 12, a graphics API utilized to simplify the shows of game engines. Among the brand-new functions provided was enhanced support for numerous GPUs, and while it could potentially remove a lot of the concerns, it requires to be fully implemented by the developers.

Priced at a totally unreasonable $1,399, it had a thermal style power (TDP) ranking of 580 W. To put that into some kind of perspective, our 2015 test system for graphics cards was drawing less than 350 W for the whole setup– including the graphics card.

Power needs had been a problem for AMDs double GPU cards for a variety of years, but they reached insanity levels in 2015 with the Radeon R9 390 X2.

The GeForce GTX Titan Z, despite its enormous single card performance, was a workout in hubris and greed. Being available in at just shy of three thousand dollars, nothing about it made any sense whatsoever.

The 2014 Titan Z boasted peak theoretical figures of 5 TFLOPS of FP32 computation rate and 336 GB/s of memory bandwidth, to call just 2. Simply 4 years later on, Nvidia released the GeForce 2080 Ti which boasted worths of 13.45 TFLOPS and 616 GB/s for the very same metrics– on a single chip, and for less than half the cost of the Titan Z.

Nvidia GeForce GTX 295.

Many rendering passes and on-the-fly resources are needed for a contemporary game.

Simply put, Nvidia was charging you more than double the rate for the Titan Z! Once again, its only conserving grace was that the maximum power draw was less than 400 W. Of course, this isnt a perk at all when the remainder of the product is so outrageous, however it was much better than the competition.

At $999, the GeForce GTX 690 was not only twice the cost of their next-best offering, the GeForce GTX 680, but it performed exactly the same as 2 680 cards linked together (both setups using a customized SLI approach)– and worse than 2 AMD Radeon HD 7970s (which used AFR).

Despite launching at $599, the similarity the 7950 GX2 might often be discovered less expensive than that, and more significantly, it was more economical than purchasing 2 different 7900 GTs. It likewise showed to be fastest consumer graphics card on the marketplace, at the time.

Goodbye and thanks for all the frames.

Shopping Shortcuts:.

Soul Calibur, Thief, Unreal Tournament– 3 timeless late 90s games.

For the next 6 years, ATI and Nvidia combated for the GPU efficiency crown, releasing various double processor models, of differing costs and performance. Some were really expensive, such as the Radeon HD 5790 at $699, but they constantly had the speed chops to support the price.

Sporting the same GPUs as found on the GeForce GTX 780 Ti, but clocked a little slower to keep the power usage down, it carried out no better than 2 of those cards running in SLI– and their launch rate was $699.

Excess defined– the $2,999 Nvidia Titan Z.

This is where the frame rate drops right down, for a really quick time period, before recuperating up and after that repeating this pattern throughout the scene. Its so quick that its typically difficult to get, even through careful benchmarking, however its clearly noticeable during gameplay.

And its not simply the games that require additional work to make them effectively use numerous GPUs. Roughly 8 years back, AMD and Nvidia began to present CrossFire/SLI profiles to their chauffeurs. These were nothing more than hardware setups that were triggered upon spotting a specific title, but over the years, they started to expand their function– for example, specific shaders may be replaced before being put together by the driver, in order to minimize issues.

Fewer is more.

The user of a multi-GPU card (or multiple cards, for that matter) will probably experience a greater total frame rate, compared to using simply one GPU, however the boost in distinction in between the frames manifests in the form of micro stuttering.

Exit The Dragon( s).

Its only real benefit was power consumption, being around 300 W at the majority of– on the other hand, just one HD 7970 utilized up to 250 W. The escalating dollar tags and energy requirements hadnt rather reached their zenith.

One of the last multi-GPU graphics cards– AMDs Radeon Pro Duo.

Some were rather fairly priced: Nvidias GeForce GTX 295, for instance, introduced at $500. Not excessively budget friendly in 2009, however considered that then year-old GTX 260 had an MSRP of $450, and the reality that the GTX 295 outshined two of them made it a relative deal.

Micro stuttering is a problem intrinsic to multi-GPU systems and while there are numerous tricks that can be utilized to decrease its effect, its not possible to remove it completely.

For a time, they served a niche market really well, but the extreme power needs and jaw dropping costs are something that nobody desires to see once again.

By this phase, Nvidia had bailed out of the multi-GPU race entirely, and even AMD just attempted with a few more models, many of which were intended at the expert market.

However certainly there must still be a market for them? Nvidias $1,000+ GeForce 2080 Ti offered really well for such a costly card, so it cant simply be a problem of cost. Its a comparable scenario with power: the Radeon Pro Duo from 2016 had a TDP of 350 W, 40% less than the R9 380 X2.

Some of the really first 3D graphics cards sported several chips, although these werent really GPUs. Nvidias $1,000+ GeForce 2080 Ti offered very well for such a costly card, so it cant simply be a concern of rate. Are quad or double GPU cards gone for excellent?

The real killer of multi-GPU cards isnt the software requirements nor the micro stuttering: its the fast development of single chip models thats taken their thunder.

Innovation races typically cause excess, though, and while enthusiast-level graphics cards have constantly been draining pipes on the wallet, Nvidia took it to a whole new level in 2012.

Its Not You, Its Me.

Nvidias GeForce GTX 690 stripped bare. Image: Toms Hardware.

By this time, AGP had actually been replaced by PCI Express, and in spite of this interface still being point-to-point, the use of a PCIe switch enabled multiple gadgets to be quicker supported on the same slot. The outcome of this is that multi-GPU cards could now run with far more stability than they ever might on the older AGP.

Up to that point, double GPU cards usually comprised of GPUs from the lower section of the top-end spectrum, to keep power and heat levels in check. They were also priced in such a method that, while clearly setting them apart from the rest of product range, the cost-to-performance ratio could be justified.

Offered how complex a modern-day title is, the additional task of changing the engine to much better utilize a dual GPU card is not likely to be used up numerous teams– the user base for such products is going to be really little.

Yes, it truly does have four 8 pin PCIe ports … the 580W AMD Radeon R9 390 X2.

Consoles such as the Sega Dreamcast had titles with some seriously impressive graphics, however PC games were generally more muted. All of them used comparable rendering methods to produce the visuals: simply lit polygons, with one or 2 textures applied to them. Graphics processors then were limited in their abilities, so the graphics needed to do the same.

And yet, the just one you can now buy can be found in Apples Mac Pro– an AMD Radeon Pro Vega II Duo, for an eye-watering $2,400. The factor behind the obvious death of the multi-GPU graphics card lies not in the products themselves (although that does contribute), but in how theyre utilized.

When popular Fury MAXX appeared in 1999, real 3D video games (using polygons and textures) were still fairly new. id Softwares Quake was only 3 years old, for example.

Its the same story with AMDs items. The Radeon RX 5700 XT launched in 2019, five years after the look of the R9 380 X2, and for simply $400, you got 91% more FP32 throughput and 30% more bandwidth.

When it comes to graphics cards, more is nearly constantly better. Heres a brief walk through the story of multi-GPU graphics cards, the real giants of performance, price, and power.

The result of all these graphical improvements was that the workload for a GPU become increasingly more variable.

However as the hardware began to advance, designers started to use more intricate strategies. A single 3D frame might require numerous rendering passes to produce the last image, or the contents of one pass may be used in other frames.

Utilizing multi-GPUs better in Direct3D 12. Image: Nvidia.

Keep Reading. Hardware Features at TechSpot.

So are quad or double GPU cards chosen great? Most likely, yes. Even in the professional workstation and compute markets, theres little require them, as its far simpler to change a single malfunctioning GPU in a cluster, than having to lose several chips at the same time.

Nvidia focused greatly on improving their single GPU items after the GTX 690, however in the middle of 2014, they offered us what would be their last multi-GPU offering– in the meantime, a minimum of.

Checking out the release notes from any set of modern-day drivers will plainly show the addition of new multi-GPU profiles and staying bugs associated with such systems.

Related Posts

Why the company behind Pokémon Go is getting crypto-curious – The Verge

‘Quantum Internet’ Inches Closer With Advance in Data Teleportation – The New York Times

NZXTs Best Case Yet: H7 Flow Review (Thermals, Cable Management, & Noise) – Gamers Nexus