I did a double take when I saw the block diagrams. Sixteen unified memory controllers ringing the die like a stadium crowd? If those are 32-bit each, you’re staring at a 512-bit bus on a mainstream gaming Radeon again. That’s the kind of spec that makes GPU nerds like me grin and immediately reach for a napkin to scribble bandwidth math. Sure, these are leaks and not gospel, but the shape of this rumor hits a nerve: AMD might be gearing up to bulldoze its bandwidth bottlenecks, feed a lot more compute, and actually push high-end ray tracing without wheezing. About time.
The source is AnandTech forum regular Kepler_L2, who has a solid track record (they pegged core counts and memory on the RDNA 4 top chip-what became the RX 9070 XT-months before it was public). This time, the diagrams show four chips under an “AT” codename umbrella (Alpha Trion, allegedly): a 96‑CU beast at the top (likely AT0), and three mortals beneath it-AT2 (40 CUs), AT3 (24 CUs), and AT4 (12 CUs). There’s even chatter that the flagship could eclipse the 9070 XT by 50%+, positioning it to square up against Nvidia’s rumored RTX 6080. All unconfirmed, obviously, but the architecture choices they’re hinting at are the most interesting part anyway.
My first impression wasn’t “ooh, big number go brrr.” It was more, “wait, is AMD actually prioritizing feeding the cores this time?” Because a fat memory bus changes the kind of GPU you can build. It’s not the only path to performance, but it’s one of the most bluntly effective ones—especially if you want to keep ray tracing and AI workloads from choking at high resolutions.
I’ve been daily-driving a 7900 XTX in one rig and a 4070 Ti Super in a small-form-factor build. The biggest difference I feel in modern games isn’t raw raster, it’s how quickly each card runs out of breath when the VRAM and bandwidth demands spike—think path tracing in Cyberpunk 2077, heavy texture packs, or modded Starfield. You can muscle through a lot with smarts (caches, compression, frame gen), but there are moments where raw throughput wins. A 512-bit bus is a sledgehammer.
Do some back-of-the-napkin math: even with last-gen GDDR6 at 24Gbps, 512-bit nets ~1.5TB/s of bandwidth (512 x 24 / 8 = 1536 GB/s). If GDDR7 at 28-32Gbps shows up on this card, you’re eyeballing ~1.8–2.0TB/s. That’s daft for a gaming GPU—daft in a very good way—especially if AMD keeps some form of Infinity Cache onboard. The cache reduces trips to VRAM; the fat bus slams the door on worst-case misses. It’s the “belt and suspenders” approach to memory starvation.
It took me a minute to understand why I’m so excited about the memory controllers rather than the 96-CU headline. CUs aren’t a universal yardstick. AMD’s been running dual-issue compute units since RDNA 3, and on paper you can double a lot of internal numbers without seeing double the frame rate. What clicks here is pairing more (and hopefully smarter) compute with a plan to keep them busy at 4K with lots of ray queries and big textures. If AMD wants to go toe-to-toe with Nvidia’s “80-class,” this is where you start.
Right now, AMD’s top RDNA 4 gaming card—the RX 9070 XT—trades punches with Nvidia’s 70-class and is great value in the right build. But there’s a void above it. Nvidia’s been dictating the halo narrative since the 4090, and even the 80-class has felt untouchable price-wise. Rumors that AMD’s RDNA 5 is targeting an “80-class” fight rather than a direct 90-class slugfest sound… sane. Win back mindshare where it’s still brutally competitive, not where every extra 5% performance costs a kidney in die area and power.
We’ve also seen this dance before: a generation lands big raster gains, RT still lags, AI upscaling moves the goalposts. If AMD’s flagship really is 50%+ over 9070 XT at stock—and there’s a rumored ~10% IPC/efficiency bump between like-for-like RDNA 4 and RDNA 5—that’s a recipe for comfortable 4K raster and “actually playable” RT settings without falling off a cliff. Whether it’s enough to match Nvidia’s latest RT hardware is another story, but the bandwidth and VRAM story would finally be unambiguously in AMD’s favor if these numbers are right.
Let’s decode CUs in practical terms. Each compute unit houses SIMD arrays for shaders, dedicated ray traversal hardware, and those AI/Matrix accelerators AMD’s been tucking in since RDNA 3. Ninety-six of them is 50% more than the 64-CU 9070 XT. But a simple 96 vs 64 comparison is misleading unless we know three things:
The diagram places six CUs behind each Render Backend (RB), which loosely maps to ROP throughput—the bits that finalize pixels. That ratio matters for 4K fill rate and post-processing. If AMD balanced ROPs better this gen, you’re less likely to find yourself memory and ROP-bound at high resolutions with heavy AA and post effects. I’ve hit those walls on both my 7900 XTX and 4070 Ti Super, just in different games for different reasons.
AMD’s RT hardware has improved generation-over-generation, but Nvidia still enjoys a comfortable lead in many titles—especially those with heavy path tracing or denoising pipelines tuned for CUDA. More CUs means more RT blocks, but RT scaling isn’t linear with CU count; traversal efficiency, cache behavior, and BVH builder performance all matter. The bright side of a 512-bit bus is that RT’s worst-case behavior—lots of random memory fetches—hurts less. If RDNA 5 also brings a smarter RT pipeline and better denoisers in drivers, this could be the first Radeon in a while where turning on RT doesn’t feel like dropping a gear on the highway.
On the AI side, AMD’s “AI accelerators” per CU are real, but the game ecosystem largely optimizes for DLSS today. FSR has come a long way, and the newer iterations are much improved, but DLSS still wins in consistency and image stability in many titles. If RDNA 5 adds stronger matrix throughput and more robust software support, great—that can help FSR’s quality ceiling and future AI-assisted features. But this is the one area where I’ll keep my skepticism hat on until I see frame-time charts and image comparisons. AI features are as much about software shaders and model quality as they are about silicon.
My biggest personal bias: I like headroom. I run texture mods. I poke at Blender. I tinker with Stable Diffusion and local LLMs when the gaming backlog guilt gets too loud. On my 7900 XTX (24GB), I can load stupidly high-res assets without worrying; on my 4070 Ti Super (16GB), I sometimes have to dial back mods or watch frametime spikes as the driver shuffles memory. A 32GB (or even 48GB) Radeon is catnip for that kind of mixed workload.
For pure gaming at 1440p, 16GB is still fine. But path tracing, 4K with high-quality RT, and emerging texture standards are merciless. If AMD puts 32GB on its flagship and keeps bandwidth high, it changes the vibe of “max everything and go” builds. It also opens the door for prosumer crossover—content creators and AI hobbyists who balk at workstation pricing but need more than 16–20GB. I’ve hit 20–22GB in some Stable Diffusion graphs; 24GB works, but 32GB would be a no-brainer comfort zone.
AT2 allegedly has 40 CUs and just six memory controllers (so ~192-bit), while AT3 has fewer CUs (24) but more controllers (eight, so ~256-bit). That’s not the typical “more compute = wider bus” we see in clean product stacks. My read: AMD is experimenting with bandwidth-to-compute ratios per price tier to avoid the failure mode we’ve seen too often—midrange parts getting crushed at higher resolutions because their VRAM and bandwidth are an afterthought.
If AT3 really gets a 256-bit bus at the 24‑CU tier, it could be a sleeper pick for 1440p ultra with better minimums than you’d expect, especially in newer RT-forward titles. And if the rumor about LPDDR5X ever materialized (say, for mobile or ultra-budget desktop variants), a wide bus could offset LPDDR’s latency/bandwidth characteristics enough to make it viable. I’m not betting my build on that, but I can squint and see the product planning logic.
I’ve learned to care less about the packaging religion and more about what the product lets me do. Chiplets can help with yield and cost; monolithic can help with latency and routing. The diagram calling out “unified memory controllers” doesn’t answer the packaging question either way. But the presence of 16 controllers implies a beefy board with lots of memory packages around the die—think 16 modules for 32GB (if 2GB chips) or 16 higher-density packages for 48GB. That’s a big-boy PCB and cooler no matter how you slice it.
Practical concern: 16 chips means more power and more heat spread across the board. Add in the likely 2.0–2.5GHz+ core clocks and you’re in triple-slot, heavy heatsink land, maybe with vapor chamber and a backplate that actually needs thermal pads. I don’t mind that; I just hope AMD nails fan acoustics and coil whine. Comparing my two rigs, the 7900 XTX I use is quieter under load, but the coil whine roulette is real on both sides of the aisle.
I will die on this hill: I want AMD’s next high-end card to adopt the 12V‑2×6 connector cleanly, not a mix of daisy-chained 8-pins with adapters. The cable drama era needs to end, and tidy cable management in modern cases is way easier with a single, well-engineered power plug. If this rumored flagship is genuinely “80-class competitive,” it’ll pull enough power that proper connectors and quality cables matter for both safety and aesthetics. Less spaghetti, more sanity.
I’ve had solid experiences with AMD drivers the last couple of years, especially for raster performance and features like HYPR‑RX and Anti‑Lag. The AV1 encoder is finally where it needs to be for streaming, and ReLive is fine for most creators. But Nvidia still has the better “it just works” story in a few pro workflows and in RT-heavy games at launch. If AMD really brings a bandwidth monster to market, they need to back it with killer day-one profiles, consistent frame pacing, and FSR quality that closes the remaining gap to DLSS in the tough scenes (vegetation, high-frequency detail, thin geometry).
The other piece is game dev relations. A 512-bit bus and 32GB VRAM can let engines pursue higher-res textures and more ambitious RT techniques. But engines don’t auto-magically scale. If AMD seeds hardware and tooling early and loudly, we might see a wave of titles where Radeon owners get more than just “it runs.” I want to see AMD showing up in patch notes with explicit optimizations called out—less marketing, more measurable wins.
If you’re sitting on an RX 6800 XT, RTX 3080, or newer and you’re happy at 1440p, I wouldn’t slam the brakes on your upgrade plans solely because of this rumor. Today’s 70/70‑Ti class cards are excellent at that resolution and often cheaper than launch. If you’re targeting 4K with RT enabled and you’re allergic to compromises, this is a “maybe wait” moment. A 32GB, bandwidth-rich Radeon could finally make 4K RT settings feel less like flipping a self-destruct switch, and it might put genuine price pressure on Nvidia’s next-gen stack.
For creators and AI hobbyists: if your work actually uses >16GB VRAM (you know who you are), the possibility of a 32GB flagship at a consumer price point is huge. Even if you don’t buy AMD, the existence of that card can move the whole market. I’ve done enough Blender cycles and SDXL workflows to know that extra VRAM doesn’t just make things “faster,” it makes them “possible” without resorting to tiling or CPU fallbacks.
I won’t pretend to know AMD’s BOM or margins. But given the parts list implied by 16 memory controllers and 32–48GB VRAM, the flagship won’t be cheap. My gut says AMD aims at the “80-class” price tier rather than 90-class—think a premium price, but not moonshot. The midrange AT2 and AT3 feel like the make-or-break value plays. If AT3 ships with a 256-bit bus at mainstream pricing, a lot of gamers who are sick of VRAM roulette will flock there. AT4 is the volume card that needs to be efficient, quiet, and “just run eSports without drama.”
This rumor thread gets tangled because AMD’s recent CUs are dual-issue. Some people see “48” when the diagram shows “24,” because they’re thinking about two instructions per clock per CU—functionally “doubling” parts of the CU. The clean way to think about it: count CUs as blocks, not issue lanes. Doubling issue width is great for throughput if the workload feeds it, but it’s not the same as doubling CUs with their own fixed-function units and caches. If AT3 truly has 24 CUs behind a 256-bit bus, it could still punch above its weight in bandwidth-bound games.
If the AT0 flagship really ships with 32GB VRAM and a 512-bit bus, that’s my next 4K build anchor—no question. I’d pair it with a high-airflow case (think front mesh, at least two 140mm intakes), a quality 850–1000W PSU with native 12V‑2×6, and a CPU that won’t bottleneck in simulation-heavy games (a modern 8C/16T+ chip). If I’m targeting the best price-to-sanity ratio at 1440p, AT3 looks compelling on paper: fewer CUs but a wider bus for stable minimums, ideally packing 12–16GB VRAM. I’d throw that into a compact case with a 650–750W PSU and call it a day.
It’s the coherence. A lot of leaks scream “more cores!” and stop. This one whispers “more cores, yes—but also the memory system to feed them, and the backend to finish the pixels.” The diagram’s block-level balance—CUs per RB, central command/scheduler, the L2, the ring of memory controllers—reads like a team optimizing for practical next-gen gaming workloads, not just chasing synthetic benchmarks. That’s why I’m optimistically skeptical rather than eyeroll skeptical.
Do I think this flagship needs GDDR7 to shine? Not necessarily. GDDR6 at 24Gbps on a 512-bit bus is already bonkers. GDDR7 would push it into “why are we even arguing” territory, but it’s not a requirement for a great card. What matters more is how the cache hierarchy, compression, and memory timings play together. If AMD nails the balance, you won’t be able to tell which memory type it uses from the seat of your chair—you’ll just notice the frame-time graphs are flat and the settings sliders stay to the right.
If AMD shows up with a credible “80-class” challenger and aggressive midrange cards that don’t skimp on VRAM, the used market is going to get spicy. Expect a wave of 24GB 7900-series cards at better prices, which is great news for creators and modders. Nvidia owners hovering on the fence might wait to see how the RT story shakes out, but even then, competition tends to nudge prices and bundles. I still remember when 8GB was “enough” for 1440p and how quickly that changed once games had room to stretch.
Is the 512-bit bus confirmed?
Not at all. The diagrams show 16 memory controllers; if they’re 32-bit each, that’s 512-bit. Controller width isn’t confirmed.
Will it use GDDR7?
Unknown. Even fast GDDR6 would be monstrous on a 512-bit bus. GDDR7 would be icing.
Is 32–48GB VRAM realistic?
Yes, with 16 memory packages it’s plausible depending on chip densities. Whether that’s the default SKU or a premium variant is the real question.
How does 96 CUs compare to Nvidia’s cores?
Different architectures, different math. The 50%+ claim vs RX 9070 XT is the meaningful takeaway—not a 1:1 core count duel.
Should I wait?
If you want 4K with RT and lots of headroom, and you’re not in a rush, waiting makes sense. For 1440p today at a good price, current 70/70‑Ti class cards remain great buys.
I’m not crowning a paper champion. But if these block diagrams are even directionally accurate, AMD is finally building the kind of GPU I’ve wanted from them since the first RDNA: compute that’s properly fed, VRAM that’s future-proof, and a product stack that looks tuned for the way we actually play and create in 2025. The specs are spicy; the execution is what matters. Your move, Radeon.
Cautiously optimistic. A 96‑CU, bandwidth-heavy RDNA 5 flagship with 32–48GB VRAM could put AMD back in the “80-class” fight and force Nvidia to sharpen its pencils. The midrange shapes up to be the real value story if the wider buses stick. All eyes on power, RT/software polish, and pricing.
Get access to exclusive strategies, hidden tips, and pro-level insights that we don't share publicly.
Ultimate Tech Strategy Guide + Weekly Pro Tips