Wednesday, 25 April 2012



Link to NVIDIA

The Impact of Process Technology on Kepler’s Efficiency

Posted: 25 Apr 2012 11:11 AM PDT


Kepler's impressive efficiency has been considered a major achievement by press and PC gamers, but a key part of the story has never been told.

As the leader of the engineering team that worked with TSMC for three years to manufacture Kepler, I'd like to shed some light on the impact of 28nm process technology on Kepler's efficiency.

The 28nm GeForce GTX 680 die is 294mm sq.

Kepler was an ambitious project because it introduced a new architecture at the same time as a new silicon process technology node. This is a bit like designing a new jet engine using exotic materials which are still in development. Much like the engineers at Pratt & Whitney, there was an intense focus on power efficiency, on delivering the best performance per energy unit (watts in our case, gallons of jet fuel in theirs).

The advancement that TSMC offered was a new optimized process technology. Kepler is manufactured using TSMC’s 28nm high performance (HP) process, the foundry's most advanced 28nm process which uses their first-generation high-K metal gate (HKMG) technology and second generation SiGe (Silicon Germanium) straining. HKMG is a process that uses a gate insulator film with a high dielectric constant which reduces power by reducing gate leakage compared to the previous generation SiON gate. SiGe straining is a chemical process to stretch the silicon atoms to improve the mobility or the effective frequency of the transistor. Both technical advances improve the performance per watt of the transistor translating to a more power efficient system.

Using TSMC's 28nm HP process enabled us to reduce active power by about 15 percent and leakage by about 50 percent compared to 40nm, resulting in an overall improvement in power efficiency of about 35 percent (see chart). Let me explain why this is so critical.

Today, the primary constraint on processor performance is the power consumption budget. So our goal is always to develop solutions that deliver the highest performance within a fixed power budget. Having a more efficient process enabled us to add more processing cores, thus increasing performance. Put simply, greater efficiency equals greater performance and optimal performance per watt.

To maximize the efficiency of 28nm (while developing a new architecture) required us to change our silicon process development model with TSMC. In previous process nodes we had worked independently—with TSMC preparing the process, and NVIDIA working on the design. TSMC engineers would do the best job making a volume process platform, and NVIDIA would implement our designs following the guidelines of process design rules and electrical performance.

For Kepler, we began working with TSMC three years before our product tape-out (when the processor design is complete and ready for manufacturing). Together we created a Production Qualification Vehicle (PQV) to allow the TSMC process engineers and our internal design engineers to optimize the process before the product tape-out. Through repeated prototyping, we were able to optimize both the process and design, creating a more efficient Kepler design rather than simply a chip in a standard 28nm process.

TSMC's 28nm HP process, seen here under an electron
microscope, is 30 percent smaller than 40nm and about
35 percent more energy efficient.

We're extremely proud of what we accomplished with Kepler. It combines NVIDIA's world-class GPU engineering with TSMC's very best 28nm process. But while Kepler was a key milestone, it is one point in a continuum. We continue to improve on what we developed and continue our collaboration with TSMC. In fact, we recently received our first version of an enhanced PQV for 20nm from TSMC. That process will yield even greater efficiency for NVIDIA's next next-generation GPUs.

Contest: What would you do with a petaflop supercomputer?

Posted: 25 Apr 2012 11:11 AM PDT


What if every university had a computing cluster capable of delivering a petaflop of performance (equivalent to about 20,000 typical laptops)? Better yet, what if every university department – biology, chemistry, physics, and others – had one?

I'm certain that this would fundamentally increase the pace of scientific discovery around the world.

"We will award early access to this
new Kepler-based Tesla GPU to the three best

Computing is now the third pillar of science, along with theory and experimentation. Giving every researcher access to a dedicated high performance computing facility could lead to breakthroughs in renewable energy, climate prediction, the development of lighter and tougher materials, or even pave the way to find cures for some of society's most devastating and persistent diseases.

By the end of 2012, NVIDIA will launch a Tesla GPU based on our new Kepler architecture that will enable any university in the world to build a petaflop supercomputer that will fit into nearly any university's data center and budget.

To support this launch, we are inviting proposals from researchers around the world on what you would do if you had access to a dedicated petaflop supercomputer. We will award exclusive early access to this new Kepler-based Tesla GPU to the three best proposals we receive.

What types of studies would you conduct?  Which scientific problems would you tackle?

To enter this contest, post a detailed answer to the above questions in the comments section. Be sure to include a link to your research, if available. Or if you prefer, email your submission to  You should also include your contact information (at least email address and affiliation) so that we can reach you if you win.

NVIDIA will use the proposals to highlight the potential of petaflop supercomputing, so please do not submit anything you consider confidential.

We will announce the top three entries at NVIDIA's upcoming GPU Technology Conference (GTC 2012) in May, and notify them directly.  Also, look for a blog post announcing the winning entries after GTC.

You can find the complete list of contest rules here.

Good luck!

No comments:

Post a Comment