This week IBM, Samsung, New York State, and Global Foundries announced a new high capacity silicon chip made with a combination of Silicon and germanium.
Are IBM et al, leading us in the right direction? As the width of connections on chips reach the atomic diameter of the individual atoms of the silicon connectors, EUV etch stations and change in deposition technology are just he tip of the CAPEX impact required to transition and follow the consortium’s lead. At approximately $2B per FAB cost in the near future, who can afford to follow? What ripples in the ecosystem of silicon equipment manufacturing will this cause and at the commodity pricing of today’s market can the ASPs tolerate this new move? Even though Intel mentions 7-Nano occasionally there seems to be no defined roadmap to get there. Consortiums and research are good things. However, we now have to figure out practical steps to get to the future the consortium has described.
When we look at the history of the PC industry, we see that while Moore’s Law is fantastic, it is always outpaced by consumer demand. Market expanding software solutions can be developed faster than hardware solutions to develop but are frequently performance constrained by the limits of running on general purpose processors. Eventually IHVs see a large enough market and have time for development of custom silicon to parallelize the process. This lag time between when the problem is first noticed and when it’s solved in silicon can be referred to as the “Wilson Gap” aphras coined by some Microsoft employees who worked with me and quoted my assessment as “Information consumer appetite/demand will always outpace CPU capability” which I stated in a meeting regarding complex computational transforms.
By doing a simple analysis of this “Wilson Gap” over a series of technologies we can see some very interesting patterns:
*Note: This illustration is based on 2011 estimates
The vertical axis represents the number of years a particular technology was on the market in software-only form before it was introduced in silicon as an ASIC (Application Specific Integrated Circuits). Based on this data I would like to postulate that companies like Microsoft & Google have direct bearing on these figures, and that in many cases they can significantly reduce the Wilson Gap. But first, let’s review the situation a little further.
How the SW Industry Fights the Wilson Gap
While the flexibility general purpose CPU offers imaginative engineers the ultimate design surface, it likewise has the inherent limitation that code must be reduced to a lowest common denominator, that being the CPU instruction set. Time and again, this limitation has caused a Wilson Gap in what consumers want and what the PC platform is able to inherently deliver.
For Many of Today’s Needs Moore’s Law is too Slow
As the previous graph illustrates, the Wilson Gap was a limiting factor in the potential market for specific technologies, when the CPU was not fast enough for the consumer demand of floating point operations. Likewise, at various times throughout PC history, the CPU has not kept up with demand for:
Digital Signal Processing (DSP)
3D Graphics
SSL Processing (encompassing 3DES, RSA, AES)
MPEGx Encoding/Decoding
Windows Media Encoding/Decoding
TCP/IP offloading
XML Parsing and Canonicalization
ASICs help reduce the Wilson Gap
When Moore’s Law is too slow we traditionally rely on ASICs to fill the Wilson Gap. In all of the examples above (Math Coprocessor, DSP, 3D, 3DES, RSA, MPG, etc…) we now have fairly low-cost ASICs that can solve the performance issue. Total time to solution and time to money are far too long for current industry economic conditions. These (ASIC) processors will typically accelerate a task, off-load a task or perform some combination of the two. But for the remainder of this paper we’ll use the term “accelerate” to include acceleration that encompasses CPU off-loading.
The Downside to ASIC Solutions
Unfortunately ASICs are inherently slow to market and are a very risky business proposition. For example, the typical ASIC takes 8 to 12 months to design, engineer and manufacture. Thus their target technologies must be under extremely high market demand before companies will make the bet and begin the technology development and manufacturing process. As a result, ASICs will always be well behind the curve of information consumer requirements served by cutting edge software.
Another difficulty faced in this market is that ASIC or Silicon Gate development is very complex, requiring knowledge of VHDL or Verilog. The efficient engineering of silicon gate-oriented solutions requires precision in defining the problem space and architecting the hardware solution. Both of these precise processes take a long time.
FPGAs further reduce the Wilson Gap
A newer approach to reducing the Wilson Gap that is gaining popularity is the use of Field Programmable Gate Arrays (or FPGAs). FPGAs provide an interim solution between ASICs and software running on a general purpose CPU. They allow developers to realign the silicon gates on a chip and achieve performance benefits on par with ASICs, while at the same time allowing the chip to be reconfigured with updated code or a completely different algorithm. Modern development tools are also coming on line that reduce the complexity of programming these chips by adding parallel extensions to the C language, and then compiling C code directly to Gate patterns. One of the most popular examples of this is Handel-C (out of Cambridge).
The Downside to FPGA Solutions
Typically FPGAs are 50% to 70% of the speed of an identical ASIC solution. However, FPGAs are more typically geared to parallelize algorithms and are configurable so as to received updates, and leverage a shorter development cycle (http://www.xilinx.com/products/virtex/asic/methodology.htm). These factors combine to extend the lifespan of a given FPGA-based solution further than an ASIC solution.
A Repeating Pattern
Looking at the market for hardware accelerators over the past 20 years we see a repeating pattern of:
First implemented on the general purpose CPU
Migrated to ASIC/DSP once the market is proven
Next the technology typically takes one of two paths:
The ASIC takes on a life of its own and continues to flourish (such as 3D graphics) outside of the CPU (or embedded back down on the standard motherboard)
The ASIC becomes obsolete as Moore’s Law brings the general purpose CPU up to par with the accelerator by the new including instructions required.
Now let’s examine two well known examples in the Windows space where the Wilson Gap has been clearly identified and hardware vendors are in the development cycle of building ASIC solutions to accelerate our bottlenecks.
Current Wilson Gaps
Our first example is in Windows Media 9 Decoding; ASIC hardware is on its way thanks to companies such as ATI, NVIDIA and others. This will allow the playback of HD-resolution content such as the new Terminator 2 WM9 DVD on slower performance systems. Another example here is in TCP Offload Engines (TOE); which have recently arrived on the scene. Due to the extensibility of both the Windows’ Media and Networking stacks, both of these technologies are fairly straightforward to implement.
Upcoming Wilson Gaps – Our Challenge
However, moving forward the industry faces other technologies which don’t have extensibility points for offloading or acceleration. This lack of extensibility has lead to duplication of effort across various product teams, but not duplication in a competitive sense (which is usually good), but more of a symbiotic duplication of effort, increasing the cost of maintenance and security.