Monolithic 3D Inc., the Next Generation 3D-IC Company
  • Home
  • Technology
    • Technology
    • Papers, Presentations and Patents
    • Overview >
      • Background
      • Why Monolithic 3D?
      • Paths to Monolithic 3D
      • Applications
    • Ion-Cut: The Building Block
    • Monolithic 3D Logic >
      • RCAT
      • HKMG
      • Laser Annealing
      • RCJLT
      • 3D Embedded RAM
      • 3D Gate Array
      • FPGA
      • Ultra Large Integration - Redundancy and Repair
    • Monolithic 3D Memory >
      • 3D DRAM
      • 3D Resistive Memories
      • 3D Flash
    • Monolithic 3D Electro-Optics >
      • 3D Image Sensors
      • 3D Micro-Displays
  • 3D-IC Edge
    • 3D-IC Edge
  • News & Events
    • News & Events
    • S3S15 Game Change 2.0 Video/P
    • Webcast
    • Webinar
    • Press Releases
    • In the News
    • Upcoming Events
  • About Us
    • About Us
    • History
    • Team
    • Careers
    • Contact Us
  • Blog
  • Simulators

A 1,000x Improvement in Computer Systems by Bridging the Processor-Memory Gap

5/23/2018

0 Comments

 
Picture
We have a guest contribution from Zvi Or-Bach, the President and CEO of MonolithIC 3D Inc.
​For over 4 decades the gap between computer processing speed and memory access has grown at about 50% per year, to more than 1,000x today. This provides an excellent opportunity to enhance the single-core system performance. An innovative 3D integration technology combined with re-architecting the integrated memory device is proposed to bridge the gap and enable a 1,000 x improvement in computer systems.
ABSTRACT
For over 4 decades the gap between computer processing speed and memory access has grown at about 50% per year, to more than 1,000x today. This provides an excellent opportunity to enhance the single-core system performance. An innovative 3D integration technology combined with re-architecting the integrated memory device is proposed to bridge the gap and enable a 1,000 x improvement in computer systems. The proposed technology utilizes processes that are widely available and could be integrated in products within a very short time.
Keywords—processor-memory gap; 3D memory;
1. INTRODUCTION
The Fig. 1 illustrates the growing gap between processing and memory access [1,2]
Picture
​The source for this gap is directly related to the gap between transistor performance progress vs. on-chip interconnect delay as illustrated in Fig. 2 [3,4]
Picture
In most computer systems the processor is being bought from a processor vendor (Intel, AMD, Nvidia,…) while the memory is being bought from memory vendors (Micron, Samsung, Hynix,…). In many cases multiple memory devices are being integrated into a memory module, often called DIMM (such as Fig. 3), which is then integrated into a computer system with the processor(s). This integration is driven by the very different semiconductor technology knowhow, and manufacturing infrastructure required, for processors vs. memories.
Picture
​Printed circuit board (‘PCB’) is used to connect processor to memory adding significantly to the ‘gap,’ as illustrated in Fig. 4 [5]
Picture
​This gap has been articulated in many forms including in a presentation titled “Why we need Exascale and why we won't get there by 2020” [6] which included Figs. 5. There is a tremendous power cost in moving data to and from an off-chip memory, and even an ‘integrated’ memory.
Picture
2. PREVIOUSLY PROPOSED SOLUTIONS
A joint research by teams from Stanford University, Berkeley and Carnegie Mellon was summarized in their paper “Energy-Efficient Abundant-Data Computing: The N3XT 1,000×” illustrated in Fig. 6 [7].
Picture
​Yet it is very unclear if and when any of the proposed new technologies – CNT, RRAM, STT-MRAM - would be available and in high volume production and ready to be used for manufacturing of computer systems.
​
3D integration using TSV has been considered as another attractive option to bridge the gap. A paper by D. H. Woo concluded “On average, for single-threaded memory intensive applications, the speedups range from 1.53 to 2.14 compared to a conventional 2D architecture” [8]. This is far less than the monolithic 3D work done by Stanford. In Fig. 12 Micron provides a chart comparing TSV vs. DDR3. Micron has formed an industry consortium named Hybrid Memory Cube – “HMC,” to leverage TSV for bridging the memory gap.
Picture
​A table comparing 2.5D and TSV memory stacking technologies is presented in Fig. 7.
​
While the various approaches offer clear performance improvements, they are far from what is suggested by the N3XT factor of 1,000x. It would seem that the inherent limitations (density, delay) of the TSV technology are the limiting factors. We therefore propose a novel 3D integration that offers more than 1,000x better vertical connectivity to enable comprehensive bridging of the processor memory gap, while still utilizing widely available commercial processes that make it attractive for fast industry adoption.
​
​3. BRIDGING THE GAP WITH MONOLITHIC 3D INTEGRATION
In TSV technology the layer thickness of each layer in the stack is tens of microns (~50 μm) and the via through each stacked layer is about 5 microns in diameter. Monolithic 3D integration enables stacked layers of tens of nanometers (~50 nm) with vias that are of similar size of regular vias between metal layers. All currently known 3D integration techniques that provide such vertical connectivity require major process and equipment changes at the wafer fab. The following approach is a monolithic 3D innovation that overcomes this challenge.
​
Or-Bach in “Modified ELTRAN® - A Game Changer for Monolithic 3D” presented a 3D integration that does not require process change but would require a special porous base substrate [9]. Here we propose an alternative substrate could be far more readily available (Fig. 8).
Picture
Epitaxial wafer with silicon over SiGe are already the preferred substrates of future nodes for silicon nano-wires, with gate-all-around in which the SiGe layer is used as a sacrificial layer. We suggest a reverse use of such epitaxial layer, in which the SiGe would function as an etch-selective​ ‘cut’ layer by functioning as an etch stop for a back grinding and etch-back process sequence. The 3D technique of flip, bond and etch-back of an SOI donor had been practiced by MIT Lincoln Lab for many years [10, 11]. Use of a SiGe ‘cutable’ substrate is an attractive alternative as SOI wafers are quite expensive. The SiGe ‘cutable’ substrate could be processed as a regular wafer through the fab all through the process, including BEOL interconnection layers. Then it could be flipped over and bonded to a target wafer. A simple grind and etch-back operation using the SiGe layer an etch stop follows, which could be later removed by a follow-up etch step. Accordingly, the transferred layer could be made as thin as desired, down to less than 100 nm of silicon, removing by grinding and etch back almost all of the 700 micron of the original substrate. Fig. 16 illustrates selective etching of silicon allowing SiGe to serve as an etch stop [12, 13, 14].
Picture
​Currently available production worthy wafer bonders can support such layer transfers with less than 200nm (3σ) misalignment [15].
This fine grained 3D integration could be used for reengineering memory product into two strata, one for arrays of bit cells and one for memory control circuitry, as is illustrated in Fig. 10.
Picture
​Such densely connected 3D partitioning would reduce memory costs as bit-cell processing is completely disconnected and different from memory control circuitry processing. Manufacturing memory and logic on separate wafers would reduce overall costs, while enabling a paradigm shift in the logic and memory interface.
​
Having the memory peripheral circuitry not in the periphery of the device but rather on top of it, allows cost effective breaking up of the memory array into an array of small memory units, such as 200 μm x 200 μm units, each with its own word-lines and bit-lines.
Additionally, multiple memory strata could be vertically integrated to offer much larger memory capacity for the same footprint. Fig. 11 illustrates a region of such memory strata having two adjacent memory units with word-lines or bit-lines traveling in-between, and a per-layer selector, and nano-pads for vertical connectivity.
Picture
​Fig. 12a illustrates a vertical connectivity region having landing pads large enough to cover the bonding misalignment. Fig. 12b illustrates the region with overlaying via connecting these pads to the word-lines or the bits-lines, and Fig. 12c illustrates the overlying pinning pads.
Picture
​Memory strata could be constructed by integrating such connectivity structure with the block diagram of Fig. 11 to form a stackable memory structure, which could be produced as a generic memory substrate. Hybrid wafer bonding could be used to provide vertical connectivity between layers in the stack. The required distance between two adjacent units (Fig. 11) could be less than 1 μm, resulting in less than 0.5% ​overhead for this 3D memory tiling and connectivity structure.

Fig. 13 illustrates a vertical cross-section view of 4 strata distributing the top select signal to each stratum in the stack.
​
This connectivity structure opens up many usage options, including redundancy to overcome defects. The memory is constructed by stacks of strata, each constructed as array of units connected in parallel with a select signal per stratum. One of these strata could serve as a redundancy stratum with per unit select allowing repair at the unit level. In addition, multiple memory access options could be enabled from high speed local access to a global—albeit somewhat slower—access to large array of units. An additional advantage of this architecture is having one (or two) memory control strata to service multiple memory strata.
Picture
​Fig. 14 illustrates a 3D computer system utilizing the technologies presented here. The base silicon is a carrier substrate which also provides cooling to the main multi-core computing stratum. Through the first thermal isolation layer, the computer stratum is connected to the multi-unit lower memory control stratum, which controls the multi-unit memory array strata. Overlaying the memory strata is an upper memory control stratum which provides a second access to the same memory strata. Through a second thermal isolation layer a second computing stratum could be connected to the upper memory control stratum. The second computing stratum could be the Input/Output computing stratum communicating with external devices utilizing another communications stratum. The communications stratum could utilize wired, wireless, optical or other channels to communicate with external devices. An upper heat removal apparatus could overlay the communications stratum.
Picture
​4. SUMMARY
We have presented a 3D integration technology combined with memory architecture that could utilize existing processes and infrastructure to bridge the processor memory gap. Using these concepts would enable the current connectivity of about 100 wires at average 20 mm length to be replaced with 100,000 wires with average 20 μm length, with the corresponding 1,000x improvements in computation speed, power, and cost. Additional advantages would be reduction of overall system costs, establishing a generic memory fabric with build in full repair capability at factory and field.
​
5. REFERENCES
[1] Hennessy, John L. and David A. Patterson. Computer Architecture: A Quantitative Approach. 4th ed., p. 289. Elsevier, 2007.
[2] McCalpin, John D. “Memory Bandwidth and System Balance in HPC Systems”, Invited Talk, International Conference for High Performance Computing, Networking, Storage, and Analysis, 2016.
[3] Wu, Banqiu, and Ajay Kumar. "Extreme ultraviolet lithography and three dimensional integrated circuit—A review." Applied Physics Reviews 1.1, 2014.
[4] Yeap, Geoffrey. "Smart mobile SoCs driving the semiconductor industry: Technology trend, challenges and opportunities." In Electron Devices Meeting (IEDM), 2013.
[5] Sun, Jack Y-C. "System scaling and collaborative open innovation." VLSI Technology (VLSIT), 2013 Symposium on. IEEE, 2013.
[6] Simon, Horst. "Why we need Exascale and why we won’t get there by 2020." Optical Interconnects Conference, Santa Fe, New Mexico. 2013.
[7] Aly, Mohamed M. Sabry, et al. "Energy-efficient abundant-data computing: The n3xt 1,000 x." Computer 48.12 (2015): 24-33.
[8] Woo, Dong Hyuk, et al. "An optimized 3D-stacked memory architecture by exploiting excessive, high-density TSV bandwidth." High Performance Computer Architecture (HPCA), 2010 IEEE 16th International Symposium on. IEEE, 2010.
[9] Or-Bach, Zvi, et al. "Modified ELTRAN®—A game changer for Monolithic 3D." SOI-3D-Subthreshold Microelectronics Technology Unified Conference (S3S), 2015 IEEE. IEEE, 2015.
​[10] Chen, C. K., et al. "3D-enabled heterogeneous integrated circuits." SOI-3D-Subthreshold Microelectronics Technology Unified Conference (S3S), 2013 IEEE. IEEE, 2013.
[11] Chen, C. K., et al. "SOI-enabled three-dimensional integrated-circuit technology." SOI Conference (SOI), 2010 IEEE International. IEEE, 2010.
[12] Orlowski, Marius, et al. "(Invited) Si, SiGe, Ge, and III-V Semiconductor Nanomembranes and Nanowires Enabled by SiGe Epitaxy." ECS Transactions 33.6 (2010): 777-789.
[13] Borenstein, J. T., et al. "Silicon germanium epitaxy: a new material for MEMS." MRS Proceedings. Vol. 657. Cambridge University Press, 2000.
[14] Taraschi, Gianni, et al. "Ultrathin strained Si-on-insulator and SiGe-on-insulator created using low temperature wafer bonding and metastable stop layers." Journal of The Electrochemical Society 151.1 (2004): G47-G56.
[15] Uhrmann, T. "Monolithic IC Integration-Key alignment specifications for high process yield." SOI-3D-Subthreshold Microelectronics Technology Unified Conference (S3S), IEEE. 2014.
0 Comments

    Search Blog


    Meet the Bloggers


    Follow us


    To get email updates subscribe here:


    Recommended Links

    3D IC Community
    3D IC LinkedIn Discussion Group

    Recommended Blogs

    • 3D InCites by Francoise von Trapp
    • EDA360 Insider by Steve Leibson
    • Insights From the Leading Edge by Phil Garrou
    • SemiWiki by Daniel Nenni, Paul Mc Lellan, et al.

    Archives

    March 2022
    December 2021
    August 2021
    August 2018
    July 2018
    May 2018
    October 2017
    September 2017
    December 2016
    September 2016
    August 2016
    November 2015
    October 2015
    September 2015
    July 2015
    June 2015
    May 2015
    April 2015
    March 2015
    February 2015
    October 2014
    September 2014
    August 2014
    July 2014
    June 2014
    May 2014
    April 2014
    March 2014
    February 2014
    January 2014
    December 2013
    November 2013
    October 2013
    September 2013
    August 2013
    July 2013
    March 2013
    February 2013
    January 2013
    December 2012
    November 2012
    October 2012
    August 2012
    June 2012
    May 2012
    April 2012
    March 2012
    February 2012
    January 2012
    December 2011
    November 2011
    October 2011
    September 2011
    August 2011
    July 2011
    June 2011
    May 2011
    April 2011
    March 2011

    Categories

    All
    3d Design And Cad
    3dic
    3d Ic
    3d Nand
    3d Stacking
    3d Technology
    Brian Cronquist
    Dean Stevens
    Deepak Sekar
    Dram
    Education
    Heat Removal And Power Delivery
    Industry News
    Israel Beinglass
    Iulia Morariu
    Iulia Tomut
    Monolithic3d
    Monolithic 3d
    MonolithIC 3D Inc.
    Monolithic 3d Inc.
    Monolithic 3d Technology
    Moore Law
    Outsourcing
    Paul Lim
    Repair
    Sandisk
    Semiconductor
    Semiconductor Business
    Tsv
    Zeev Wurman
    Zvi Or Bach
    Zvi Or-Bach

    RSS Feed

© Copyright MonolithIC 3D Inc. , the Next-Generation 3D-IC Company, 2012 - All Rights Reserved, Patents Pending