"We moved to 3D starting with Trigate (FinFET) at 22 nm node, but an even better example is our announcement in May of a 96-layer, 4-bit-per-cell NAND flash that packs up to 1 terabit of information per die. This is a true post-Dennard example of packing increasing functions into a die without feature scaling. Over time, we expect logic to also move more toward 3D." [read full article]
"The Continuing Evolution of Moore's Law" blog written by Dr. Michael Mayberry, the chief technology officer of Intel Corporation, was posted on August 2nd on EE Times website. He wrote that in the future the logic will move more toward 3D.
"We moved to 3D starting with Trigate (FinFET) at 22 nm node, but an even better example is our announcement in May of a 96-layer, 4-bit-per-cell NAND flash that packs up to 1 terabit of information per die. This is a true post-Dennard example of packing increasing functions into a die without feature scaling. Over time, we expect logic to also move more toward 3D." [read full article]
0 Comments
EV Group (EVG), a supplier of wafer bonding and lithography equipment for the MEMS, nanotechnology and semiconductor markets, today unveiled the new SmartView® NT3 aligner, which is available on the company’s industry benchmark GEMINI® FB XT integrated fusion bonding system for high-volume manufacturing (HVM) applications. Developed specifically for fusion and hybrid wafer bonding, the SmartView NT3 aligner provides sub-50-nm wafer-to-wafer alignment accuracy — a 2-3X improvement — as well as significantly higher throughput (up to 20 wafers per hour) compared to the previous-generation platform. - Original article from Solid State Technology Magazine.
ABSTRACT For over 4 decades the gap between computer processing speed and memory access has grown at about 50% per year, to more than 1,000x today. This provides an excellent opportunity to enhance the single-core system performance. An innovative 3D integration technology combined with re-architecting the integrated memory device is proposed to bridge the gap and enable a 1,000 x improvement in computer systems. The proposed technology utilizes processes that are widely available and could be integrated in products within a very short time. Keywords—processor-memory gap; 3D memory; 1. INTRODUCTION The Fig. 1 illustrates the growing gap between processing and memory access [1,2] The source for this gap is directly related to the gap between transistor performance progress vs. on-chip interconnect delay as illustrated in Fig. 2 [3,4] In most computer systems the processor is being bought from a processor vendor (Intel, AMD, Nvidia,…) while the memory is being bought from memory vendors (Micron, Samsung, Hynix,…). In many cases multiple memory devices are being integrated into a memory module, often called DIMM (such as Fig. 3), which is then integrated into a computer system with the processor(s). This integration is driven by the very different semiconductor technology knowhow, and manufacturing infrastructure required, for processors vs. memories. Printed circuit board (‘PCB’) is used to connect processor to memory adding significantly to the ‘gap,’ as illustrated in Fig. 4 [5] This gap has been articulated in many forms including in a presentation titled “Why we need Exascale and why we won't get there by 2020” [6] which included Figs. 5. There is a tremendous power cost in moving data to and from an off-chip memory, and even an ‘integrated’ memory. 2. PREVIOUSLY PROPOSED SOLUTIONS A joint research by teams from Stanford University, Berkeley and Carnegie Mellon was summarized in their paper “Energy-Efficient Abundant-Data Computing: The N3XT 1,000×” illustrated in Fig. 6 [7]. Yet it is very unclear if and when any of the proposed new technologies – CNT, RRAM, STT-MRAM - would be available and in high volume production and ready to be used for manufacturing of computer systems. 3D integration using TSV has been considered as another attractive option to bridge the gap. A paper by D. H. Woo concluded “On average, for single-threaded memory intensive applications, the speedups range from 1.53 to 2.14 compared to a conventional 2D architecture” [8]. This is far less than the monolithic 3D work done by Stanford. In Fig. 12 Micron provides a chart comparing TSV vs. DDR3. Micron has formed an industry consortium named Hybrid Memory Cube – “HMC,” to leverage TSV for bridging the memory gap. A table comparing 2.5D and TSV memory stacking technologies is presented in Fig. 7. While the various approaches offer clear performance improvements, they are far from what is suggested by the N3XT factor of 1,000x. It would seem that the inherent limitations (density, delay) of the TSV technology are the limiting factors. We therefore propose a novel 3D integration that offers more than 1,000x better vertical connectivity to enable comprehensive bridging of the processor memory gap, while still utilizing widely available commercial processes that make it attractive for fast industry adoption. 3. BRIDGING THE GAP WITH MONOLITHIC 3D INTEGRATION In TSV technology the layer thickness of each layer in the stack is tens of microns (~50 μm) and the via through each stacked layer is about 5 microns in diameter. Monolithic 3D integration enables stacked layers of tens of nanometers (~50 nm) with vias that are of similar size of regular vias between metal layers. All currently known 3D integration techniques that provide such vertical connectivity require major process and equipment changes at the wafer fab. The following approach is a monolithic 3D innovation that overcomes this challenge. Or-Bach in “Modified ELTRAN® - A Game Changer for Monolithic 3D” presented a 3D integration that does not require process change but would require a special porous base substrate [9]. Here we propose an alternative substrate could be far more readily available (Fig. 8). Epitaxial wafer with silicon over SiGe are already the preferred substrates of future nodes for silicon nano-wires, with gate-all-around in which the SiGe layer is used as a sacrificial layer. We suggest a reverse use of such epitaxial layer, in which the SiGe would function as an etch-selective ‘cut’ layer by functioning as an etch stop for a back grinding and etch-back process sequence. The 3D technique of flip, bond and etch-back of an SOI donor had been practiced by MIT Lincoln Lab for many years [10, 11]. Use of a SiGe ‘cutable’ substrate is an attractive alternative as SOI wafers are quite expensive. The SiGe ‘cutable’ substrate could be processed as a regular wafer through the fab all through the process, including BEOL interconnection layers. Then it could be flipped over and bonded to a target wafer. A simple grind and etch-back operation using the SiGe layer an etch stop follows, which could be later removed by a follow-up etch step. Accordingly, the transferred layer could be made as thin as desired, down to less than 100 nm of silicon, removing by grinding and etch back almost all of the 700 micron of the original substrate. Fig. 16 illustrates selective etching of silicon allowing SiGe to serve as an etch stop [12, 13, 14]. Currently available production worthy wafer bonders can support such layer transfers with less than 200nm (3σ) misalignment [15]. This fine grained 3D integration could be used for reengineering memory product into two strata, one for arrays of bit cells and one for memory control circuitry, as is illustrated in Fig. 10. Such densely connected 3D partitioning would reduce memory costs as bit-cell processing is completely disconnected and different from memory control circuitry processing. Manufacturing memory and logic on separate wafers would reduce overall costs, while enabling a paradigm shift in the logic and memory interface. Having the memory peripheral circuitry not in the periphery of the device but rather on top of it, allows cost effective breaking up of the memory array into an array of small memory units, such as 200 μm x 200 μm units, each with its own word-lines and bit-lines. Additionally, multiple memory strata could be vertically integrated to offer much larger memory capacity for the same footprint. Fig. 11 illustrates a region of such memory strata having two adjacent memory units with word-lines or bit-lines traveling in-between, and a per-layer selector, and nano-pads for vertical connectivity. Fig. 12a illustrates a vertical connectivity region having landing pads large enough to cover the bonding misalignment. Fig. 12b illustrates the region with overlaying via connecting these pads to the word-lines or the bits-lines, and Fig. 12c illustrates the overlying pinning pads. Memory strata could be constructed by integrating such connectivity structure with the block diagram of Fig. 11 to form a stackable memory structure, which could be produced as a generic memory substrate. Hybrid wafer bonding could be used to provide vertical connectivity between layers in the stack. The required distance between two adjacent units (Fig. 11) could be less than 1 μm, resulting in less than 0.5% overhead for this 3D memory tiling and connectivity structure. Fig. 13 illustrates a vertical cross-section view of 4 strata distributing the top select signal to each stratum in the stack. This connectivity structure opens up many usage options, including redundancy to overcome defects. The memory is constructed by stacks of strata, each constructed as array of units connected in parallel with a select signal per stratum. One of these strata could serve as a redundancy stratum with per unit select allowing repair at the unit level. In addition, multiple memory access options could be enabled from high speed local access to a global—albeit somewhat slower—access to large array of units. An additional advantage of this architecture is having one (or two) memory control strata to service multiple memory strata. Fig. 14 illustrates a 3D computer system utilizing the technologies presented here. The base silicon is a carrier substrate which also provides cooling to the main multi-core computing stratum. Through the first thermal isolation layer, the computer stratum is connected to the multi-unit lower memory control stratum, which controls the multi-unit memory array strata. Overlaying the memory strata is an upper memory control stratum which provides a second access to the same memory strata. Through a second thermal isolation layer a second computing stratum could be connected to the upper memory control stratum. The second computing stratum could be the Input/Output computing stratum communicating with external devices utilizing another communications stratum. The communications stratum could utilize wired, wireless, optical or other channels to communicate with external devices. An upper heat removal apparatus could overlay the communications stratum. 4. SUMMARY We have presented a 3D integration technology combined with memory architecture that could utilize existing processes and infrastructure to bridge the processor memory gap. Using these concepts would enable the current connectivity of about 100 wires at average 20 mm length to be replaced with 100,000 wires with average 20 μm length, with the corresponding 1,000x improvements in computation speed, power, and cost. Additional advantages would be reduction of overall system costs, establishing a generic memory fabric with build in full repair capability at factory and field. 5. REFERENCES
[1] Hennessy, John L. and David A. Patterson. Computer Architecture: A Quantitative Approach. 4th ed., p. 289. Elsevier, 2007. [2] McCalpin, John D. “Memory Bandwidth and System Balance in HPC Systems”, Invited Talk, International Conference for High Performance Computing, Networking, Storage, and Analysis, 2016. [3] Wu, Banqiu, and Ajay Kumar. "Extreme ultraviolet lithography and three dimensional integrated circuit—A review." Applied Physics Reviews 1.1, 2014. [4] Yeap, Geoffrey. "Smart mobile SoCs driving the semiconductor industry: Technology trend, challenges and opportunities." In Electron Devices Meeting (IEDM), 2013. [5] Sun, Jack Y-C. "System scaling and collaborative open innovation." VLSI Technology (VLSIT), 2013 Symposium on. IEEE, 2013. [6] Simon, Horst. "Why we need Exascale and why we won’t get there by 2020." Optical Interconnects Conference, Santa Fe, New Mexico. 2013. [7] Aly, Mohamed M. Sabry, et al. "Energy-efficient abundant-data computing: The n3xt 1,000 x." Computer 48.12 (2015): 24-33. [8] Woo, Dong Hyuk, et al. "An optimized 3D-stacked memory architecture by exploiting excessive, high-density TSV bandwidth." High Performance Computer Architecture (HPCA), 2010 IEEE 16th International Symposium on. IEEE, 2010. [9] Or-Bach, Zvi, et al. "Modified ELTRAN®—A game changer for Monolithic 3D." SOI-3D-Subthreshold Microelectronics Technology Unified Conference (S3S), 2015 IEEE. IEEE, 2015. [10] Chen, C. K., et al. "3D-enabled heterogeneous integrated circuits." SOI-3D-Subthreshold Microelectronics Technology Unified Conference (S3S), 2013 IEEE. IEEE, 2013. [11] Chen, C. K., et al. "SOI-enabled three-dimensional integrated-circuit technology." SOI Conference (SOI), 2010 IEEE International. IEEE, 2010. [12] Orlowski, Marius, et al. "(Invited) Si, SiGe, Ge, and III-V Semiconductor Nanomembranes and Nanowires Enabled by SiGe Epitaxy." ECS Transactions 33.6 (2010): 777-789. [13] Borenstein, J. T., et al. "Silicon germanium epitaxy: a new material for MEMS." MRS Proceedings. Vol. 657. Cambridge University Press, 2000. [14] Taraschi, Gianni, et al. "Ultrathin strained Si-on-insulator and SiGe-on-insulator created using low temperature wafer bonding and metastable stop layers." Journal of The Electrochemical Society 151.1 (2004): G47-G56. [15] Uhrmann, T. "Monolithic IC Integration-Key alignment specifications for high process yield." SOI-3D-Subthreshold Microelectronics Technology Unified Conference (S3S), IEEE. 2014.
We have a guest contribution from Zvi Or-Bach, the President and CEO of MonolithIC 3D Inc.
Next week, as part of the IEEE S3S 2017 program, we will present a paper (18.3) titled “A 1,000x Improvement in Computer Systems by Bridging the Processor Memory Gap”.
Next week, as part of the IEEE S3S 2017 program, we will present a paper (18.3) titled “A 1,000x Improvement in Computer Systems by Bridging the Processor Memory Gap”. The paper details a monolithic 3D technology that is low-cost and ready to be rapidly deployed using the current transistor processes. In that talk, we will also describe how such an integration technology could be used to improve performance and reduce power and cost of most computer systems, suggestive of a 1,000x total system benefit. This game changing technology would be presented also in the CoolCube open workshop, a free satellite event of the conference 3DI program.
In an interesting coincidence DARPA just came out with a calls for >50x improvement in SoC The 3DSoC DARPA solicitation reads: “As noted above, the 3DSoC technology demonstrated at the end of the program (3.5 Years) should also have the following characteristics: –Capability of > 50X the performance at power when compared with 7nm 2D CMOS technology. The 3DSoC program goal of 50x is to allow proposals suggesting US-built device at 90nm node vs. 7nm of computer chip using conventional 2D technologies. Looking at the table below we can see that if 7nm technology is used the benefit would be over 300x
This represents a paradigm shift for the computer industry and high-tech world, as normal scaling would provide 3x improvement at best. The emergence of AI and deep learning system makes memory access a key challenge for future systems, and indicate the far larger benefits offered by monolithic 3D integration.
The following charts were presented by the 3DSoC program manager Linton Salmon at the 3DSoC proposers day. The program calls for the use of monolithic 3D to overcome the current weakest link in computers – the memory wall.
Leading to the 3DSoC solicitation was work done by Stanford, MIT, Berkeley and Carnegie Mellon
Proposals are due by Nov 6.
There is a unique opportunity to hear the 3DSoC DARPA Program Manager, Dr. Linton Salmon, articulate the program and what DARPA is looking for during his invited talk at the S3S 2017 conference next week.
We have a guest contribution from Zvi Or-Bach, the President and CEO of MonolithIC 3D Inc.
As recently reported, in an effort to initiate resurgence of the U.S. electronics industry, some $500-$800 million will be invested in post-Moore's Law technologies.
Quoting from the BAA: “As noted above, the 3DSoC technology demonstrated at the end of the program (3.5 Years) should also have the following characteristics:
The following charts were presented by the 3DSoC program manager Linton Salmon at the 3DSOC proposal day. The program calls for the use of monolithic 3D to overcome the current weakest link in computers – the memory wall. This is illustrated in the following slides:
Leading to the 3DSoC solicitation was work done by Stanford, MIT, Berkeley and Carnegie Melon, summarrized in the following two slides.
Proposals are due by Nov 6. Everyone has a unique opportunity to hear an invited talk about the program on October 16 from the 3DSoC DARPA Program Manager, Dr. Linton Salmon during this year’s IEEE S3S conference at the Hyatt Regency at the San Francisco Airport. The IEEE S3S conference has dedicated focus on monolithic 3D technologies.
3D integration is considered expensive and monolithic 3D is considered extremely challenging requiring high technology, investment, and major changes to the foundry process. In an interesting entanglement, the S3S 2017 program includes a paper by us, MonolithIC 3D Inc., in which we will present a monolithic 3D technology that is low cost and ready to be rapidly deployed using the current transistor processes. In that talk we will also describe how such an integration technology could be used to improve performance and reduce power and cost of most computer systems, suggestive of a 1,000x total system benefit. I hope to see you there to talk further about this upcoming disruptive change.
We have a guest contribution from Zvi Or-Bach, the President and CEO of MonolithIC 3D Inc.
A technology that could bridge the processor memory gap, Monolithic 3D has DARPA's attention. The agency wants proposals by Nov 6. Learn more at an upcoming IEEE conference.
On Sept 13 the Defense Advanced Research Projects Agency (DARPA) launched a giant funding effort to ensure the United States will sustain the pace of electronic innovation vital to both a flourishing economy and a secure military. Under the banner of the Electronics Resurgence Initiative (ERI), some $500-$800 million will be invested in post-Moore’s law technologies.
Among those is the 3DSoC program. “The overall goal of the Three Dimensional Monolithic System-on-a-Chip (3DSoC) program" DARPA wrote in a statement, "is to develop 3D monolithic technology that will enable > 50X improvement in SOC digital performance at power. 3DSoC aims to drive research in process, design tools, and new compute architectures for future designs while utilizing U.S. fabrication capabilities.” As is illustrated in the following chart:
The foundation for the 3DSoC program were formed in prior DARPA research work performed by Stanford University in collaboration with Berkeley and CMU, Energy-Efficient Abundant-Data Computing: The N3XT 1,000×
The underlying problem enabling these orders of magnitude improvement is often called “The Memory Wall” illustrated by the following chart from Hennessy and Patterson:
The main source of the gap is the limited number of long wires connecting memory to processor. This is driven by the fact that typically the process line for memory is very different than of processor, resulting in memory chips aggregated to memory modules and being connected to the processor using printed circuit board or, at best, carrier substrate. A technology that could bridge this gap — Monolithic 3D — has the potential to provide more than 1,000x better computers.
DARPA kept the target enhancement to only 50x to allow use of chip manufactured in domestic older fab (~90nm) instead of off-shore 7nm fab lines. Proposals are due by Nov 6, allowing everyone a unique opportunity to hear an invited talk about the program on October 16 from the 3DSoC DARPA Program Manager, Dr. Linton Salmon during this year’s IEEE S3S 2017 at the Hyatt Regency at the San Francisco Airport. The IEEE S3S conference is dedicated to monolithic 3D technologies. It provides unparalleled opportunity for quick catch up with the broad spectrum of monolithic technologies. At the conference’s start Al Fazio, Intel Senior Fellow, will deliver a plenary talk on how 3D NAND and 3D XPoint ended being the trailblazing monolithic 3D IC technologies that have matured to volume production, taking over the fast growing memory market. The first day will conclude with two 3D IC focus sessions comprised of a mix of invited and submitted papers covering exotic technologies and the use of the emerging nano-wire transistors for 3D scaling. The first half of the second day includes a collaborative event organized by Qualcomm and CEA Leti — the COOLCUBE/3DVLSI Open Workshop. The second half includes an open 3D tutorial providing full coverage of the various 3D integration technologies from TSV to Sequential Integrations. The third day is made of four sessions of invited and submitted talks on monolithic and other forms of 3D integration. These sessions include a talk by MonolithIC 3D Inc. in which we will present a monolithic 3D technology that is ready to be rapidly deployed using the current transistor process. We will also describe how such an integration technology could be used to improve performance, reduce power and cost of most computer systems, suggestive of a 1,000x total system benefit. In addition, the IEEE S3S conference includes full coverage of SOI and low-power technologies, making it the place to be and to learn about alternative technologies to dimensional scaling. I am looking forward to seeing you at the S3S from October 16th thru 19th, 2017.
We have a guest contribution from Zvi Or-Bach, the President and CEO of MonolithIC 3D Inc.
Learn all about Monolithic 3D at IEEE S3S.
On Sept 13 DARPA come out with Electronic Resurgence Initiative (ERI) programs. Quoting: “with an eye toward the times we now live in, [Gordon Moore] laid out the technical directions to explore when the conditions under which scaling will be the primary means for advancement are no longer met. A trio of simultaneously-released ERI BAAs—this one among them—parallel the research areas detailed on page three of Moore’s paper: materials and integration, architecture, and design. These new page-three-inspired investments, along with a series of related investments from the past year, comprise the overall Electronics Resurgence Initiative.”
Among these programs is the “Three Dimensional Monolithic System-on-a-Chip (3DSoC): Develop 3D monolithic technology that will enable > 50X improvement in SoC digital performance at power.” In perfect timing, this year’s IEEE S3S 2017 at the Hyatt Regency at the San Francisco Airport will feature a comprehensive show case for monolithic 3D IC technologies. At the start Al Fazio, Intel Senior Fellow, will give a plenary talk on how 3D NAND and 3D XPoint™ happened to be the trailblazing monolithic 3D IC technologies that have matured to volume production, taking over the fast growing memory market. The first day will end with two 3D IC focus sessions comprised of a mix of invited and submitted papers covering exotic technologies and the use of the emerging nano-wire transistor for 3D scaling. The first half of the second day includes a collaborative event organized by Qualcomm and CEA Leti – the COOLCUBE/3DVLSI Open Workshop. The second half will include an open 3D tutorial providing full coverage of the various 3D integration technologies from TSV to Sequential Integrations. The third day of the conference will include a full day with four sessions of invited and submitted talks on monolithic and other forms of 3D integration. These sessions will include a talk by us, MonolithIC 3D Inc., in which we will present a monolithic 3D technology that is ready to be rapidly deployed using the current transistor process. In that talk we will also describe how such an integration technology could be used to improve performance, reduce power and cost of most computer systems, suggestive of a 1,000x total system benefit. In addition, the IEEE S3S conference will have full coverage of SOI and low power technologies, making it the place to be and to learn about alternative technologies to dimensional scaling. I am looking forward to seeing you at the S3S from October 16th thru 19th, 2017.
As we come to the end of 2016, MonolithIC 3D Inc. team would like to share its holiday greetings by wishing you Merry Christmas and a Happy New Year. We end this year with a grate recognition by Solid State's Magazine. Our CEO's recent blog post Moore’s Law did indeed stop at 28nm was number 1 of 2016 top stories. You can access the entire article here.
As we have predicted two and a half years back, the industry is bifurcating, and just a few products pursue scaling to 7nm while the majority of designs stay on 28nm or older nodes.
Our march 2014 blog Moore’s Law has stopped at 28nm! has recently been re-confirmed. At the time we wrote: “From this point on we will still be able to double the amount of transistors in a single device but not at lower cost. And, for most applications, the cost will actually go up.” This reconfirmation can be found in the following IBS cost analysis table slide, presented at the early Sept FD-SOI event in Shanghai.
Gate costs continue to rise each generation for FinFETs, IBS predicts.
As reported by EE Times - Chip Process War Heats Up, and quoting Handel Jones of IBS “28nm node is likely to be the biggest process of all through 2025”.
|
Search Blog
Meet the BloggersFollow usRecommended LinksRecommended Blogs
Archives
July 2024
Categories
All
|