Archive for the ‘General News’ Category

Semicon China – SMIC Shows off 28-nm HKMG Development

Saturday, March 24th, 2012 by Dick James

Another foundry goes gate-last

In the opening keynote at Semicon China today, Dr. Tzu-Yin Chiu, CEO of SMIC, gave a run-through of their technology portfolio, and in doing so let out a few details of their sub-40 nm process development.

SMIC's application portfolio

It appears that they are actually shipping some 40-nm pilot product for revenue, and to keep the ARM-world happy, they will have Cortex A9 cores running at 1.2 GHz by the end of the year.

Snapshot of advanced node work at SMIC

Scheduled for mid-2013, their 28-nm offering will be both high-k, metal gate and poly/SiON, and feature one of the smallest SRAM cell sizes to date.

SMIC 28-nm schedule

The images are all distinctly fuzzy thanks to the challenges of using a phone camera at some distance from a dimly-lit screen, but they show what I’m talking about. It appears that the gate-last structure has more in common with TSMC’s 28-nm structure than Intel’s 32-nm, and also that the NMOS and PMOS labels have been reversed;

SMIC 28-nm gate structures and SRAM cell

In all the other gate-last HKMG transistors we have seen, the thick TiN and Ta layers are in the PMOS (you have to squint to distinguish them in this image, but they are there), and I wouldn’t expect SMIC’s to be any different. We can also see the tell-tale notch at the bottom of the gate edge that indicates that the gate dielectrics were formed before the dummy poly gate was put down.  At less than 0.13 sq. microns the SRAM cell is the smallest that I know of – TSMC is 0.15, and Intel 0.17 sq. microns.

Just for comparison, here’s a pair of composite images of Intel’s 45-nm transistors and TSMC’s 28-nm transistors. You can clearly see the notches at the bottom of the gate structures that I refer to above.

TSMC's 28-nm (right) and Intel's 45-nm gate-last transistors

The inclusion of a poly/SiON variant (presumably low-power) at 28 nm puts them on a par with TSMC and UMC, and leaves GLOBALFOUNDRIES as the only major foundry without an announced non-HKMG LP process at that node. If the rumours about GloFo second-sourcing the Qualcomm S4 (currently on TSMC’s poly/SiON 28LP line) are true, presumably they’ll have to develop one.

GloFo’s FinFETS are Better than Intel’s! Musings from on the Road..

Monday, March 19th, 2012 by Dick James

This confident statement came from Subramani (Subi) Kengeri of GLOBALFOUNDRIES (GloFo) during the panel session in the GloFo/IBM/Samsung Common Platform Technology Forum (CPTF), held Wednesday in the Santa Clara Convention Center. I’m currently on one of my periodic road trips, and this one has given me the chance to sit in on the CPTF – last year I had to make do with the online version.

Towards the end of the panel discussions, the host, Jaga Jagannathan of IBM, asked Subi “How do you stack up against Intel? – especially in the SoC/smartphone space?”

This clearly took Subi by surprise, but after some preamble, he focused on FinFET development, which AMD, then GloFo, have been working on for the last ten years.  In conjunction with customer input, they have been focusing their finFET efforts to optimise the (14 nm) process for mobile SoCs. He said that this was what would differentiate them from Intel, and in that space “We believe we have a much better finFET, that is optimised for mobile SoCs”.

CPFT Panel Session - Jaga on the left, Subi third from the right. Source: Common Platform

Of course time will tell, and the CPTF 14-nm process will likely not show up for three or four years, while we are waiting for Intel’s imminent launch of their trigate product.

The panel session has been put online, so you can see it by going here; register if you need to, then select Agenda and click the relevant link; if you want to see this particular Q & A, move the slider to the 52:30 timepoint.

Also during the discussion Subi stated that GlobalFoundries is in production for 32-nm HKMG, and running the full flow of the 20-nm (gate-last) process in their Malta NY fab.

Earlier in the day he had given one of the keynote talks, and it was then that he gave the logic for the move to finFET at 14-nm that was a major theme of the day.  It boils down to the fact that by the time you get to the 20-nm node, there are no more knobs to turn to crank up the performance of a transistor.  In order to mitigate the short-channel effects and increase drive current, a 3D fully-depleted structure is needed. GloFo regards the mobile sector as one of the big drivers for leading-edge process development these days, so their finFET efforts have been focused in the mobile SoC arena, with a multiple Vt process in development.

Another nugget from the day was the public announcement that Samsung is in full production with their 32-nm HKMG process, and it appears in Austin as well as Korea.  I was hoping that we might see it in the new iPad, but we’ve now confirmed that the A5x chip is 45-nm. I guess we’ll have to wait for one of the new phones or tablets that will be out soon. Actually, that includes TVs too – Samsung had a TV with gesture recognition on the show floor, powered by a 32-nm HKMG processor, and that’s due out next month as well.

The following day I was at an Intel analyst meeting, but that’s under NDA so I can’t say too much; but it’s not letting too much out to say that it reinforced their messages from CES and the Mobile World Congress that there will be a big push on Ultrabooks and mobile phones.  Next month expect a huge marketing campaign for Ultrabooks – it was described as “epic” and “cinematic” at CES. Even now we’re seeing all sorts of product announcements by the OEMs, including plenty with the 22-nm Ivy Bridge chip inside.

At the moment I’m in Shanghai taking in the China Semiconductor Technology International Conference and Semicon China. I’m presenting on “Recent Innovations in Leading-Edge Silicon Devices”; hopefully it will get a good reception. And we’ll see if there’s anything blog-worthy this week. In the meantime I tweet @ChipworksDick if anything is noteworthy.

ISSCC – Intel’s Ivy Bridge, Rosepoint, Near-Threshold Techniques

Thursday, February 23rd, 2012 by Dick James

Contributed by Vincent Karam.

Kicking off the afternoon of day 1 was the Ivy Bridge paper (3.1); the processor contains 1.4 Billion Transistors in an area of 160 mm2 (for their 4 core 2 graphic segment die). The IVB dies were shown in four configurations, 4+2 (4 cores 2 graphics), 2+2, 4+1 and 2+1.

Here were some of the chip’s major highlights:

Quad-core with Intel Hyper-threading Technology

Next Generation Intel HD Graphics with DirectX 11 support

Dual channel DDR3-1600 or DDR3L -1333 interface

Integrated PCIe

Support for 3 displays simultaneously

Up to 8 MB cache memory

Same Thermal Design Power as predecessor

Next up was Intel’s 32nm Atom SoC with integrated WiFi codenamed Rosepoint (paper 3.4). Intel says it’s the first 32nm SoC with a WiFi transceiver and two Atom cores on the same die. They were able to get the Atom cores and the WiFi transceiver to get along nicely by choosing Atom processor frequencies such that their harmonics didn’t land in the WiFi frequency band.

Intel announces Atom-based WiFi chip Source: Intel

This was an obvious example of the direction Intel would like to push RF, into a scalable technology that keeps up with Intel’s fabs, so think digital CMOS. For most RF designers, including myself, it’s hard to imagine, but it’s something all RF designers will have to come to grips with.  Over the past few years Intel has been converting traditionally analog blocks to fully digital circuits (LNA ISSCC’01, Synthesizer VLSI’10, T/R Switch ISSCC’11 to name a few). On Tuesday Intel will also be presenting an all digital PA and Phase modulator, so you can add those to the list.

Intel’s third Microprocessor project, code named Claremont, seems to have received more attention by the media for different reasons than intended. This was a 32nm processor that demonstrated NTV (Near Threshold Voltage) operation as a means to optimize computational speed and energy (3.6). Although Intel used a solar cell to power the chip, Intel says they do not have any intention of producing solar powered CPU’s (at least in the near future). Power consumption can be as low as 2 mW, and it can operate on as little as 280 mV up to the conventional 1.2 V.

Intel's Claremont processor using NTV technology Source: Intel

ISSCC – Samsung Presents Dual/Quad-Core 32-nm Exynos Processor

Thursday, February 23rd, 2012 by Dick James

Contributed by Mike Christie

At ISSCC 2012, Samsung presented (paper 12.1) in the Multimedia and Communication SOCs session, entitled “A 32 nm High-k Metal Gate Application Processor with Ghz Multi-Core CPU.”

In this presentation, Samsung discussed its next generation Exynos application processor – we just recently finished a functional analysis report on the Exynos 4210. Samsung have already announced the Exynos 4212 (dual-core Cortex A9) and Exynos 5250 (dual-core Cortex A15), both of which are 32 nm HKMG.  Presumably, this one will be another in the Exynos series.

The specs on this processor include: two or four ARM-V7A CPU cores, with a shared 1 MB cache memory, each with a full hardware vector floating-point unit and 64 bit ARM NEON single instruction multiple data (SIMD) engine; a GPU containing quad core pixel processors, a single geometry processor, and its own 128 KB of dedicated L2 cache memory; and a dual channel DRAM interface capable of up to 6.4 GB/s and supporting LPDDR2, DDR2, and DDR3 400 MHz.

Block Diagram of Samsung's 32 nm Application Processor (source: Samsung/ISSCC)

The specs themselves are not particularly surprising, but the techniques Samsung has used to increase performance and reduce power consumption stand out.

The first major innovation is the switch from a poly-Si/SiON process to the HKMG process.  Their tests indicate that this process greatly reduces the leakage current in comparison to the old process (100x in gate leakage and 10x in overall leakage) with a 40% improvement in performance.

There are four power domains on the die: CPU, GPU, memory I/F, and media Ips. Each of these is further divided into power subdomains. For example, in the CPU domain, each of the cores is its own power domain, and each half of the cache memory. This allows for various power schemes to be used, depending on the system requirements.

Samsung has introduced a number of methods of balancing power and performance on the die, so that only the hardware needed for an application is used. The first of these is called DVFS, or dynamic voltage frequency scaling. The DVFS unit adjusts the operating voltage and frequency of the active blocks to meet the performance requirements of an application. By doing this, they have improved the battery life (for AP and DRAM only) by 34% to 50%.

The second method of balancing performance and power consumption is by means of body biasing. By using a Forward Body Bias, performance can be increased by as much as 13.5%, and by means of a Reverse Body Bias, it is possible to instead reduce the leakage current on a transistor, thereby reduce the power consumption.

Finally, there is the addition of a thermal management unit (TMU) which monitors the temperature of the system through thermal sensors, and maintains a constant temperature by throttling the various blocks, as necessary, through the DVFS controller. When the temperature in a block is determined to be too high, the performance will be reduced in order to maintain the temperature below a threshold. This not only helps to prevent burnout of the device, but also helps to reduce power consumption by as much as 32%.

Thermal Management Unit in Samsung's Application Processor (source: Samsung/ISSCC)

Samsung did not want to comment on the die size or many details of the process, as it is soon to be released to the press, but they displayed them at the demo sessions, along with their previous 45 nm generation, to show the improvements in performance.

ISSCC 2012 – Hynix eliminates dummy cells in 6F2 DDR3

Tuesday, February 21st, 2012 by rwilliamson

Contributed by Mike Christie.

The International Solid-State Circuits Conference (ISSCC) conference is in full swing. Chipworks is attending to track the newest ideas in circuits and chips for 2012 and are blogging a few notable highlights.

Hynix presented a paper entitled ‘A 1.2V 23nm 6F2 DDR3 SDRAM with Local-Bitline Sense Amplifier, Hybrid LIO Sense Amplifier and Dummy-Less Array Architecture’.

While the author touted a number of circuit innovations, including a modified sense amplifier and LIO amplifier, the most interesting modification they discussed was the removal of the need for dummy cells in the memory array.

In current DDR3 SDRAMs using 6F2 architecture the edge arrays are only half utilized as the sense amplifiers are located between arrays and require the bitline capacitance to be balanced.  This means that there are actually thirty-three arrays, two of which are only 50% in use. Therefore over 3% of the memory cells on the die are unusable. In fact, some DRAMs have an even higher fraction of cells that are unusable. For instance, in the Elpida 46nm 2Gb LP DDR2 SDRAM we are currently analyzing in our labs we see a full array as 25 sub-blocks (with 2 of these sub-blocks only ½ usable). As such, a full 4% of the cells on this DRAM serve no function.

Die Photo of Elpida B4064B2PF LP DDR2 SDRAM

Hynix have proposed using only thirty-two memory arrays, and modifying the sense amplifiers for the bitlines that terminate on the outside edge of the array.  In order to balance the bitline capacitance, there is an offset, which is dynamic and based on the data that is being sensed from the memory array. In the cut-throat world of commodity DRAM pricing this 3% cell usage gain (which would translate to about a 1.5% chip area reduction) should have a meaningful impact on product competitiveness.