Virtualization and the need for faster processors and more memory
Developments in virtualization have allowed many servers to be replaced by one server.
However, this development requires the availability of faster servers and higher memory per server.
Each “virtual machine” (VM) that is run on a server requires a certain amount of memory.
If you have a fast processor, you can add more VMs (in that one server) – but then you need the correspondingly higher amount of memory to accompany that.
Processor vs. memory scaling
Processor speed/capability has been going up – by the availability of multi-core processors. These allow easy “scaling” of processor power.
However memory for servers has not experienced the same scaling capability.
Adding more memory tends to degrade the signals on the memory bus – which limits the maximum achievable speed.
As a result when you add more memory (so you can run more VMs on your very fast server) the memory speed goes down and much of the advantage of having a fast processor on which you can run many more VMs goes down.
You cannot install the greater memory for all those VMs without experiencing a performance-crippling slowdown in memory speed (down from 1333Mhz to 1066MHz or to 800MHz).
Netlist intellectual property
Netlist (NLST), which owns much of the IP behind “load reduction” and “rank multiplication” has had a solution available for some time – called HyperCloud (now available on Romley servers as IBM HCDIMM or HP HDIMM/HP Smart Memory HyperCloud).
Intel was pushing LRDIMMs (Load Reduced DIMMs) for Romley to solve the same problem. However prior to Romley no one had LRDIMMs available to evaluate – and their performance issues were only confirmed starting early 2012 (primarily from the Inphi LRDIMM blog).
As it stands now, both LRDIMMs and NLST HyperCloud are available from IBM and HP.
The problem is LRDIMMs are infringing on NLST IP – as is the DDR4 standard (which incorporates LRDIMM), as explained elsewhere here.
The differences are:
– LRDIMMs require a BIOS modification before the motherboard can understand LRDIMMs
– HyperCloud is plug and play and requires no BIOS updates of the motherboard (demonstrated on pre-Romley systems)
– LRDIMMs have a “5 ns latency penalty” compared to RDIMMs (from Inphi LRDIMM blog).
– NLST HyperCloud have similar latency as RDIMMs (a huge advantage) and have a rather significant “4 clock latency improvement” over the LRDIMM (quote from Netlist Craig-Hallum conference)
– LRDIMMs cannot deliver 1333MHz at 3 DPC (peak at 1066MHz at 3 DPC)
– NLST HyperCloud delivers 1333MHz at 3 DPC
– LRDIMMs are not interoperable with standard RDIMMs
– NLST HyperCloud are interoperable with standard RDIMMs (even though IBM and HP are marketing the HyperCloud in all-HyperCloud configurations – perhaps because you get maximum load reduction this way)
Inphi LRDIMMs infringement and execution
Despite the expected weaknesses in LRDIMMs, Intel persisted in it’s support for LRDIMMs right up to Romley launch.
However the companies which traditionally supply buffer chips for memory modules have been less convinced.
Of the top 3 buffer chipset makers:
– IDTI has prudently scaled back on the rhetoric over the course of a few quarters (as evidenced by their enthusiasm for LRDIMMs).
– IDTI has postponed LRDIMMs to end of 2012 (i.e. skipping Romley and targeting the Ivy Bridge series according to their conference call).
– Texas Instruments has not been interested in LRDIMMs – possibly related to settlement in Netlist vs. Texas Instruments a couple of years ago (it seems Texas Instruments may have been the original leaker of NLST IP to JEDEC).
– Only Inphi has persisted with LRDIMMs and are currently the only supplier of buffer chipsets for LRDIMMs.
Inphi has also been the most aggressive of the three – having challenged NLST’s right to their IP in patent reexamination challenges at the USPTO.
However, these challenges have led to the strengthening of Netlist’s position – as patents which survive reexamination can never be challenged again in court.
USPTO found in it’s reexamination proceedings (after examining all the prior art put forward by Inphi) that ALL claims in the Netlist patents ‘537 and ‘274 survive examination.
This is a powerful signal of things to come. This places Inphi at a disadvantage when Netlist vs. Inphi resumes (it was stayed pending reexamination of Netlist patents). If the case proceeds and the judge rules that infringing product should be recalled, that would be problematic for Inphi, and possibly LRDIMMs (possible recall of infringing product ?).
In reality such a recall may not occur.
A possibility is that since DDR4 copies the Netlist IP to a considerably greater extent, that eventual licensing of Netlist IP for DDR4 may include a license for LRDIMMs.
However it will still make those who have bought LRDIMMs uncomfortable holding an end-of-life product with little possibility of support.
However as things stand, the legal issues with LRDIMMs are moot, as the LRDIMMs underperform the Netlist products at IBM and HP.
What prompted Intel to be so aggressive with LRDIMMs in violating IP
There could be several reasons for this – Intel may have been under pressure to have a load-reduced solution for memory in time for Romley.
Even though Netlist was demoing HyperCloud memory well before Romley (since it works on pre-Romley), the push by Intel for LRDIMMs (even if they were underperforming) created the competitive environment that ensured load reduction solutions would be available at Romley rollout.
Impact of Intel’s push for LRDIMMs on OEMs
Conversely for Netlist, the push by Intel of LRDIMMs ensured that the OEMs were prepared to push load reduction solutions prominently with Romley.
This is important because the availability of load reduction does not help sell more server boxes (in fact it reduces the need to buy more servers – something the OEMs would not normally be happy about).
But with a new line of servers – like Romley – for which Intel is already pushing a load reduction solution (LRDIMMs), it creates a new environment where end-users are informed and expecting the benefits of load reduction to be available.
For this reason you have IBM and HP supporting load reduction solutions – with LRDIMMs, and also prominently featuring the benefits of HyperCloud as the only memory delivering maximum speed. They are also pricing both products similarly (for example IBM sells both a 16GB LRDIMM and a 16GB HCDIMM).
Competitive advantage for load reduction solutions
IBM and HP comprise the bulk of server sales – esp. in the virtualization/data center space (high memory loading – thus consumers of load reduction).
If DELL and others do not compete at same speeds, they will suffer from lack of a competitive solution for virtualization.
What to expect in the future
Expect load reduction to be a standard part of memory – as it will be required:
– at 3 DPC when you use 16GB memory modules
– at 3 DPC but also at 2 DPC and possibly 1 DPC even when you use 32GB memory modules (because 32GB RDIMMs will only be 4-rank for a while)
And will be required for next-generation memory – for 2013 and beyond:
– works at lower voltages
– works at higher frequencies than 1333MHz and 1600MHz
Both these factors (lower voltages and higher frequencies) make memory loading impact WORSE – the speeds required are not achievable without load reduction.
Which is why load reduction will be even more essential for DDR4.
This is why DDR4 includes not only the LRDIMM (which copied NLST IP) features, but goes further and copies the NLST HyperCloud distributed buffer architecture.
Some time before DDR4 is finalized they will have to license the relevant technology (esp. in light of Inphi inability to challenge Netlist IP at USPTO).
LRDIMMs might get legal cover at that time (if they have sold in sufficient number for there to be a need for such licensing).
Netlist has said their IP will allow DDR4 to reach the speeds that are being planned. Without use of this technology there is no other way to achieve those speeds.
Load reduction will be needed prior to DDR4 also
As it stands load reduction is essential at 3 DPC when using 16GB memory modules.
It will become essential at 3 DPC and 2 DPC and possibly even 1 DPC when 32GB memory modules start to be used (they are already available for Romley, but the market for 32GB is expected to be 3%-5% of the memory market in 2012 according to various analysts).
At the moment the OEMs are supporting 32GB LRDIMMs (16GB LRDIMMs are not able to compete against the 16GB RDIMMs 2-rank).
Here is a comparison of 32GB LRDIMMs vs. 32GB HyperCloud (which will be available at IBM as 32GB HCDIMM and at HP as 32GB HP Smart Memory HyperCloud mid-2012):
– 32GB market is anticipated to be a much smaller market than the 16GB market.
– 32GB LRDIMMs are listed on the HP and IBM docs – which suggests they are available – however at least IBM had it listed as “Available later in 2012” – perhaps you get get them if you want them from IBM or HP right now or from the resellers.
– 32GB LRDIMMs are expensive (much more than 2x the 16GB LRDIMMs)
– 32GB LRDIMMs are slower than 32GB HyperCloud.
– 32GB LRDIMMs have higher latency than 32GB HyperCloud.
– 32GB LRDIMMs have legal risk associated with them (of recall or cancellation)
– 32GB HyperCloud will be cheaper than 32GB LRDIMMs also – because the 32GB LRDIMMs use 4Gbit x 2 (DDP) while NLST 32GB HyperCloud uses 4Gbit (monolithic) and leverages their Planar-X IP to make 32GB HyperCloud
So in summary, this might have been Intel’s gameplan for LRDIMMs:
– Intel recognized the value of load reduction
– with increasing processor speed – the high memory loading problem was rapidly becoming mainstream (2 DPC users would be wanting to become 3 DPC users)
– virtualization and cloud computing – all require large memory servers ..
– therefore the need to push “load-reduction” into the mainstream market for Romley
– push LRDIMMs (even if it may have been unsure of LRDIMM infringement of NLST IP) to ensure a load reduction solution would be delivered at Romley rollout (both LRDIMMs and NLST HyperCloud are now available at IBM and HP)
– this Intel push for load reduction thus proved beneficial for the end-user