Memory for in-memory databases and SAP HANA

High memory loading and protection from power loss

Since in-memory database solutions (IMDB) tend to move data off the hard drive (slow disk access) and onto DRAM main memory (fast but expensive), there is a higher need for memory on the server.

Here the bottleneck is the high cost of memory – when you load 384GB of memory on a 2-socket server, the cost of memory dwarfs the server cost. For this reason compression is often used (slows down a bit but is still faster than hard disk) to reduce total memory requirements.

However, despite this, the total memory requirements for in-memory databases can get quite large.

Memory for in-memory databases

When you add a lot of memory to a server, it creates the “high memory loading” issues mentioned in other articles here – requiring load reduction and rank multiplication techniques (Netlist IP) – which can be addressed by using LRDIMMs/HyperCloud memory modules.

Memory choice remains the same as for virtualization servers – the OEMs generally have standardized memory – for example the HP:

– HP Smart Memory RDIMM
– HP Smart Memory LRDIMM
– HP Smart Memory HyperCloud

All have the same type of error recovery features.

The IBM x3850 server addresses the “enterprise database” market:

http://www.redbooks.ibm.com/abstracts/tips0817.html?Open
IBM System x3850 X5 Product Guide

On the IBM x3850 server the currently qualified memory is:

– RDIMM
– LRDIMM

HyperCloud is currently not available, but when it does, it would be preferable over the LRDIMMs (which have performance, latency, price and IP issues).

As examined in this article:

https://ddr3memory.wordpress.com/2012/06/29/infographic-memory-buying-guide-for-romley-2-socket-servers/
Infographic – memory buying guide for Romley 2-socket servers
June 29, 2012

Servers like the IBM x3850 also can have proprietary memory solutions available, like the IBM MAX5 for memory expansion capability beyond the Intel PoR (plan of record). These solutions may introduce latency or speed penalties (going through the QPI interface for example to the MAX5 memory expansion card would introduce latencies), but for in-memory database applications the end result may still be faster than using traditional database applications.

One solution would be to not use the proprietary memory expansion capabilities like MAX5 and go with the load reduction solutions like LRDIMMs/HyperCloud which are now available for Romley servers.

On current 2-socket servers with 24 DIMM slots, you can expand memory to 768GB running at 1333MHz (with 32GB HyperCloud when it becomes available mid-2012). With LRDIMMs you can have it running at 768GB at 1066MHz. 32GB RDIMMs (which will be 4-rank for the foreseeable future) will not be able to deliver 768GB (because of rank limitations – can only deliver 512GB at 800MHz). The 32GB HyperCloud which use 4Gbit monolithic memory packages should be cheaper than the 32GB RDIMMs and 32GB LRDIMMs.

With 4-socket servers, you can just double that number to 1.5TB running at an achievable speed of 1333MHz (with 32GB HyperCloud).

On the fragility of memory modules

Memory is susceptible to errors – errors caused by gamma radiation, and other sources. In addition there can be differences between DRAM dies which make one DRAM have more errors than another.

For some background on DRAM error probabilities:

http://storagemojo.com/2009/10/10/nightmare-on-dimm-street/
Nightmare on DIMM street
by Robin Harris on Saturday, 10 October, 2009

http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf
DRAM errors in the wild: a large-scale field study

With large amounts of memory, the probability of single-bit errors SOMEWHERE in your 1.5TB of memory goes up (i.e. is double what it would be for 768GB and so on). And the probability of two-bit errors goes up as well.

An analogy can be made with RAID5 – RAID5 was seen as dangerous to use after people realized that with extremely large hard disk sizes (and consequently very long times for RAID recovery), the probability of an error occurring DURING a RAID recovery operation could not be ignored.

Similar considerations need to be given to deal with DRAM memory errors and recovery from those errors.

For in-memory database applications, where the whole database needs to be in memory, even a single error could invalidate the integrity of the whole database.

Enter servers with RAS capability

For these reasons, in-memory database applications like SAP HANA generally tend to favor servers that have additional capabilities for managing memory errors and replacing faulty memory modules with replacement ones.

SAP HANA is an in-memory database solution being pushed by SAP to compete with Oracle – the emphasis on in-memory usage leads to huge improvements in database processing capability:

http://www.bluefinsolutions.com/insights/blog/sap_hana_adapt_or_die/
SAP HANA – Adapt or Die?
11 May 2011 Business Intelligence (BI), HANA, In-Memory, Emerging Technologies

http://www.sap.com/hana/index.epx
http://www.sap.com/hana/overview/index.epx

For background on how in-memory databases may become more common in the future:

http://storagegaga.com/sap-wants-to-kill-oracle/
SAP wants to kill Oracle
By cfheoh | May 5, 2012 | Acquisition, Oracle, SAP, Violin Memory

http://www.readwriteweb.com/cloud/2012/06/vmwares-database-play-disk-is-the-new-tape.php/
VMware’s Database Play: “Disk Is the New Tape”
Scott M. Fulton· June 7th, 2012

In order to address this, Westmere (pre-Romley) and the new Intel E7 (Romley) systems have RAS capability (Reliability, Availability and Serviceability).

http://www.intel.com/content/www/us/en/servers/reliability-availability-and-serviceability-for-the-always-on-enterprise-paper.html
Intel® Processor-based Server Platforms: Enhanced RAS Capabilities

Servers like the IBM x3850 server explicitly mentions support for in-memory databases and SAP HANA in their docs:

http://www.redbooks.ibm.com/abstracts/tips0817.html?Open
IBM System x3850 X5 Product Guide

This IBM system for example mentions SAP’s HANA in-memory database capability prominently – and offers RAS. RAS capable servers have the ability to do DRAM mirroring and also options for reserving DRAM memory modules which can be used to replace memory modules that exceed a certain error count limit.

In addition the OEMs have various techniques (“Chipkill” from IBM for example) for dealing with the memory errors on the memory modules.

http://en.wikipedia.org/wiki/Chipkill
Chipkill

Most of these techniques are available on the full range of memory products offered by the OEM.

– RDIMM
– HyperCloud – which are compatible with RDIMMs
– LRDIMM – which are a new standard and incompatible with RDIMMs

Non-volatile DDR3 memory modules

In-memory databases have a greater vulnerability to power loss because a lot of data is in the volatile DRAM – when power goes, so does the data.

While the techniques mentioned above tackle memory errors, they will not help you in case of power outage – where all the data in the DRAM will be lost.

Since in-memory databases store all their data in memory – and since it would slow things down if things were cached all the time to secondary storage (slower) like SSDs or hard disks – for this reason it becomes very important to have features that could save memory module data AFTER a power loss event has been noted (instead of anytime before that in a precautionary way).

For an examination of how non-volatile DDR3 memory could enable fast recovery after power loss:

https://ddr3memory.wordpress.com/2012/07/03/would-non-volatile-dram-have-reduced-amazon-outage/
Would non-volatile DRAM have reduced Amazon outage ?
July 3, 2012

Advertisements

1 Comment

Filed under Uncategorized

One response to “Memory for in-memory databases and SAP HANA

  1. Pingback: Inphi to report July 25 | ddr3memory

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s