Would non-volatile DRAM have reduced Amazon outage ?

NVvault non-volatile DDR3 for Romley

UPDATE: 07/04/2012: Amazon took 3 hours to boot servers

Recently we heard about storms knocking out power at Amazon data centers.

The power outage was for 9 minutes – but the time for recovery was much longer.

Would the use of non-volatile DRAM memory modules have reduced the time to get back to a “consistent state” for end-users ?

http://allthingsd.com/20120630/storm-knocks-out-amazons-power-taking-down-instagram-netflix-pinterest/
Storm Knocks Out Amazon’s Power, Taking Down Instagram, Netflix, Pinterest
June 30, 2012 at 11:19 am PT
Arik Hesseldahl

Amazon confirmed that the power outage had affected its services in a statement on its AWS status dashboard at 8:40 pm PT. While power was restored only nine minutes later, technicians worked through the night to bring servers and customer data back to normal. The outage affected its Elastic Cloud Compute, Relational Database and Elastic Beanstalk services.

8:40 PM PDT We can confirm that a large number of instances in a single Availability Zone have lost power due to electrical storms in the area. We are actively working to restore power, Amazon technicians worked through the night to restore services and to recover data from applications running when the storm hit.

8:49 PM PDT Power has been restored to the impacted Availability Zone and we are working to bring impacted instances and volumes back online.

As of 8:38 am PT, about 12 hours after the power failure, Amazon said it was still working on restoring user data. It warned users to expect some inconsistent data, and gave detailed instructions on how to check to see if their data was damaged.

Jun 30, 8:38 AM PDT We are continuing our recovery efforts for the remaining EC2 instances and EBS volumes. We are beginning to successfully provision additional Elastic Load Balancers. As a result of the power outage, some EBS volumes may have inconsistent data. …

So what happened at Amazon ?

The power went out for 9 minutes and the reason they could not recover everyone’s data immediately was that they could not restore user data to a “consistent state”.

The user data being referred to are something like virtual memory machine instances – the Amazon servers run many instances on each of their virtualization servers.

Main memory lost on a single server leads to the loss of main memory for a number of these VM instances.

Thus when you lose temporal data for thousands of these instances, that becomes a huge recovery operation.

Using a DRAM to flash backup solution (which kicks in only in the case of power failure) one can ensure that the volatile data in the DRAM is backed up at the appropriate time – without having to resort to more complicated (and slower) solutions which cache the main memory while it is being used in normal operation.

Power failure will also affect mirrored RDIMM type strategies – since all RDIMMs lose power.

The value of non-volatile memory

Since on most servers, the volatile areas are:

– the processor state (which is backed up whenever you have a context switch between applications/processes) – so this is certainly backup-able
.
– the state of the RAM (which is in gigabytes and usually not backed up – where to store it that fast ?)
.
– then the state of any RAID cards (temporary RAID data stored there)

Then you have the SSD/Hard Disk storage which is non-volatile (although some cache information there could be volatile as well).

What if all the non-volatile information could be backed up in the case of power failure, and restored “9 minutes later” (when Amazon got power again) to the state the computers and their internal data were in ?

NVvault – RAID cards

Already the current generation of RAID cards employ non-volatile memory so the temporary RAID data is not lost.

An example of this is Netlist’s NVvault memory – which has so far been used on Dell’s PERC RAID cards to store the temporary RAID data. These are used on Dell servers.

http://content.dell.com/us/en/business/d/help-me-choose/hmc-raid-controller-12g
Help Me Choose: RAID Controller

The NVvault memory modules are used with LSI MegaRAID CacheVault cards:

http://www.lsi.com/channel/marketing/Pages/LSI-MegaRAID-CacheVault-Technology.aspx
LSI MegaRAID CacheVault Technology

While RAID is designed to be resilient to hard disk loss – so that the remaining hard disks can take over and data can be recovered, there is a bit of data that is kept by the RAID processing itself. If this data is lost, it becomes very hard to restore data.

Netlist is the largest supplier of such non-volatile memory cards for RAID – and sold such cards in volume to Dell for it’s PERC RAID cards:

http://www.netlist.com/investors/investors.html
Fourth Quarter and Full Year 2011 Conference Call
Tuesday, February 28 5:00pm ET
http://78449.choruscall.com/netlist/netlist120228.mp3

at the 18:25 minute mark ..

As you know we introduced NVvault (NetVault) last year for the current Westmere generation and have been shipping to server and storage OEMs including DELL and Compellent for integration into the RAID subsystem.

at the 18:35 minute mark ..

In fact, we are the only supplier to ship in high volume to a major OEM for this class of data restoration product.

In the event of a power failure, the state of the RAID temporary data that is in the RAID memory, is backed up immediately to onboard flash memory.

The flash memory resides ON the memory module itself – so the memory module contains BOTH DRAM and flash memory.

Power is supplied by an “ultracapacitor/supercapacitor”. This provides sufficient power to enable the DRAM to flash transfer, even if there is no power being supplied to the server.

The result is a battery-less, maintenance free solution – Netlist suggests a $500 per server per year cost savings over previous battery-backed solutions, because you avoid an annual or periodic technician visit to the server farm (to replace all the batteries attached to the memory modules).

The use of battery-less non-volatile memory creates the opportunity for use in consumer products (where no technician would be available to replace the batteries).

Netlist previously shipped battery-based NVvault, and later moved to battery-less (ultracapacitor/supercapacitor-powered) NVvault – both to Dell for PERC.

Processor state

Processor state is the most often backed up data – and happens when you do a context switch on the processor – switching from running one application/process to another.

This is not that much data and is a solvable problem.

Main memory state

The main problem is the main memory – what do you do when you have 256GB or more of main memory (DRAM memory modules) and the power goes out ?

How can you back all that RAM up in a second – or a few seconds ?

The solutions obviously has to be parallel – in order to scale to high memory loading applications – 768GB on a 2-socket server cannot be backed up if you do it serially. But it can be done in a small amount of time if the backup media is on the memory module itself – each memory module backs up it’s DRAM to onboard flash.

As it is with Netlist’s NVvault.

With backup media on the memory module itself, the solution is scalable – i.e. as you add memory modules to a server, they each (in parallel) are able to backup the DRAM data to the onboard flash on the same memory module.

NVvault – DDR3 memory

In the paragraphs above we discussed Netlist’s NVvault DDR2 memory module that was used on Dell’s PERC RAID cards to store the RAID temporary data in case of a power failure.

It included onboard flash in sufficient quantity to backup the contents of the DRAM on the same memory module.

Netlist was selling DDR2 memory of this type for use on Dell PERC RAID cards.

Netlist has now expanded that capability to DDR3 memory modules for Romley.

What this means is that you could have a Romley server now running main memory that is non-volatile. In the case of power failure, the contents of the DRAM would be stored to onboard flash. After the “9 minutes of downtime” (as Amazon experienced), you could restore the state of main memory back to it’s original state on powerup – data copied earlier to flash would be restored to DRAM on the memory module and you would have restored memory state perfectly.

Anyone who has used a Windows-based computer is familiar with it’s “Hibernate” feature (Start Menu – Shutdown options). This allows you to save the running state of the processor/main memory structure to a partition on the hard disk (specifically dedicated usually for that purpose) – the partition needs to be at least the same size as the DRAM in your computer/laptop.

With NVvault DDR3 memory you are saving the DRAM contents to flash on the same memory module. The only extra equipment you require is the ultracapacitor/supercapacitor connection to each DDR3 NVvault memory module in your server – this supplies power long enough for DRAM to flash data transfer to take place in the even of a power failure.

Moving from DDR2 to DDR3 NVvault

The potential for Netlist’s DDR2 NVvault was limited to sales related to Dell PERC RAID cards – these required 512MB-1GB DDR2 memory modules for use on LSI MegaRAID CacheVault products.

For DDR2 NVvault:

http://www.netlist.com/products/vault/nvvault-dd2/
Non-Volatile Cache Data Protection

Even though Netlist is the highest volume supplier in the industry for this type of product, the memory sizes are small (512MB-1GB). Thus the revenue per unit is relatively small.

Yet the majority of Netlist’s revenue is currently derived from this product. Netlist is at EBITDA breakeven just on the basis of the NVvault and flash products.

Going forward, Netlist will add revenue from new products for Romley – HyperCloud (IBM/HP) and VLP memory (IBM blade servers) have just started ramping – this will add new revenue going forward in 2012.

NVvault is also transitioning to Romley – and Netlist DDR3 NVvault for Romley will greatly expand use, but will also carry a higher per-unit price – as the DDR3 NVvault will be larger sized than the 512MB-1GB DDR2 that were sold for RAID card use.

For DDR3 NVvault:

http://www.netlist.com/products/vault/nvvault-dd3/
Non-Volatile DIMM for Cache Data Protection

Netlist collaboration with Intel

Netlist has pointed to NVvault use for data centers which would allow seamless recovery.

And they have discussed their collaboration with Intel to bring NVvault to Romley as DDR3 main memory on the server. Netlist points to Intel interest in the storage and RAID space – where a non-volatile memory module has value as a safe place for temporary/cache data storage, and also for direct storage (in the case of RAM hard disks).

http://www.netlist.com/investors/investors.html
UBS Global Technology and Services Conference
Thursday, November 17, 2011 9:00:00 AM ET
http://cc.talkpoint.com/ubsx001/111511a_im/?entity=63_EIUMYWQ

at the 16:15 minute mark:

So that wraps the .. HyperCloud IP part of our product line.

We’ll touch briefly on the NVvault – think of Vault as a safe – it’s a safe place to put your data.

So we created – along with work with .. uh .. several of our large OEMs .. DELL in particular .. a way to get rid of batteries in caching applications by using a combination of DRAM and flash.

at the 16:40 minute mark:

And oh and .. while doing that .. and we’ve done several generations of work WITH the battery .. so WE were trying to get rid of the battery as well as our customer’s trying to get rid of the battery .. we found a very viable solution today .. and we’ve expanded this into a family of products.

So we make these products for RAID caching and our new DDR3 NVvault is available to go directly into the mmemory bus (DIMM sockets/slots) for next-generation Intel servers.

at the 17:05 minute mark:

So we are working closely with Intel (INTC) on that. But it encompasses .. uh .. some of our IP in a digital controller .. we have put flash on one side, DRAM on the other and then you see over on the (referring to slides) .. on the left the little .. uh .. ultracapacitor backup.

So all that does is hold enough charge to mirror the data from the DRAM into the flash – it does that in about 30 seconds .. and then when the power comes back up on the system in about 4 seconds it pulls it right back into the DRAM and you are operating.

So how many of you have ever shut down your computer or had a power go out on you when you are in the middle of something ? You have something like this it would really protect you from that.

And you can imagine in a data center how important that is .. to be able to cache that.

at the 17:40 minute mark:

So you can see that NLST and Intel (INTC) are well-aligned – on the Vault product we are working directly to bring that to market.

That helps Intel move more into the storage and RAID adapter area – something they are interested in .. and along the HyperCloud, with the DDR4, Intel’s already proposing a distributed architecture .. uh .. to JEDEC.

at the 18:05 minute mark:

And .. that closes the technology .. uh .. gaps .. today.

at the 23:10 minute mark:

Chris Lopes:

Well that is a good question .. so .. the .. the collaboration with Intel (INTC) right now is primarily around our NVvault product for the Romley servers.

So we are building a combination DRAM and flash – which is similar to the “hybrid memory cube” .. uh .. although we match the densities .. identical ..

So 2GB, 4GB, 8GB of DRAM backed by 2GB or 4GB or 8GB of flash .. and that works right into a memory .. directly on the memory bus (i.e. DIMM sockets/slot) ..

at the 23:35 minute mark:

So if there are 24 sockets on that new Romley based server, you can fill all 24 (sockets/slots) with that and really create quite an effective .. you know .. virtual SSD .. uh .. running at DRAM memory bus speeds.

Uh .. the “hybrid memory cube” .. I’m seeing some interesting write-ups on that .. uh .. it seems to be geared at first for some more mobile applications .. difficult to get the densities there (probably means that for mobile applications it would be difficult to create large sized memories that fit in small form factor) ..

at the 24:00 minute mark:

Uh .. but that trend is a very positive one for us.

We have already looked at combining our IP on the multi-ranking (“rank multiplication”) with HyperCloud with the controller technology we are developing on flash .. to look at building something that might look similar to that .. uh .. with a combination DRAM and flash .. but NOT with an exact matching of .. densities (i.e. c.f. the “hybrid memory cube”).

at the 24:25 minute mark:

So you imagine you got large flash .. which is the lowest cost per bit .. uh .. memory out there .. buffered by high speed DRAM .. and you .. give you the best of both worlds ..

And there is some significant IP challenges .. uh .. to doing that .. uh .. effectively .. but we think we have a head start ..

Availability of a non-volatile DDR3 memory module would greatly simplify affairs for:

– in-memory databases – like SAP’s HANA
.
– RAID being done directly on the server – with non-volatile main memory for storing critical RAID information
.
– use in network devices which could recover state
.
– DRAM-based backup storage – or virtual SSD with faster than flash access speed
.
– non-volatile battery-less NVvault requiring little maintenance being used in consumer devices which need to maintain state

SAP HANA is an in-memory database solution being pushed by SAP to compete with Oracle – the emphasis on in-memory usage leads to huge improvements in database processing capability:

http://www.bluefinsolutions.com/insights/blog/sap_hana_adapt_or_die/
SAP HANA – Adapt or Die?
11 May 2011 Business Intelligence (BI), HANA, In-Memory, Emerging Technologies

http://www.sap.com/hana/index.epx
http://www.sap.com/hana/overview/index.epx

Netlist comments on the “data recovery capability on the main memory bus”:

http://www.netlist.com/investors/investors.html
Fourth Quarter and Full Year 2011 Conference Call
Tuesday, February 28 5:00pm ET
http://78449.choruscall.com/netlist/netlist120228.mp3

at the 18:50 minute mark ..

However the Romley will open up many other opportunities for DDR3 NVvault.

For the first time with Romley, the CPU will enable data recovery in main memory.

There are several advantages to having a data recovery capability on the main memory bus, compared to the traditional way of backing up data on the RAID card.

These include speed, efficiency and higher reliability.

Because of these advantages, coupled with the enormous market reach of the Romley platform, we believe that this will open a much larger customer base for the NVvault product line in the years ahead.

Netlist comments on the transition from DDR2 NVvault addressing pre-Romley RAID cards to DDR3 NVvault as main memory in Romley servers:

http://seekingalpha.com/article/592411-netlist-s-ceo-discusses-q1-2012-results-earnings-call-transcript
Netlist’s CEO Discusses Q1 2012 Results – Earnings Call Transcript
May 15, 2012

In its place we will be ramping up a multitude of brand-new products starting this quarter that will serve as the foundation of the company’s top-line and bottom-line growth over the next several years. As you know, HyperCloud, VLP, Planar-X RDIMMs and NV3 are all fruits of the company’s long-term vision and investment of tens of millions of dollars over the past several years.

Much like the PERC business, these will become high-volume products, but unlike the PERC, the average selling price and gross margin dollars will be an order of magnitude higher because these new memory products are high-density and high-performance. As such, we expect the business scale quite rapidly once these products gain traction in the marketplace.

Also, importantly, unlike the PERC which was a single product — single customer product limited to Dell servers, all of the new products are targeted at the entire server and storage space. Therefore, we expect to see the benefit of supplying numerous major customers in terms of our revenue profile and customer diversification.

Finally, while the PERC went up and down with the traditional two to three-year server cycle, HCDIMM, NV3 and Planar-X are multigenerational products based on fundamental groundbreaking IP. They will retain their competitive edge through DDR4 all the way to the end of the decade.

To understand the difference between DDR2 NVvault used in RAID cards (LSI MegaRAID CacheVault etc.) vs. DDR3 NVvault which can be used as general purpose main memory in servers, check out this paper:

http://research.microsoft.com/pubs/160853/asplos206-narayanan.pdf
Whole-System Persistence
Dushyanth Narayanan
Orion Hodson
Microsoft Research, Cambridge

Competitors

There are a couple of competitors – however Netlist is the largest supplier of such memory modules as pointed out above.

One is AgigaRAM, a subsidiary of Cypress which suggests use of non-volatile memory as a replacement for the need for UPS power:

http://www.agigatech.com/pdf/pdf_ProductBrief_DDR3.pdf

Another is Viking:

http://www.vikingmodular.com/products/arxcis/arxcis.html
DDR2 version

http://www.vikingmodular.com/products/arxcis/ddr2/ddr2.html
DDR3 version

However, these do not seem to have the extensive experience that Netlist has accumulated with their relationship with Dell PERC RAID card use.

Netlist sold over one million battery-backed versions (which used to be called NetVault instead of NVvault) to Dell for use on their PERC RAID cards:

http://www.prnewswire.com/news-releases/netlist-ships-more-than-one-million-battery-backed-memory-modules-92948629.html
Netlist Ships More Than One Million Battery-Backed Memory Modules
May 6, 2010

UPDATE: 07/04/2012: Amazon took 3 hours to boot servers

http://www.pcmag.com/article2/0,2817,2406682,00.asp
Amazon Blames Power, Generator Failure for Outage
Chloe Albanesius By Chloe Albanesius
July 3, 2012 05:42pm EST

It turns out that an abrupt power outage like that is pretty bad for the cloud. Though the backup generators finally started to restore power just 10 minutes into this second outage (power was fully restored 10 minutes after that), Amazon technicians soon discovered that it was going to take them about three hours to reboot affected servers in the data center and that this delay would be compounded by several bugs in their cloud software that they hadn’t known about.

It seems that even if the servers had non-volatile DRAM, it would take 3 hours just to reboot the servers.

Then there were load balancing and other issues that are pointed out in the article above.

There is mention of the 3 hour delay related to a “bottleneck in the server booting process”, and that there was a problem with routing of traffic between zones:

http://www.networkworld.com/news/2012/070312-aws-outages-260646.html
Amazon takes blame for outages, bugs and bottlenecks
Amazon’s market-leading cloud suffers outage caused by power failure, restarting bottlenecks and a multiple software bugs, bringing down Netflix and other customers Friday night
By Brandon Butler, Network World
July 03, 2012 12:01 PM ET

As a result, for more than an hour between 8:04 and 9:10 p.m. PDT on Friday, customers were unable to create new EC2 instances or EBS volumes. The “vast majority” of the instances came back online between 11:15 p.m. PDT and just after midnight, AWS says, but that was delayed somewhat because of a bottleneck in the server booting process due to the large number of reboot requests. AWS says removing the bottleneck is an area they will work to improve on in the case of a power failure.

That didn’t seem to work on Friday though. On Saturday, Cockcroft tweeted, “We only lost hardware in one zone, we replicate data over three. Problem was traffic routing was broken across all zones.”

Here is a timeline/explanation from Amazon of the events that transpired:

http://aws.amazon.com/message/67457/
Summary of the AWS Service Event in the US East Region
July 2, 2012

About these ads

4 Comments

Filed under Uncategorized

4 responses to “Would non-volatile DRAM have reduced Amazon outage ?

  1. Pingback: Examining Netlist | ddr3memory

  2. One has to also address the cost issue.

    How much would people be willing to pay in order to reduce the recovery time?

    The cost differential between NVRAM and regular RAM is not small especially when multiplied across thousands of servers.
    The Cloud infrastructure provider will have to demand a premium for the servers that offer faster recovery time.
    Will end customers agree to pay?
    after all, failures happen only about 3x per year (so far).

    • Yes, you are right – that NVvault would be more expensive than regular memory.

      Plus NVvault type of products would not have max capacity – since you need space for the flash also – so current non-volatile memory modules usually have max of 4GB or 8GB capacity (instead of 16GB, 32GB which is the max for regular memory modules).

      However, theoretically, with the cost of flash probably a fraction of the cost of DRAM – it can be envisioned that a slight cost premium be paid to acquire the non-volatile version.

      It may not happen now, but I suspect it may become the norm later.

      I don’t know how Amazon would calculate the loss of a catastrophic event (even if it happens every 3 years).

      Are there use cases which WOULD require that type of fault tolerance ?

      Yes, the network devices and all that which is usually mentioned.

      However, I think in-memory databases would be one use which could benefit from it.

      Or “enterprise database” stuff – like the IBM x3850 type servers are designed for – they specifically mention SAP’s HANA in-memory database type of use.

      For example this server that David Watts mentioned – it has reference to use with SAP HANA – is designed for “enterprise database”:

      http://www.redbooks.ibm.com/abstracts/tips0817.html?Open

      IBM System x3850 X5 Product Guide

      In those use cases, they maybe willing to pay the premium – if they are already paying it for the RAS – reliability, accessibility etc. ..

  3. Pingback: Memory for in-memory databases and SAP HANA | ddr3memory

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s