Memory has its faultsand not only the human variety.
Hard drives, for instance, can hold terabytes cheaply. But
theyre slow. Random-access memory (RAM) is fast but expensive, and
data in RAM disappear the instant the power goes off. Flash memory
is faster than hard drives and cheaper than RAM, and it retains
information. But the way it handles data writes handicaps its
usefulness in server environments.
Still, the distinct advantages of flashparticularly its ability
to retain data when power is off, as well as its speedled two
Microsoft Research scientists, Sudipta
Sengupta and Jin
Li, to develop what they call FlashStore. Its a flash-based
bridge between RAM and a hard drive that helps to overcome the
handicaps of each while maximizing flash memorys capabilities and
minimizing its weaknesses.
FlashStore operates as a key-value store and uses a key to
access the value associated with each piece of a data record. It
supports the operations of read, write, update, and deletion of
such data records. A third collaborator on FlashStore, Biplob
Debnath, was a research intern at Microsoft and now works for EMC,
a network-storage and data-recovery firm.
FlashStore has potentially significant usefulness across a range
of computing applications, from server farms to cloud applications.
It already has shown it can speed up online gaming for Xbox LIVE players, as
well as data-intensive server applications.
Sudipta Sengupta
Flash memory sits conveniently in the huge gap between RAM and a
hard disk in terms of both cost and performance. With its
properties of low power consumption, physical ruggedness, and small
size, flash has enabled new experiences with many consumer
electronic devices for more than a decade. But it is only recently
that flash is seeing widespread adoption in desktop and server
applications, in the form of solid-state drives. New applications
of flash involve different storage-access patterns from those in
typical consumer devices and, because of flashs device properties,
pose new challenges to its ability to deliver sustained high
throughput and low latency.
Flash is great technology, but it requires intelligent software
to make the best use of it, says Sengupta, a researcher at
Microsoft Research Redmond. Thats where our work comes in.
Sengupta and Li have written several papers for technical
conferences on intelligent software design for utilizing flash
memory, one of which is
FlashStore: High Throughput Persistent Key-Value Store,
which appeared in the 36th International Conference on Very Large
Data Bases in September 2010.
Peculiarities of Flash Memory
In FlashStore, flash sits between a hard drive and RAM, acting
as a high-speed holding area for frequently used data. FlashStore
works by overcoming flash memorys drawbacks. These relate to the
way in which flash stores and manages data. In flash memory, data
is not as easily overwritten as it is on a hard drive. Flash memory
starts in an erased state, then collects data in page-sized units
at a time, where a page can vary in size from 512 bytes to eight
kilobytes, depending on the device.
But if you need to erase data, you cant do it a page at a time.
Typically, an erase block consists of 32 to 64 data pages. Li
compares the way flash works to a line of water buckets.
Jin Li
The write operation is like pouring water into a series of
buckets, says Li, a principal researcher with Microsoft Research
Redmonds Communication
and Collaboration Systems group. The cheapest flash devices are
designed so that if you need to erase a bankthe blockyou have to
drain the water from all the buckets. You cant drain just one
bucket.
To handle data writes and reads, the memory controller in a
flash-memory device uses a mapping system called Flash Translation
Layer (FTL), which maps the logical page address of the incoming
data to the actual physical page location of the data in flash
memory. If a page of data is rewritten, it is written to a new
location with an updated FTL entry.
During a process known in computing circles as garbage
collection, the flash controller takes valid data from a memory
block, as well as new data, and writes both to a new memory block.
The freed memory block then can be erased for storing new datain
effect, draining all the buckets in one memory block. But this
process isnt efficient, in large part because of the requirement to
reclaim space with pages that are rendered invalid by such
operations. In time, repeated erase cycles also degrade the life
expectancy of each flash memory block.
Flash also writes only to pagesnothing smaller. Data written to
a page that is less than the page size, typically 512 bytes to
eight kilobytes, make poor use of that page. Any attempt to write a
small chunk of data within a page means the controller will have to
move the existing data to another page, along with the new
data.
Flash works fine on consumer electronic devices such as cameras
and digital music players, where each photo or music file is at
least megabytes in size, and data are written to flash mostly
sequentially.
But flash performance drops drastically when data writes are
random, which is the case for many desktop computing and server
applications. Even todays high-performance, flash-based
disk-replacement devices experience a dramatic drop in access
latencies and throughput when performing random write
operations.
Then there is the cost aspect of flash memory, which is about 10
times more expensive than hard disk per GBand about 10 times less
expensive than RAM. Flash makes most sense for applications that
can identify and exploit the sweet spot between cost and
performance.
FlashStore Efficiency
To make flash work efficiently enough to be more attractive on
the price-performance tradeoff for heavy-duty server and cloud
computing, FlashStores creators had to create a more flash-friendly
data structure while avoiding random data writes to flash. They did
that by giving FlashStore three important design features.
First, FlashStore introduces flash as a cache in the memory
hierarchy between RAM and hard disk so that the flash memory holds
a working set of datainformation a computer is most apt to need.
FlashStore tracks data use so that, as data becomes less frequently
accessed, it eventually is sent back to the hard drive so that new,
fresh data can take its place. That makes better use of flash while
still getting much faster retrievals than are possible with a hard
drive.
Second, FlashStore is designed to eliminate random writes. It
organizes data in a log structure on flash so that new data sent to
flash does not lead to random writes and, hence, is not subject to
garbage collection by the device. FlashStore aggregates small
writes from the application into a write buffer in memory and
writes to flash when there is enough data to fill a flash page. By
making writes sequential to flash, FlashStore makes much better use
of flash-device architecture. If the application needs strict
data-durability guarantees, FlashStore can aggregate writes in
memory up to a timeout interval provided by the application or when
a flash page worth of data is amassed, whichever comes first.
Finally, FlashStore is designed to use RAM efficiently by
employing a specialized RAM index to access the data on flash. It
uses a hash-table-based index in RAM that is designed to save space
along both the vertical and horizontal dimensionsthe number of
slots and the size of each slot, respectively. To reduce the number
of slots, it uses a variant of cuckoo hashing that achieves high
hash-table load factors while keeping lookup times fast. To reduce
the size of each slot, it stores a compact key signature of about
two bytes in each slot instead of the full key, which could be tens
of bytes or larger, depending on the application. Each slot also
stores a pointer, typically four bytes, that points to the full
key-value pair record on flash. During key access, the pointer to
flash is followed only if the signature in the hash-table slot
matches that of the key being searched. These techniques keep the
RAM usage frugalat about six bytes per key-value pair, independent
of the key-value pair sizeand key-access times fast, involving at
most one flash read per lookup on the average.
In continuing work on a system calledSkimpyStash
, a project that will be described in a research paper to be
delivered during the 2011 conference of the Association for
Computing Machinerys Special Interest Group on Management of Data,
the Microsoft Research scientists have reduced the memory
requirements of FlashStore even more, about six-fold, to about one
byte per key-value pair. SkimpyStash reduces memory usage by making
a tradeoff with key-access times and by using multiple flash reads
per lookup, which is still orders of magnitude faster than
hard-disk lookup times. The tradeoff choice between memory usage
and key lookup times in SkimpyStash is adjustable and can be made
by the application using a simple parameter.
FlashStore Performance
FlashStores results are impressive. Server systems using
FlashStore can show as much as several tens of factors of increased
data throughput compared with an ordinary RAM/hard-drive
configuration and existing software that is not flash-aware, such
as a key-value store like Berkeley DB. When compared with simple
hard-disk replacement with flash but no changes in software, as in
using a flash-unaware key-value store, FlashStore demonstrated
several factors of improvement in performanceunderscoring the
impact of using flash-aware data structures and algorithms in
FlashStore. It could enable the creation of much less expensive and
more power-efficient computing designs. In some cases, Sengupta
says, system designers buy arrays of hard disks that remain only
partially filled with data, because they need additional disk heads
for higher input/output (IO) operations per second and not for disk
space. FlashStore can be used to absorb the IO intensive operations
at the flash layer, while leaving the hard drive to handle
operations involving large data size that amortize disk-seek time
overheads.
You could replace 10 to 20 hard drives with one flash drive
using FlashStore for such applications, he says. That gives you a
capital-expenditure savings, power savings, and
operational-expenditure savings, as well, and you also get much
faster throughput: an all-win situation on the three metrics of
price, power, and performance.
Applications That Benefit
FlashStore is showing its potential in practical applications.
Xbox LIVE Primetime 1 vs. 100, for example, operates a
challenging computing environmentwhen a player initiates actions
within a game environment, those actions must be logged and
communicated to all other participants in a timely manner to
maintain the responsiveness of the gaming experience. Game activity
can scale up or down rapidly, depending on the number of online
players. A games ability to track and update thousands of player
actions relies heavily on the back-end systems ability to map what
the players are doing and retrieve changes in game play.
The Xbox LIVE Primetime 1 vs. 100 back end uses database
servers running on hard drives to log and update game-state
changes. But, Sengupta says, this results in a lot of redundant
hardwareand latency issues as the drives retrieve and disseminate
data. A flash-based system would be ideal, he says, because flashs
speed could accelerate and improve the responsiveness of online
game play. It also could hold relevant data until it is no longer
needed, then send it to a hard drive for post-game analytics.
The FlashStore team replayed Xbox LIVE Primetime 1 vs.
100 game traces to test whether FlashStore could boost
performance. It didperforming operations 60 times faster than a
standard RAM/hard-drive configuration and five times as fast as a
flash-unaware key-value store running on a top-of-the line
commercial flash-based drive.
FlashStore also achieved significantly better results than a
RAM/hard drive or a flash drive when tested with the task of
performing data-chunk indexing for data deduplication, a method of
reducing storage-capacity needs by eliminating redundant data. The
FlashStore research team also has built a complete
flash-assisted-storage deduplication system, called ChunkStash. The
team reported its design and evaluation in a research paper
entitled ChunkStash:
Speeding Up Inline Storage Deduplication Using Flash
Memory, presented in 2010 during the USENIX Annual
Technical Conference.
A FlashStore-like application also might find utility in areas
such as ad-sponsored online searches, where relevant advertisements
need to be summoned in response to certain queries with tight time
constraints on advertisement selection and ranking. While RAM-based
data processing is a choice for such latency-sensitive
applications, a flash-based solution such as FlashStore could
provide acceptable performance in many cases at a fraction of the
hardware cost.
Flash memory has come a long way from its beginnings in consumer
electronic devices such as digital cameras and portable music
players. It is seeing widespread deployment in desktops and
servers, spanning consumer, enterprise, and cloud applications.
These developments present the computing industry with
opportunities to identify new applications for flash memoryas well
as challenges across the software and hardware stacks to get
maximum performance at lowest cost.
That, Sengupta says, shows the relevance and timeliness of our
efforts.