R&D - Technologies & Strategies for Research & Development

Search R&D
 
Search Tips

SUBSCRIPTIONS

Magazine
   Digital
   Print
   Renew

The R&D Daily
   Recent Newsletters
   Subscribe
   Contact
   Advertise

Laboratory Design
   Newsletter Homepage
   Subscribe

FREE SUBSCRIPTIONS to R&D Magazine and Newsletters











Awards


R&D 100 Awards

Lab of the Year

Scientist of the Year

Innovator of the Year



Product Solutions


R&D E-solutions

R&D Product Showcase



Feedback
Please select the subject of your e-mail.
Subscription Renewal/Change of Address
Advertising Info
Supplier Directory Listing/Change
Questions/Comments about this site
Please provide your name, company name, phone # and e-mail address.
Then, enter your message and click the send button to provide feedback on this Web site
First Name:
Last Name:
Company name:
Telephone #:
Your E-mail address:
Please enter your comments here:


Editor's Take
Data Big Gulp
July 17,2008

This morning I completed a long-overdue mailbox clean-up. You know, the intensive one that purges three-month-old messages with 300 KB attachments that you thought you were going to need soon but never did.

The effort blew away some 30 Mb of not-really-useful data, greatly simplifying my digital life. Managing the home computer with the 120 GB hard drive is a different altogether. My troubles, however, pale in comparison to those of researchers who sequence genes or study samples using light-sheet fluorescent imaging. They have terabyte problems.

This week’s inaugural meeting of the Information Overload Research Group(IORG) in New York City seems to suggest there is data pandemic, calling this overload the “world’s greatest challenge to productivity.”

Certainly, the monolithic piles of 0s and 1s have already pestered high-level researchers, many of whom are producing monstrous data sets from physics R&D. For example, a Univ. of Chicago team last fall produced the world’s largest compressible, homogeneous isotropic turbulence simulation. The effort generated 154 TB in 75 million files. The transfer of just 23 TB of this data to different computers took three weeks. Government-funded researchers are attempting to build distributed computer grids to help solve what has become a “petascale” problem, but these efforts are still in their infancy.

Even research on data overload itself has burgeoned in the past few years (IORG cites 16 notable studies on email overload since 1999), and most experts recognize that data management and storage will become a significant theoretical and engineering challenge in the coming years. This philosophically recursive R&D work reveals some obvious but still unfortunate findings. For example, an email that is not responded to within 24 hours (often this means an “8-hour” workday) will likely remain unanswered altogether. Companies such as Microsoft are developing probabilistic machine learning tools to help people triage email automatically and reduce the number of unnecessary emails.

No question, interruptions to productivity (such as the one I’m writing now!) are bad for efficiency, but I have a competing theory: the more data you have, the more likely you will find a solution.

You just need to learn how to find what you need. And delete the rest.

E-mail the editor

More From the Editors
2008 Lab Cost Index
2008 Lab Cost Index
Download PDF


Advantage Science Group's Academic Sourcebook
Advantage Science Group's Academic Sourcebook
Digital Edition


2008 R&D Funding Forecast
2008 R&D Funding Forecast
Download PDF


Lab 2015
Shaping the Lab of Tomorrow

Lab 2015
Download PDF


Exclusive
2008 Global R&D Report

2008 Global R&D Report
Download PDF








Events Calendar

More Events



























© 2008 Advantage Business Media. All rights reserved.
Privacy Policy | Terms & Conditions | Advertise with Us