Another bit of Big Data humor courtesy Daniel Gutierrez, Managing Editor of insideBigData. This time, the topic is the (ir)-rational fear of statistics:
As the pace of data generation increases, the amount of that data we want to store for some period will also increase. At some point (if it hasn’t happened already), the volume of data we want to store will exceed our manufacturing capability for hard drive (mechanical and solid-state). Now is probably the time to be thinking about how we re-introduce tape libraries into our data processing stacks.
Given how much faster the data streaming rate is for data that comes off of tape versus data being pulled from a hard drive, I wonder how difficult it would be to create a MapReduce job which gets its input from locally attached tape drives instead of traditional storage. If up front thought and consideration for such processing were done, I think it would be a very interesting experiment.
Tape is the oldest computer storage medium still in use. It was first put to work on a UNIVAC computer in 1951. But although tape sales have been falling since 2008 and dropped by 14% in 2012, according to the Santa Clara Consulting Group, tape’s decline has now gone into reverse: sales grew by 1% in the last quarter of 2012 and a 3% rise is expected this year.
As the article points out, ever since “Moneyball”, the search has been on for the next big, Big Data, play in sports. Because of the increasing signal to noise ratio, it’s not yet been found.
How soon will it be before the signal to noise problem of increasing data volumes causes a backlash against the Big Data phenomenon in financial services, insurance and business in general?
The more data we collect, the harder it is to filter signal from noise, according to renowned statistician Nate Silver. One sport where this truth recently became evident is cricket.