Big Data is Still Data

Yesterday I wrote a post entitled, “Big Data’s Little Secret”. In that post, I noted it is my belief that the ‘fear of missing out’ has become concomitant with the ever quickening pace of technology’s advance. I further opined that because of the fear of missing out there is increasing potential that we’ll lose sight of the fact (if we haven’t already) that “Big Data” is still Data.

We’ve accumulated many many years of expertise and understanding in the domains of Data and Data Management. We understand backup and recovery and replication. Specialties and specialists have arisen from the sub-domains of data governance as well as security and auditing. We recognize the business’ need for data when we consider its availability and do disaster preparedness planning. There are so many facets of data and it’s management that I don’t believe I am able to list but a very small percentage of them, yet I am hard pressed to come up with a single example which applies to what we classically call Data and not to “Big Data”.

It must have been serendipity, then, that I came across this Infographic at the IBM Big Data Hub.

From the Infographic:

There are certain things that cannot be overlooked when dealing with data. Best practices must be instituted for the care of big data just as they have long been in small data. […]


Enjoy …

Infographic: Taming Big Data - From IBM

Infographic: Taming Big Data – From IBM

Big Data’s Little Secret

In a recent Forbes magazine article, Howard Baldwin took an opportunity to whack the reader with an “obvious stick”. In this age of the ever accelerating freight train that is Moore’s Law, its becoming easier every day to succumb to the fear of missing out. This is no more prevalent than with the topic of Big Data. It is my fear that once we forget that “Big Data” is, at its core (and center and surface for that matter), our long familiar friend data, albeit with a new hairstyle, we face not the possibility, but rather the likelihood of repeating data disasters of the past. We suffered for too many years and endured too much pain and anguish to allow a bit of buzzword excitement to blind us to reality or the obvious. We cannot afford to forget the data lessons of decades gone by.

Lest I be misunderstood, I am by no means advocating putting on the brakes and calling for an international summit and standards organization to be formed around Big Data before we continue on our merry way. I am saying, though, that if we back-burner any of the controls we currently have in place for the sake of expediency in this brave new Big Data world, that we do so with our eyes wide open and with the clear understanding of what controls we’re relaxing and why.

From the article:

Big data doesn’t make data management easier. It makes it harder. Companies that have had a difficult time mastering structured data aren’t going to magically master unstructured data. There are little stumbling blocks such as taxonomies, consistency, hierarchies, and so on that have always made getting to a single source of truth a challenge. Is it a zip code or a postal code? Is it a car, a truck, or a vehicle?

Without applying some rules, you could end up being more confused, with data that’s less reliable and less trustworthy than before. My advice: don’t start tackling big data unless you’re really confident that you’ve mastered data of any size.

English: Data Flow Diagram Example

Lastly, to Baldwin’s point, if you’ve not mastered ‘small data’, or at the very least went and got the T-shirt and you are still heading down the ‘Big Data’ trail with guns-a-blazin’, I wish you luck and invite you to give me a call when you’re mid-tunnel and discover that the light you’ve been heading toward is in fact an oncoming locomotive and not daylight as you had hoped.

The article mentioned can be found at: Big Data’s Little Details – Forbes