07 Mar 2019

Much Ado about... Big Data

mishmash io

Share
Like Comment

It's hard to write about Big Data. Especially when you want to avoid all the hype. There are interviews with business leaders, articles about success factors, there are coaches and their acronyms - the five V's of Big Data (the 6th 'V' is disputed), the 8 Z's, the 4 M's... '11 things I learned from Big Data'; studies on how much data is BIG data; treatises on is unstructured data bigger than structured data... Well, to say the least - it's confusing. Even to us.

I often wonder why we continue monitoring what's being said on the subject. And I honestly believe the term was chosen badly. It naturally makes you think about gigabytes, terabytes and petabytes, when in fact these don't matter. And I mean they don't matter at all.

The Internet would have us believe Google, Amazon, Facebook became huge because they were collecting data and so we should too. But no, it wasn't the size, nor the data. It was something else.

But before we get to what it was, let me tell you a story. It's a short one.

The first Data-driven business

17th century Europe. A young gentleman is very keen on quick money making. So much so that by the age of about 30 develops a bad gambling habit. And he must have been a bad looser too, I'm sure he was asking himself the same questions as we all do - 'why always me?'. I'm sure of that because of what he did next. Unlike most of us, he actually sat down and began computing the answer. I doubt he won billions, but I can easily see him using knowledge he got out of these computations to easily trick his more gullible opponents into placing bets on less favorable odds. Or in other words - he definitely got ahead of the competition.

Want to know the name? Rene Descartes. You see, the name does not matter much. The thing is - you can repeat that same story, changing the setting: the century, the place, etc and you will, again and again, get another very well known name. All gamblers, bad at loosing, that we all learn about in school. Think of this in terms of branding - the verb 'to google' became synonymous to a certain quick way to find and an answer to something you don't know. The adjective 'Cartesian' (yes, it comes from the name Descartes) denotes a certain approach in computing answers.

Let me repeat this, as it is important - those gamblers did not remain in history because they were flipping coins and dealing cards. Certainly not because they made money. They did so because, collectively, showed us that even though there is a reality out there that will forever remain hidden from us - we can probe it and then deduce, by computation, what hidden factors are at play. We can understand what cannot be seen.

The second Data-driven business

1801, London. A 40 year old spy is collecting data on the hot business of the day - empire building. Was not a gambler, in fact he must have been very responsible in his job, because in his desire to help Britain beat France produced the earliest known example of a modern market report - with pie charts and everything.

Want to know the name? Me too! So, I googled it - it's a certain William Playfair. Of course I don't want to ridicule his work, gathering intelligence does matter. But ironically even his name fits well the kind of opposition I want you to consider - what you see as reality VERSUS computing what are the hidden factors behind our observations. Playing fair VERSUS getting 'Cartesian'. Gathering intelligence VERSUS analysis.

One of the two approaches stuck becoming a $30 bln/year business today. The other didn't. And this is an important distinction that I believe we should all consider every time we see the word 'analytics' in a business context. There are many vendors out there who will market their software as 'beautiful analytics', 'make decisions based on data', 'transform digitally' and so on and so forth. But the fact remains - one approach shows you your observations, the other digs for what remains hidden from view.

One is Big Pie Charts, the other is Big Data.

The similarities don't end there - Rene Descartes was flipping coins to collect data, yet his approach couldn't be more different. Here's an example why:

Rene Descartes does pie charts

Mish1-1

Obvious, right? Want to make a business decision based on a chart like this? 'Cos he did. Although, if you noticed in his story I did not say 'data'. I said 'computing'. Because this is what he did differently - he started computing how to make decisions.

And this is another major distinction in the world of business analytics today - a lot of us grab what's already there, tried and tested, solid and proven - the pie charts - even though this actually leaves us to our own reasoning to figure out what's hidden behind the data. Or in other words - it means we have to do what Descartes did.

A few of us though will gamble and say 'can I have something like... a million miniature, virtual Rene Descarteses who will be figuring out day and night what exactly is happening in reality and placing my best bets? Every single time?'. Some of these few are Google, Facebook and Amazon.

How is that for a digital transformation? A computer doing business decisions based on computational methods? Not me, not my best and greatest analysts and experts - a computer?!?

Projections on digital transformation

Mish1-2

N.B. William Playfair is credited as the inventor of the bar chart too.

No way! How can I trust a computer?

And this is, in my opinion, why Descartes is NOT a household name. Why his work has to be literally 'taught into' us in school. As humans, we have somehow evolved to look for trust when decisions have to be made.

Hand on heart you never felt uneasy when you were presented with something like this:

% Profit

Based on Sampled Data

Mish1-3

Even though you know sampling is enough? Let me just casually drop another name here - Carl Friedrich Gauss. He's the one behind sampling. Hand on heart next time you see sampled data you will trust it, just because you know it was due to him?

I bet not. And if I'm right - you have to give me one thing here - by adding just a little bit of computation I made something you would have trusted into something that makes you feel uneasy.

You see, physicists trust those computational methods to uncover the hidden factors that can send people to the Moon and back (casually dropping Isaac Newton here), engineers trust those methods to make devices like the one you're using right

now, geneticists... and so many countless others to whom you might attribute human progress. But they're not looking to make decisions, are they? Like, they have to uncover hidden factors and use them, not make decisions on how things should be.

In business and life in general - numbers are trusted, computation isn't.

Keep this in mind too.

As experts in deploying virtual Descarteses, Gausses and so on, I can tell you that you can't sell them by saying to clients 'with this - you don't have to make decisions', or by saying 'it works because it has computed that this is the only way you could have these observations exactly and not any other observations'; or by saying 'why are you even bothering with this, it's a computational problem, solvable by computers'; 'this will beat humans at analytics 10 out of 10 times'...

Also in my experience - companies do understand Big Data is Big Deal. But they don't go 'virtual-Cartesian', they still go Big Pie Charts, asking questions like - 'can we collect more data?' (read: so our pie charts will be more trustable), or 'can we have something where we can test hypothesis like ... are we going to sell more umbrellas when it's raining?' (read: we have to scale up guess-work) and so on.

The good thing about this blog is that we don't have to sell. So, we can talk about Big Data as it is. Stay put as we do our bit in demistifying Big Data.

In a series of posts we'll try to give you a better understanding about how broad is the range of problems that can be solved by computing (uncovering hidden motivational factors is just one of them); why Big Data is not a software or platform but actually a way of thinking and operating; it's not a skill experts have but more akin to a business process; and that 'Big' and 'Data' denote the end of such a journey, not the beginning.

And most importantly, I hope, we will show you that the biggest digital transformation going on is the one that TRANS-FORMS the decision-making process in a company. Not using the same old way of having reports and discussions in board-room meetings, where we merely made the reports digital. It's the one where millions of decisions per second are made on behalf of the business.

Next time William continues to do intelligence. Rene is not impressed.

Author: Andrey Rusev, VP of Real-time Analytics

Share
Like Comment

Categories: IBA Blog , mishmash of Big Data

Tags: #BigData #Analytics #Digital #Transformation

Comments (0)

Archive / Search

Instagram

Follow on Instagram