So here we’ve got a cloud and it’s a word cloud and it just illustrates that data is everywhere.

We live in the 21st century.

And wherever you go this data and in fact there’s even a term for it data exhaust.

So as you go by of your day you’re constantly leaving this data exhaust your eye whether it’s through your text messages that you’re sending or you’re going on Facebook or just simply walking through the city and different antennas are picking you up you’re leaving GPS GPS data.

So you’re constantly leaving this data exists whether you are aware of it or not. And probably the only way to not leave dead exhaust is to go live in the forest for a couple days.

So that’s the reality we live in and this data is is just growing exponentially it’s accumulating all the time. And let’s see how this has been happening throughout the history of the human race.

So here we go.

Since the dawn of time up until 2005 humans had created 130 exabytes of data.

So you’re probably sitting there thinking Carol that sounds really cool but it is very hard to imagine

what what the heck an exabyte is and that means nothing to me.


Fair enough.

So let’s go and drill into this to understand what this actually represents.

So here have got a letter a right so letter as a single letter and a letter on a computer takes up one byte of data.

Now if we zoom out a thousand times we have a thousand letters and 1000 letters represents about a page on a small book and that fits into one kilobyte. Now if we zoom out another thousand times and we have a thousand kilobytes which is a megabyte and that is about a book with 500 pages double sided.

Now if we zoom out another thousand times we get a gigabyte. And believe it or not you can fit the human genome onto one gigabyte So it actually takes about seven hundred twenty five megabytes of space and you can encode if you think about it you encode a whole human being on a one gigabyte bus.

You’ll probably say that’s you know a human is not just his genome it’s just his DNA it’s also the life experiences that he has.

You know all the things that he does in his life you can put that on to the goodbye.

Well how about Wismer another thousand times and we get one terabyte. And now if you take an HD camera and you walk around a person for his whole life. So for about 80 years every minute every second of their life and you feel everything they do and you put all of that in HD video that will fit into one terabyte.

So that’s already getting impressive right now. Let’s zoom out another thousand times and this is one of my favorite representations. So here we’ve got the Amazon rain forest.

It has about 1.4 billion acres of trees every acre has about 500 trees in it. So that makes it about 700 billion trees in the Amazon rain forest. So if you take all of those trees and hypothetically I’m not saying this should be done but hypothetically, if you take all those trees and you were to chop them down put them into paper and fill every single page of their paper with letters with text then you will get about one to two terabytes of data. So that should give you an idea of how huge a terabyte is.

Now if you zoom out another thousand times that’s where you get an exabytes on Exabyte is a thousand terabytes. So now this should be a bit more impressive that since 2000 since the dawn of time up until 2005 humans have created 130 exabytes of data and includes everything includes all of the books written all of the songs that we’re saying all of the words that were spoken.

Everything that humans have created was about 130 exabytes of data. Now in 2010 that number was already 1200 exabytes. So you can see that in the following five years humans create so much more data then in the next five years in 2015 this number became seven thousand nine hundred exabytes.

Then in 2020 this number is estimated to be fourteen thousand nine hundred eggs or bytes of data. As you can imagine these numbers are astonishing. But what’s even more astonishing is the growth exponential growth of data that we’re seeing. And if you visualize it it looks something like this.

This is the reality that we live in the world that we live in.

How quickly data is growing every single year.

So this is an actual study it was done by DC and a sport sponsored by EMC.

So this is all researched.

Now what is not research is these lines that I’m going to draw now but this is my impression of what’s going on in the world.

This is where I think we have we are as data scientists our capacity to process as data scientists though the data that’s in the world. And then if you draw the machines this is the line that represents how much of the data machines are actually using that exists in the real world.

And so where does machine learning come into play. Well machine learning is the potential that presents a potential to use this data that is not being used only equipped with machine learning algorithms.

Can you step up to that challenge and make sense and add value from the data that is growing so quickly in fact exponentially. And that is why machine learning is actually a degree is being taught at universities in fact Huddleston.

