BIG DATA

So what’s this?

It’s just a NAME to a Problem that Industry is facing in the era.

The problem is that the data that we generate is too big for any organization(Google, Facebook, Amazon, etc)to handle but still they are handling! but the question is How they are Managing?

But you must have think that they are a big company they can buy any amount of storage? Right, they can but then there’s another problem rises I/O the speed of writing and reading the data from the HDD/sdd or storage will be affected and so will be the time

And it's not only they have to save it for 2–3 days they have to save the data forever again the problem of Persistent came.

Solution

Distributed Storage

So what's this again?
It’s a method of storing the data in different servers simultaneously to save time and I/O speed.

You can Imagine By yourselves that if you are saving the 10GB of data in single storage that it will take some time but if you cut the data in 1GB parts and then store them it will be much faster to save them on the HDD as compared to the 10GB one coz at the same time we are saving 10 parts of the file compare to 1 file only.

And For that, WE NEED SOME TOOLS
TOOLS:
Hadoop, Cassandra, Drill, and many more

Check Out The Following Links to Know More About The Data That These Tech-Giants are Storing.

Facebook per day data
Google

A learner who is Learning From Anywhere