Or "You have Big Data". Paraphrasing Greg Young's idea, "If you can put it on a ...

kabes · on Oct 16, 2016

I would say that the definition of what is big largely depends on the problem you try to solve. If that problem is finding keywords in text files, then your definition sounds about right. For other problems even a couple of KB might be big. To me, big is when your dataset is too big to solve your problem in reasonable time on one machine.

firasd · on Oct 16, 2016

Right. I'm not comfortable with this idea of big data being dependent on the actual file size of the data. There really are problems where doing stuff like reading files or using a relational database break down and you need something else that's more specialized in solving a problem even if the dataset is just a couple GBs. (So in my mind big data is a reference to specific tools or approaches like map-reduce, etc.)

rbanffy · on Oct 16, 2016

My personal criteria are on the line "if it doesn't fit on a single box (up to a cubic meter)" or "if it's so much it's impractical to move around", then you can say you have big data.

If just maxing out the computer's RAM and CPU count solves your problem, then it's not big data.

Roboprog · on Oct 16, 2016

Thank you thank you thank you (all) for pointing out the obvious resume padding.

Suck relevant DB entries into memory. There is no step 2.

AstralStorm · on Oct 18, 2016

Step 2 is obviously writing synchronisation primitives to write it to a permanent medium.

Step 3 is using a real database.

Circles into square pegs. ;)

Roboprog · on Oct 18, 2016

Sure, but not if I'm just gronkulating a bunch of stuff to spit out a short summary "on the screen" (or equivalent).