Big Data Tech and Analytics --- Introduction
1. Characteristics of Big Data: 4V
- Volume: From terabytes to exabyte to zetabytes of existing data to process
- Velocity: Batch data, real-time data, streaming data, milliseconds to seconds to respond
- Variety: Structured, semi-structured, unstructured, text, pictures, multimedia
- Veracity: Uncertainty due to data inconsistency & incompleteness, ambiguities, deception, model approximation
2. Bonferroni’s principle:
if you look in more places for interesting patterns than your amount of data will support, you are bound to find crap.