摘要:
Motivation: finding "similar" sets in high-dimensional space Defination: Distance Measures: Aim to find "near neighbors" in high-dimensional space We 阅读全文
摘要:
1. Standard Architecture to solve the problem of big data computation Cluster of commodity Linux nodes Commodity network (ethernet) to connect them 2. 阅读全文
摘要:
1. Characteristics of Big Data: 4V Volume: From terabytes to exabyte to zetabytes of existing data to process Velocity: Batch data, real-time data, st 阅读全文