Data mining is an integral part of knowledge discovery in databases (KDD), which is the overal process of converting raw data into useful information.

 

The process of knowledge discovery in databases:

Input Data

-> Data Preprocessing(Feature Selection, Dimensionality Reduction, Normalization, Data Subsetting)  (the most laborious and time-consuming task)

-> Data Mining

-> Postprocessing (Filtering Patterns, Visualization, Pattern Interpretation)

-> Information

 

The purpose of preprocessing: raw input data -> appropriate format

Steps involved in data preprocessing:

1. fusing data from multiple sources;

2. cleaning data to remove noise and duplicate observatoins;

3. selecting records and features that are relevant to the data mining task at hand.

 posted on   Jiang, X.  阅读(263)  评论(0编辑  收藏  举报
努力加载评论中...
点击右上角即可分享
微信分享提示