Data mining is an integral part of knowledge discovery in databases (KDD), which is the overal process of converting raw data into useful information.
The process of knowledge discovery in databases:
Input Data
-> Data Preprocessing(Feature Selection, Dimensionality Reduction, Normalization, Data Subsetting) (the most laborious and time-consuming task)
-> Data Mining
-> Postprocessing (Filtering Patterns, Visualization, Pattern Interpretation)
-> Information
The purpose of preprocessing: raw input data -> appropriate format
Steps involved in data preprocessing:
1. fusing data from multiple sources;
2. cleaning data to remove noise and duplicate observatoins;
3. selecting records and features that are relevant to the data mining task at hand.
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】凌霞软件回馈社区,博客园 & 1Panel & Halo 联合会员上线
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】博客园社区专享云产品让利特惠,阿里云新客6.5折上折
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步