Summary

In this chapter, based on the previous chapter on Data Understanding, we have demonstrated some basic tasks needed to perform in the step of the data preprocess or data preparation. Those tasks are either resulted from the initial data quality assessment like discover the missing values or demanded by the next step of data analyses like correlation analyses to order the potential predictors based on the prediction power. Attributes re-engineering is the task to make the maximum use of the information contained in the given dataset or transform give attributes in the most appropriate form or types. The ultimate goal is to make datasets ready for analysis.