The goal of data mining is to tranform data into useful, sometimes actionable, information.
Which is NOT true?
Arrange the order of steps for data mining:
Prepare the Data
Visualize and Summarize the data
Perform Regression Analysis
Evaluate the Models and Select
Predict
The best way to fill in missing values is to use a linear model when two variables are highly correlated.
What is a good plot to create when your data doesn't follow a guassian distribution?
Optimizing a model is the balance between creating a complex model that captures variability and a computationally efficient one.
We prune when the relative error with respect to 1SE rule is...
A random forest is an ensemble of...