What Is A Data Mining Model?
A data mining model is developed when you apply an algorithm to data. However, it is more than a metadata container or algorithm. It is a set of data, patterns, and statistics which can be applied to new data to assist in generating predictions and making suggestions about the data.
Data mining model architecture
A data mining model analyzes data through data mining algorithm. This data is usually received from different mining structures. However, the mining model and the mining structure two different things.
- Data mining model – stores information received from statistical processing of the data. This mostly includes the patterns discovered after data analysis. This model is usually empty until the data delivered by the mining structure has been processed and analyzed. After the mining model is processed, it brings results and metadata back to the mining structure.
- Data mining structure – stores information which defines the source of the data received. The name of the mining model and the server where it is stored can be identified by the metadata. Also, the mining model has the columns from the mining structure used to build the model, the algorithm used to analyze the data and definitions for any filters used to process the model. Algorithms used, filters, data columns and data types are the common choices which affect the analysis results.
The actual data is not stored in the mining model. The mining structure stores the actual data while the mining model stores the summary statistics. If you create filters on data trained in this model, both the model object as well as the filter definitions are saved.
The mining model contains a set of bindings, which point back to the data stored in the mining structure. If the mining structure stores the data, but it doesn’t clear it for processing, the bindings assist you to get the facts that support the results.
Properties of a data mining model
The properties of each data mining model are used to define the model as well as its metadata. That is the name, date the model was processes, description, permissions on the model and any filters utilized for training.
Every mining model also has properties which come from the mining structure. The properties describe the columns of the data used by the mining model. Nested tables are the columns used by the mining model. Separate filters can be applied to the columns also. Additionally, each mining model has two special properties. That is algorithm and usage.
It specifies the algorithm used to create the mining model. The providers you use normally define the algorithms which one can access. The algorithm property is used in the mining model, and can only be used once for each mining model.
The algorithm can be changed later. However, if the algorithm you choose doesn’t support some columns in the mining model, they might somehow become invalid. If changes happen to the algorithm property, the mining model must be repossessed too.
This property states how each column should be used by the mining model. The column usage can be defined as Input, Predict, Predict only and Key. The usage property is only applicable to individual mining model columns.
Therefore, there must be a set individually for every column which is included in the model. The usage property ignores columns which cannot be used in the mining model.
E-mail addresses and customer names are examples of data which can be added to the mining structure but cannot be used in the analysis. Thus, you can analyze them later without having to add them to the analysis phase.
After you create the mining model, you can easily change the value of the mining model properties. However, any changes to the mining model; including the slightest ones like its name require you to reprocess the model. There is a possibility of getting different results after reprocessing the model.