The correctness of the decision nowadays because of the increased influence of the decision points and the consequences on the job; but also on the strength of the analytical / verifiable data that the decision maker has as much experience, ability, and knowledge as it has now. In other words, the data storage, classification, cleansing, processing and interpretation results in firms are directly affecting the decision maker's success. The increasing difficulty of the decision process has necessitated more data processing and faster data processing. In this case, another problem arises; this data in ascending dimensions can not be manually controlled and monitored.
For this reason, data are the most important factors to be monitored and analyzed in today's economic system. While accelerating globalization with the support of the Internet has increased the level of competition, it has lowered profit margins and made it more difficult for customers to please. For this reason, companies have tended to detail in order to make a difference and need more data to be able to make decisions in these delicate matters.
As a result, the data produced by the developing technologies has started to increase. 90% of the data produced by IBM's data have been created in the last 2 years. From an optimist point of view, we can easily say that we live in data age. On the other hand, those who say we swim in the data garbage are not too unfair. Millions of data are being released on each side. It works with data stacks in this complex structure, and extracting interpretable parts is possible with data mining. As the data mining is used to extract data that has been cleared from the data stacks; now more importantly, is used to uncover the intelligence, intelligence, and signal values that the verse grasps to complement the value chain. In other words; scattered data production, correctly classified, formatted, structured data; information production is correctly ordered and analyzed; intelligence production depends on the emergence of multi-dimensional relationships between information. It takes a long dig to reach the bottom of the massive data mountain. Otherwise, people will continue to disappear into the mass of information that is not really intelligence produced from scattered verbally.
Data mining is also fully engaged here. Data mining is the use of a variety of statistical techniques to create predictions for events to occur. With these techniques, the decision that can be made before the processes occur is anticipated and the processes can be managed in advance in case of future situations and events.
The final goal is to predict the data mining. Today, organizations use predictive modeling in many field decision-making processes. Predictive modeling is frequently used in marketing, banking, telecommunication, e-commerce, health and insurance areas.
Data mining steps
First, the data are examined to find stable patterns and relationships between the variables and the patterns determined to confirm the findings are applied to the secondary data set. Data mining has three main steps;
Verification and Modeling
Discovery and Preparation
In order to use statistical analysis methods, some requirements on the data are needed. Therefore, the data digging process begins with the data preparation process. Data during the data preparation process;
Combined and cleaned,
Modeling and Evaluation
In this phase, various statistical models are applied to create predictions with prepared data and the best one is selected according to their performance. At the end of this universe, the pattern in the data is revealed. Various techniques are available for many models to apply the prepared data. Evaluation methods are applied according to the determined confidence interval to select the model that best describes the target.
The application phase is the retrieval of the selected models for daily use. In this last step, analyzing the historical data, the selected model is applied to the new, current data to generate estimates.
The Cross-Industry Standard Process for Data Mining (CRISP-DM) method, which is one of the applied methodologies for applying data mining to organizations, forms a loop by adding the step of Business Conception before this data mining step.
In organizations, data analysts should use this knowledge to construct the problem definition, taking into account their job status and priorities prior to the discovery and preparation steps. The starting plan and business objective are determined in this step.