Data Mining by Mehmed Kantardzic (inspirational novels TXT) 📗
- Author: Mehmed Kantardzic
Book online «Data Mining by Mehmed Kantardzic (inspirational novels TXT) 📗». Author Mehmed Kantardzic
How is data mining different from other typical applications of a data warehouse, such as structured query languages (SQL) and online analytical processing tools (OLAP), which are also applied to data warehouses? SQL is a standard relational database language that is good for queries that impose some kind of constraints on data in the database in order to extract an answer. In contrast, data-mining methods are good for queries that are exploratory in nature, trying to extract hidden, not so obvious information. SQL is useful when we know exactly what we are looking for, and we can describe it formally. We will use data-mining methods when we know only vaguely what we are looking for. Therefore these two classes of data-warehousing applications are complementary.
OLAP tools and methods have become very popular in recent years as they let users analyze data in a warehouse by providing multiple views of the data, supported by advanced graphical representations. In these views, different dimensions of data correspond to different business characteristics. OLAP tools make it very easy to look at dimensional data from any angle or to slice-and-dice it. OLAP is part of the spectrum of decision support tools. Traditional query and report tools describe what is in a database. OLAP goes further; it is used to answer why certain things are true. The user forms a hypothesis about a relationship and verifies it with a series of queries against the data. For example, an analyst might want to determine the factors that lead to loan defaults. He or she might initially hypothesize that people with low incomes are bad credit risks and analyze the database with OLAP to verify (or disprove) this assumption. In other words, the OLAP analyst generates a series of hypothetical patterns and relationships and uses queries against the database to verify them or disprove them. OLAP analysis is essentially a deductive process.
Although OLAP tools, like data-mining tools, provide answers that are derived from data, the similarity between them ends here. The derivation of answers from data in OLAP is analogous to calculations in a spreadsheet; because they use simple and given-in-advance calculations, OLAP tools do not learn from data, nor do they create new knowledge. They are usually special-purpose visualization tools that can help end users draw their own conclusions and decisions, based on graphically condensed data. OLAP tools are very useful for the data-mining process; they can be a part of it, but they are not a substitute.
1.6 BUSINESS ASPECTS OF DATA MINING: WHY A DATA-MINING PROJECT FAILS
Data mining in various forms is becoming a major component of business operations. Almost every business process today involves some form of data mining. Customer Relationship Management, Supply Chain Optimization, Demand Forecasting, Assortment Optimization, Business Intelligence, and Knowledge Management are just some examples of business functions that have been impacted by data mining techniques. Even though data mining has been successful in becoming a major component of various business and scientific processes as well as in transferring innovations from academic research into the business world, the gap between the problems that the data mining research community works on and real-world problems is still significant. Most business people (marketing managers, sales representatives, quality assurance managers, security officers, and so forth) who work in industry are only interested in data mining insofar as it helps them do their job better. They are uninterested in technical details and do not want to be concerned with integration issues; a successful data mining application has to be integrated seamlessly into an application. Bringing an algorithm that is successful in the laboratory to an effective data-mining application with real-world data in industry or scientific community can be a very long process. Issues like cost effectiveness, manageability, maintainability, software integration, ergonomics, and business process reengineering come into play as significant components of a potential data-mining success.
Data mining in a business environment can be defined as the effort to generate actionable models through automated analysis of a company’s data. In order to be useful, data mining must have a financial justification. It must contribute to the central goals of the company by, for example, reducing costs, increasing profits, improving customer satisfaction, or improving the quality of service. The key is to find actionable information, or information that can be utilized in a concrete way to improve the profitability of a company. For example, credit-card marketing promotions typically generate a response rate of about 1%. The praxis shows that this rate is improved significantly through data-mining analyses. In the telecommunications industry, a big problem is the concept of churn, when customers switch carriers. When dropped calls, mobility patterns, and a variety of demographic data are recorded, and data-mining techniques are applied, churn is reduced by an estimated 61%.
Data mining does not replace skilled business analysts or scientists but rather gives them powerful new tools and the support of an interdisciplinary team to improve the job they are doing. Today, companies collect huge amounts of data about their customers, partners, products, and employees as well as their operational and financial systems. They hire professionals (either locally or outsourced) to create data-mining models that analyze collected data to help business analysts create reports and identify trends so that they can optimize their channel operations, improve service quality, and track customer profiles, ultimately reducing costs and increasing revenue. Still, there is a semantic gap between the data miner who talks about regressions, accuracy, and ROC curves versus business analysts who talk about customer retention strategies, addressable markets, profitable advertising, and so on. Therefore, in all phases of a data-mining process,
Comments (0)