Data mining 10 most common problems

xiaoxiao2021-03-06  65

AbstractWhile a myriad of different data mining techniques have been proposed, just a few simple questions can shed light on the key attributes and the power of each technique. In this paper Information Discovery, Inc. analyses approaches to data mining providing two sets of business and technical questions that dissect each technique. Information Discovery, Inc. is the leading provider of large scale data mining oriented decision support software and solutions, introducing pattern management with its breakthrough pattern Warehouse? technology and offering two comprehensive product suites. The Data Mining Suite? Of Products Directly Access Very Large Multi-Table SQL Repositories To Find Powerful Multi-Form Patterns. The Knowledge Access Suite? IncremeTally Stores these Pre-Mined Patterns in a pattern warehouse?

for access by business users. The company also offers a wide range of discovery and data mining solutions, strategic consulting and warehouse architecture design, as well as customized solutions for banking, financial services, retail, customer packaged goods, manufacturing and web-log analysis . IntroductionThe past year has seen a dramatic surge in the level of interest in data mining, with business users wanting to take advantage of the technology for a competitive edge. The IT departments in most Fortune 500 companies are suddenly tasked to respond to deployment questions relating to data mining. The growing interest in data mining has also resulted in the introduction of a myriad of commercial products, each described with a set of terms that sound similar, but in fact refer to very different functionality and based on distinct technical approaches. The IT Managers Charged with the Task of Selecting a Decision Support System Offen Face a Challenge In Responding to the Needs of The business users because the underlying concepts of data mining are far more complex than traditional query and reporting, and to add to the pressure the needs of the business users are usually urgent, requiring decisions that need to be made quickly.However, while various approaches to data mining seem to offer distinct features and benefits, in fact just a few fundamental techniques form the basis of most data mining systems and asking a few simple questions will help clarify the nature of each system. These questions need to be asked both from the view .

A related business article on Measuring the Dollar Value of Mined Information illustrates how the benefits of a data mining system can be quantified as tangible corporate assets. A technical article on the Characterization of Data Mining Techniques separates the technologies used in most data mining systems as three classes of: equations, logic and cross-tabulation and how these techniques are used in some commercial products Here are two sets of "Top Ten Data Mining Questions" from business and technical perspectives Each question has three parts which together highlight one specific aspect.. of a data mining system's power and capability. These questions aim to bring out the character of a data mining system and help business and technical users understand how to deploy such systems system.The Top Ten Data Mining Business QuestionsThe top ten business question should be asked By Business Users About The Benefits, Quality and USAbility of The System. They Are: Question 1: Business Ben efitsa) How will this system help us? b) How well does this system work for our industry-specific applications? c) What information can we get that we do not already have? It is essential to ask this question again and again. You should, of course, get new refined information, but it is not enough just to know something -. you should have information that allows you to "act" within the context of your industry And, you should measure the bottom-line dollar benefits Delivered by a data mining system. See The Paper "Measuring The Dollar Value F Mined Information" for a framework for this.

Question 2:?? Technical Know-howa) How technically sophisticated do we need to be to use it b) Can business users operate it without calling the IS group all the time c) Is it as easy to use as an internet browser Business? users should be empowered with direct, on-demand access to refined knowledge They should not have to know statistics, yet should be given consistent and correct answers The system interface should be as easy to use as a web-browser.Question 3..: Understandability and Explanations a) Are the results intuitive or difficult to understand? b) Do we get clear explanations for any information item presented? c) Will the explanations be in technical statistical terms or in a form that we can understand? Results should be presented to business users in plain English, accompanied with graphs The system should be able to explain each piece of information it presents in clear, English-like terms that business users can easily comprehend and use.Question 4:. Follow-up Questionsa) What kinds of follow-up questions can we ask from the system? b) Do we need to go to an analyst for further question answering? c) How fast can we drill-down on the fly to see more patterns? Response to follow-up questions must be immediate. Business users should not need to use intermediaries such as analysts to get more information after they have seen some results. If follow-up questions take time and involve intermediaries, the business users effectiveness will be impacted. Business users should get refined Information, As they Need it, when.

Question 5: Business Usersa) How many business users can this system support b) Can the business users tailor their own questions for the system c) Can users utilize the knowledge for day-to-day decision making The system should be able??? to use the same fundamental knowledge to support a few hundred business users, each with a different group-perspective. Yet, all of these users must be given consistent answers as they ask their own questions. The information must be presented such that can be utilized for day-to-day actions.Question 6: Accuracy, Completeness and Consistencya) How accurate are the results the system delivers b) Can some patterns be missed by the system c) are the results always consistent or can 100 users get 100?? different answers? The system must cover a wide range of patterns and should provide high quality, information. The knowledge provided to business users should be derived from the entire data set (and not samples) in order to increase accuracy. All business users shou ld access the same knowledge so that they all receive consistent answers, increasing the quality of corporate information Question 7:.? Incremental Analysisa) Can we automatically analyze weekly / monthly data as it becomes available b) Can the system compare the "month to month "results and patterns by itself? c) Can we get automatic pattern detection over time, every week or month? The system should analyze data as it becomes available every week or month and perform on-going trend analysis, highlighting the key items and influence .......................

转载请注明原文地址:https://www.9cbs.com/read-110750.html

New Post(0)