Business Applications of Data Mining (Web/Library research) List some business applications of data mining techniques. Case studies and success stories will be helpful to you. . See for example: http://www]01.ibm.com/software/success/cssdb.nsf/CS/STRD] 8QSKHK?OpenDocument&Site=spss&cty=en_us o This is just one example of a place to find case studies and success stories. You should find at least one other reference to use for this assignment . Create a table that lists: . Each business application (including a brief description of the business objective, and the company/companies that have used or could use data mining for these applications/business-objectives) o Example business application: ‘Predicting responses to a marketing campaign’ . The data mining techniques/algorithm(s) that are/were helpful in achieving the business objective for each business application o Your syllabus lists some of the most popular data mining techniques . Possible/typical outputs of data mining in that business application area o Example outputs: ‘Married women with one or more children are more likely to respond to the campaign’, ‘People who buy chili are more likely to by antacids’, etc. . The web-address of the page where you found the information or the citation for the article or book where you found the information. You are required to provide at least 3 different web-addresses/citations. (300-400 words [approx. 1 page]; 10 points). Question 2: The Data Mining Process Describe the industry standard CRISP-DM data mining process model (http://www.dataminingtechniques.net/data]mining]tutorial/data]mining]processes/) and SAS’s SEMMA model (http://www.sas.com/offices/europe/uk/technologies/analytics/datamining/miner/semma.htm l ) . Choose one of your business applications from Question 1, illustrate the usage of either CRISP-DM or SEMMA for that application: e.g. in the ‘data preparation’ phase of CRISPDM describe the specific data sources you would use for that application, etc. Further Instructions for Assignment 1 . For question 1, you should format your answer as a table with the headings ‘business objective’, ‘companies (that do or could pursue this data mining objective)’, ‘data mining algorithm/technique’, ‘sample output’, and ‘web address’ as requested in the question. o Also, you should give specific answers. For example, a business objective of ‘provide payment processing solutions’ is not specific enough; rather say ‘produce a model to score transactions and identify the transactions most likely to be fraudulent’. o Similarly, for outputs of data mining, ‘company lowers percentage of fraudulent transactions’ is fine as a general goal, but give me more details of possible specific outputs: e.g. give example rules that could be produced like ‘large transactions by people in Queens who have held accounts for less than 6 weeks are likely to be fraudulent’. . For question 2, on the application of data mining techniques, make sure to describe both the CRISP-DM and SEMMA processes. However, you only need to apply one of them (preferably CRISP-DM). o The goal is to provide good descriptions of how specifically to apply each of the 6 stages of the process to your particular case. o More importantly, you should give specific actions for each phase of the data mining process when explaining how the process could be applied to your particular case. . For example, ‘understand data’ is not specific enough and would not earn you any of the points for the ‘application’ part of the question; instead, under the heading ‘data understanding’ give details like ‘gather customer data from internal database, including customer identifier, age, purchase history, …’. . Similarly ‘prepare data’ is not detailed enough. Instead, under the heading ‘data preparation’, write ‘bin the customers into 5 equal bins by income attribute; compute aggregates for past 3 months, past 6 months, and past 12 months customer purchases, …’. . Under the heading ‘evaluation’ you might explain that a model that picks fraudulent transactions with 98% recall, and 80% precision is probably sufficient and that low recall is costly because each fraudulent transaction that falls through our checks is expensive, whereas low precision is not too costly as transactions that were rejected falsely do not lose us a lot of profits. . (Model evaluation is dealt with more detail in lectures after the assignment is due; you should have picked up knowledge about evaluation criteria from reading the Two Crows reading). o For the deployment phase (of CRISP-DM), you should explain how the Company could exploit (profit from) the model produced and what specific actions they did or could take: e.g. they could use the model to score prospects, and email high-scoring customers. Obviously the details would depend on the application you chose, but the important thing is to be specific and apply the process to your particular application. . Always cite the source of any comparative performance figures, or seemingly unsubstantiated data, which you give.

Business Applications of Data Mining (Web/Library research) List some business applications of data mining techniques. Case studies and success stories will be helpful to you. . See for example: http://www]01.ibm.com/software/success/cssdb.nsf/CS/STRD] 8QSKHK?OpenDocument&Site=spss&cty=en_us o This is just one example of a place to find case studies and success stories. You should find at least one other reference to use for this assignment . Create a table that lists: . Each business application (including a brief description of the business objective, and the company/companies that have used or could use data mining for these applications/business-objectives) o Example business application: ‘Predicting responses to a marketing campaign’ . The data mining techniques/algorithm(s) that are/were helpful in achieving the business objective for each business application o Your syllabus lists some of the most popular data mining techniques . Possible/typical outputs of data mining in that business application area o Example outputs: ‘Married women with one or more children are more likely to respond to the campaign’, ‘People who buy chili are more likely to by antacids’, etc. . The web-address of the page where you found the information or the citation for the article or book where you found the information. You are required to provide at least 3 different web-addresses/citations. (300-400 words [approx. 1 page]; 10 points). Question 2: The Data Mining Process Describe the industry standard CRISP-DM data mining process model (http://www.dataminingtechniques.net/data]mining]tutorial/data]mining]processes/) and SAS’s SEMMA model (http://www.sas.com/offices/europe/uk/technologies/analytics/datamining/miner/semma.htm l ) . Choose one of your business applications from Question 1, illustrate the usage of either CRISP-DM or SEMMA for that application: e.g. in the ‘data preparation’ phase of CRISPDM describe the specific data sources you would use for that application, etc. Further Instructions for Assignment 1 . For question 1, you should format your answer as a table with the headings ‘business objective’, ‘companies (that do or could pursue this data mining objective)’, ‘data mining algorithm/technique’, ‘sample output’, and ‘web address’ as requested in the question. o Also, you should give specific answers. For example, a business objective of ‘provide payment processing solutions’ is not specific enough; rather say ‘produce a model to score transactions and identify the transactions most likely to be fraudulent’. o Similarly, for outputs of data mining, ‘company lowers percentage of fraudulent transactions’ is fine as a general goal, but give me more details of possible specific outputs: e.g. give example rules that could be produced like ‘large transactions by people in Queens who have held accounts for less than 6 weeks are likely to be fraudulent’. . For question 2, on the application of data mining techniques, make sure to describe both the CRISP-DM and SEMMA processes. However, you only need to apply one of them (preferably CRISP-DM). o The goal is to provide good descriptions of how specifically to apply each of the 6 stages of the process to your particular case. o More importantly, you should give specific actions for each phase of the data mining process when explaining how the process could be applied to your particular case. . For example, ‘understand data’ is not specific enough and would not earn you any of the points for the ‘application’ part of the question; instead, under the heading ‘data understanding’ give details like ‘gather customer data from internal database, including customer identifier, age, purchase history, …’. . Similarly ‘prepare data’ is not detailed enough. Instead, under the heading ‘data preparation’, write ‘bin the customers into 5 equal bins by income attribute; compute aggregates for past 3 months, past 6 months, and past 12 months customer purchases, …’. . Under the heading ‘evaluation’ you might explain that a model that picks fraudulent transactions with 98% recall, and 80% precision is probably sufficient and that low recall is costly because each fraudulent transaction that falls through our checks is expensive, whereas low precision is not too costly as transactions that were rejected falsely do not lose us a lot of profits. . (Model evaluation is dealt with more detail in lectures after the assignment is due; you should have picked up knowledge about evaluation criteria from reading the Two Crows reading). o For the deployment phase (of CRISP-DM), you should explain how the Company could exploit (profit from) the model produced and what specific actions they did or could take: e.g. they could use the model to score prospects, and email high-scoring customers. Obviously the details would depend on the application you chose, but the important thing is to be specific and apply the process to your particular application. . Always cite the source of any comparative performance figures, or seemingly unsubstantiated data, which you give.

IS IT YOUR FIRST TIME HERE? WELCOME

USE COUPON "11OFF" AND GET 11% OFF YOUR ORDERS