Software

  Classifiers Scenario: The data is related with direct marketing campaigns of a Portuguese banking institution. The marketing campaigns were based on phone calls. Often, more than one contact to the same client was required, in order to access if the product (bank term deposit) would be ('yes') or not ('no') subscribed. Launch Orange and load the Bank data.csv file. Check the data; y (bank term deposit subscription) is the target attribute. Using the Select Columns widget, move the month, day_of_week, and duration fields to the left (Ignored columns). 1. Train a decision tree using this data set. In the Tree widget, uncheck the Induce binary tree box. Next, using the Tree Viewer widget, view the decision tree up to a depth of 4 levels. Identify the paths (specify the rules) that have the highest likelihood of a term deposit subscription (take a screenshot of the paths). What is the subscription probability? 2. Build four classifiers for evaluation: Tree (make sure that the Induce binary tree box is still unchecked), Logistic Regression, SVM, and Random Forest (keep the default settings). Connect the classifiers to the Test and Score widget to evaluate their performance. Use Random Sampling with Repeat train/test = 10 and Training set size = 70%. Which among the four classifiers performs the best with respect to the following evaluation measures: i) AUC ii) classification accuracy (CA), iii) F1, iv) Precision, and v) Recall? (take screenshot of the entire Test and Score window) 3. Based on the Test and Score results from question 2, generate the confusion matrices for the four classifiers. How many false negatives are there for each classifier (take screenshots). How would you interpret the results as a whole? 4. Based on the Test and Score results from question 2, generate the lift charts for the four classifiers; note that the target class is “yes” (check the Cumulative Gains radio button to generate the lift charts). Which among the four has the highest lift when the company wants to target 30% of the overall customer population (take screenshot)? What is the value of that lift? 5. Examine the results from questions 2, 3, and 4. Which among the four classifiers would you finally select for targeting customers and why? Make sure to provide a compelling argument for your selection. Number and store your answers, along with the corresponding screenshots, in a Word document called SW6 ###.docx (where ### is your 3-digit student number); save your Orange workflow file as SW6 ###.ows. Submit both these files to Canvas.

IS IT YOUR FIRST TIME HERE? WELCOME

USE COUPON "11OFF" AND GET 11% OFF YOUR ORDERS