People Analytics

Which of the variables listed above (second paragraph) would you use to define success of sales employees in order to develop your model? (10 points) The choice of independent variables for your model has to be based on their power to predict the dependent variable AND their availability in job candidates’ resumes that you intend to screen. Which of the variables listed above (second paragraph) would you test as predictors of success in your model? (10 points) Which parameter of the model, i.e., sensitivity, specificity, precision, or accuracy, describes the ability of your model to identify candidates who are actually qualified? Which parameter of the model describes the ability of your model to identify candidates who are actually unqualified? Which parameter describes the ability of your model to correctly predict unknown candidates as being qualified? (5 points) What are the false positive rate and the false negative rate? Describe what false positive and false negative mean. (10 points) If your goal is to develop a model to screen resumes and identify candidates to be invited for an interview, which type of error is worse - false positive or false negative? Explain the rationale for your answer. (10 points) If you want to improve the performance of your model to identify candidates to be invited for an interview, which parameter (sensitivity, specificity, precision, accuracy) would you use to guide the selection of candidates? Explain your rationale based on what you are trying to achieve with your predictions. (10 points) If your goal is to develop a model to identify candidates who will receive a job offer, which type of error is worse - false positive or false negative? Explain the rationale for your answer. (10 points) If you want to improve the performance of your model to identify candidates to receive a job offer, which parameter would you use? Explain your rationale based on the what you are trying to achieve with your predictions. (10 points) Fast forward one year. The company deployed the model that you developed with the 3 years of data on current employees, and people were hired based on your predictions. The Head of HR has now come back to you with a concern that not all of the new hires were "good". Twelve of the 100 people hired were not qualified, and did not work out. What parameter in the confusion matrix would you use to understand if your model worked better than you expected, as well as you expected or worse than you expected? How well did the model work? Do you agree that accuracy not the best parameter to use? If so, why not? Explain the rationale for your answers. (15 points) Describe any limitations and/or concerns associated with your approach for this new business opportunity. (10 points)  

Sample Solution

   

Choosing Independent Variables

To develop a predictive model for sales employee success, the choice of independent variables should be based on their ability to predict the dependent variable (sales success) and their availability in job candidates' resumes. From the listed variables, the following would be suitable independent variables:

  1. Education: The level of education and relevant field of study can provide insights into a candidate's knowledge and skills related to sales.

Full Answer Section

   
  1. Previous Experience: Prior work experience in sales or related fields can indicate a candidate's ability to apply their knowledge and skills in a professional setting.

  2. Skills and Certifications: Possession of relevant sales skills, such as customer relationship management (CRM) software proficiency, and certifications can demonstrate a candidate's commitment to professional development.

  3. Personality Traits: Certain personality traits, such as extroversion, communication skills, and problem-solving abilities, can contribute to success in sales roles.

  4. Achievements and Awards: Recognizing awards, achievements, and successes in previous roles can provide evidence of a candidate's ability to excel in sales.

While these variables could provide valuable predictive insights, it's important to consider their availability in job candidates' resumes. Some information, such as personality traits, may require additional assessment methods beyond resume data.

Model Evaluation Parameters

  • Sensitivity: Sensitivity measures the ability of the model to correctly identify candidates who are actually qualified (True Positive Rate).

  • Specificity: Specificity measures the ability of the model to correctly identify candidates who are actually unqualified (True Negative Rate).

  • Precision: Precision measures the proportion of positive predictions that are actually correct (Precision = True Positives / (True Positives + False Positives)).

  • Accuracy: Accuracy measures the proportion of predictions that are correct (Accuracy = (True Positives + True Negatives) / (Total Population)).

Error Rates

  • False Positive Rate: The false positive rate (FPR) is the proportion of negative predictions that are incorrect (False Positive Rate = False Positives / (Total Negatives)).

  • False Negative Rate: The false negative rate (FNR) is the proportion of positive predictions that are incorrect (False Negative Rate = False Negatives / (Total Positives)).

  • False Positive: A false positive occurs when the model predicts that a candidate is qualified when they are actually unqualified.

  • False Negative: A false negative occurs when the model predicts that a candidate is unqualified when they are actually qualified.

Error Severity in Resume Screening

In the context of resume screening, a false negative error (missing a qualified candidate) is generally considered more detrimental than a false positive error (inviting an unqualified candidate for an interview). This is because the cost of missing a qualified candidate is higher than the cost of interviewing an unqualified candidate.

A false negative error means that a potentially successful candidate may not get the opportunity to showcase their skills and experience, leading to a loss of talent for the company. On the other hand, a false positive error can be rectified through the interview process, where the candidate's true qualifications can be assessed.

Parameter Guidance for Resume Screening

For resume screening, prioritizing sensitivity over other parameters is crucial. Sensitivity ensures that the model effectively identifies as many qualified candidates as possible, reducing the risk of missing top talent. This aligns with the objective of resume screening, which is to widen the pool of potential candidates for further evaluation.

Error Severity in Job Offer Decisions

When making job offer decisions, a false positive error (extending an offer to an unqualified candidate) is more severe than a false negative error (rejecting a qualified candidate). This is because hiring an unqualified candidate can lead to performance issues, increased training costs, and potential turnover.

A false positive error in job offer decisions can have significant financial and reputational consequences for the company. Investing time and resources in training and onboarding an unqualified employee can prove costly and detrimental to productivity. Additionally, hiring an unqualified individual can damage the company's reputation and erode employee morale.

Parameter Guidance for Job Offer Decisions

In job offer decisions, prioritizing precision over other parameters is paramount. Precision ensures that the model accurately identifies candidates who are truly qualified, minimizing the risk of hiring an unqualified individual. This aligns with the goal of making informed hiring decisions that lead to high-performing employees.

Model Evaluation after Deployment

The confusion matrix provides a comprehensive overview of the model's performance, including false positive (12) and true positive (88) predictions. While accuracy (90%) might seem impressive, it doesn't capture the model's ability to correctly identify qualified candidates.

Using precision as a more relevant metric, we see that the model's precision is 88% [(True Positives)/(True Positives + False Positives)], indicating that 88% of the candidates predicted as qualified were actually qualified. This suggests that the model

IS IT YOUR FIRST TIME HERE? WELCOME

USE COUPON "11OFF" AND GET 11% OFF YOUR ORDERS