Machine Learning – Collaborative Filtering
You are to use collaborative filtering techniques to predict which political party voters who have not been polled will vote for in an upcoming election. We
assume that we have a large data store of voters and many attributes about the voters. Attributes include age_group (with values of young, middle, old),
gender, income_bracket (with values of under_50K, 50_150K, 150_300K, over_300K), marital_status, number_of_children, profession (with many different
values), education_level (with values of no_high_school, high_school, bachelors, masters, doctor), number_of_automobiles, political_party, and state. Also
assume that many voters have already been polled and the party they stated that they would vote for is also stored in the data store.
a.Design a schema for a structured cloud table such as Accumulo to represent this data.
b.Write pseudocode for determining similarity called VoterSimilarity() with signature:
UserSimilarity similarity =
VoterSimilarity(voterA, voterB);
Assume that facts such as gender or state with totally different values either have a similarity value of 0 or 1. Assume that attributes that have values over …