(a) Assume you are using the scores a user rated for all the…
(a) Assume you are using the scores a user rated for all these movies to represent users with unrated movies set to 0. Thus, the number of features is equal to the total number of movies (17,770) in the training set. What similarity (or distance) measure would you use to cluster the users? Please select among “Simple Matching Coefficient”, “Jaccard”, “Cosine”, and “Euclidean distance”. Why do you select this measure? (Please limit your answer to 30 words).
Read Details1. The plot below displays a set of 2D data points that are…
1. The plot below displays a set of 2D data points that are uniformly distributed. Suppose you want to use K-Means to cluster these points into three groups. Please select the final cluster boundaries that you anticipate K-Means may produce.
Read Details