For the paper “Improving Map Reduce Performance in Heterogen…
For the paper “Improving Map Reduce Performance in Heterogeneous Environments” answer the following questions about the experimental comparison of the Longest Approximate Time to End (LATE) scheduler and the original Hadoop scheduler. The following graphs are from the paper. They show the performance in terms of the completion time of the map-reduce application for three versions of the scheduling algorithm (No Backups, Hadoop Native, and LATE), normalized to the Hadoop Native case. Explain the following: Why does the ‘No Backups’ case perform so much better when there are no stragglers vs. when there are stragglers? Why does the LATE scheduler perform better than Hadoop Native in both cases (i.e., what are the main problems of the native Hadoop scheduler that are addressed with the LATE scheduler?)
Read Details