For example, the 11 point Net Promoter Question "How likely are you to recommend this product to a friend" has the top-two boxes of 9 and 10. This applies to standard likert item options (strongly disagree to strongly agree) to other response options such as from "definitely will not purchase" to "definitely will purchase." While top 2 box scores are the most common, researchers may also choose in some cases to use Top 3 Box or even Top 4 Box scores. Top-two-box scores include responses to the two most favorable response options. Contact Us, Losing precision and variability means it's harder to track improvements. Top-box scoring has its place for quickly assessing results and especially for stand-alone studies when there's no meaningful comparison or benchmark. When you go from 7 response options to 2 or from 11 to 2, a response of a 1 becomes the same thing as a 5. Top-box and top-two-box scoring systems have the benefit of simplicity but at the cost of losing information. The only difference here is the reduction in the number of respondents who completely hated the website (0's, 2's & 3's) and three 9's changed to 10's. The idea behind this practice is that you're getting only those that are expressing a strong attitude with a statement. The average rating on the website before the changes was 6.12 with a standard deviation of 2.71 (those are actual scores from 42 users). Table 1: 42 Actual responses to the question "How Likely are you to recommend to a friend? Yet the top-two box score is still 19% and the Net Promoter Score is still – 33%. Top-box and top-two-box scoring systems have the benefit of simplicity but at the cost of losing information. Even rather large changes can be masked when rating scale data with many options is reduced to two or three options. Only measurement purists will take issue with this practice. In the absence of any benchmark or historical data, researchers and managers look at so-called top-box and top-two-box scores (boxes refer to the response options). Top-two-box scoring is popular for rating scales with between 7 and 11 points. Latin and Greco-Latin Experimental Designs for UX Research, Improving the Prediction of the Number of Usability Problems, Revisiting the Evidence for the Left-Side Bias in Rating Scales, Quantifying The User Experience: Practical Statistics For User Research, Excel & R Companion to the 2nd Edition of Quantifying the User Experience, 0, 0, 0, 2, 3, 3, 3, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 4, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6. 