World Heritage Site

for World Heritage Travellers



Forum: Start | Profile | Search |         Website: Start | The List | Community |
About this website forum.worldheritagesite.org Forum / About this website /  
 

Controversial in the Community

 
Author nfmungard
Partaker
#1 | Posted: 14 Feb 2019 13:41 
With the new statistics pages you can track what sites have a large variance and standard deviation.

https://www.worldheritagesite.org/community/rating+stats

Column standard deviation indicates a large variance in ratings. The first entries are primarily due to few votes, but e.g. a site like Tarnowitz ranging from 0.5 to 5.0 with 13 votes is indicative of a wide array of opinions.

https://www.worldheritagesite.org/list/Tarnowskie+G%C3%B3ry+Lead-Silver+Mine

Column outliers indicates the largest deviation from the standard deviation, that is abs(avg - vote - std deviation).

Dubrovnik has taken the lead due to a single 0.5 rating (2.85 below avg - std).
https://www.worldheritagesite.org/list/Dubrovnik

Yosemite is close 2nd with another 0.5 rating (2.78 below avg - std)
https://www.worldheritagesite.org/list/Yosemite+National+Park

Now, I am open minded about disagreeing on Bauhaus. And if a site has a large variance as Tarnowitz so be it. Even though there seems a clear national bias reflected in the higher ratings. But neither Yosemite nor Dubrovnik deserve 0.5 by any objective standard.

So to start the discussion:
* Should we remove outliers?
* Do we need to define more clearly what ratings mean and how they should be used? E.g. a temporary closing of Yosemite as a consequence of the 2019 shutdown is annoying but this frustration shouldnt be applied to the rating. Meanwhile if a site like Stoclet simply is not accessible I think the argument is different.

What do you think?

Author meltwaterfalls
Partaker
#2 | Posted: 14 Feb 2019 14:00 | Edited by: meltwaterfalls 
Personally I'm in favour of keeping them in. The outliers can reveal a lot about a site/ reviewer and there is a chance that in removing them we take away some of the quirks of our membership.

The outliers will mostly be negated by the other votes anyway.

The only potential exceptions could be:

1. If there looks to be a concerted effort to dramatically affect a sites overall rating, eg a group of reviewers intentionally voting a site up or down
2. Anyone that gives the Bauhaus less than 4 stars should be discounted:)

I don't think 1 will really happen, so the affect of outliers will mostly be marginal.
And 2 is just common sense.

Author Zoe
Partaker
#3 | Posted: 14 Feb 2019 15:21 
I don't think removing outliers will be useful long term, but for few votes this would make sense.
Some websites don't actually show a rating with less than e.g. 5 votes anyway. As I voted on a WHS I found very poor and I was the only one so far it now sits in the bottom 10.

Unfortunately visits are very different for anyone. E.g Clyde visited Roman buildings and they staged a festival, I had a horrible time when the Darjeeling train derailed, Els visited some town in Spain when it was brutal hot. The influence is just there. Plus I think some people take more time and/or have a guide, thus get more out of a site and can rate it better but those reviews shouldn't weigh more.

Author nfmungard
Partaker
#4 | Posted: 14 Feb 2019 18:50 
meltwaterfalls:
2. Anyone that gives the Bauhaus less than 4 stars should be discounted:)

Especially the one in Weimar...

meltwaterfalls:
The outliers will mostly be negated by the other votes anyway.

My goal was not to prohibit / delete votes. My goal was only to exclude stark outliers when calculating the score. So the votes and average would remain the same. But the score would be calculated on: avg without votes exceeding 2 STD - offset based on all.

Zoe:
I don't think removing outliers will be useful long term, but for few votes this would make sense.
Some websites don't actually show a rating with less than e.g. 5 votes anyway.

Our sample size is rather small and will likely stay that way (<1000). Also note that with small sample sizes we already apply a negative weight to the score. E.g. Kamtchatka would lead if it weren't for this.

Author paul
Partaker
#5 | Posted: 15 Feb 2019 03:02 
Nerd alert!

A few suggestions on ratings:

You could try using the median and the median absolute deviation to measure central tendency & dispersion. These are more robust against outliers.

You are currently using an 11 point uni-polar interval scale for rating. Most rating scales use a 5 or (less frequently) 7 points. The more points the less accurate your rating will be - due to effects such as central tendency bias. Using half stars is also much more difficult for a respondent to mentally process (really).

Because the population is small you might try using a bayesian average.

You could try positively weighting votes from "experts", this is quite often used in rating scales!

You could try reducing the precision of your ratings, perhaps to 0.5, and then ordering using the "certainty" or simply the number of votes.

You could try forcing an objective and a subjective rating - both are valid. "Rate the site" and "Rate your visit", much like meltwaterfalls does in his reviews. This gives interesting insights - often respondents give ratings based on what they think they should think.

Finally you could be more explicit about what is being rated.

Author GaryArndt
Partaker
#6 | Posted: 15 Feb 2019 23:42 
Throwing out highs/lows is a common practice in many scoring systems for precisely this reason.

Author nfmungard
Partaker
#7 | Posted: 16 Feb 2019 05:26 
GaryArndt:
Throwing out highs/lows is a common practice in many scoring systems for precisely this reason.

Only question is when is it a too high/low value? I would say std deviation + 1.0 rating point. As a consequence, 2-3 ratings of me would be eliminated (Bern and Salzburg), but that seems fine.

paul:
Nerd alert!

Not a problem with me.

paul:
You could try using the median and the median absolute deviation to measure central tendency & dispersion. These are more robust against outliers.

Not possible/not out of the box via SQL functions. I do think this could be interesting and need to check.

paul:
You are currently using an 11 point uni-polar interval scale for rating. Most rating scales use a 5 or (less frequently) 7 points. The more points the less accurate your rating will be - due to effects such as central tendency bias. Using half stars is also much more difficult for a respondent to mentally process (really).

Well... For me I see the following scale:
1* 0.5 -> Should not be on the list.
2* 1.0 -> Pretty miserable.
3* 1.5-2.0 -> Below standard
4* 2.5 -> Average
5* 3.0-3.5 -> Above Average
6* 4.0 -> Good
7* 4.5 -> Exceptional
8* 5.0 -> World Wonder

Maybe one could group 2.5 and 3.0 and 3.5 and 4.0 to get to the seven point scale. What does anyone else think? I agree that 11 seems a bit too much, especially in the middle.

paul:
Because the population is small you might try using a bayesian average.

We already apply a Wilson score lower bound to the average (at 25%) to compute the score. I think this is covered. I was also thinking about normalizing each voter and awarding points based on that.

paul:
You could try positively weighting votes from "experts", this is quite often used in rating scales!

Weighing per visited sites and reviews would be fun. I will see how I can make that happen. Maybe give one vote per 100 visited sites and per 50 reviews written?

paul:
You could try forcing an objective and a subjective rating - both are valid. "Rate the site" and "Rate your visit", much like meltwaterfalls does in his reviews. This gives interesting insights - often respondents give ratings based on what they think they should think.

I think it should stay with one scale and be "Rate the site", not "Rate the Visit".

paul:
Finally you could be more explicit about what is being rated.

That's why we are having this discussion ;)

Author elsslots
Admin
#8 | Posted: 16 Feb 2019 06:36 
nfmungard:
Well... For me I see the following scale:
1* 0.5 -> Should not be on the list.
2* 1.0 -> Pretty miserable.
3* 1.5-2.0 -> Below standard
4* 2.5 -> Average
5* 3.0-3.5 -> Above Average
6* 4.0 -> Good
7* 4.5 -> Exceptional
8* 5.0 -> World Wonder

Maybe one could group 2.5 and 3.0 and 3.5 and 4.0 to get to the seven point scale. What does anyone else think? I agree that 11 seems a bit too much, especially in the middle.

I think 1 - 5 is pretty intuitive, but I do not mind the half stars being available. Let's not make it too complicated.

nfmungard:
I think it should stay with one scale and be "Rate the site", not "Rate the Visit".

Agree. It was meant as "Rate the site" (which will always be influenced by the circumstances of the visit, as voters will not do the whole AB evaluation again but give a score based on their own impression (whether they did their homework or not))

Author meltwaterfalls
Partaker
#9 | Posted: 18 Feb 2019 10:53 
Whilst appreciating the nerdistry (this is the reason I feel at home on this website) I think the five star system is intuitive, and personally prefer having the nuance of the half stars.

I could live without it, but I think it worthwhile.

I speak as someone who previously used to rate my music collection on a 0-10 scale to 2 decimal places, so the difference between a 4.5 and 5 is something I feel very at home with and can be revealing :)

Author nfmungard
Partaker
#10 | Posted: 18 Feb 2019 16:25 
Proposal by me: I will try to compute a score with outliers and then we can compare the impact and if it's worthwhile to implement this.

About this website forum.worldheritagesite.org Forum / About this website /
 Controversial in the Community

Your Reply Click this icon to move up to the quoted message


 ?
Only registered users are allowed to post here. Please, enter your username/password details upon posting a message, or register first.

 
 
forum.worldheritagesite.org Forum Powered by Light Forum Script miniBB ®
 ⇑