Flaming_Spinach
Feature Editor
Article Name: Designing a Fair Rating System: Part 1
Author: Flaming_Spinach
Artist: Assiram41
Date: 4/19/2007
(note: I make a few assumptions in this thread, based on the educated guess that 2000 will be the approximate cutoff rating for getting into worlds this year, anlong with the findings in this thread, which say it takes at least 75 matches to reach that rating under the current system.)
PUI currently supports the use of the ELO Rating System to send people to Worlds. Although ELO is a relatively fair system, it does have its downsides. If this system is to be used next year, it is important that it is fair for everyone.
I have decided to divide this article into 2 parts. The first will be about changes that could be made to the algorithms and process of calculating ratings. The second will be about making sure everyone has a fair chance of winning an invite by Rating.
Let’s get into it.
The first thing we need to do is look at various rating systems, so we can get an idea of the shortcomings of the current system, and in what ways those shortcomings can be fixed.
1. Yearly ELO
Let’s start with the system we already have. This is a variant of the orthodox ELO, where everyone’s rating is reset to 1600 at the end of each season; mostly to prevent any kind of rating sitting.
BENEFITS:
-Rating sitting is virtually non-existent.
-ELO is approved by a vast number of gaming administrative bodies.
-Everyone starts off on equal footing at the beginning of the year.
PROBLEMS:
-People in areas with less events are at a massive disadvantage.
-It puts too much pressure on players, who can’t afford to have a bad tournament, as their rating would be ruined.
-Good players losing to bad players on luck.
-Ratings in this system are usually not an accurate measure of a persons skill for very long, as once a persons rating gets high enough, it is reset to 1600.
2. Lifetime ELO
Very closely related to the one above. In this variation, ratings are not reset after each year, so they have a real ability to grow and become a true representation of a persons’ skill.
BENEFITS:
-Ratings become highly accurate after a time.
-People in areas with less events can still compete.
-Every game means slightly less.
PROBLEMS:
-People in areas with less events need to work on their rating for a longer period of time.
-Rating sitting would be rampant. (Although, measures can be taken against this.)
-Players who are new to the game have no chance of winning a trip to Worlds until 2 or 3 years after they start.
3. Glickos’ System
Glicko is a take-off on ELO, which attempts to calculate the standard deviation of a persons rating, as a function of activity.
BENEFITS:
-A measure of activity helps prevent any rating-sitting.
PROBLEMS:
-The system is FAR too complicated for a game like Pokemon.
-There are other, simpler ways of achieving the same end.
4. Masterpoints
This system is used by various Bridge associations, as well as Magic: The Gathering, to some degree. Masterpoints (Magic refers to them as Pro Points) is a system where everyone starts at 0, and can gain points by winning or placing well at certain events. A players’ score can not decrease in this system, it only goes up.
BENEFITS:
-Score is determined by final placement in an event.
-Luck plays less of a factor, as 1 bad loss does not ruin your rating.
-Can easily be named ‘Power Points,’ for video-game tie-ins.
PROBLEMS:
-People in areas with less events are at a massive disadvantage. (More so than with the current system.)
-Unless they are reset after each year, new players will never have a chance.
5. ELO plus Masterpoints
I am sure that this has been attempted by some Organized Play system somewhere, but I can’t find any evidence of it. This is basically a system where you gain and lose points as usual from each match, but unlike ELO, at the end of each event, the high placers get ‘bonus points’ added onto their ratings. This helps cut down on the effects of bad luck, and favors people on how well they finish in a tournament, not necessarily how well they perform throughout the day.
BENEFITS:
-Those who place the highest will almost always win the most points.
-A single loss on bad luck in Swiss means less.
PROBLEMS:
-Ratings become non-zero-sum, so an error in calculation could go unnoticed for the entire season.
-People who win events usually win a large amount of points anyway. This system caters to the minority.
-People in areas with less events are at a massive disadvantage.
6. ELO with a differently determined K-value
Simply, the K-value of a match is dependant on the players rating, not the event at which it occurred; the higher your rating, the lower the K-value. The United States Chess Federation (as well as FIDE) both use this system. The higher a persons’ rating gets, the lower the K-value of each of their matches becomes.
BENEFITS:
-A higher K-value for low-ranked players allows faster growth for new players.
-A lower K-value for high-ranked players reduces the overall impact of bad luck, and makes ratings more stable.
PROBLEMS:
-Finding a fair way to determine what the K-value scale should be.
Those are the 6 basic options we have at this point in time. None of the systems is perfect, but each has its own individual strengths and weaknesses.
I think we should (obviously) focus on the key weaknesses of our current system. There are basically 2 major problems with the Pokemon ELO system right now. Those are:
A. Yearly Ratings. Ratings that reset each year leave players starting over from scratch all the time. Any player with an ‘actual’ rating of 2000 needs to play in a minimum 75 matches JUST to get their rating back to where it belongs. Along with this, anyone who can’t hit that magic 75 matches mark simply CAN NOT get a rating of 2000; and if they have some bad luck somewhere, it could take a lot more than 75 matches. Simply put, anyone in an area with less tournaments doesn’t stand a chance of getting to Worlds because they physically can not get their rating high enough. We must find a way to fix this.
B. High-rated players losing on bad luck. Let’s face it, anyone can lose on a no-energy start. Loosing on bad luck can completely ruin a tournament (rating-wise), and 2-3 ruined tournaments can ruin your whole season in a high-activity area. In a low-activity area, 1 ruined tournament can ruin a whole season. This essentially comes down to one thing: the K-values at some of the events this year are just too high. At some point, the amount a player stands to lose on bad luck just gets ridiculously high, and saying, ‘you got a no-energy start; minus 30 points’ can be disheartening to even the most experienced player.
After taking all possibilities into account, and consulting some friends (who are well-knowledged in both Pokemon TCG and ELO) for their opinions, I am ready to make the following 2 suggestions for designing a fair rating system for the Pokemon TCG.
And here’s my proposed fixes for both of those problems.
Problem: Yearly ratings
Fix: Bi-yearly ratings.
Very simple. You get to work on your rating for 2 years. This should let virtually everyone have equal chances of getting a high enough rating to qualify for Worlds.
Now the technicalities of it all: Each person would have 2 ratings. The first is the compound rating that was started last year, and is continuing to be worked on this year. The persons’ second rating is their rating for this year, which starts at 1600. The compound rating is the one used to determine who goes to Worlds. At the end of the season, the persons compound rating is retired, and their yearly rating turns into their new compound rating. They also get a new yearly rating, which starts out at 1600 again.
To put it another way, every year you start a new rating. That rating only matures (as far as Worlds invites are concerned) after 2 years.
There are many other benefits a system like this has, including:
- Less emphasis on each and every game.
- More chances to get your rating up.
- Ratings would be far more accurate.
- New players who are working from behind in their first season would be on equal footing with everyone in their second season.
- If someone sits on a good compound rating, they may effectively ruin their chances for the next year, by falling too far behind the rest of the pack. (ie. Rating sitting is possible, but a bad idea.)
- Yearly rating would make the best tie-breaker imaginable.
- Consistent play is rewarded.
- Anyone who enters the season late still has a chance.
Problem: K-values at some events too high
Fix: Compound K-values
Another very simple one. The K-value used in each match is a combination of the K-value for the event, and the K-value determined by the players rating.
For example:
K[sub]P[/sub] x K[sub]E[/sub] = K[sub]F[/sub]
Where:
K[sub]P[/sub] = The K-value based on the players’ ratings.
K[sub]E[/sub] = The K-value based on the event.
K[sub]F[/sub] = The final K-value used to determine ratings.
K[sub]E[/sub] would still be determined the same way as it is now, solely on the level of premier event that the match takes place at.
K[sub]P[/sub] would (imo) be determined by the rating of the higher-rated player in the match, since they are the ones who stand to lose more. Determining it on the average rating, or difference in ratings, would do the exact opposite of what ELO attempts to accomplish. Basing it off of the lower-rated players rating would not accomplish the primary goal. The only option that makes sense is to base K[sub]P[/sub] on the higher-rated players’ rating.
This has more than just the effect of offering protection against luck for high-rated players. There is a whole list of all the benefits this change would bring:
- A higher K-value for low-rated players means that good players can rise to the top quickly at the beginning of the season.
- A lower K-value for higher-rated players prevents them from losing too many points on bad luck against a low-rated opponent.
- High-rated players will also have their ratings rise slower, preventing one person from sky-rocketing after a single excellent tournament.
- Anyone who enters the season late still has a chance.
So, those are my suggestions. Two very simple changes, whose goal is to make things as fair as possible for everyone who plays this game.
ELO is by far the best start for a game like Pokemon, and with these changes, I believe the system can be the fairest it can be for our game.
Last edited by a moderator: