Pokémon TCG: Sword and Shield—Brilliant Stars

Designing a Fair Rating System: Part 1

The winning average that ELO produces is your skill-based winning average. And it is extremely accurate for skill-based games (ie. chess). When a game adds an ellement of luck, the system becomes less acurate, but is still usable.

This is one of the problems I've always had with ratings.

This is not a finite game where skill can be tuned like that. It's so much more prone to luck and fluctuations that games like chess (like you mentioned) simply don't have. ELO gives you a prediction without luck. What happens when you play a deck that involves a lot of luck (Flariados or something) or a deck that has better luck than most (a t2 based deck)... ELO can't take that into account. The game has multiple levels of luck, not just a simple, basic luck, that ELO simply can't track.

Plug in the numbers and you get a win expectancy of 82%. What this means is that you are expected to win 5 out of 6 games against this player. That one loss is no more down to luck than any of your five expected wins is down to luck. The opponent is SUPPOSED TO WIN 1 in 6 games.

Luck may be involved in deciding the when for that loss, but its not luck that gave you the loss you are expected to loose.

Its not a case of 'why did this happen to me' much more a case of 'why shouldn't it happen to me?'.

That is only the case if the person of that low rank has their ELO match their skill. What if it was someone like Seena who chose to sit out a lot of the season, only to come back for Nationals and dominate a field ranked high above him? Sure, these are rare instances, but important instances since they can, and probably will have decided at least one of the trips. With so few trips on the line it's important to implement as many safety nets as possible so that people don't get cheated out of trips due to preventable faults in the system.

It tries its best, but it ultimately fails. On top of that, it's so bad when phenomenal players don't play in many events and then come in for nationals or regionals and blow people out of the water. Jake B is a great example- going to few tournaments, but whose rating was not at ALL indicative of his skill.


I like your suggestions FS, but what about Worlds- would Worlds be counted? What would happen if events in the next two years might have to become capped or invite-only? How would those be counted in? I'm also unsure of your bi-yearly system. If I have a bad year in life, and can't make but 1 or 2 CCs, 1 state, 1 or 0 regional, and no nats and do poorly at an event or just average, I'd have to work so much harder next year to make up for my last year. I kind of like the idea of being able to say "oh well, I messed up, there's always next year." With the bi-yearly system, I have to say "Either I work ten times as hard to make up for last year, or I've wasted two years instead of just one."

I think the best fix is to just give MORE invites. It would make these issues less dramatic. Or perhaps give invites in a split method like they are somewhat doing this year. More invites at nationals and perhaps 1 at a regional would IMO make it more fair. It makes it more fair to people who simply can't make it to that many tournaments.

It's all a really debatable matter. I don't know the best fix, and I see faults in every system. I think less dramatization is a key fix, and so is some kind of influence of a player's past on their ELO so that good players who don't participate don't donk people out of rankings.
 
Ryan, early on in the season the system is behaving much more like a reward system than a rating system. Later on when players have played in lots of tournaments then the comparison between ELO ratings as a predictor of match outcome should be more accurate. I'm reasonably sure that the reward component never quite dissappears.

regarding the luck factor: a switch to match play addresses much of the donk problem that single game swiss can cause. Though even here you can argue that the donk factor just leads to increased instability and uncertainty rather than causing an uncorrectable flaw in the ELO system. I would venture to suggest that such instability is not a desireable feature as the tournament season comes to a close.
 
Last edited:
Win-Loss Ratio Solution?

You could base the second k value on the win-loss ratios of players. For example:

1.) If the higher rated player wins, the winner gains the raw ELO points multiplied by the value p ( that player's losses divided by his wins). The loser at the same time will lose the raw ELO points.

2.) If the higher rated player loses, the loser loses the raw ELO points multipled by the same coefficient p. The winner at the same time will gain the raw ELO points.

The difference here is that the lower rated players still have the oppertunity to make decent gains by playing higher rated players, while at the same time not hurting the higher rated player too much.

Basically what Flaming Spinach said, but adding a pitch for what the modifier can be.
 
1) Too much emphasis being placed on Cities and BR's (Too high a K-value) and too little being placed on States, Regionals, and Nationals.

There is quite enough placed on States and Regionals.

Just the fact that the larger tournaments (almost always) have more attendance and rounds means that they will have a bigger impact on ratings.



2) Large metropolitan areas get more cities and BR's at their disopsal, again unevening the playing field for those in a more rural setting.

WHICH IS EXACTLY THE PROBLEM BI-YEARLY RATINGS ATTEMPTS TO ADDRESS.



I can agree somewhat with the compound theory, but you must limit the amount of tournaments that players may go to ( say 3-cities, 2 BR's, 1-State, and 1-Regional) to equal the other side of the equation and then steepen the K-value for the more important tournaments. Then, I think part of what you are saying would make more sense.

No.

Reducing tournaments and increasing K-value makes ratings more dependant on luck. If I can go 7-0 at my three cities this year, I WILL be #1 in the world. It doesn't matter if I really am that good, as long as I get lucky.

Ideally, we'd all play 500 matches, with a K-value of 10. Then we'd KNOW that the ratings mean something.




I'm a perfect example, 'cause I went up almost 150 points due to Southern Plains.

I guess I'm not, since I got 3rd at NW Regionals, and got +2.43 on my rating.




It tries its best, but it ultimately fails. On top of that, it's so bad when phenomenal players don't play in many events and then come in for nationals or regionals and blow people out of the water. Jake B is a great example- going to few tournaments, but whose rating was not at ALL indicative of his skill.

Exactly. What if Jeremy Maron came to Nationals (with his 1600 rating), and ruined the ratings of a dozen good players?

Surely Jermys "true" rating is nowhere NEAR 1600. But, the system says that it is. This problem needs to be fixed imo.



I like your suggestions FS, but what about Worlds- would Worlds be counted?

Yes. The K would probably be equal to Nationals, to prevent some phenominal jumps.



What would happen if events in the next two years might have to become capped or invite-only?

You mean like States or Regionals?

In that case, bi-yearly ratings would do exactly what they're designed to do. They give people who can't make it to as many tournaments an equal chance of getting their rating up over an extended period of time.



If I have a bad year in life, and can't make but 1 or 2 CCs, 1 state, 1 or 0 regional, and no nats and do poorly at an event or just average, I'd have to work so much harder next year to make up for my last year. I kind of like the idea of being able to say "oh well, I messed up, there's always next year." With the bi-yearly system, I have to say "Either I work ten times as hard to make up for last year, or I've wasted two years instead of just one."

True, bi-yearly ratings would hurt these people. But, I think this would be an extreme case, and may never actually occur.



I think the best fix is to just give MORE invites.

Absolutelly.

But, beyond that obvious point, there are still problems with the rating system which must be addressed.
 
If I have a bad year in life, and can't make but 1 or 2 CCs, 1 state, 1 or 0 regional, and no nats and do poorly at an event or just average, I'd have to work so much harder next year to make up for my last year. I kind of like the idea of being able to say "oh well, I messed up, there's always next year." With the bi-yearly system, I have to say "Either I work ten times as hard to make up for last year, or I've wasted two years instead of just one."

That sounds like my season so far.

1 CC and one State.


Though I have a very slight chance to make Nationals...
 
No.

Reducing tournaments and increasing K-value makes ratings more dependant on luck. If I can go 7-0 at my three cities this year, I WILL be #1 in the world. It doesn't matter if I really am that good, as long as I get lucky.

Ideally, we'd all play 500 matches, with a K-value of 10. Then we'd KNOW that the ratings mean something.

The quote said nothing about increasing the K-value across the board. It said that the difference between the K-values of various events (Cities, States, Regionals, Nats, BRs, etc) should be greater, to the point that Regionals and Nationals are worth approximately two States or 4-6 Cities/BRs. Currently, that is not the case. Also, keep in mind that while it's very easy to donk a City, BR, or even a State, it's MUCH HARDER to donk a Regionals or Nationals because of the difference in attendance and skill level between the events. Also, if you T4 Nationals, you've already got your trip, and your rating is pretty much meaningless.
 
Scrap K values alltogether? Its not as daft as it sounds.

The K value is completely seperate to Elo's use of the logistic equation to predict match outcome. Scrapping K values would need a lot more work to make it a reality. I made a very very very (is that enough 'very' s LOL) crude attempt at such an approach here http://pokegym.net/forums/showthread.php?t=53108

If we want to improve the rating system then much more emphasis needs to be placed upon the actual match outcomes and much less on the change in rating from match to match. I would expect that this change of emphasis would significantly reduce the grinch effect that a late entrant to the ratings system can have.

Any system that attempts the above is likely to have to make multiple passes through the match records. Your early season results where everyone is currently considered identical would be treated in a very similar way to the late season result against the same player. Currently they are not.
 
After much thought, and playing in more events (seeing my rating rise and fall), I have to re-affirm my suggestion of making this a bi-yearly system.

The biggest problem with the system as it is, is that it doesn't actually award 'consistant' play. It awards those who can play in alot of events. Not everyone can do that. If someone can't make it to enough tournaments, then they simply can't make Worlds. Is that right?



Now 2 examples for you...



I have an int'l friend who has played 73 rated matches and is currently sitting on a rating of ~1990. That should give him a good shot of making Worlds, right? Nope. He has no tournaments left to play in. He has no chance of making it to Worlds (unless 4 people above him all drop 20 points), all because he physically can not make it to enough tournaments to get his rating up to where it should be.

Next up, my current Rating is ~1940. I only have 3 tournaments remaining before Nationals (and although I am planning to play, there are some difficulties with getting there on time). So really, I have 3 sure touraments to get my rating up to 2000, in order to stand a chance at making Worlds. That works out to a 20-point gain per tournament. One losing tournament, and I'm effectivelly screwed. One bad start against a bad opponent, and that = GG me.



In both cases, the problem is fixed by a system (any system) that gives players more opportunities or time to work on their rating.

One season is not long enough. We need more time.
 
FS: as it currently stands consistant late season play is most certainly rewarded. Just look at what is happening in Europe for proof.

Playing in lots of events is potentially brutal late season if ,or rather when, you pick up that unlucky donk. The donks are always there waiting for you in single game swiss and the more you play the more likely you have of getting a donk. There is no doubt in my mind that the donk is best addressed by switching to 45minute match play for the swiss. This is standard practice in the Netherlands. The NL is showing us the way forward for how to reduce the donk in Pokemon. I'd also say that Nationals should be 45 minute matchplay in the swiss too if venue time permits. (sorry USA - you are just too big!)

As to your assertion that if someone can't play in a lot of tournaments then they can't make it to worlds not being right :rolleyes:, what would you offer in its place.. win a few citys undefeated and get an invite to worlds??? The rating system has to be a reward for playing this season and not for sitting on your rating from last season. With bi-yearly ratings, a few players will attain 2200 ratings and become unassailable.

For those at the top of the table late season losses are excessively penalised compared to early season losses. But then they will have picked up some early season wins which would have been excessively rewarded. I don't know if the two effects balance out because the logistic equation is non-linear and early performance counts less than late season performance. I don't care much for the asymetry in when you loose to any given opponent, that is currently present. The asymetry can be almost completely removed by making multiple passes through the tournament results.

Resetting the ratings system each year is the right thing to do. At the begining of the season everyone has to have a chance at those invites and not be locked out by last years players. We want to attract players to the game and not give them reasons to pick a different tcg.
 
Last edited:
NoPoke said:
There is no doubt in my mind that the donk is best addressed by switching to 45minute match play for the swiss. This is standard practice in the Netherlands. The NL is showing us the way forward for how to reduce the donk in Pokemon. I'd also say that Nationals should be 45 minute matchplay in the swiss too if venue time permits.

I would not argue against this change if it were made, but I seriously doubt it ever happening in the USA.



As to your assertion that if someone can't play in a lot of tournaments then they can't make it to worlds not being right , what would you offer in its place... win a few citys undefeated and get an invite to worlds???

That's obviously not what I'm saying. We want people to play in more tournaments all the time. But the problem with the current system is that many people simply CAN'T make that many tournaments.

Look at the top 100 Masters players in NA right now. All of the top 14 players come from the USA. Is this system fair to the Canadians and Mexicans who have to compete with us?



The rating system has to be a reward for playing this season and not for sitting on your rating from last season.

Anyone who sits on their rating from the first season will have their score rotated out at the end of the second season. Is it worth sitting on your rating right now if you have to start over at 1600 next year?

That's one of the beauties of this system. Each person trying to get to Worlds has to worry about TWO ratings; this years, and next years.



With bi-yearly ratings, a few players will attain 2200 ratings and become unassailable.

2200 will never happen in Pokemon.

I am looking at the DCIs ratings for Magic:TG right now. Only 2 players (out of 120,000, constructed BTW) have a rating exceeding 2200. They have played in 600 and 1051 matches, respectivelly.

Even if you look at those M:TG players who have a 2100 or better rating, the one with the fewest matches has 237. In Pokemon (at 7 matches per tournament), that comes to 34 tournaments to get that rating. aka, this probably won't happen any time soon.

I will admit that I don't know what DCI uses for the K-value, so that could skew the results slightly.



Even IF someone could reach 2200, their score will be rotated out at the end of the season, and they'll have to start re-accumulating points asap.



The asymetry can be almost completely removed by making multiple passes through the tournament results.

I disagree completely.

What you're suggesting is basically doubling the K-values of every match.

While this may give some people in under-competitive areas a better chance, what you overall succeed in doing is making a system where luck is even more of a factor.

If someone loses to a 1600 player, why should that loss count TWICE.

If someone wins a CC 7-0, why should that count as 14-0?



The lower the K-value is, and the more matches you have, the more accurate the scores will be.



We want to attract players to the game and not give them reasons to pick a different tcg.

Agreed, but...

How many people stand a good chance of making Worlds in their first year? Not very many.

If this is a problem, I do support a number of trip-giving tournaments (Regionals or whatever) alongside Ratings.
 
I am one that would prefer a point system for each type of event that was set for top finishers. Battle Roads could award points for 1st through 4th depending on number of participants, then City Championships award 1st through 8th, then states award up to 16th and so on. Set how many positions can get points based on the same way things are done for determining rounds and such. That means top 2 in a Battle Road is for sure to get points (example 5 for 1st, 3 for 2nd, 2 for 3rd, and 1 for 4th) but if enough attend then up to 4 get points. City Championships then have top 4 get points for sure (points 10 for 1st, 8 for 2nd, 6 for 3rd, 5 for 4th, and so on) but up to 8 get points if enough people participate. This goes on for each level. This is not totally thought out and I am sure there are people who could figure out a good chart. This makes playing the whole tournament important and does not hurt one for having a bad day if a top player.

I know some would not like that because means people who play more get a chance at more points. That is something that will always be an issue and I don't see a problem with someone who participates more and consistently getting rewarded for it. The trip to Worlds is a great goal but people talk about people starting later in the season and new players being able to compete for a trip. I would prefer something implemented into Player Rewards that would give something to strive for other than the trip. I know people enjoy stamped cards and such so how about implementing in where there are markers set for recieving a foil stamped card for earning 10 points off Premier tounaments that has 1st Level or Marker 1 on it. Only way to get one is to participate in the Premier tournaments that offer the points. Then set the next level at something like 30 points for a different card that is foil stamped again. Keep adding on levels till reach a mark that is decided high enough. The first marker is possible from doing well at a few Battle Roads and City Championships. Even the second marker is possible. Have to play in more than a few to reach marker 3 or do well in a State Championship. I know a few people who know they will never play enough to get a trip but do well enough that earning such things would thrill them.

I keep hearing about the trip and know that it is important but someone who plays consistant and makes the sacrifices necessary to travel to more events should have a better chance in my eyes than someone who plays just part of the season or certain events. I think that the best players are ones who participate a lot and give to the game. I did not see them at every event but they did show up to most that I was at. I know a number of people (I know I do as often as possible) that carpool to make events. These people are doing what I believe makes a top player who deserves a trip. I am not saying that someone who is one of the best Pokemon TCG players should not be able to win a trip but just playing at a few events should not earn the player a trip. What I mean earn a trip is through the points system. I do think that that player should have a chance to win a trip by playing in a tournament (or possibly a small number of tournaments) that reward for being able to outplay almost everyone. I think both being able to straight win a trip and earning a trip through consistent and active participation in Premier Events is what is needed. I feel that being able to start later with the current system and just fly up through the ratings is something that is not right. I feel the same about having to worry about playing because of possibly losing points. The Premier Events should be something people want to play in and not calculate if it is worth the risk or not. I know players this year that are not playing any till Nationals because of possible loss of points. That does not seem to me what POP is wanting though I could be wrong. Hope to see something though that will inspire players to play more and not sit out or drop from tournaments because of a possible bad showing.
 
If you can't make it to lots of tournaments then you don't get a ratings invite.

I don't see anything wrong with that. Nothing at all. Zip. Nada. Change it and you automatically reward not playing.

2200: we already have players in Europe who can go a whole season of 60+ matches and yet only pick up 2 losses. That 30:1 ratio will guarantee those 2200+ ratings.

Two passes through the ratings does not double the K value. Nor does it give you the effect of doubling your losses and wins. Players will spread out more but the ratings do settle down. If you don't believe me then try it with a couple of players and a WWWWL record repeated multiple times. The two players ratings make an asymptotic approach to the values if the games were repeated an infinite number of times. This is what non-linear systems frequently do. Indeed without this behaviour all sorts of things that we take for granted like radios just would not work. The world is non-linear fortunately for us, even if our early mathematical schooling might have us believe otherwise. In real life 1+1 is usually slightly less than 2, sometimes dramatically so and 1+1=1!

Multiple passes through the data set removes the exagerated effect of a late season loss. late season donks will no longer cost you a ratings invite as they can do this season.

FWIW I believe that the ratings system is "under-notice" right now. From a purely practical point of view I don't believe that it has two years life left in it in its present incarnaton and also that we cannot wait two years for any fix to kick in. A lot of players look to the invite through competitins only system that was in place last year and are quite correctly asking what was so wrong with that approach? A lot of players are questioning if the rating system is actually a step backwards.
 
Last edited:
Multiple passes through the data set removes the exagerated effect of a late season loss. late season donks will no longer cost you a ratings invite as they can do this season.

I can definatelly see where you're coming from, but I still believe that it would not work.

If you make 2 passes, then a single phenominal tournament (say, 11-0 at Regionals) could be worth 300 points or more. And someone who pulls off a strong streak in cities will suddenly gain ~2x as many points for it, creating the possibility of sitting on your rating after 8-10 tournaments.

What if you do more runs? 100 or so to reduce all ambiguity? Someone will go 10-0 and sit on their rating right there. That would ruin the game.



FWIW I believe that the ratings system is "under-notice" right now. From a purely practical point of view I don't believe that it has two years life left in it in its present incarnaton and also that we cannot wait two years for any fix to kick in. A lot of players look to the invite through competitins only system that was in place last year and are quite correctly asking what was so wrong with that approach? A lot of players are questioning if the rating system is actually a step backwards.

I do not support a return to the event-only way on winning Worlds invites. If you want, I can re-live my horrific Gym Challenge series from last year.

I really believe people do not have a problem with the rating system itself. The reason people want to repeal the rating system is because we (the usa, since they speak up most often) have gone from 50 Gym Challenge invites last year, to 8 Ratings Invites this year.

It is my belief that if the USA/NA had the top 50 win invites to Worlds, there would be no complaints.
 
FS taking your 10-0 example. Or to be more precise the infinity-0 example true enough that players rating will climb through the roof and the opponents will fall through the floor. It is possible to take this into account by not making an infinite number of passes through the results data. The goal is to have a reasonable estimate of player strength. It can still be an estimate.

At some point every player will pick up a loss. You are quite correct that we will have 10-0 and probably a few 20-0s but the liklihood of multiple 30-0 players in the record is very small. handling the 1-0, 2-0s, 10-0s, and 20-0s can be done by either capping their rating so that it isn't infinity but in fact reflects the uncertainty that a small number of results indicates. There are other ways of handling such rogues in the data, but essentially the aim would be to admit as few rogues as possible. Some rating systems have a two tier approach: you have to play a minimum number of games before you are fully included in the system. Uptil that point your results are weighted. I'll dig out a link.

I hate to say this but it matters much more what the Junior players and their parents think of the rating system than the committed master players. The same goes for those with a more academic interest in such issues (such as myself). I can only make my opinions and thoughts count if I keep in mind the need to appeal to the target market for sales. My motivation is thus primarily to ensure that though no system can ever be fair it does not have features that will discourage new players or make retention of those players problematic. How long it takes to attain the invite, players complaining that its not fun anymore, late season losses crippling an otherwise decent years performance. All of these have to be considered as much as any technical benefits that a change may introduce.

Parents wont support the game if they believe that the system is hurting their children. No matter what I may wish Pokemon tcg is seen by many as the childrens tcg, Yugioh as the teenagers tcg, and MtG as the 'adult' tcg.
 
Last edited:
I hate to say this but it matters much more what the Junior players and their parents think of the rating system than the committed master players. The same goes for those with a more academic interest in such issues (such as myself). I can only make my opinions and thoughts count if I keep in mind the need to appeal to the target market for sales. My motivation is thus primarily to ensure that though no system can ever be fair it does not have features that will discourage new players or make retention of those players problematic. How long it takes to attain the invite, players complaining that its not fun anymore, late season losses crippling an otherwise decent years performance. All of these have to be considered as much as any technical benefits that a change may introduce.

Parents wont support the game if they believe that the system is hurting their children. No matter what I may wish Pokemon tcg is seen by many as the childrens tcg, Yugioh as the teenagers tcg, and MtG as the 'adult' tcg.

My nephew lives with me and Pokemon TCG is what I used to get him to work hard to learn to read because he was behind where he should have been. We started in the summer of '05. He is a Junior and I have been very proud of him. He actually was the last one to get a trip to Worlds from U.S. Nationals last year. He had come close at a couple of Gym Challenges. This was with him still having real problems remembering to do most of the steps during a turn. This year has had him doing much better but not having the fun he did last year. He really wants to win a trip (and is in the 1900 area) but at the same time doesn't like the way it is because he has to be conscious of losing will be a huge loss and winning not much of one. That is why I proposed what I did because then it offers what I think would be more fun and playing more (which he loves to do) does not mean he is taking risks that he might not want to do. He would get to play and enjoy the tournaments whether he wins, does great or has a terrible day. I have seen him have as much fun losing almost all his matches in a day as he has winning the tournament because it was fun. He doesn't feel that way this year as he learned having a bad day is able to really hurt his chances at earning a trip.
 
okay found a link to a variant on ELO that uses provisional ratings in addition to ELO. http://www.daysofwonder.com/en/play/ranking4/ The ever present donk in Pokémon still has to be addressed. Donks can be fun but they should form no part of a rating system. How to exclude them or ameliorate the donk factor has to be addressed. Its not easy.


robbgobb I'd bet that POP are aware of the following and are just as concerned about it as you and others
robbgobb said:
...as much fun losing....<SNIP>..He doesn't feel that way this year as he learned having a bad day is able to really hurt his chances at earning a trip.
 
Last edited:
I saw the quote of me here http://pokegym.net/forums/showpost.php?p=870453&postcount=36 NoPoke and was wondering if you altered the word "loosing" from "losing" and I know a little off topic but just curious because those words have different meanings when I look them up and I see a lot of people make the mistake of using "loose" rather than "lose". Sorry to interrupt the thread.

On subject. I really hope so NoPoke. I am waiting and hoping that by time Fall Battle Roads appear that there is a chance of having an improved system in place. My suggestion was one that seemed easier and offers some other goals to set for a player playing in Premier tournaments.
 
Okay probably time to revist some of the first posts assertions.

PROBLEMS:
1-People in areas with less events are at a massive disadvantage.
2-It puts too much pressure on players, who can’t afford to have a bad tournament, as their rating would be ruined.
3-Good players losing to bad players on luck.
4-Ratings in this system are usually not an accurate measure of a persons skill for very long, as once a persons rating gets high enough, it is reset to 1600.

1- I disagree with the term massive. They are at a disadvantage in that a ratings system really isn't appropriate for players that play in very few tournaments. But I don't see that the few players who choose not to play or are unable to play justifies rejecting a ratings based system that rewards players who actually DO play and attend lots of tournaments.

For me 1) is a non problem. Because it is inherent in any ratings based system.

2) ELO isn't putting the pressure on the players its the big prize that is on the line. We have had the rating system running for fun for the past few years and there was very little complaint about pressure to always win.

3) This one is valid. I'd rather not use the blanket term luck but prefer to think of the T2 T3 donk that is inherent in the design of pokemon tcg.

4) The system is both rating and reward. This is probably what POP intend rather than a pure skill rating system. Towards the end of the season when the ratings invites are determined the system is supposed to have shifted away from pure reward towards a more skill based indication of player strength. It remains to be seen if ELO rating is actually a reasonable predictor of match outcome.

So of the four problems that FS asserts two aren't problems with ELO at all (1&2) , one has yet to be established (4) and one is an issue that needs to be addressed(3).


Before continuing lets just have a look at one of the alternatives - masterpoints.

Masterpoints isn't a rating system at all. Its pure reward. Now in the past several tournaments offered a big reward to the winner (invite to worlds) under a masterpoints system you would have to perform well at several big tournaments in order to get that same reward. You could easily view the old regionals system as an extreme example of masterpoints where the winner only gets the points, and everyone who manages to accumulate these masterpoints during the season gets an invite.

So although Masterpoints technically isn't a rating system at all. It could replace the current rating system or be added to it. There are certainly players and parents who would be happy for a shift away from ratings invites because of the sour tournament experiences that some have had during this season. masterpoints may be a key way to reintroduce the fun element back into pokemon tcg tournaments without abandonning ratings invites completely.

Next some mathematical properties.

Zero sum - whatever I win you lose. no more no less. The ELO ratings are zero sum with a bias of 1600. POP refers to the amount at risk in any match as the stake. Having a zero sum system leads to some neat mathematical properties but that should not be a motivation for a zero sum system. Probably the biggest motivation is that because a player looses as much as a opponent wins cooperation to manipulate the points should not occur to any great extent.

(non-) Transitive. - Player A beats Player B, who in turn beat Player C. So is it reasonable to assume that A will beat C? If the game was transitive then this would be the case. Unfortunately Pokemon is not a transitive game and it is not safe to conclude much about the A vs C match. The fact that pokemon is not transitive makes it very difficult to compare the relative strengths of players who haven't played each other. This extends to comparing countries or regions.



....to be continued as the car just failed its MOT!

... okay the car is now legal on the road once more..
 
Last edited:
2200 will never happen in Pokemon.

I am looking at the DCIs ratings for Magic:TG right now. Only 2 players (out of 120,000, constructed BTW) have a rating exceeding 2200. They have played in 600 and 1051 matches, respectivelly.

Even if you look at those M:TG players who have a 2100 or better rating, the one with the fewest matches has 237. In Pokemon (at 7 matches per tournament), that comes to 34 tournaments to get that rating. aka, this probably won't happen any time soon.

I will admit that I don't know what DCI uses for the K-value, so that could skew the results slightly.



Even IF someone could reach 2200, their score will be rotated out at the end of the season, and they'll have to start re-accumulating points asap.

Well dont forget...MtG Players have whats called "Friday Night Magic"...In which they have a tournement every Friday, which also counts for their ratings.

So essentially they could have 52 tournements of 4+ rounds each. And dont Forget the PTQ's & Regionals. Those account for almost a 1/6th of all games played if you went to 3-4 PTQ's and 1-2 Regionals...So for those players to attain those games played is not that hard...

(I used to play MtG before comming over to Pokemon when it was released, and I belive that last time I looked I have over 300 games played, going to very few FNM mind you, but going to 4 PTQ's and 1 Regional in a 1 year period).
 
Okay probably time to revist some of the first posts assertions.



1- I disagree with the term massive. They are at a disadvantage in that a ratings system really isn't appropriate for players that play in very few tournaments. But I don't see that the few players who choose not to play or are unable to play justifies rejecting a ratings based system that rewards players who actually DO play and attend lots of tournaments.

For me 1) is a non problem. Because it is inherent in any ratings based system.

2) ELO isn't putting the pressure on the players its the big prize that is on the line. We have had the rating system running for fun for the past few years and there was very little complaint about pressure to always win.

3) This one is valid. I'd rather not use the blanket term luck but prefer to think of the T2 T3 donk that is inherent in the design of pokemon tcg.

4) The system is both rating and reward. This is probably what POP intend rather than a pure skill rating system. Towards the end of the season when the ratings invites are determined the system is supposed to have shifted away from pure reward towards a more skill based indication of player strength. It remains to be seen if ELO rating is actually a reasonable predictor of match outcome.

So of the four problems that FS asserts two aren't problems with ELO at all (1&2) , one has yet to be established (4) and one is an issue that needs to be addressed(3).


Before continuing lets just have a look at one of the alternatives - masterpoints.

Masterpoints isn't a rating system at all. Its pure reward. Now in the past several tournaments offered a big reward to the winner (invite to worlds) under a masterpoints system you would have to perform well at several big tournaments in order to get that same reward. You could easily view the old regionals system as an extreme example of masterpoints where the winner only gets the points, and everyone who manages to accumulate these masterpoints during the season gets an invite.

So although Masterpoints technically isn't a rating system at all. It could replace the current rating system or be added to it. There are certainly players and parents who would be happy for a shift away from ratings invites because of the sour tournament experiences that some have had during this season. masterpoints may be a key way to reintroduce the fun element back into pokemon tcg tournaments without abandonning ratings invites completely.

Next some mathematical properties.

Zero sum - whatever I win you lose. no more no less. The ELO ratings are zero sum with a bias of 1600. POP refers to the amount at risk in any match as the stake. Having a zero sum system leads to some neat mathematical properties but that should not be a motivation for a zero sum system. Probably the biggest motivation is that because a player looses as much as a opponent wins cooperation to manipulate the points should not occur to any great extent.

(non-) Transitive. - Player A beats Player B, who in turn beat Player C. So is it reasonable to assume that A will beat C? If the game was transitive then this would be the case. Unfortunately Pokemon is not a transitive game and it is not safe to conclude much about the A vs C match. The fact that pokemon is not transitive makes it very difficult to compare the relative strengths of players who haven't played each other. This extends to comparing countries or regions.



....to be continued as the car just failed its MOT!

... okay the car is now legal on the road once more..


Sorry No Poke I could not help but to completely disagree with what you said to reject the first argument

Being from Northern California I have experienced the first one to the major degree.
It is not my lack of determination that keeps me away from tournaments it is just the fact that my fourth closest battle road is a six hour drive. Correct me if I am wrong, but geographically living in certain parts of California and Texas makes it near impossible to navigate yourself to extra tournaments. As for me I think it is a bit unfair when I find myself driving 7 hours to my state championship and 11 hours to Nevada, when some people do have to even drive as long as I did for the first one to get to four. I am sorry because I am unmotivated and didnt go to texas before I went to nevada. I would have plenty of time seeing how my home is a day drive home.


I do not mean to come across as rude, but dude you cannot just call people unmotivated without knowing everyone's stories.
 
Back
Top