You want numbers I give you numbers

#0 - Dec. 15, 2008, 12:37 a.m.
Blizzard Post
GC mentioned that they like numbers, so I decided to analyze the current DPS performance statistically.

Method

Samples
- WWS report of Patchwerk fight ranging from 2:30 to 3:00
-Filters will be best fight of each guild and public reports only

Class selection
- All non DK (due to the age of the class and lack of solid theorycraft) and non-rogues (due to HAT bug)
- Only the top DPS of EACH class in a single report will be recorded to minimize the effect of stacking.
- Only the top 10 DPSers will be recorded in each report to sample out the most competitive players

Statistics
- Paired sample T-test to compare hunter DPS to other classes
- Performed using SPSS version 16

Sample size
- Aiming for 100 reports and within those reports, at least 50 of each class
- I will continue to update the results

Results will be on the next post

Update: I am currently up to 30 reports
#279 - Dec. 16, 2008, 5:09 p.m.
Blizzard Post
Q u o t e:
I really hope GC sees this and reconsider the nerf.

The current DPS difference between hunters and other class CAN ALL be accounted for by the usage of Readiness/double BW spec + heroism + Rake.

The nerf will make hunters totally uncompetitive in end game raids as the current data suggests.


I appreciate what the OP is trying to do here. Statistics can be complicated and tends to use some very precise language. One of the best things about stats in general is the assumptions and limitations are very well laid out.

T-tests are most useful when you have two groups. The OP is correct in pairing off the various groups, but in reality we have more than two groups. A great way to use a t-test would be to compare hunter mean dps before and after the Steady Shot change. Then you are comparing two groups, trying to see if the change caused a significant difference (I would be surprised if it did not, since that was the whole point).

If you have a general hypothesis overall (in this case that hunter dps is not statistically significant than other classes), you are probably better off using an analysis of variance, often called ANOVA. Briefly, a t-test compares two means, while an ANOVA can compare the variance of multiple groups. Because there are several classes involved, we have multiple groups.

Also note that ANOVA is only good at detecting if differences exist. If you want to figure out what the differences are, you need some sort of follow up test, like a Tukey.

Please remember that WoW tends to generate very complicated data sets. What I mean is that it is difficult to predict what a given player’s dps will be. It varies enormously depending on skill, gear and the other classes (and their players) in the group. For example, you might find that below a certain gear level a spec is not competitive, but above a certain gear level they dominate. You might find that class X performs well only when specific buffs are present. I’m not confident the data are always normally distributed. All of those confounding factors can make simple statistical tests suggest trends that are not actually there. The tricky part is not in running the actual test but in deciding which points to throw out. Do you only include players who clearly know their class? Do you only throw out certain gear levels? Do you include groups with different numbers of healer or buffs? Which data you analyze is everything.

None of that is to say you should discount these data. We ask for numbers a lot, so it is awesome when players can deliver them. But you have to be very, very careful to not over-analyze them.

On the actual topic, we chose the new Steady Shot number after a great deal of research. But we also know that sometimes even the best statistical models and predictions routinely fail to represent reality, which is why we provide a PTR in order to collect additional information. Is it possible we nerfed hunter dps too much? Of course. We have some confidence in our numbers, but rarely certainty. Certainty is a conversation ender. That’s not what we are about.

Q u o t e:
Stats are never worthless. I think GC is a stats nut, to be honest.

Are statistics EVERYTHING? No, far from it. But numbers are helpful.


All true.
#403 - Dec. 17, 2008, 6 p.m.
Blizzard Post
Q u o t e:
"All Else Being Equal" is a mantra that is frequently tossed around, but rarely true in practice.


I want to quote this for great justice. We see this all the time. "I saw a spreadsheet that said mages can do 5000 dps. My rogue never does 5000 dps. Therefore mages are OP."

Not trying to derail the thread. Stats are useful. Theoretical models are useful. In-game experiences are useful. None of them tells the whole story alone.