Who name selection criteria
Posted: Mon Oct 24, 2011 1:48 pm
Every time I've run CensusPlus I've felt uncomfortable about how the who name requests are created.
The ideal process is to do the least amount of requests to get all the character currently online.
Currently Rollie uses:
local function GetNameLetters()
return { "a", "b", "c", "d", "e", "f", "g", "i", "o", "p", "r", "s", "t", "u", "y" };
end
For a total of 15 instead of the assumed worse case of all 26 english characters. The problem is that Blizzard can't limit names to the strict ASCII coding.
To cover the English, German, Spanish, Portuguese, French languages takes 42 symbols in the Latin-1 character set (ignoring case.){per wikipedia)
And I've already spotted in use characters in the Latin-1 character set but not used by the above languages.
The problem I see is that we are certainly doing requests that return data that we have already see.. and we are almost certainly missing the edge cases.
I'm not yet sure how the WhoLib handles Diacritics and ligatures used by most of covered languages, Blizzard UTF-8 character set which includes characters not in the above languages. Nor am I sure yet how the Wow api responds to single character name requests.. i.e. which character/case/accent combos.
I'm looking at letter frequency in character names.. by examining names from the top (membership) guilds from realms in all languages. Using Wowhead to get guild numbers and Blizzards web api to download membership rosters.
Already I've found a disturbing error.
From US - Cenarius -guild Infection
I see member Äçóòñ who as of his last activity is a level 67 Draenei Priest and appears to have been in the 60's through out all of June and late May.
Yet doing a character query here I get a level 16 priest only seen on may 23 of this year... but Cenarius/alliance has always been a well covered faction.. currently at 300 census in the last 30 days.
So the questions are:
Why has Äçóòñ been missed in census runs when he was very active during the month of may.
Or.. Why/How was he even found once!
I'm going to continue researching this problem and I hope to be able to provide Rollie some code to help cover all names and maybe even with less duplication from the requesting pattern.
The ideal process is to do the least amount of requests to get all the character currently online.
Currently Rollie uses:
local function GetNameLetters()
return { "a", "b", "c", "d", "e", "f", "g", "i", "o", "p", "r", "s", "t", "u", "y" };
end
For a total of 15 instead of the assumed worse case of all 26 english characters. The problem is that Blizzard can't limit names to the strict ASCII coding.
To cover the English, German, Spanish, Portuguese, French languages takes 42 symbols in the Latin-1 character set (ignoring case.){per wikipedia)
And I've already spotted in use characters in the Latin-1 character set but not used by the above languages.
The problem I see is that we are certainly doing requests that return data that we have already see.. and we are almost certainly missing the edge cases.
I'm not yet sure how the WhoLib handles Diacritics and ligatures used by most of covered languages, Blizzard UTF-8 character set which includes characters not in the above languages. Nor am I sure yet how the Wow api responds to single character name requests.. i.e. which character/case/accent combos.
I'm looking at letter frequency in character names.. by examining names from the top (membership) guilds from realms in all languages. Using Wowhead to get guild numbers and Blizzards web api to download membership rosters.
Already I've found a disturbing error.
From US - Cenarius -guild Infection
I see member Äçóòñ who as of his last activity is a level 67 Draenei Priest and appears to have been in the 60's through out all of June and late May.
Yet doing a character query here I get a level 16 priest only seen on may 23 of this year... but Cenarius/alliance has always been a well covered faction.. currently at 300 census in the last 30 days.
So the questions are:
Why has Äçóòñ been missed in census runs when he was very active during the month of may.
Or.. Why/How was he even found once!
I'm going to continue researching this problem and I hope to be able to provide Rollie some code to help cover all names and maybe even with less duplication from the requesting pattern.