Masterpoint Reports Decoded

Have you ever wondered what the ACBL knows about your playing habits? Or put another way, what data mining options are available to the ACBL? You can speculate or you can dig in and learn the truth as demonstrated below.

ACBLscore creates games files with names like YYMMDD.AC{A,E,M,L}. For example, 130825.ACA indicates an afternoon (ACA) game played on August 25, 2013. The game files contain information about all the pairs, including names, player numbers, partnership percentages, ranks in each flight, as well as raw and matchpoint scores each pair achieves on each board. Games file can contain multiple events, e.g. an open and limited game run concurrently. But the game files from a club are never sent to the ACBL. Instead clubs generate a masterpoint report from the games files using ACBLscore and e-mails this masterpoint report to the ACBL. Usually this is done once a month. The ACBL only knows the information contained in the masterpoint report.

I’ll cut to the chase for readers not interested in the how the information is obtained. First, if you do not win any masterpoints in an event, the ACBL has no knowledge that you played in the event. This means the ACBL does not even know how frequently each member plays but rather has only a lower bound from which to extrapolate. Similarly, they can not compute your average percentage in a regular game or compute other similar performance metrics. Second, masterpoint records for pair games contain the player number, section, pair number, direction (N-S or E-W), partnership percentage, and masterpoints awarded. Also included are the strat and the rank in the strat that resulted in the masterpoint award, as well as whether the award is due to an overall or section ranking. For stratified events, ACBLscore ranks your partnership in your partnership’s strat and all higher strats. It also ranks your partnership in both your section and overall, except in small games with a Howell movement where there is only one section and thus only overall awards. Your masterpoint award is the highest of the possible awards. Although ACBLscore may list a pair as say 5th Overall in Flt A, 3rd in Section in Flt A, and 2nd in Section Flt B, only the ranking information yielding the highest award is transmitted to the ACBL.

Masterpoint records do not include each player’s partner. However, since masterpoint records for the players in each partnership are always adjacent in the masterpoint report, partners are implicitly available. This means the ACBL could construct social graphs like this one from The Social Network, but on a larger scale, albeit only based on the events where you have won masterpoints. More simply, they could include your partner’s name on their online My Masterpoints listing. It is possible they could do this immediately or it may be the case that they would have to reprocess the many masterpoint reports submitted to store the partnership information in their database.

These results presented are for pair games. I strongly suspect the results are similar for team games but I have not verified this. These results are for ACBLscore. The newer program ACBLscore+, currently under development, could make big changes.

Looking inside the masterpoint report

Masterpoint reports have a name like 14650638.LZH. The first six digits of filename are the club number (146506), in this case the club number for the Sunday afternoon events run by the La Jolla unit. I suspect the 3 corresponds to the last digit of the year (2013) and that the 8 corresponds to the month (August). The .LZH file extension indicates a compressed file archive. The LZH archive format is similar to the ZIP format commonly used on Windows. However, Windows does not have built-in support for LZH archives the way that it does for ZIP archives. Installing the open source 7-zip program is one easy way to open LZH archives.

The 14650638.LZH archive contains two files F146506.308 and M146506.308, where both files are named by the club number (146506) and date (3 = 2013 and 08 = August). These are both text files. You can rename them to F146506.308.txt and M146506.308.txt so that they will open in notepad or whatever program you have set as the default for open text files. If you want to take a look at the actual files yourself, download this Zip archive of the original LZH archive and the two files extracted from it. The F file looks like this:

All fields appear to be fixed length. For readability, I’ve both removed most trailing spaces from fixed length fields and then added additional spaces and line breaks to separate fields. The file is basically a listing of the events with separate records for each section (A or B). Each event record contains the club number (146506), submission year and month (201308), event date (130811 or 130825), the event name (Sunday Afternoon Pairs), and the director (Marc Matz or Paul Darin). My guess is that the first value in each row, shown in green, is a record sequence number for use by a database loading program where the first digit(s) indicate a record type, for example the event records start at 40006. The numbers in blue are the masterpoint cutoffs for the each strat and 0000 indicates no upper limit for the Flt A strat, i.e an open event.

The start of the file contains information about the masterpoint submitter, our club manager Bill Grant, our unit (526), district (22), and the version of ACBLscore used (W7.74). The meaning of some fields is still unknown.

Looking inside the Masterpoint (M) file

The M file contains the masterpoint awards. It begins with short header that identifies the club. Club number (146506), unit (526), district (22), club name (La Jolla And Beach), and ACBLscore version (W7.74).

10001 000 2000 8002013 00 146506 526 22 La Jolla And Beach W7.74

Events and the masterpoints awards for each event are listed chronologically. Shown below is the event header for the August 25, 2013 event, the second event in the report file. The header includes the event date (08252013), session number (20), club number (146506), event sanction (L308526B), unit (526), district (22), event name, the masterpoint cutoffs for each strat (in blue), and the strat designations (ABC). Each time slot combination of a day of the week and morning, afternoon, or evening is assigned a session number; 20 corresponds to Sunday Afternoon.

For each event, the overall masterpoint winners are listed followed by the section winners. Within each group, the winners are listed in order of decreasing partnership percentage. Curiously, the pair results are grouped three partnerships at a time. This probably reflects how the database loader works. The masterpoint data, including the headers for each group of three partnerships, appear as below. I have added player names for each row for easy comparison with the posted result but the names are not actually included in the masterpoint M file. Moreover, there are only line breaks in the file after the results for each group of six players; the individual player results are separated by spaces rather than a new line and the fixed length fields in each player result are not separated at all. The actual formatting supports the hypothesis that the database loader handles up to six players per record processed. But as above, I have reformatted the white space and new lines for readability.

The start of each record contains the date (08252013), session number (20), club number (146506), event sanction (L308526B), unit (526), district (22), and number of player results contained in the record (6 or 4 in this example). The meaning of 000 and 604 is unknown and they may not in fact be two fields as speculatively shown.

The following table shows the fields in each masterpoint record.

PlayerNumber

100 ×MP

OS

Strat

RankLow

RankHigh

Sec

Dir

Pair#

100 ×Pct

Nbd

Player Name

R811117

00411

21

1

001

001

B

1

006

6174

026

Eileen Heinrich

R811125

00411

21

1

001

001

B

1

006

6174

026

Morton Heinrich

K108259

00308

21

1

002

002

B

2

009

6096

027

Elaine Chan

O023010

00308

21

1

002

002

B

2

009

6096

027

Mike Mezin

R351736

00231

21

1

003

003

A

1

005

5960

027

Barbara Norman

O033954

00231

21

1

003

003

A

1

005

5960

027

Chuck Wilson

Masterpoints are recorded as 100 times the masterpoint award, e.g. 00411 means 4.11 MP. Flt is the numerical index of the flight of the masterpoint award. For example if the flights are A, B, and C then 1 ⇒ A, 2 ⇒ B, and 3 ⇒ C. The two ranking columns are usually the same; however, if there is a tie, for example for 4th and 5th places, Rank Low would be 004 and Rank High would be 005, and the ACBLscore reports shown to players would show the rank as 4/5. Sec is the section and Dir is the direction the pair sat where 1 ⇒ N-S and 2 ⇒ E-W. The partnership percentage is shown as 100 times the percent, e.g. 6174 means 61.74%.

The OS column is unclear. It appears to be 21 for an overall award and 11 for a section award. It is likely that additional codes are used for multi-session events at tournament where there are overall awards across for both sessions, overall awards within each session, and section awards within each session. It is quite possible that the first digit indicates the overall (2) or section (1) of the award and the second digit is the session number where perhaps 0 is used for an overall award across all sessions, i.e. 20 would indicate an overall award across all sessions.

BTW, one shortcoming of the current ACBLscore is the failure to indicate exactly what a pair is rewarded for in a multi-session event. For a one session event, ACBLscore prints out a designation such as 0.71(OC) which means 0.71 masterpoints for the rank achieved in the C strat. For multi-session events, a pair receives the better of (1) the overall award from the partnership’s average percentage across all sessions; or (2) the sum of the partnership’s individual awards from each session. I think the problem is exactly how to indicate the award in the finite space for the dot matrix printouts that the ACBL has been hanging up for decades. Seriously, the ACBL must be one of the largest remaining users of dot matrix printers left in United States. Presumably ACBLscore+ will do better.

The Nbd column gives the number of boards the pair played. Presumably this is the sum of all boards played in sessions on which the masterpoint award is based; otherwise a two digit field would suffice. Everett Boyer figured this field out. I was originally puzzled because it was a three digit field and because the top pair mysteriously had a 026 instead of a 027. That it was the top pair was merely coincidental. They were unable to play one board on a round against a pair that is habitually slow. As for the 024 values, one section had a half table and therefore a sit-out.

Keep in mind that the player names are not included in the ACBL masterpoint report. I have just put them in the table for convenient comparison with the posted result.

The M file concludes with the following unknown information:

90016 0016 0000950700000000000000000000000000009507

How big is the ACBL database?

The number of rows in the largest table of a relational database is one measure of how large the database is. For the ACBL, the masterpoint award table is surely the largest table. Google searches limited to the ACBL website pull up documents from which the total table count in 2012 can be estimated as 3,120,000 for club games, 117,000 for STACs, 157,000 for sectionals, and 186,000 for regionals (see Overall Table Counts). The total for club games includes 906,000 online tables sanctioned by the ACBL to award masterpoints. Therefore, the total table count is roughly 3.6 million tables or 14.4 million player-games. Masterpoints are awarded to the top 40% of pairs. However, pairs may rank in multiple flights and in section as well as overall. In practice, perhaps 50% of all players in a field on average receive some masterpoints. This means the ACBL must add about 7 million rows per year to the masterpoint award table. The ACBL My Masterpoint page has a note that, “Masterpoint detail exists only since 1989.” From this I conclude that the ACBL only has detailed masterpoint records for the last 24 years. This means the total number of rows in the masterpoint award table is probably less than 170 million.

A 170 million row table is significant. Certainly one would want to think carefully about what indexes to build on it. But compared to the amount of relational data managed by a truly large entity like Google, Facebook, or Microsoft, the ACBL database is really quite small. The whole database should fit on a single DVD. Likewise, it would be cheap to buy enough memory to cache the entire table in memory for quick access by web services.

Security

It’s interesting to speculate on the security of the masterpoint submission process. First keep in mind that any director could show players one result and then alter scores on enough boards to change the ranking or simply change the player numbers assigned to various pairs before the monthly masterpoint report for the club is generated. But this form of cheating would likely be caught quickly because there are so many players who check every fraction of a masterpoint they earn, a behavior that doesn’t seem to falloff even after such players reach life master or even some other higher rank.

But suppose a club manager added a masterpoint record to one of the events in the masterpoint report after generating it with ACBLscore but before submitting it to the ACBL. Actually it would be better to add a pair of records since an odd number of records might trip up the database loader. If the combination of pair number, section, and direction for the added records didn’t match any of the other masterpoint records for the game, there is a good chance that the ACBL would simply load the additional records and the actual players in the event would be none the wiser.

What defenses could the ACBL have against such tampering? Some kind of checksum in the masterpoint report is one possibility. I haven’t decoded all the fields in the masterpoint report file and yet none of the left over fields seem to have the high level of entropy that one would expect from a checksum, at least not a sophisticated one like SHA-1. The database loader is the other main line of defense. Loader defenses could take various forms. One would expect basic relational database table constraints to disallow more than two players from the same event to have the same combination of pair number, section, and direction. There may also be basic checks such as flagging awards greater than 10 MP in a club game. Even better would be checking for any awards larger than the top possible masterpoint award given the number of tables in the event and the event classification. However, I didn’t find any fields in the masterpoint report that correspond to the number of tables in each event.

Better yet, would be to have the database loader validate the awards based on the rankings and the number of tables in the event using the same formulas that ACBLscore uses. But this means maintaining code to do those calculations in both ACBLscore and the database loader. Given the continuous small tinkering with the masterpoint formulas, I suspect this is more trouble than the ACBL wants to take.

ACBLscore is old. So is the mainframe technology that the ACBL uses for maintaining the masterpoint database. Both are from an era when making software work at all was more important than worrying about every possible security hole. I’m not encouraging you to hack the ACBL but I bet it would work.