As I mentioned a few weeks ago, I had planned to start up a thread about using the collective technical know-how of the quizbowl community to help create tools for managing various aspects of quizbowl. Among the more pressing issues that I can think of that can use automation are tournament registration and scheduling, question submission, packet archiving, and scorekeeping.

What I'd like to do in this thread is to use it for planning and brainstorming about both specific things we can do and the technologies we'd like to use to do them. There's a lot more technical talent around these days than there used to be, and I think now is a good time to think about how we can marshall this talent to do some useful things for the game. In particular, I would like to coordinate all these different efforts in such a way that people get a chance to do the stuff they're good at and also so that we're not repeating each other's work.

With that in mind, I would like to make two different proposals, one for the technology stack and one for the functionality I'd like to implement.

Tech stack:

Github. This is the sine qua non of collaborative development. Enough said, I think.

MySQL or MongoDB for backend storage. I don't have a really strong preference between these two storage paradigms. There's something to be said for the NoSQL approach in that it allows you to quickly change up your schema, but unless you plan on doing that, it probably doesn't buy you that much. The thing I like about SQL is that if you use Django as the backend framework, you can directly get all the ORM goodies that come with Django, which is not really well-developed for the NoSQL databases. On the other hand, NoSQL like MongoDB allows you to do full-stack Javascript.

Node or Django for backend framework. I think these are the two most viable options. As a Python guy, I have a strong preference for Django; one of the things I don't like about Node is that it can be annoying in any situation where you need to sequentially access the database several times, because you have to nest every one of those accesses in a callback. I also just don't like Javascript all that much and really prefer the Pythonic paradigms to those of JS. But I've got enough facility with JS to be comfortable programming in it if needed. Node seems to scale quite well and can be used to do full-scale JS, which can be nice. I'm determined to avoid PHP, which I view as a shitty language that results in unmaintainable code.

jQuery and Bootstrap for the frontend. I should say here that I am not a frontend guy and you can see that from anything I have ever designed. Because of this I really prefer standardized UI packages that make it simple for me to make pretty layouts. No one's going anywhere without jQuery, though, so I assume that's gotta be part of any frontend design. If people have strong preferences about frontend UI tools, I'm perfectly happy to have those who know more about this than I do take the lead on it.

Another front-end possibility (somewhat orthogonal to the above) is a framework like Ember or Angular. In some ways I think this might be overkill, but it does get you a lot of "automagic" out of the box. I'm agnostic on this point.

Functionality:

Tournament management. To me, this is the currently most pressing need, driven in large part by the fact that ACF has been doing tournament management by spreadsheet, and that has required a great amount of manual work on our part. I envision a system in which teams can seamlessly register themselves for tournaments, hosts and TDs can see who is registered for what, and teams can get invoices for their registration, all in an automated fashion. Schedule construction is a feature that one would probably like to add once the above basic functionality is obtained.

Scorekeeping. Right now, we do all our scorekeeping on paper, necessitating a dedicated stats person who enters the scoresheets into SQBS. Since we are no longer living in the dark ages of 2004 when you could never know whether a university would have some reasonable level of wifi access, I think it's time we thought about moving to an online scorekeeping system. I have previously linked to a skeleton of such a system, which kind of works, but obviously needs to be much more rigorously tested.

Packet submission. This is a big one and a situation where security is absolutely paramount. I think that having online packet submission and a system where editors could construct and edit packets would be a huge benefit. Right now, all of this is done by hand in Google Docs, which mostly works, but necessitates a lot of manual labor on the part of the editors. HSAPQ has such a system in QEMS, but it has a slightly different purpose since it's for writers only, not submissions. Still, something like that which is friendly, easy to use, and has a lot of functionality, is something that we should strive for.

Searchable packet archiving. About a year ago I updated the code of QBDB to get away from PHP (ugh), and I think that's a good base for it; the real trick is having people volunteer to format old tournaments for import. This could be generalized into a Protobowl-style practice app or whatever. I view this as the lowest priority.

Ideally all of these different components would interface with each other seamlessly. That means that you register a tournament, submit a packet, get read a packet, and have the score of your match kept, all without leaving the ecosystem. This is, to put it mildly, an ambitious vision, and it won't get realized in a few months, or even possibly a few years. But I think we are at the point where the creation of such a toolkit would be of great benefit to everyone in the larger quizbowl community, and I think we also have the talent and human-power to make it happen.

What I'm looking for in this thread is some feedback on these topics, both from technical and non-technical people. Suggestions for functionality are most welcome. I am also looking for volunteers who would be willing to work on such a project. You don't have to commit your life to it, but you should be willing to do at least some non-trivial amount of work. It would be especially great if anyone who has some experience with front-end design would like to help out, because there are few things I dislike more than writing CSS rules or doing layout (and I'm sure it shows). Obviously this will be a collaborative project probably using Github, and we'll want to produce something that people can easily deploy on their own servers if they want. I want to focus on the registration/tournament management system first, and once I feel good about that working, I intend to deploy it on the ACF website, hopefully in time for Fall.

At LASA, we've forked Jerry's QBDB and have been building off of it for an archive. As of right now, it's not polished enough for release but so far we have around 25 college regular difficulty and below tournaments, some basic search functionality, and a random packet generator. Before we release, we want to improve search functionality and add several more tournaments (mainly high school and upper-level college tournaments). Our goal is to release at the beginning of May.

LASA's quiz bowl team has categorized over 15,000 tossups and bonuses to get started. We'd like to use these as a seed for a question category classifier (probably Naive Bayes). If anyone wants to build upon our site (or write the classifier using more sophisticated machine learning) you should email us at arnavNOSPAMsastryATgmailDOTcom.

Thanks for putting this up, Jerry. I've also been thinking along these lines recently. There are a fair number of things that could use some improvement. I think it would be a fruitful endeavor to solve a simpler problem and see how it scales out, learn any pain points, etc.

One 'problem' that I think could use some simplification is playtesting. Per the sage recommendations Andrew made, I think people have been doing a lot of good things; for instance, I saw how diligently Auroni was sending out emails for playtesting Nats questions, reminding people the evening of the playtesting, changing passwords for IRC channels, etc. People seem to do this kind of stuff all the time, and there would be some value in making things easier.

In my view the playtesting problem can be subdivided into:
1. Finding the right audience (current solution is to advertise on the forums)
1.a. Security - making sure the person who claims he/she is indeed that person.
2. Setting up infrastructure (a private IRC channel, password protected)
3. Sending out invitations and reminders.

One solution I can think of, from a very high-level:
1. A 'portal' where people sign up (using a Facebook/Gmail/Yahoo/Twitter account). You enter in information of the levels (HS, College-easy, College-regular, Open) and the topics (Science, Lit, R/M/P, Arts). Ask for the following info when signing up: hsqb username, email, and phone.
2. The portal would also give the ability for people to create a tournament, enter in the topics with which they need playtesting help, and the level of the tournament.
3. Based on the info provided, the system comes up with a recommendation of people who might be of value playtesting things. From here, the editor can send out a quick message and check if the folks are interested in playtesting (or, you know, whether they'd rather play the tournament.), what times work, etc.
4. Once the people/date/times are finalized, the editor simply specifies a channel on slashnet and a password. Use some api calls (if possible) to get this setup.
5. The system sends an email reminder 24 hours before the playtesting event, and a text message 30 mins before the event.
6. Rinse and repeat for future rounds of playtesting.

I'm mostly agnostic in terms of the frontend/backend that would serve something like this. I am familiar with scripting in python, but I haven't really written too many webapps. I've currently got something running on heroku using Flask. It's pretty straightforward to get things setup, but I just haven't had time to get going on something like this.

Would appreciate any thoughts on whether an app like this would be useful, and if it is not, what would make it more useful, etc.

WildKard wrote:At LASA, we've forked Jerry's QBDB and have been building off of it for an archive. As of right now, it's not polished enough for release but so far we have around 25 college regular difficulty and below tournaments, some basic search functionality, and a random packet generator. Before we release, we want to improve search functionality and add several more tournaments (mainly high school and upper-level college tournaments). Our goal is to release at the beginning of May.

LASA's quiz bowl team has categorized over 15,000 tossups and bonuses to get started. We'd like to use these as a seed for a question category classifier (probably Naive Bayes). If anyone wants to build upon our site (or write the classifier using more sophisticated machine learning) you should email us at arnavNOSPAMsastryATgmailDOTcom.

-Arnav and Freed

Ah the joy of having time in HS.

But yeah, Multinomial (or regular old) Naive Bayes will work great for something like this. Hell, if I had some kind of test/train set, I could get a basic thing going a-la a project I did on the crunchbase companies database (it's not super sophisticated):

As a TD, there are often times when I cannot get everyone internet access - and this is true much more often than not. While I, of course, would love to see a good online scoring system, I would also really like a good replacement for SQBS that handles things more sensibly.

Cody wrote:As a TD, there are often times when I cannot get everyone internet access - and this is true much more often than not. While I, of course, would love to see a good online scoring system, I would also really like a good replacement for SQBS that handles things more sensibly.

I think SQBS is a good fallback in such cases. But I don't want to build two systems to bridge corner cases; that will result in fragmentation and nothing getting done. My preference is to build a fully online system and then worry about the situations where it can't be used.

I would certainly be up for contributing in some sort, though I will admit I don't really have any experience with collaborative projects.

Hopefully we can avoid doing another BeES; having half the projects completed is much better than having everything half-completed.

cody wrote:As a TD, there are often times when I cannot get everyone internet access - and this is true much more often than not. While I, of course, would love to see a good online scoring system, I would also really like a good replacement for SQBS that handles things more sensibly.

I really wish SQBS were open source. There's a few quality of life improvements that would make stats that much easier to do (enable round report by default, default to 20 TUs/match, etc.)

I don't think I'll have time to actively contribute to this project, but I think it's great and would be happy to discuss how NAQT has approached many of these problems. If anyone is interested in that, please email me.

You might also want to talk to Jim Puls about Tournament Director, which NAQT uses for stats at HSNCT, MSNCT, and sometimes ICT, and might use for SSNCT. I think at some point Jim wanted to integrate much of what you're talking about into Tournament Director. I'll point him at this thread.

As I mentioned in the previous thread, I would definitely be willing to commit some time to these projects (starting in June or so). Unfortunately, I'm more of a back-end guy, too, but yeah. As another Python guy I'd like to cast a vote for Django over Node, for the same reasons you mention. Assuming that we do go with Django, I'm not sure that I see any compelling reasons to go the NoSQL route. Do you suppose it might be worth using Postgres rather than MySQL, though? I've only worked with the latter, myself, but I hear that Postgres is basically Pareto-superior to MySQL. No thoughts on front-end technology - I've barely interacted with jQuery/Angular/any other modern paradigm, so I don't really have an educated opinion as to which is better.

Re: packet archive - I'm no expert on this, but it seems like adding some sort of gamification to reward tasks like uploading old packets / tagging questions by category / adding metadata would be a great way to induce bored highschoolers et al to do these things. Automatic classification is an excellent start, but the amount of data we're dealing with really isn't so large as to make by-hand classification prohibitive.

Re: offline scorekeeping - Cody's right that this is important. I think the best way to approach this at first would be to allow the online service to consume SQBS files, but in the long run, it would probably make more sense to have the online service be able to gracefully degrade to an offline web app that is basically SQBS except better, thus displacing SQBS altogether.

we'll want to produce something that people can easily deploy on their own servers if they want.

Obviously, this would be nice, but do you think anybody's going to really want to deploy these things themselves? Maybe the packet-submission/tournament-writing component, for reasons of security, but for everything else, it would probably make the most sense to have them centrally hosted under the aegis of some quizbowl organization or luminary.

jonah wrote:You might also want to talk to Jim Puls about Tournament Director, which NAQT uses for stats at HSNCT, MSNCT, and sometimes ICT, and might use for SSNCT. I think at some point Jim wanted to integrate much of what you're talking about into Tournament Director. I'll point him at this thread.

Hi!

Jonah is correct, these are very much in line with my long term goals for Tournament Director.

Today, it's rather HSNCT-centric, since I've had exactly no reason to build for anything else. But there's no reason that has to be the case.

The idea is that you can run it equally well locally or online and have the data sync from place to place. My latest project is a partial port to iOS that you can take with you on an iPad in each game room for live entry.

One corollary to Ashvin's statement that "I think the best way to approach this at first would be to allow the online service to consume SQBS files" is that we really need a better data interchange format, preferably one that allows variation in the amount of granularity (in a perfect world, everything from buzz points to just teams' records or even just their rankings. Human readability would be a plus. This has been a particular concern of Jim and me for a while; Tournament Director currently uses the Livestat format to pass around data, and if you don't know what that is, congratulations, you have identified part of the problem. Chris Carter posted something about this a few months ago, but I haven't looked at it yet.

Oh, yes, definitely. SQBS should only be a stopgap until such a time as we have a better data interchange format. Jonah, is a specification for LiveStat and/or code that handles LiveStat-formatted data available publically? If so, could you point us to it?

I've looked at Chris Carter's proposal (Tournakit), and it seems reasonable at a brief glance - it's just plain old JSON following a particular schema. At the moment, the schema doesn't appear to allow for better granularity than SQBS, but that's one of the nice things about JSON - it's pretty easy to fix that.

Excelsior (smack) wrote:Oh, yes, definitely. SQBS should only be a stopgap until such a time as we have a better data interchange format. Jonah, is a specification for LiveStat and/or code that handles LiveStat-formatted data available publically? If so, could you point us to it?

I've looked at Chris Carter's proposal (Tournakit), and it seems reasonable at a brief glance - it's just plain old JSON following a particular schema. At the moment, the schema doesn't appear to allow for better granularity than SQBS, but that's one of the nice things about JSON - it's pretty easy to fix that.

Livestat was a command-line Perl script that worked with append-only CSV files and a second Perl script that processed them into HTML. It's yesterday's news and you don't want to go there and Tournakit looks oh-so-much-better.

Excelsior (smack) wrote:Do you suppose it might be worth using Postgres rather than MySQL, though? I've only worked with the latter, myself, but I hear that Postgres is basically Pareto-superior to MySQL.

I don't see why not. I don't have any strong feelings about the two either way.

Re: offline scorekeeping - Cody's right that this is important. I think the best way to approach this at first would be to allow the online service to consume SQBS files, but in the long run, it would probably make more sense to have the online service be able to gracefully degrade to an offline web app that is basically SQBS except better, thus displacing SQBS altogether.

Yes, I'd agree with that.

Obviously, this would be nice, but do you think anybody's going to really want to deploy these things themselves? Maybe the packet-submission/tournament-writing component, for reasons of security, but for everything else, it would probably make the most sense to have them centrally hosted under the aegis of some quizbowl organization or luminary.

I think there are some people who might want to deploy their own servers for whatever reason. I don't see any obvious reason why, say, ACF couldn't host such a thing, but sometimes people have their own reasons for running separate sites.

jonah wrote:One corollary to Ashvin's statement that "I think the best way to approach this at first would be to allow the online service to consume SQBS files" is that we really need a better data interchange format, preferably one that allows variation in the amount of granularity (in a perfect world, everything from buzz points to just teams' records or even just their rankings. Human readability would be a plus. This has been a particular concern of Jim and me for a while; Tournament Director currently uses the Livestat format to pass around data, and if you don't know what that is, congratulations, you have identified part of the problem. Chris Carter posted something about this a few months ago, but I haven't looked at it yet.

I think the interchange format should be some form of JSON. XML sucks for human readability. I'm happy for it to be Chris' Tournakit or anything else for that matter, as long as it's consistent. This might be one reason to go NoSQL, because the documents stored are usually just JSON objects anyway.

gustavus.adolphus wrote:Going with NoSQL just because its native data model is JSON seems presumptuous to me, particularly when one could just traverse the database (Postgres or whatever) to create the JSON on the fly.

Certainly you can create the JSON on the fly. The advantage of the NoSQL paradigm is that you store the JSON objects natively. If you try and traverse the database to do this, you are adding an extra step into the logic; doesn't mean it can't be done, but it's not the "cleanest" thing either. In my view, the NoSQLs make a lot of sense when you have a lot of data that could be stored as documents or subdocuments; I don't think quizbowl really has such a structure, which is why the traditional SQL tables approach should work fine. Either way, it's really beneficial to have ORM, which exists for either storage paradigm. Node is more suited to doing stuff directly in JSON (or, really, using something like Mongoose for ORM), while Django translates tables to Python classes. As someone more comfortable with Python than JS, I prefer the latter option, but I'm just trying to cover all the different possibilities.

Naturally, a lot of my input on this will relate to how the Quizbowl Resource Database fits into this technology platform. The database was designed with a similar goal in mind of unifying as much as possible into a single system; a lot of the functionality suggested thus far are things that I've long wanted to implement in the database or see implemented elsewhere. Of course, when I'm the only one who's really ever worked on the database, my full time job is in software development, and I spend so much time traveling to tournaments and taking a break from programming/quizbowl, it's no surprise I haven't devoted nearly as much time to it as I have wanted. (Too bad all the free time I had in high school/college was spent on sloppily-coded MSHSAA-style quizbowl influenced VB.NET/Access web apps that are now defunct...) As much as I've wanted to implement some of this stuff myself, it's obvious that I simply don't have the time for it, and it makes more sense for the qualified community as a whole to collaborate on a unified quizbowl software ecosystem.

The Quizbowl Resource Database was written almost entirely from scratch by me in PHP, using MySQL, with basically no frontend/backend libraries whatsoever other than what php (and phpBB) provide. Anyone joining the effort to maintain that would have to deal with PHP, the strange things I've come up with, etc. The most obvious benefit of the current system is that it is tightly integrated with the forums - it uses phpBB credentials to authenticate users so people who have a forum account don't have to maintain a separate account to use it, and it also allows us to generate those (hopefully not too annoying) emails asking people to post tournaments to the database when they announce them on the forums. It is also extremely convenient to have tournament listings, statistics reports, and question sets all in one place, and it would be nice to add on more functionality to that centralized system.

As much as I'd hate to see another one of my personal programming projects go defunct, it's becoming more apparent to me that it probably makes more sense to leverage the talents and availability of the community and come up with something new and improved rather than try to use what little time I have to maintain it myself or bring others on board and have them learn the quirks of the current system. With 4+ more years of industry experience since I first started the database, there are certainly some things I would do differently if I were to do it all over, and this community effort may be a good way to do that. I'm really glad this discussion is happening now, because with the end of the regular season in Missouri, I'm finally coming up with some free time to work on the database again, and it would be good to know if I'd be working on something that would last for a long time or if it's just going to be rendered obsolete by something else in a few months.

As for some of the features suggested in this thread:

grapesmoker wrote:Tournament management. To me, this is the currently most pressing need, driven in large part by the fact that ACF has been doing tournament management by spreadsheet, and that has required a great amount of manual work on our part. I envision a system in which teams can seamlessly register themselves for tournaments, hosts and TDs can see who is registered for what, and teams can get invoices for their registration, all in an automated fashion. Schedule construction is a feature that one would probably like to add once the above basic functionality is obtained.

I have wanted to implement this for several years now but it has never made it to the top of my task list. This functionality is a natural fit for the existing tournament database - this is the reason why pricing information on database entries is entered the way it is and not as a free-form text box, for instance. I was hoping to implement this by next fall, but then again, that's something I've said the last 2 summers. This is the kind of situation that I don't want to see fragmented across multiple systems; I would like to see this tightly integrated with the community's "master" tournament database, whether it be integrated into the existing database, or written in a way that the existing tournament database could eventually be replaced with it.

grapesmoker wrote:Scorekeeping. Right now, we do all our scorekeeping on paper, necessitating a dedicated stats person who enters the scoresheets into SQBS. Since we are no longer living in the dark ages of 2004 when you could never know whether a university would have some reasonable level of wifi access, I think it's time we thought about moving to an online scorekeeping system. I have previously linked to a skeleton of such a system, which kind of works, but obviously needs to be much more rigorously tested.

Cody wrote:As a TD, there are often times when I cannot get everyone internet access - and this is true much more often than not. While I, of course, would love to see a good online scoring system, I would also really like a good replacement for SQBS that handles things more sensibly.

Jonah wrote:One corollary to Ashvin's statement that "I think the best way to approach this at first would be to allow the online service to consume SQBS files" is that we really need a better data interchange format, preferably one that allows variation in the amount of granularity (in a perfect world, everything from buzz points to just teams' records or even just their rankings. Human readability would be a plus.

While SQBS importing would be good for importing archived statistics to the system, I think a better interchange format is a must to ensure maximum compatibility between applications. Like Cody, I definitely want to see an offline solution that integrates naturally with the system. I have never had internet access on my laptop when directing the NAQT Missouri Qualifier at Mizzou, and rarely have internet access on my laptop when doing stats at area high schools. When posting stats, I would much rather have people upload the raw data file and generate the HTML on the server side, but the obscurity of the SQBS format has pushed that extremely low on my priority list. A good interchange format would also make it much easier for people to get a raw data dump to use for their own purposes (for instance, something like HSQBRank that could pull raw data from a central database and calculate team rankings automatically). Ideally, this interchange format would support arbitrarily merging data from multiple sources as automatically as feasible (for inputting stats from multiple places, other services synchronizing data with the master database without wasting bandwidth getting redundant dumps, etc).

I would like to see a new statistics platform written from scratch that uses an extensible, well-documented data format/backend so that frontend applications could be developed for various platforms (web interface, standalone PC/Mac/etc. that could be used offline, iOS/Android/Windows Phone/etc.) and functionalities (stat viewing, SQBS data entry replacement, electronic scorekeeping, BEeS-like system, etc.), for whichever combinations of those things demand exists. It must be designed with rebracketed tournaments in mind, so you don't have to juggle multiple data files - this is by far my least favorite thing about SQBS.

------------------------
Here's an overview of my vision of how a unified system could work, which brings together some things that already exist in the current database and some other things proposed so far. (Note that anything I suggest as "automatic" should be an optional aid and the output from such "automatic" functions should be manually modifiable if necessary.)

A master database would contain information about all the various schools, organizations, people, etc, involved in quizbowl. This database could be used to generate contact lists for hosts to send out tournament announcements, hosts to contact willing staffers in the area, etc. This should definitely be designed in a way to reduce the risk of making this contact information readily available in bulk to spammers or other nefarious people. (For instance, by limiting access to email addresses to trusted registered users, etc.)

Tournament hosts announce their tournaments and add them to the tournament listing database, allowing people to register using an online registration form. These tournaments are linked to the question set they are hosted on.

Teams search for tournaments in their area and register for tournaments they'd like to attend. In the ideal case that all tournaments were in the database and used the database for registration, teams could be alerted if they've already registered for a tournament on that date, on that question set, if there's a closer tournament on that question set, etc.

Teams could also subscribe to tournament listings to be alerted when new tournaments are announced in their area.

Tournament hosts can use the registration system to keep track of which teams are registered, how many buzzers and staffers they have, etc. Invoices could be generated and sent to teams through the system.

Teams would enter their rosters in the online system before the tournament. For high school tournaments, the roster entry could prompt for college information, and the player database could effectively replace the existing Freshman Contact List.

The tournament listing could display a list of registered teams and rosters.

Before the tournament, a blank statistics data file could be generated from the entered rosters.Optional enhancements:

The roster/team information could be passed through a system that analyzes past tournament statistics to generate a preliminary seeding.

The system could suggest a schedule based on the number of packets and teams, user-provided scheduling constraints, etc.

The TD can adjust the seeding as desired and then have teams placed into the schedule automatically, with an option to automatically swap teams to avoid teams from the same school being in the same pool, enhance geographic diversity, etc.

This data file could be fed to a schedule viewer that produces a schedule suitable for printing. The schedule could also be made available online so that attendees can look up the schedule on their phones/tablets.

Data entry could be handled multiple ways.

It could be done online from a web interface or Internet-connected program or app, either from a central stat person or from scorekeepers in each room.

One or more stat people could enter statistics offline in a manner similar to how SQBS works now, and these data files could be merged with each other and/or uploaded to the central database anytime.

A BEeS/Abacus like frontend could eventually be used.

The schedule data and entered results could be fed to a feature that evaluates the possibility of ties for rebracketing occurring based on certain outcomes in the remaining rounds.

The statistics program could use the schedule to automatically rebracket teams based on the entered results. The schedule viewer could be used to print new schedules, or teams could just get the schedules online.

The tournament host uploads the file to the statistics repository after the tournament (if not throughout the day), making the report available to the public and the data available to anyone interested in using the data.

When a question set is posted after all mirrors have concluded, all teams that registered for a tournament using that set could be notified, if desired.

As mentioned and otherwise implied earlier, the central statistics repository would have data in a format that could be easily consumed to create rankings, playing histories for schools/teams/players, etc.

---------------------------
Anyway, I would love to be involved in this project with whatever time I can find to do so. I don't really have any experience with the various frameworks/libraries that exist today; my web-based programming experience is just several personal projects using PHP/MySQL (older projects in VB.NET), and my professional experience technology-wise is not particularly relevant to this project.

If this project involves replacing the existing Quizbowl Resource Database with a reboot based on this new ecosystem, I'd naturally be able to help out with this transition and to make sure all the data in the existing system is migrated. I would like to see the tight integration with the hsquizbowl forums continue because of the benefits of having a single account provides, and I imagine we could figure out a way to do that even if the new system is written in something other than php. If the system ends up being completely independent of hsquizbowl, I have been sitting on quizbowl.org for over 5 years now and I think this would be a great use for that domain for whomever will be hosting this service.

All of the work that Jeffery and others have put into thinking about this seems really great, and I'd love to help wherever I can. However, I think I should mention the http://en.wikipedia.org/wiki/Second-system_effect , as that seems (to me at least) to be a potential problem with the project.

Of course, I'm happy to work on whatever the community thinks up, so if you need any help with Python/.NET stuff (with a bit of frontend stuff), I would be happy to help out.

Jeff, thanks for your post. I agree with the list of features that you've outlined; most of those are part of my vision as well. It would be nice if we could integrate with the forums somehow, but for the moment I'm putting that particular effort on the back-burner. I think once we have real functionality to offer, we can start thinking about integration. As for PHP, the old version of QBDB was written in it, and it was a huge mess, so I don't want to go back to that again.

Just a thought: perhaps Jeff's (amazing) database functionality of the forums could be configured to accept the uniform output file that people want this new project to generate (instead of/in addition to the SQBS files it currently accepts). Because it's integrated with the forums in the way Jeff describes, I'm guessing it could be reconfigured to, for example, automatically post and update an announcement thread from a particular account as logistical data from Jerry's program is entered and updated, automatically post a link to stats as stat reports are posted, and serve (as it does now) as a clearinghouse for searchable stats links and packets.

In other words, instead of replacing the framework Jeff has created or worrying about integrating Jerry's project with the forums, why not use Jeff's database as a sort of adapter plug that would integrate Jerry's project with the forums? If this is technically unfeasible, obviously ignore.

theMoMA wrote:Just a thought: perhaps Jeff's (amazing) database functionality of the forums could be configured to accept the uniform output file that people want this new project to generate (instead of/in addition to the SQBS files it currently accepts). Because it's integrated with the forums in the way Jeff describes, I'm guessing it could be reconfigured to, for example, automatically post and update an announcement thread from a particular account as logistical data from Jerry's program is entered and updated, automatically post a link to stats as stat reports are posted, and serve (as it does now) as a clearinghouse for searchable stats links and packets.

In other words, instead of replacing the framework Jeff has created or worrying about integrating Jerry's project with the forums, why not use Jeff's database as a sort of adapter plug that would integrate Jerry's project with the forums? If this is technically unfeasible, obviously ignore.

This is probably the way we'll go when we get to it. A common data interchange format is one of the first things we need to establish, and once we do that, there's no reason why Jeff's database couldn't read it.

This is so far over my head as to be outright horrifying, but I am a HUGE fan of what you folks are trying to do here. Hoover High, down in Alabama, hosts two major tourneys each year (anywhere from 30-60 teams playing), full wi-fi and wired access, and a very cooperative tech management staff who will work with me to get things set up in advance. I would happily volunteer to field test any new system, or to contribute time/effort in any area where I can actually assist in this regard.

Joshua Rutsky
Coach, Hoover High School, Hoover, AL
Member of the Qwiz Team!

I haven't run a tournament before, so correct me if I'm wrong, but I don't think that these systems need to communicate in any way. They should have a consistent look and feel, a consistent interface, but we should be able to isolate the components. As much as possible should stay the same between the suite of tools (especially the way we store questions), but I think having three different teams in communication might be the fastest / most efficient way of doing this.

Our fork started off with NoSQL and MongoDB, but we soon switched over to PostgreSQL. The lack of JOINs in NoSQL may be a net positive at Facebook scale, but modeling tournaments, tossups, and packets seem to lend itself well to a relational database model. Right now, our site is using sequelize, which hasn't been great at representing complex queries. There are other node SQL ORMs like orm2 and waterline, but I haven't looked into them yet.

The case for Node is a single language everywhere, even if it isn't a nice one. Also, node has node-webkit, which might make it easy to create a website, and then compile it to Windows, Mac, and Linux for offline use. Does anyone know if Django has anything similar?

You can create tournaments, add brackets, add divisions, add teams, add rounds, and change all of these settings at any time. There are two different levels of access control - director, and moderator. Directors can do anything with a tournament, including deleting it. Moderators may only upload statistics using the Abacus interface you're probably familiar with. User accounts are done through Persona, so making an account is as easy as just logging in to your e-mail with Persona.

It isn't polished yet, so it might be a little difficult to use, but everything works and it has basically all the features of SQBS except for schedule generating (round robin, etc.).

Right now the statistics takes a while for very large tournaments, but I'm currently implementing a way to reduce this time down to a few seconds tops. A feature to export statistics to raw html that can be uploaded to hsquizbowl.org/db should be done within the week (hopefully by Saturday). That link is the development version, so it may go down at any time since I'm developing right now. I'm hoping to push to production in a few days (and open source it then too). Abacus runs on Flask and uses SQLAlchemy/sqlite for the database, and just Bootstrap for the front-end. I want to try out AngularJS in the coming months as well as Websockets for statistics. I'm probably going to migrate to postgreSQL sometime too.

The e-mail has been deprecated in favor of the database, and every time the moderator presses a score button (either tossup or bonus), the round statistics are saved into a cookie so if your browser crashes, you won't lose any data at all (though you still have to synchronize beforehand.) Just click Save Statistics>Load>Recovered Round.

Having a well-designed interchange format will give us a lot of flexibility. With a reliable way to exchange data between systems, we would be able to break this up into independent components that are responsible for specific tasks (stat keeping, question production, etc.) and use the common data format to exchange data between the components as needed.

Similar to Arnav, I see these primary components of a "complete" system, with flexibility to add additional components as necessary:
- Quizbowl Resource Database (tournament listings, etc)
- Question production program
- Statistics program/tournament manager
- Question archive/search engine

These components could be modules within a single application, completely independent applications, or a combination of both.

I see the Quizbowl Resource Database's role as being the "master" database that other components can push/pull data from. In addition to what it has now (entries for tournaments, tournament statistics, and question sets), it would also have information about people, organizations, and other entities that may need to be exchanged between components.

I think it makes sense for this central database to be responsible for tournament search, since so much in quizbowl revolves around tournaments. As I said earlier, I think adding the ability to register for tournaments through the QRD is a natural fit, since it would allow people to use the same system to find and register for tournaments, which would also have the benefit of allowing people with forum credentials to edit their registrations without having to maintain a separate account. If designed correctly, the data exchange format would not prevent others from running their own decentralized registration systems (such as if ACF wanted to handle registrations through their own website) and synchronizing data with the central database as needed. (Though, it would be nice to avoid fragmentation where possible.)

Of course, if decentralized parts of the system require authentication and we want to use a common login throughout the ecosystem, the existing use of forum credentials will obviously not work for that. If that were determined to be the case, forum integration becomes merely a peripheral component, and a complete rewrite of the central database designed specifically for this unified platform may be better for the project in the long run, and forum integration would likely become a separate component from the central database.

A question production system would handle all the aspects of set production. This would be able to output question sets in common formats like PDF or DOC, as well as a raw exchange format that could be consumed by an Abacus-like packet reader, packet archives/search engines, etc.

Statistics programs would naturally handle statistics entry, generating results in the common interchange format to be archived in the central database. The program would be able to operate on a completely standalone basis. Tournament registration information, a schedule module, and potentially historical statistics could be combined to generate a statistics file ready for data entry; this could be handled by a module in the central database, a module in the statistics program, and/or an entirely separate helper application.

The packet archive would ideally contain both downloadable packets and search engine capabilities. It would be able to take in DOC/PDF as well as raw question data in the interchange format. If the current database continued to use phpBB for authentication, it might make sense to leave the quizbowlpackets.com DOC/PDF archives there, but provide a convenient interface to allow separate search engines/question readers to pull packets (which would ideally also be available in raw format, either as part of the central database or from a separate component). I think having people upload packets directly is more convenient than emailing packets to a central person, and having the authentication adds a layer of protection from people trying to upload random stuff. Although, the benefits of having all packet archive functionality in a single system may outweigh the benefits of keeping quizbowlpackets.com as part of the QRD. Of course, if we use some other method for authentication that the question archive database could use, it might make more sense for packets to be uploaded there and that dedicated system to become quizbowlpackets.com.

This is just one potential approach; it's definitely ambitious, and implementing a good method of exchanging/synchronizing data between components may be quite challenging. I think keeping components modularized in some way, having a good data exchange mechanism, and clearly prioritizing what order to implement the various components/features are important to help us produce a flexible, maintainable suite of tools.

Jeff, one way to handle interchange is if you could create a REST API for the HSQB database. That way sending data between systems would be straightforward. As far as packet generation is concerned, PDFs are my favorite output; I don't know that there's any point in generating DOCs but I guess we could do that if people really wanted to. I don't think we want to worry about searching PDFs though; that seems like a lot of extra work on top of searching a plain-text database.

I like Raj's interface and the functionality is quite nice, although I still find a few things confusing about it. Raj, can you make that code public so we can all look at it?

It looks like all of us are mostly on the same page. I don't detect a lot of disagreement about what tools to use; there seems to be some slight preference for Postgres over MySQL, which is fine with me, and SQL in general over NoSQL, which I think also makes sense. To me, this also suggests using something like Django for the backend, because my impression is that Node is not as advanced in handling relational DBs as Django is; in Django, the ORM basically handles all the relational magic for you, which I like. But if I'm wrong about that and people have a strong preference for Node, I'm happy to work with that too. There also appears to be general agreement on front-end tools.

Here's what I suggest: let's have an online IRC meeting sometime next week and try to decide what the plan is for going forward. We have a number of projects that could be used as a skeleton for building around, and a lot of people who are motivated and can contribute across multiple areas. So let's get together and decide what we're going to do. I've created a Doodle poll to facilitate this, so if you're interested, please indicate your preferred times (all times are EST) and we'll go with what works best for the largest number of people.

For what it's worth (hint: not much,) I've been working on and off on packet parsing for a couple of years, although most of what I've come up with hasn't been coded yet. If anyone wants to compare notes, I can be reached at j(spam)alex(spam)malone(at)gmail.

grapesmoker wrote:Jeff, one way to handle interchange is if you could create a REST API for the HSQB database. That way sending data between systems would be straightforward. As far as packet generation is concerned, PDFs are my favorite output; I don't know that there's any point in generating DOCs but I guess we could do that if people really wanted to.

I think generating DOCs (or RTFs or something else that doesn't require much knowledge to edit) is a good idea. Some hosts like to make adjustments to questions, and it'd be good to support that (when the editors are okay with it).

Yeah, conversion between document formats seems to be mostly a solved problem. The program just needs to convert it to RTF or some other format, and then http://johnmacfarlane.net/pandoc/ or something like it, can deal with it.

There's a few areas that need still need significant cleanup. Particularly, the tournament statistics calculations and the data validation method (I'm using voluptuous).
In regards to the first point, I completely revamped the system - before it was requesting data structures from the SQL on server and compiling them into HTML clientside, and now it does everything serverside, storing html to the database (updated at the discretion of the tournament director) which the client requests. Obviously for the user this is much faster, but there are still a few things I want to work out - like storing the data as XML or another format besides HTML as well as a queue system (I was thinking RQ [http://python-rq.org/]) to for example, run statistics updates for all tournaments that need it every five minutes. When a director makes a change to a tournament or a game is uploaded, the tournament is submitted into the queue for a statistics update at the next five minute mark. Alternatively, the TD can force a statistics update. In the future I want to try out websockets and see if it can streamline this process.

Another issue is compatibility with hsquizbowl.org/db. Currently I'm not able to upload my own html files (sample http://abacusquizbowl.com/loaded/1) to the database, it says that my file isn't valid SQBS, which of course it isn't. If hsquizbowl.org/db could open up the statistics uploads to more formats, that would be great.

Finally, there are some bugs on Firefox right now. It should work in Chrome/Safari fine, but I'll get to the Firefox bugs soon (select options aren't displaying for some reason).

So it looks to me like Monday at 9 is a pretty good time for almost everyone; I'm going to call the meeting for that hour. We'll be getting together in #quizbowl-software on Slashnet. The following things are on the agenda:

1) What technologies do we want to use? Frontend, backend, storage, etc.
2) What functionalities do we want to create?
3) Should we reuse any currently existing projects?
4) How should the work be divided?

Those are the basic large-picture bullet points that we probably want to iron out before we start working on anything, so please come prepared to talk about them.

Also, yes. User interface/experience is not my strong suit (as you can tell by the extensive bootstrap theming). I'm going to work on that soon, I have a lot of advice from the moderators at WeST (which we used Abacus for - statistics at http://abacusquizbowl.com/loaded/3 by the way, which were uploaded live during the tournament).

Hey folks: if you are a member of the quizbowl github group, you should probably join this email list so we can communicate via email. I have some things I want to share with those of you working on this project but I don't want to clutter up the board with posts or expose links the world.