I am now testing the "wrong band reported" piece of the data scrubber. A quick check seems to indicate that it's correctly identifying all the cases reported to the forum over the last few weeks, and well as quite a few others.

Some quick statistics on the data filtering based on the last two weeks of data:

So, about 2.5% of spots are being filtered. Duplicate reports include multiple reports of same station in same band by same reporter at same timestamp, so if you are gettting mutliple decodes of same station in the passband because of heterodyning in the rx or tx, this filter will keep only the strongest report.

Since the wrong band percentage is low (<1%), I'm inclined to just delete the incorrect band reports rather than try to fix them in place. The data is always there to do that later if it is somehow important.

Note for stations running simultaneous multiple tx or multiple rx in one band:

I have (finally) implemented a data scrubber for the spot database. It currently handles several situations:

1) identifying and deleting bogus spots (bad callsigns)
2) bogus timestamps (delete if too old or fix if odd-minute report)
3) remove duplicate spots of same station on same band by a reporter, such as those caused by power line noise in tx or rx (it keeps the strongest report in a given time slot)

First of all, I wanted to report that the site changes made over the last month have helped stability considerably. Things seem to be running much more smoothly, resulting in far fewer load spikes (which were causing timeouts when posting spots as well as viewing pages).

I will be on holiday from Saturday the 17th through the 31st, and I will be pretty far out of touch for most of time. I don't anticipate any problems while I'm gone.

I want to thank Trevor G0KTN and Stu K6TU (ex N6TTO) who have been helping with the site administration. Those two are the best points of contact for site issues while I'm gone. Stu has also offered to help with some future site enhancements. We've talked about a bunch of things we both would like to see done, and it will be great to have some help on the programming side.

I see Pavel, CO6WT (below), also just offered to help with some of the analytics, which is also something Stu and I have talked about. Thanks, Pavel, though I probably won't have time to work with you too much until January.

Along with the server move last night, I have begun to make some structural changes to the spot database to address some scalability issues and pave the way to support more data mining & analytics on the site. The major change is that I am in the process of separating the spot database into two tables: a realtime "live" table which will be limited to some recent history (I plan to start with 14 days), and an archive table containing the full history. Spots will go into the live table, which will be used for the standard database query page. New spots will migrate into the archive table in batches, every 30 minutes for now.

By separating the transactional "live" data, it will be easier to add more sophisticated search features to the archive database (date ranges, wildcard matching on calls (e.g., VK*), path analysis, extraction of data for download, etc.) without locking things up for new spots and auto-refresh pages. It also makes it easier to back up the archive without locking things up for new spots.

1) The chat page now uses user information set in fields from the user profile under "My Account" -> Edit -> WSPRnet. The old "User Info" link, which fetched and stored data from a different place, is no longer used. If you haven't done so, please fill out the WSPRnet info fields.

I'm in the process of doing some improvements to the database and associated code, and I wanted to explain several things I did today. Joe (K1JT) is in the process of developing a new version of WSPR, and some minor schema changes were required to support some of the new features (I won't give anything away at this point!). That's what I was just doing during the 30-minute work tonight.

Using the downloadable CSV data, it's pretty trivial to load the data into R for some decent analysis. I was able to load all 3+ million spots into a data frame, and I'm going to start to try to learn some things. My first pictures are trivial histograms of Received SNR, Transmitted Power, and SNR-Power (un-normalized path loss). These are across all 30m spots to date.

This is a community blog, to which registered users have the ability to contribute (under "create content" -> "blog entry". This can serve for more long-term community content than forum topics. Enjoy!

Pages

For issues with this site, email the WSPRNET Admin Team or post to the site forum. Downloads and more information about WSPR program and the MEPT_JT mode, as well as other modes by Joe Taylor (K1JT), can be found at the WSJT Home Page.