Search form

Main menu

Search form

Search

TagLib, symlinks, and an optimized upload queue

Submitted by Nir on Tue, 01/20/2015 - 13:03

The biggest piece of news this time around is that I've managed to integrate TagLib, the super versatile audio file analysis and tagging library into SoulseekQt. Finding TagLib was a pretty major happy accident. I was showing SoulseekQt to a co-worker, and his second question (after: can you search for FLAC files?) was, does it show FLAC file audio properties in search results? No, I said, we only really analyze MP3 files for audio properties. But that's a good idea. A-googling I went for a C/++ library that can analyze FLAC files and TagLib showed up almost immediately. On top of indexing FLAC file audio properties, TagLib also does that for (begin near copypasta:) MP3, MPC, MP4, ASF, AIFF, WAV, TrueAudio, WavPack, Ogg FLAC, Ogg Vorbis, Speex and Opus files. Goodbye my own personal MP3 analysis function, hello TagLib. Though audio attributes for all these file types should already show up in your own private share (I've only really tested MP3s, MP4s, FLACs and Oggs), the only audio attributes I'm currently indexing are bitrate and play length, the latter one which has been lost to us since the dawn of SoulseekQt. Whether the file is VBR or not, interestingly, is not information that's provided by TagLib and so that no longer shows up, but I feel the benefits more than outweigh this loss. I'd be happy to add other audio attributes that you folks feel might be useful, so let me know.

Another bit of good news for those of you who prefer organizing their shared folders via symlinks, those are back in play. Scanning symlinked folders was disabled at some point because it would occasionally create inifinite scan loops. I'm making a list of the real location of each scanned subfolder and checking it twice to prevent those kinds of loops, so hopefully this'll address that.

Finally, I've sort of reworked the way the transfers are processed on the upload queue. The recent "upload small files immediately" feature created a performance problem that led me to slow down the speed at which the client processes new upload requests. That was not ideal, so I moved things around and now hopefully the whole thing should be faster.

These are all very sensitive changes, so I'm expecting more problems than usual. I'll be fixing anything that comes up ASAP, so keep me posted.

1. in search result mp3 attribute is kbps and it's showing well. time length is not shown for all mp3's though.
2. for flac attribute column is empty. why not filling it with "sample rate, length"?
3. .db is on my excluded list, but is still downloaded when folder is selected from search results (from Browse it's blocked properly). Is this behavior correct? .db file should be at least unchecked by default in "download files dialog", or even not shown there if figures as excluded from download.

Regarding 1 and 2, assuming that both relate to search results, bear in mind that this new functionality affects your own shares only right now. Once this new version is out we should hopefully see more and more extra attributes for shared files as users upgrade.

3 sounds like an unrelated, but genuine bug. I'll look into it over at the weekend at the latest.

Disabling Diagnostics did not fix my crashing problem with the last build, (I posted a new crash log to my forum post), but I guess you can ignore that for now, so far so good on this build. Thanks for your amazing work!

That's interesting.....
Just looked at the file scan log, and noticed that its doing a case-sensitive A-Z-a-z scan for the flac and ape files? The folder it had crashed on before is about 2/3rds of the way down the list. Goes from 'Yes' to 'arvo part' not long after, then finishes ok.

I think all of this TAGlib is unnecessary. things where fine so far, thats my opinion. It Might even slow down slsk. but since its already in the process here are my suggestions: MP# should have bitrate and if its in VBR it should simply show V0, V1, V2 instead of the 140-217kbps or alike. i cant explain how awesome that would be. lossless files like flac/wav should only show bit and frequency in 24/96 form. i believe thats it. why would i wanna see some unimportant intel. length is really not that important, i mean i would sooner make a size filter than a length filter.

+1 on showing LAME encoding settings (V0, V1, etc.) if that info is in files with LAME tags, but -1 on trying to infer such things from the bitrate of any other file. When encoding VBR, the bitrate varies greatly depending on how tonally and spatially complex the sound is. Quiet, mono, and tonally simple music can be encoded with the highest quality setting and still come out with a very low bitrate because that's all it took to achieve the target quality (by whatever objective measure the encoder uses for "quality"). Aside from the encoder-specific settings stored in LAME tags, the INFO or XING header on VBR files can have a "quality" value from 0 to 100, which may or may not be filled in by the encoder with anything it wants, and there's no standard so you can't compare the numbers produced by different encoders. The VBR header can often be found on CBR files as well, but CBR is normally encoded without regard to quality, so whatever "quality" number is written there is useless.

Bit depth and sample rate would be nice for lossless, I agree. Display format is not too important, although I agree if both are together, bit depth should come first.

Length is important for me. I appreciate it in search results because the file names often don't convey accurate (or any) info about song versions, especially for edits and compilation appearances.

I also use taglib in my application and I extended it to give me the vbr info. And yes, I'd like bitrate and song length as accurate as possible. If I'm looking for a longer version of a song the bitrate itself doesn't help me much.
Thanks for your work.

The XING header is the main info header for VBR files, no matter what encoder was used. Also LAME creates a XING header if encoding VBR files. The only other VBR header that was used but has quite disappeared by now is VBRI. For CBR files LAME creates an INFO header that is almost like XING but with very few changes.
However Taglib already knows how to scan these headers. The only missing piece was that it should store the fact that the bitrate is VBR. That's what I added.

This is weird, I modified XingHeader to extract quality information if it's present, but i seem to always be getting quality 0. I only tried with a few hundred MP3s though. This is the modified version of XingHeader::parse:

Scratch that, looks like TagLib only reads the first 16 bytes of the header, so it was returning 0 as a way of refusing to read outside the byte buffer. I changed it to read 120 bytes and now I'm getting quality info!

Are the shares rescanned on startup?
Assume they are, but that would not stop another user's list being invalid. I spotted a folder that had been moved weeks ago, showing up on the 'not found' diagnostics tab, so I did a rescan and around 40 files were listed as removed.
Now, I restarted the debug version a day or two back after 1.21 had crashed overnight, but I don't remember doing any housekeeping of that volume during that time.

It did make me notice two things. The file rescan window holds its z index too high. even if another window is brought to the front during this, the rescan sits on top of it. the width also seems to hold the longest folder/file name size, even thought that's a bit pointless.

Thanks! Yeah, there doesn't seem to be anything there other than the thread messages. May or may not have something to do with the crashes... these are probably file transfer threads. I've been working the last couple of weekends to transition SoulseekQt to non-threaded file transfers, so maybe that will make the client more stable. I'll post a link to a new build as soon as it's ready.

For symlinks and shortcuts, I'm curious to know what the indexing logic is. There are a lot of decisions to make about what to follow and what to ignore. Avoiding loops is only part of what needs to be done.

What if the target has NTFS 'system' or 'hidden' flags set or is actually missing or unreadable? Ignore the link, I hope.

What if the target of a public link is userlist-only, or not shared at all? The target better not be visible or accessible to anyone I haven't allowed it to be shared with.

Even loop avoidance causes problems. I assume you are avoiding indexing a folder if it was already indexed. What's the point of following folder links at all, then? And unless you delay symlink-following until after all the real folders have been scanned, you could end up with a situation where whichever path happened to be indexed first is the one that "wins". This can invalidate people's queued downloads if they wanted a folder that's no longer indexed by the time their turn comes up.

Windows shortcut filenames end in .lnk. If you transfer the target file but give it the link's name without stripping .lnk extension, the downloader's system will think it's a shortcut file. I saw this happening when shortcut following was enabled. You need to think about how to index and transfer .lnk files if you are following them like symlinks.

If nothing else, I suggest making a toggle for disabling the following of symlinks entirely, so that if there are problems or unexpected complications, we can continue to use Soulseek as we have been. I do want to help make symlinks work, but if they're causing problems, I need to be able to shut 'em off.

Qt makes working with shortcuts on Windows a lot more difficult than symlinks on Linux/OSX (you basically can't use a shortcut directly as a folder), so for the time being indexing shortcuts is turned off in the latest nightly builds. You make a lot of good points, but ultimately I'd like to offer the user one simple scheme or another, either index all symlinks, or index no symlinks, configurable as an option. Definitely hiding folders with the hidden or system attribute sets is a good idea. If I look into re-enabling shortcut indexing, I'll look into doing that as well.

Just can't test it out yet because you are only providing your nightlies for Windows and Mac ;-)
But take your time: we 'Buntu users are very used to waiting (while others are messing around with gedit 3.14, aptitude will still install v3.10 from years ago ;)