Author
Topic: swill's pm search tool (Read 80710 times)

Current Status--------------I have tested the tool on all the major OS platforms (Mac, Windows and Ubuntu) and the attached binaries work on all platforms. I recommend you run the program from the command line the first time with the '-user' attribute set. After that you can just doubleclick the executable to start the program on Mac and Windows. In Ubuntu I had to keep using the command line because it did not create a new terminal instance for me when I double clicked the executable. If you set the 'user' and 'pass' in the config file in linux, it should start the application in the background if you doubleclick on it.

You need to have the following checkbox checked in your GeekHack PM Settings:

You can use any of these settings for display, but I HIGHLY recommend using the 'All at once' option on the initial import to make the import faster, then you can change it back to whatever you want.

Let me know if you have questions or problems. This is an Alpha release, so be gentle...

Refer to the README section at the end of this post for a step-by-step guide for working with pm_search...

I have been developing a tool for BunnyLake to help him manage the crazy amount of PMs he gets. I have been developing it specifically for his use case to this point. I am close to delivering my first release to him, so I wanted to gauge other people's interest in this functionality.

Here is what the UI looks like:

Here are the search options:

Here is how it works:

The tool is distributed as a single binary with no local requirements. It will work on Mac, Windows (to be tested) and Linux (to be tested). When you run the tool the first time it will launch a web crawler which will crawl your GH PMs and download and index them locally. It will start a webserver on your computer and will open a browser window pointing to the local webserver. You will then be able to do searches. Everytime you reload the page the crawler will check to see if there are any new PMs since your last refresh and will download and index them if there are.

Thats pretty much it. I will probably keep the tool in private beta for the first few weeks at least to validate its usability and such. After that I will likely open source the tool and make it more widely available if there is an interested audience. I would also like to verify with the Mods that this type of tool is not against some kind of terms of use or something which would limit my ability to open source it.

Setup on GeekHackYou need to have the following checkbox checked in your GeekHack Personal Message Settings in order for the tool to work (for now):

You can use any of these settings for display, but I HIGHLY recommend using the 'All at once' option on the initial import to make the import faster, then you can change it back to whatever you want.

Initial RunYou will need to specify a 'user' the first time you run the application. To do this, you will need to open a terminal and pass in the command line argument '-user=your_username' when you execute the binary.

Windows- Start > run > cmd- Find the executable in the explorer and drag it into the open terminal. This will copy the file path into the terminal.- Add '-user=your_username' to that line and hit Enter.- For a full list of commands use the '-h' argument.- When finished, close the terminal and the application will be stopped.

Mac- Applications > Utilities > Terminal- Find the executable in the explorer and drag it into the open terminal. This will copy the file path into the terminal.- NOTE: If the above did not put the full file path in the terminal, you will likely have to add executable rights to the binary: $ chmod +x pm_search_darwin_amd64- Add '-user=your_username' to that line and hit Enter.- For a full list of commands use the '-h' argument.- When finished, close the terminal and the application will be stopped (after you verify that you want to close).

Once entered, the 'user' will be stored in a config file in the folder '.pm_search' in your HOME directory.

For convenience, you can enter password in the format 'pass = yourpass' on a new line in the '.pm_search/pm_search.conf' file. The clear text password will be removed once you run the application and a new encrypted 'pass_hash' will be added to the config file.

The tool has build in command line help. This also helps you know what you can put in the config file (without the leading '-').

Since this app actually functions as a web server locally, you can just start it and leave it running and then bookmark the URL. This makes it much more convenient to use since you don't have to go find it, launch it and then search, you just open a new tab and click your bookmark.

On a Mac, the following is done in the terminal (Applications > Utilities > Terminal):

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"><plist version="1.0"> <dict> <key>Label</key> <string>pm_search</string> <key>Program</key> <string>/full/path/to/bin/pm_search</string> <key>RunAtLoad</key> <true/> </dict></plist>Now every time you restart your computer, the pm_search app will start. The only thing that may not be desired is a browser window with pm_search will open on boot. I may add a config file option which would allow you to specify 'daemon = true' which would suppress this behavior, but we will see if other people actually use this first...

On Windows, you are on your own. I don't use Windows and have no interest in digging in ****... (I don't live on a farm anymore)

On Linux, you can easily make the app start at boot by doing the following:

$ sudo vim /etc/rc.local # add the following line before the `exit 0` line cd /full/path/to/the/bin/folder && ./pm_search &

What a great idea. I've been contemplating how I can better organize Tactile subscriptions since people just PM me to get them. Having this tool would be really helpful and I wouldn't have to go to a Google Docs Form or anything.

What a great idea. I've been contemplating how I can better organize Tactile subscriptions since people just PM me to get them. Having this tool would be really helpful and I wouldn't have to go to a Google Docs Form or anything.

For something like your Tactile thing, you can make a label called Tactile Subscriptions then make a rule that if the subject contains Tactile Subscription it applies that label. Then tell anyone who wants a subscription to be sure and have Tactile Subscription in the subject line when they pm you.

What a great idea. I've been contemplating how I can better organize Tactile subscriptions since people just PM me to get them. Having this tool would be really helpful and I wouldn't have to go to a Google Docs Form or anything.

Hmmm, I should look at how labels work in the PMs, that could potentially give you a LOT of control over the searching.

I get the twitches from passwords in the clear...or does it encrypt it in the config file? If not I'd be okay with running it on demand and supplying the info at the time. Admittedly, I'm more paranoid than most.

I don't even get that many PM's compared to many people around here and I already get lost trying to find something. Having them crawled and indexed is a really good idea.

I get the twitches from passwords in the clear...or does it encrypt it in the config file? If not I'd be okay with running it on demand and supplying the info at the time. Admittedly, I'm more paranoid than most.

The security professional twitchy about plaintext passwords? Ya don't say!

Wish there was a reliable cross-platform way to have per-user secrets, but without that it would be good to have a way at runtime to supply credential information.

I get the twitches from passwords in the clear...or does it encrypt it in the config file? If not I'd be okay with running it on demand and supplying the info at the time. Admittedly, I'm more paranoid than most.

I don't even get that many PM's compared to many people around here and I already get lost trying to find something. Having them crawled and indexed is a really good idea.

This is something I am also concerned about, but for the first pass I didn't try to solve that problem. On a mac and linux I put it in a hidden folder on your system. On windows I plan to make it hidden and a system file so you have to really be looking for it to be able to find it (still need to do this). Unfortunately I can't one way hash it because I need to be able to decrypt it in order to authenticate on GH. I am considering encrypting it, but if I open source the tool, then if someone does actually get onto your system they will know the encryption algorithm from the open source code and if they have gotten that far they can easily write a tool to decrypt it.

Right now if you use the command line arguments the first time, it will actually save the credentials into the config file, so that does not really solve the problem in your case. What I will probably do is create encryption and then only save the encrypted pass to the config file and make sure that if the clear text pass is ever configured in the config file or via the command line it will not persist in the config file. This way if you change your password on GH, you can just pass in the 'pass' argument or put it in the config file and it will re-encrypt your password and then remove it from the config. Would this help you rest easy???

I get the twitches from passwords in the clear...or does it encrypt it in the config file? If not I'd be okay with running it on demand and supplying the info at the time. Admittedly, I'm more paranoid than most.

The security professional twitchy about plaintext passwords? Ya don't say!

Wish there was a reliable cross-platform way to have per-user secrets, but without that it would be good to have a way at runtime to supply credential information.

You can supply credentials at runtime. Unfortunately, in trying to make the tool 'convenient', I am saving those credentials into the config file. I could change that functionality though and not save credentials if passed in the command line...

There is a tool I could bundle in which would give pretty good security, but it is overkill and would add a lot of bloat to the install: https://www.vaultproject.io/

Edit: I can make it so you can enter your password via the command line and it won't be saved in bash history or in the config file with something like this (https://github.com/bgentry/speakeasy), maybe that is the best short term solution for offering security with minimal effort...

I get the twitches from passwords in the clear...or does it encrypt it in the config file? If not I'd be okay with running it on demand and supplying the info at the time. Admittedly, I'm more paranoid than most.

I don't even get that many PM's compared to many people around here and I already get lost trying to find something. Having them crawled and indexed is a really good idea.

This is something I am also concerned about, but for the first pass I didn't try to solve that problem. On a mac and linux I put it in a hidden folder on your system. On windows I plan to make it hidden and a system file so you have to really be looking for it to be able to find it (still need to do this). Unfortunately I can't one way hash it because I need to be able to decrypt it in order to authenticate on GH. I am considering encrypting it, but if I open source the tool, then if someone does actually get onto your system they will know the encryption algorithm from the open source code and if they have gotten that far they can easily write a tool to decrypt it.

Right now if you use the command line arguments the first time, it will actually save the credentials into the config file, so that does not really solve the problem in your case. What I will probably do is create encryption and then only save the encrypted pass to the config file and make sure that if the clear text pass is ever configured in the config file or via the command line it will not persist in the config file. This way if you change your password on GH, you can just pass in the 'pass' argument or put it in the config file and it will re-encrypt your password and then remove it from the config. Would this help you rest easy???

Totally understand where you are coming from. As people often say, if someone has access to the system, they are going to get the data they want pretty much no matter what. In this case probably from the browser long before a hidden config file, even if that config file is plain text. Your plan to encrypt the config and/or not store it in the config file sounds like it will work fine to me. I've seen much, much worse implemented at very large corporations!

For concerns about decryption, whether OS or not, adding a salt to the hash should solve that.

And I realize the irony of discussing encrypting the passwords when probably less than 6 months ago GH finally implemented HTTPS. Everyone's password was free and easy pickings if you snooped the wire!

Allow for both the user and pass to be either entered via command line arguments or through the config file. If you enter both the user and pass in either location, it will assume you want the password saved because you entered it in clear text. It will encrypt the pass and save it into a new field in the config file and set the pass to "" in the config file.

If you enter only the user via the command line, the software will assume you want the utmost security and will prompt for you password at application start without displaying the password at any time. In this case the software will not save the password in any way (even encrypted), so you will need to pass the user every time you launch the application and enter your pass at the command prompt.

If you are not super paranoid, you can enter the pass in clear text via the command line and it will be saved in an encrypted format for you. You can then clear your bash_history to remove any traces of the clear text pass and use the application without having to enter a pass again. If you are super paranoid, then you can just enter the user and be prompted for the hidden pass every time you use the application.

Allow for both the user and pass to be either entered via command line arguments or through the config file. If you enter both the user and pass in either location, it will assume you want the password saved because you entered it in clear text. It will encrypt the pass and save it into a new field in the config file and set the pass to "" in the config file.

If you enter only the user via the command line, the software will assume you want the utmost security and will prompt for you password at application start without displaying the password at any time. In this case the software will not save the password in any way (even encrypted), so you will need to pass the user every time you launch the application and enter your pass at the command prompt.

If you are not super paranoid, you can enter the pass in clear text via the command line and it will be saved in an encrypted format for you. You can then clear your bash_history to remove any traces of the clear text pass and use the application without having to enter a pass again. If you are super paranoid, then you can just enter the user and be prompted for the hidden pass every time you use the application.

-page_size=50: Number of PMs to show on each page -pass="": Your GH password -pass_hash="": Dynamic: Do not modify this... -port=8888: The port the pm_search app should listen on -stored_pm=0: Dynamic: Last indexed PM. To re-index, set to: 0 -user="": Your GH username$ ./bin/pm_search_darwin_amd64 -user=swillEnter your GH password: ^C$ # the user will be saved to the config file now, so I don't have to pass it again$ ./bin/pm_search_darwin_amd64Enter your GH password: ^C$ # here I manually modify the config file to add my password in clear text$ # passing the '-pass' arg would do the same thing, but I am showing output, so no$ vim ~/.pm_search/pm_search.conf $ ./bin/pm_search_darwin_amd64^C$ # not prompted anymore because the password config is pulled from the config file$ cat ~/.pm_search/pm_search.conf pass_hash = a687bf46d4bd2a2b07c3becauseiamnotanidiotd696269af4b7b3e1stored_pm = 1195354user = swillpass = $ # the clear text password has been removed from the config and replaced with a hashed pass

I think this implementation works well. Thanks for the feedback guys... Any final feedback on the security concerns?

Right now the implementation REQUIRES that you have the following PM setting set.

I will look into supporting this being unchecked, but for now, just check this box...

It supports all of the following display types.

I would recommend you use the "All at once" option when you do the initial import though because it is about 4x faster than the next closes method. You can change the display settings after you import to your normal display preference and the performance difference is not noticeable for getting new PMs.

To everyone who is interested. I have updated the OP with the binaries for an Alpha version of the tool. I have tested on all major platforms and it is working. I have attempted to write a relatively extensive README to go with the binary, so please just ask in this thread if anything is unclear. If you get ANY errors, please bring them to my attention. The application log can be found at HOME/.pm_search/pm_search.log.

I setup the local webserver but my search queries don't give that many results (definitely less than it should be)

Did you wait for the "Getting PMs... This takes a few minutes..." message to go away?

Edit: BTW, If you interrupt execution during the initial download and index of the PMs, you will likely have to run the tool with the -stored_pm=0 argument set in order to reindex everything correctly as it probably would leave it in a broken state.

Gave it another test on my phone (Sailfish Linux) while waiting for an egg timer at work, worked perfectly and was just as quick as my netbook!

This is an alpha so there must be some bugs - I'm going to keep looking

Really? This worked on your phone? I didn't try to do anything to try to make it support mobile. I guess being Linux is the reason it worked. I won't work on any other mobiles because I don't think I have a binary that could work on other mobiles.

Um...Please bear with me. I'm really bad at this kinda stuff. I want to try the Alpha but I'm honestly not sure where to start.

Follow the instructions in the README in the OP. You will download the binary which best suits your operating system from the OP and follow the instructions. The instructions may not give enough detail, so if you have any questions at all, please just ask and I will do my best to guide you.

Gave it another test on my phone (Sailfish Linux) while waiting for an egg timer at work, worked perfectly and was just as quick as my netbook!

This is an alpha so there must be some bugs - I'm going to keep looking

Really? This worked on your phone? I didn't try to do anything to try to make it support mobile. I guess being Linux is the reason it worked. I won't work on any other mobiles because I don't think I have a binary that could work on other mobiles.

Glad to see people testing it though.

Yup! I've uploaded a video to youtube but it's taking too long and it's nearly 2am...

Out of interest what is the Arm binary supposed to be run on, a Raspberry Pi?

Gave it another test on my phone (Sailfish Linux) while waiting for an egg timer at work, worked perfectly and was just as quick as my netbook!

This is an alpha so there must be some bugs - I'm going to keep looking

Really? This worked on your phone? I didn't try to do anything to try to make it support mobile. I guess being Linux is the reason it worked. I won't work on any other mobiles because I don't think I have a binary that could work on other mobiles.

Glad to see people testing it though.

Yup! I've uploaded a video to youtube but it's taking too long and it's nearly 2am...

Out of interest what is the Arm binary supposed to be run on, a Raspberry Pi?

Cool.

The arm build is just because it is easy for me to cross compile into it as well, so I just did all the easy chipsets and os combos.

Been messing with this and it's been amazing so far. I'll keep messing with it to see if I can find any issues. Thanks for making this swill!!

Awesome! Glad to hear people are appreciating the functionality. I really just started writing it because I wanted to try some stuff and a comment bunny made in conversation caught my attention. Once I validated the idea I pinged bunny to see if he would be interested in it if I built it, and he seemed stoked by the prospect.

I like building tools that either solve a problem use case or enable people to build cool ****. A professional enabler...

Is it possible to sort the search results after they come up? Like how you sort files in Windows by date or file size? Or is that a function of the search query?

The search determines the order based on the relevance of the terms you specify. Check the help ( ? ) to see all the options. Boosting is probably what you are looking for to give relative importance. Also, inclusion and exclusion is very powerful to reduce search results especially when combined with fields.

I will try to check out labels and see if I can add that as a field as well.