Meta

Update script being used as presence detection. Previously would take arguments for -s and -b (switch and bluetooth) and only check 1 device. Required multiple cron entries, 1 for each device. This could lead to polling issues if more devices are added. This new script “Should” only poll once for all devices. There’s a pause when it’s searching for the first device but consecutive devices fly through so I think it’s working.

Had to us – as separator. : in mac address would be confusing, tried ; but conflicts with bash, was going to use a space but would need to account for that in the argument parser and – was just easier.

I’ve been using a brilliant script by Dan Fountain for a few years as part of a WooCommerce barcode scanner python program I made. It allows us to update/process orders without the need of working in the admin interface 90% of the time.
For this we use a Raspberry PI, and a handheld portable barcode scanner. One of the big things that was needed was feedback from the PI as to what it’s done or what it’s doing. I’ve attached an LCD screen and most recently added a whole web interface output (mainly for diagnostics), but when your scanning a bunch of orders especially in bulk you dont want to be looking at a very small LCD screen about 4m away. So I added speakers and TTS.

There’s a few different things the barcode program can do (get order status, add tracking code to the order) but the most important is update the status of the order.
We start processing orders first thing in the morning by scanning a ‘bulk’ QR code, then scanning each order that’s on the printer. Once they’ve all been scanned we scan another QR Code ‘Order Printed’. Quite simply this updates each order status from ‘processing’ to ‘printed’, and this is important in case Google Cloud Print fails to print an order (it does from time to time), anything left in ‘processing’ needs checking.

Anyway that’s not the important bit for this post! The important bit is the python program giving audio feedback. While we could have gone for a TTS engine local to the pi, Google Translate option gives a far better sounding voice. The above scenario would do the following:
We scan Bulk mode
Pi says ‘Bulk Capture Active’
We scan first order
PI says ‘One’
We scan second order
PI says ‘two’
and on
and on
Once all the orders are scanned, we then scan ‘Printed’
PI says ‘Bulk Capture Finished. Processing x’ where x is the number of orders.
PI says ‘One’
PI says ‘Two’
etc. etc.
Then Pi Finally says ‘Finished Bulk Processing’.

Now there’s certainly the ability to pass all the orders in one go via the WooCommerce API, but we handle them as individual requests within python so that we can do some order status checking before changing it’s status. i.e if an order has already been moved from ‘processing’ to say ‘cancelled’ we dont want to move it again to ‘printed’, at that point the PI would say ‘Error Processing Order xxxxxx’ where xxxxxx is the order number.

As you can see from the above flows, the actual text being read ends up being the same text over and over and over. The number ‘One’ can be read about 10 times as we bulk move things around queues. While it’s certainly possible to just fire the same thing at Google Translate over ad over, it’s far nicer to play friendly and cache what we can use again and again.

The code below is based on the awesome work of Dan Fountain, with the following updates:
Added Caching
Added MPG123 options (to speed up the play back a little)
Added a client to the wget request (without it google started blocking heavy requests when the cache is clear).

We all (mostly) understand updates are important and I’m sure there were only good intentions by setting it but the‘Connect your store to WooCommerce.com to receive extensions updates and support.’Nag notice is ridiculous. A non-dismissable notice should never be allowed. I get you dont want people to just quickly click the dismiss button, so why not put an option at the bottom of the connect page ‘Dismiss this alert. I understand what this means’ even if it only dismissed for say 3 months and then you had to do it again. It would still be annoying but at least easier to deal with.

But no, in someones infinite wisdom they’ve decided you absolutely must connect your store and have no other option. Well here’s the code to add to stop that nag notice

** It is your own responsibility to keep your site upto date.
** Disabling this notice, may disable other woocommerce notices.

There are of course legitimate reasons why would wouldn’t want to connect your store, managing your updates your own way should always be allowed. So dev’s stop trying to dictate how you want/think things should run, choice is the key.

I’ve so far fixed some of the issues, such as rearranging the columns (why the actual fu*k status was moved I’ll never know or understand). The below code is probably not the best way to do it, but it works

So that’s one issue solved. The next was being able to click anywhere in the row opens the order (yeah I’m sure that’s nice, but if you rely on tapping a touchscreen i.e. click and drag to scroll then this causes problems). The following code adds the no-link class to the tr and stops this shitty behaviour

This places the orders back at the top of the row, and stops the previous restoration of items sold link jumping around. but I’d rather the new status text stays in the middle inline with the checkbox, hopefully we’ll get this back to an icon soon.

All of the above I’ve added to our custom plugin, you could either do this or add them to your functions.php

Outstanding issues:
1. Getting back the Email address. There is some hope this may come back officially, but I’ll be fixing it for us tomorrow.
2. The Status being text not icons. I understand this makes more sense to new users but if you have some long statuses like we do, the text doesn’t fit and we’ve got 10 statuses all looking the same. Having coloured icons worked for us and if you weren’t sure hover the mouse. I’ll be looking to get them back to icons tomorrow.
3) Date column, just why! Why would anyone think not putting the actual date and time of the order here is a good idea. Stupid ‘X mins ago’ is no use at all.

The new preview window looks good but I really dont see it getting much use, we need the important data on the front. If it’s not that important just open the order. WooCommerce dev’s decided to screw with it but I dont think there’s an understanding that if your going as far as opening the preview window then I’m pretty sure you were used to just editing the order which is probably still going to be the case.

So summing up today, I’ve had a shit day of people moaning at me because some developers decided to improve something that really didn’t need touching. Doesn’t sound like every developer I’ve ever known 😂. I’m now getting something to eat before I go near everything I’d planned on working on today.

Spent far too long trying out different ways to get this to work. I need to setup a server listening to the local IP address to restrict things like phpldapadmin to internal requests. But hit problems with nginx appending the location to the root path, and php having no idea where to get the files from.

I spent a good few hours searching google and trying to work out what had been installed that could be causing an issue with libc6. Though I was fairly certain nothing had been (only because this is 2 days after a migration to new servers, and I’m hitting this problem on 2/4 servers, but they have extremely similar setups).

Something is eating all the space. So I tried to run ncdu to look for large files (I know there’s other ways, but I like ncdu). But I hadn’t installed it on this new server and I can’t install it with apt-get broken.

Thinking /run is bound to be causing some issues (still not sure if it’s causing this particular issue), I reboot the server (bad move!). It locked up and had to be power cycled. Thankfully it’s a droplet and with DigitalOcean I can power cycle easily (I did try the console but it wouldn’t connect).

Anyway after a reboot, /run started at 5% used, but quickly grew to 70%. but I did managed to install ncdu, and with that I knew the problems I had with apt-get were being caused by a full /run.

After a quick (very quick) look at ncdu /run I could see hhvm.hhbc taking up approx 85Mb+

A quick check of the config and I can see hhvm is configured to do so. So I adjusted the config to put it in /var/cache/hhvm/hhvm.hhbc instead and update the systemd service script to create /var/cache/hhvm and set it’s owner.

Another reboot, everything seems fine and /run is now at 3% used.

And I’ve run apt-get upgrade successfully.

I’m thankful that I noticed, I really thought I’d screwed something on these 2 servers while migrating, and I could see another night of rebuilding new servers ahead of me.

Morale of the story: Check your not out of space when you get weird errors (yes the I/O should have rang some bells, but hey it is 2am).

We’re walking 114,893 requests to just the one server. My first instinct is to add the offending IP address to our IPTables BLOCK list and just drop the traffic altogether. Except this no longer works with CloudFlare since the IP addresses the firewall will see is theirs not the offender!

No problem, CloudFlare deals with this(!?) I can just turn on ‘Under Attack’ mode and let them handle it. This is where you quickly learn it doesn’t really do much. Personally I got a lovely 5 second delay when accessing our websites with ‘Under Attack’ activated. but our servers were still being bombarded with requests. So I added the offending IP addresses to the firewall on CloudFlare. Surely that will block them from even reaching our servers! While I can’t say it had no effect, out of a number of IP addresses I had added some were still hitting our servers quite heavily.

So the question becomes ‘How can I drop the traffic at nginx level?’. Nginx is configured to restore the real IP addresses, so I should be able to block the real offenders here not CloudFlare.

Turns out to be pretty easy. Add:

### Block spammers and other unwanted visitors ###
include blockips.conf;

Just add a deny line for each offending IP. Then reload nginx (service nginx reload). I’d recommend testing your new config first (nginx -t)

Now with that done of both servers I’m still seeing requests but they are now getting 403 errors and not allowed to hit any of the websites on our servers 🙂 after another 5 minutes of attack they clearly gave up as all the requests stopped.

But we’re not done. What if they’re just changing IP addresses? we need this to be automatic. Enter Fail2Ban. I haven’t used this in some time but I know what I need it to do.

Check all the websites log files for wp-login.php attempts

Write to /etc/nginx/blockips.conf

Reload nginx

Should be quite simple. Turns out it is, but it took hours trying to figure out how since all the guides seem to think you’ll be using Fail2Ban to configure IPTables.

It’s pretty simple and will need some tweaking. I’m not so sure 8 requests in 20 mins is very good. We do have customers on some of our sites who regularly forget their password. The regex does look at the 200 code, I read that a successful auth would actually give a 304. Not sure if this is correct so will need some testing. I also found other information on how to change the code to a 403 for failed login attempt. I think this would be a huge plus, but I’m not looking into that tonight.

A few tests using curl and I can see Fail2Ban has done it’s job, added the IP to the nginx blockips file and reload nginx 😀 I’m not too worried about syncing this across servers, as shown tonight they will balance well and I’m confident that within a few minutes of being bombarded they would both independently block the IP.

So there’s my morning of working around using CloudFlare but still keeping some form of block list. Hope this helps someone. Please comment to let me know if you found any of this useful/worth reading.

Yesterday I received my Barcode scanner, after a little playing scanning everything in sight I got to work on programming a Raspberry PI in Python to use the barcode scanner to update orders in WooCommerce.

I’m not going to go into full details in this post on how. but I will write it up soon.

A BIG problem I hit though was updating orders using the WooCommerce API. I kept getting error code 400 ‘No order data specified to edit order’. I’d followed the example but changed it to fit my code i.e dont print the output but return it for error checking.

Searching google for that exact phrase gave just 1 result, and it’s someone commenting they are getting the error with no response (and it’s 4 months old on a 2 year old post).

After looking through the API code and trying to follow along (I’m not the best programmer but I can generally follow what it’s trying). I found it looking for ‘order’

So after looking at how the bulk update is done and with a bit of playing I found that instead of

I’ve recently noticed a problem on 3 of my Digital Ocean Servers. The APT package lists are not automatically updating every day. I try to keep all servers upto date, and rely on Nagios to inform me when there’s packages needed to be updated and that’s the main reason I noticed something was broken.

The 3 servers in particular are newer builds to the rest of the system, and they dont have near as much installed as the others, so at first I didn’t pay too much attention when other servers were going into warning state on nagios indicating updates but these 3 weren’t. However I would still connect to these servers and run my normal command:-

A few times these servers did install updates and I just thought it must have been my timing, that the package lists hadn’t yet been updated by the cron.daily.

But after this happening a few times, I decided to not run the above and see how long these servers would take for nagios to throw an alert. It never did and that got me a little worried.

Over the last few days I’ve been diagnosing what’s wrong. I started out with making sure cron is working properly. Then kept an eye on the file timestamps

ls -ltrh /var/lib/apt/lists/

Eventually getting to /etc/cron.daily/apt and checking through what was was doing on the working servers compared to the broken ones. I turned on VERBOSE and got a bit of info when running /etc/cron.daily/apt but it seemed to exist quite quicky.

Comparing it to a working server the important bit seemed to be around

In the last few weeks I’ve been rebuilding servers & all services. I like to do this every so often to clear out any old junk, and take the opportunity to improve/upgrade systems and documentation.

This time around it’s kind of a big hit though. While I’ve had some services running with HA in mind, most would require some manual intervention to recover. So the idea this time was to complete what I’d previously started.

So far there’s 2 main changes I’ve made:

Move from MySQL Master/Master setup to MariaDB cluster.

Get redis working as HA (which is why your here).

I’ll bore everyone in another post with the issues on MariaDB cluster. But this one concentrates on Redis.

The original setup was 2 redis servers running, 1 master 1 slave. With php session handler configured against a hostname which was listed in /etc/hosts. However this time as I’m running a MariaDB cluster, it kind of made sense to try out a redis cluster (dont stop reading yet). After reading lots, I decided on 3 Servers each running a master and 2 slaves. A picture here would probably help but you’ll just have to visualize. 1, 2 & 3 are Servers, Mx is Master of x, Sx is Slave of x. So I had 1M1, 1S2, 1S3, 2S1, 2M2, 2S3, 3S1, 3S2, 3M3.

This worked, in that if server 1 died, it’s master role was taken over by either 2 or 3. And some nagios checks and handlers brought it back as the master once it was back online. Now I have no idea if this was really a good setup (I didn’t test it for long enough), but 1 of the problems I encountered was where the PHP sessions ended up. I (wrongly) thought the php session would work with the cluster to PUT and GET the data, so I gave it all 3 master addresses. Reading info on redis, if the server your asking doesn’t have the data it will tell you which does, so I thought it’s not really a problem if 1M1 goes down and the master becomes 2M1 because the other 2 masters will know so will say the data is over there. In manual testing this worked. but PHP sessions doesn’t seem to work with being redirected (and this is also a problem later).

So after seeing this as a problem, I thought maybe a cluster is a bit overkill anyway and simplifying it to 1 Master and 2 Slaves would be fine anyway.

I wiped out the cluster configuration and started again, but I knew I was also going to need sentinel this time to manage which is the master (I’d looked at it before, but went for cluster instead. Time to read up again).

After getting a master up and running and then adding 2 slaves. I pointed PHP Sessions to all 3 servers (again a mistake). I was hoping (again) that the handler would be sensible enough to connect to each and if it’s a slave (read only) detect that it can’t write and move to the next. It doesn’t. It happily writes errors in the logs for UNKNOWN.

So I need a way for the session handlers to also know which is the current master, and just use this.

Basically this script is run whenever the master changes (I need to add some further checks to make sure <role> and <state> are valid, but this is being done for quick testing.

I’ve then changed the session path:

session.save_path = "tcp://REDIS-MASTER:6379"

and again in the pool.d/www.conf:

php_admin_value[session.save_path] = "tcp://REDIS-MASTER:6379"

Quite simply I’m back to using a hosts entry which points at the master. but the good thing (I wasn’t expecting) was I dont have to restart php-fpm for it to detect the change in IP address.

It’s probably not the most elegant way of handling sessions, but it works for me. The whole point in this is to be able to handle spikes in traffic by adding more servers N3, N4, etc. and providing they are sentinels with the script they will each continue to point to the master.

I did think about writing this out as a step by step guide with each configuration I use, but the configs are pretty standard and as it’s a bit more advanced than just installing a single redis to handle sessions, I think the above info should really only be needed by anyone with an idea of what’s where anyway.

I still dont really get why the redis stuff for PHP doesn’t follow what the redis server tells it i.e the data is over there. I suppose it will evolve. If your programming up your own use of redis like the big boys then you’ll be programming for that. I feel like I’m doing something wrong or missing something, I’m certain I’m not the only one who uses redis for PHP sessions across multiple servers behind a load balancer, but there is very little I could find googling beyond those that just drop a single redis instance into place. As this becomes single point of failure it’s pretty bizarre to me that every guide seems to go this way.

If anyone does read this and actually wants a step by step, let me know in the comments.

Just to finish off I’ll give an idea of my current (it’s always changing) topology.

2 Nginx Load Balancers (1 active, 1 standby)

2 Nginx Web Servers with PHP (running syncthing to keep files in sync, I tried Gluster FS which everyone raves about, but I found it crippling so I’m still looking for a better solution).