Search This Blog

Posts

For many, many years, I was a massive fan of dedicated web hosting. I was VERY vocal about how you couldn't run a legitimate, professional business without using dedicated web hosting. And time and time again, I was proven right as people on shared web hosting came out of the woodwork in various places who had bet their business on shared hosting and lost - and sometimes they lost EVERYTHING including their business and all their customers!

Shared web hosting is still the bottom of the barrel, scummy/scammy money grab that it has always been and no respectable business should be caught dead running their web infrastructure on it. Period. That hasn't changed.

However, I have been watching a couple of new stars grow from infancy into its own over the past 8 years: Virtual Private Servers, aka VPS, and its newer, shinier cousin Cloud Hosting.

Dedicated web hosting is expensive. It has always been because you get a piece of hardware, a network drop, electricity, a transfe…

Today, I learned that people still buy brand new dot matrix printers. You know, those extremely noisy printers I thought we ditched as soon as it was possible to do so. Well, except for the nutcases who turn them into "musical instruments" and start a YouTube channel:

After doing some research, it turns out that, for bulk printing where output quality and "professional" appearance doesn't matter at all, dot matrix printers can be anywhere from 4 to 8 times cheaper than laser printers per printed page (the next cheapest technology) when amortized over the cost of maintenance of the lifetime of each type of printer. With dot matrix, you're not going to get the speed, accuracy, or the quietness of laser, but you'll supposedly save a boatload of money on toner.

Summary: An auction house in New York is suing an auction house in Dallas for copyright law violations regarding scraping the New York auction house's website listings including their listing photos and then SELLING those listings and photos in an aggregate database for profit.

As I'm the author of one of the most powerful and flexible web scraping toolkits (the Ultimate Web Scraping Toolkit), I have to reiterate the messaging found on the main documentation page: Don't use the toolkit for illegal purposes! If you are going to bulk scrape someone's website, you need to make sure you are free and clear legally for doing so and that you respect reasonable rate limits and the like. Reselling the data acquired with a scraping toolkit seems like an extremely questionable thing to do from a legal perspective.

Setting up your own Root Certificate Authority, aka Root CA, can be a difficult process. Web browsers and e-mail clients won't recognize your CA out-of-the-box, so most people opt to use public CA infrastructure. When security matters, using a public CA is the wrong solution. Privately owned and controlled CAs can be infinitely more secure than their public counterparts. However, most people who set up a private CA don't set up their CA infrastructure correctly. Here is what most private CAs look like:

Root CA cert -> Server cert

This is wrong because the server certificate has to be regenerated regularly (e.g. annually). If the root certificate is compromised, then it involves fairly significant effort to replace all of the certificates, including the root. What should be built is this:

Root CA cert -> Intermediate cert -> Server cert

In fact, this is the format that most public CAs use. The root CA cert is generated on a machine that isn't connected to …

I know I'm late for Halloween but we are still between holidays. Back in 2012, I needed a way to start Apache in the background. No one had built anything (that I liked), so I over-engineered a solution:

It takes the core CreateProcess() Windows API call, soups it up, and makes nearly every option directly available to command-line applications and batch files.

Today I put the finishing touches on a wild new addition. The newest feature adds TCP/IP socket support, thanks to this StackOverflow post.

What makes the new feature scary is that ANY IP address can be used, not just localhost addresses. Therefore, you can have it connect out to a server on the Internet or a LAN (or your IoT toaster?) and, in fine unencrypted fashion, route data to/from the stdin, stdout, and stderr of ANY process of your choice. Security? What's that?

The feature was added to work around bugs in various scripting languages where they end up…

Ever since e-ink came out, I do an annual pilgrimage into the world of e-ink and e-readers and come away disappointed. This year is not really any different. Well, okay, it's actually worse because various people out there who run e-reader blogs are now saying e-ink is a dying technology. You know it's bad when it comes to that. That's kind of sad because the industry came SO incredibly and painfully close to minimizing or even eliminating power hungry backlit displays. Had smartphones been delayed just one year (i.e. people had said "meh" to iOS as they should have), we would likely have color e-ink tech today that rivals LCD. That's rather frustrating. But now I need to digress and cover a few core problems with e-ink that has ultimately relegated it to the background.

One of the core problems of e-ink is the refresh rate. A screen refresh in e-ink land is generally a giant flash to black to white to various bits of images popping in and out. It…

Everyone loves receiving gifts. It's the thought that counts! Or something like that. However, most families these days tend to spread out across the country and so we end up shipping gifts to each other. But they are delivered in a boring brown box with a generic label on them where the return address is Amazon or another online store. And they might have ordered something else from Amazon or that online store too. And therefore they might open the gift before they are supposed to.

It turns out that there is a simple solution that is pretty cool. You can't get away from the boring brown box but you CAN do something about the shipping label on the boring brown box. Address labels typically look like:

First & Last NameAddress line 1Optional address line 2
City, State ZIP, Country

I've bolded the parts that same-country shipping companies actually pay attention to. Let's hack the address label to make it do something useful for us that won't annoy shi…

You've just upgraded to your first VPS or dedicated server and you think you've got all the software bits in place (LAMP stack and all that) OR you've moved from a hosting provider with easily configured hardware firewalls. Now you've read something somewhere that says you need 'iptables' rules with your new host. If you have any friends who manage Linux systems, you've also heard that "iptables is hard." In my experience, the only thing hard about iptables is that no one seems to publish decent rulesets and users are left to figure it all out on their own. It doesn't have to be that way!

Ubuntu/Debian: apt-get install iptables-persistent

RedHat/CentOS: chkconfig iptables on

That installs the iptables persistent package/enables storing iptables so that they load during the next boot.

The most interesting bug in PHP is the showstopper bug in the core of PHP you finally run into after a month of software development just as you are getting ready to ship a brand new product out the door. Specifically, PHP bug #72333, which is in all current versions of PHP. If you aren't familiar with reading C code, it can be extremely hard to follow along with that bug report especially since PHP streams behind-the-scenes are ugly beasts to try to wrap your head around (mine's still spinning and I wrote the bug report). In short, the problem is a combination of non-blocking mode with SSL sockets when calling SSL_write() with different pointers in 'ext\openssl\xp_ssl.c'.

The temporary patch in userland is to disable non-blocking mode when writing data - if you can - I'm not so sure I can/should. The correct solution is to fix PHP itself by altering how it interfaces with OpenSSL, which could be as simple as altering a couple of lines of code. I'd subm…

Let's talk about PHP. The scripting language, not the health insurance. PHP is, in my opinion, one of the greatest development tools ever created. It didn't start out that way, which is where most of its bad rap comes from, but it has transformed over the past decade into something worth using for any size project (and people do!). More specifically, I've personally found PHP to be an excellent prototyping and command-line scripting tool. I don't generally have to fire up Visual Studio to do complex things because I have access to a powerful cross-platform capable toolset at my fingertips. It's the perfect language for prototyping useful ideas without being forced into a box.

When writing a TCP server, the most difficult task at the beginning of the process is deciding what port number to use. The Transmission Control Protocol has a range of port numbers from 0 to 65535. The range of an unsigned short integer (2 bytes). In today's terms, that is a fairly small space of numbers and it is quite crowded out there. Fortunately, there are some rules you can follow:

Specifying port 0 will result in a random port being assigned by the OS. This is ideal only if you have some sort of auto-discovery mechanism for finding the port your users are interested in (e.g. connecting to a web server on port 80 and requesting the correct port number). Otherwise, you'll have to occupy an "open" port number.The first 1023 port numbers are reserved by some operating systems (e.g. Linux). Under those OSes, special permissions are required to run services on port numbers under 1024. Specifically, the process either has to have been started by the 'root…

At the end of last year, I decided to start collecting some statistics about my ever-changing software development task list. To do that, I wrote a script that ran once per day and recorded some interesting information from my task list manager (a flat text file) and the number of open tabs in Firefox. What follows are some interactive (oooooh, shiny!) charts and some analysis:

The number of tasks on my task list peaked twice this year at 78 tasks and dropped one time to 54 tasks. The number of tasks appears to be decreasing according to the trend line in the first chart. However, the second chart tells a slightly different story. Even though the number of tasks is on the decrease, the file size of the text file in which the tasks are stored is apparently on the increase. This tells me that the overall complexity of each individual task is slightly higher or I'm just slightly better at documenting details so I don't forget what the task entails (or some combination of b…