One of goals in my new approach to my job is to speed up our Quality Control (QC) processes. We use an internal website for a lot of our QC checks. This website requires manual selection of each job to be checked. It would help if we could feed a list of jobs to be QCed to a script that would automate the selection of each job.

There are a number of well-known tools to do this kind of browser automation, many of them free. However, since our QC website works about as well in Internet Explorer as any other browser, I think my best approach will be to do COM automation from PowerShell, since all the parts of the solution are built in to Windows.

Accordingly, I’ve started searching for reference material on the web:

The same day that I wrote my last post (Saturday, August 13), we discovered that one of our large web data servers was missing…everything.

This started a couple of weeks of time-consuming data restores. We were clocking overtime hours for the first time in months.

Now that the server is back to normal, we’re able to sit back and reassess processes here, which has led me to reassess my approach to scripting.

Over the past couple of months, I’ve been interested in testing PowerShell using Pester. However, after the past month, I think a better approach would be to use Ruby for applications that need testing (for example, scripts that delete data) and restrict PowerShell to building GUIs (since Ruby has been backward with regard to standalone GUIs) and read-only applications (for example, identifying incomplete data processing jobs)

I’m still planning to use version control (Git) with PowerShell. I’m not CRAZY.

One particular add-on was particularly interesting to me, since I’m trying to bring to bear what I know of proper software engineering practice to the PowerShell scripting I do.

This particular add-on, ISE_Cew, is a package that includes both Pester and Git functions. It’s just the thing if you’re looking to add Behavioral Testing (Pester) and version control (Git) to your PowerShell workflow.

I’ve been diving back in to – well, dipping my toe in the chilly waters of – PowerShell for some scripting here at my Data Processing job.

Several years ago, I learned the hard way (i.e., after writing a couple hundred lines of Ruby script) that although much of our processing automation was written without unit tests, that does NOT apply to any automation that *I* want to write. Not if I want to put it into production, that is.

I resisted unit testing and TDD for some time (Why? Well, that’s a story for another time), but I finally got testing religion last year with some Python scripting.

I could continue with the Python, but I think PowerShell is a better fit for our environment here.

Most modern programming languages have a choice of testing frameworks to choose from, but for PowerShell there’s only one that I know of – Pester.

Matt’s blog is subtitled “Tales from an Automation Engineer”, so his perspective on testing is a little different from the usual software testing guru. In particular, he points out that when it comes to infrastructure (and Data Processing, IMO), the things that are mocked / “stubbed out” in most software development environments are the things that we want to test:

But infrastructure code is different

Ok. So far I don’t think anything in this post varies with infrastructure code. As far as I am concerned, these are pretty universal rules to testing. However, infrastructure code IS different…

If I mock the infrastructure, what’s left?

So when writing more traditional style software projects (whatever the hell that is but I don’t know what else to call it), we often try to mock or stub out external “ifrastructureish” systems. File systems, databases, network sockets – we have clever ways of faking these out and that’s a good thing. It allows us to focus on the code that actually needs testing.

However …if I mock away all of these layers, I may fall into the trap where I am not really testing my logic.

More integration tests

One way in which my testing habits have changed when dealing with infrastructure code is I am more willing to sacrifice unit tests for integration style tests…If I mock everything out I may just end up testing that I am calling the correct API endpoints with the expected parameters. This can be useful to some extent but can quickly start to smell like the tests just repeat the implementation.

Typically I like the testing pyramid approach of lots and lots of unit tests under a relatively thin layer of integration tests. I’ll fight to keep that structure but find that often the integration layer needs to be a bit thicker in the infrastructure domain. This may mean that coverage slips a bit at the unit level but some unit tests just don’t provide as much value and I’m gonna get more bang for my buck in integration tests.

Matt’s opinion accords with my intuition about my Data Processing environment. In the DP realm, the part of the script that can be tested without accessing the production environment (or at least a working model of the production environment) can be trivial. This is probably the main reason our existing production automation doesn’t have full testing coverage. (Well, that, and the fact that as far as I know there’s no testing framework for the automation software we use).

So I think my approach will be something like Matt’s – unit test where it’s useful and non-trivial, and more integration tests (a “thicker layer” as Matt says) to get full (or at least adequate) coverage.

I’m trying to develop some scripts to handle data on some of the web servers we push data to at work.

I’m using PowerShell, because I can be fairly sure it will be available on the local hosts that we connect to the web servers from.

There is a “PSFTP” module that wraps .NET calls in PowerShell commands, but it’s taken a few searches to find the best way to use it in our environment here.

The Technet article shows examples of using the PowerShell commands in the module, but the way the example sets up the initial FTP connection opens a dialog box (outside of PowerShell) for the user to enter the FTP password. This is ok for the example, but wouldn’t work for automated FTP applications.
Happily, I found a WindowsITPro article that shows how to use the PowerShell ConvertTo-SecureString
cmdlet to create the FTP credential using a stored password in plaintext (or from a configuration file), rather than having to be logged in to the server to enter the password and allow the script to continue:

Using the ConvertTo-SecureString cmdlet, I’ve been to create a connection and list the contents of the remote (FTP server) directory.

One thing I’ve had to look at with the Technet and WindowsITPro examples is the “-UsePassive” parameter on the Set-FTPConnection command. I’ve never had to be concerned with the difference between Active and Passive FTP modes before, but it appears that Passive mode has problems with the way our firewall is set up. I found a StackOverflow article about the difference between Active and Passive FTP, but I will probably need to keep consulting the article until I’ve really got a handle on it.

I’m getting back into technical documentation at my job – in fact, I just had my work goals for the new year* approved –

Rework the SharePoint documentation site (that I originally created)

Collect existing documentation and write new documentation for one particular large customer

Create e-learning User Guide to teach co-workers how to use the SharePoint site

… (goals 4-6 are team goals that were assigned to everyone in my department)

For goal # 2, I’m looking into creating checklists that anyone in my department can use to fix problems with the particular large customer.

I’ve read Atul Gawande’s Checklist Manifesto – in fact, I got the audiobook version back when the book came out.

The major problem with Gawande’s book is that it’s long on WHY you should create and use checklists, but kind of skimpy on HOW to create them.

I’m going to review what material Gawande’s book has on creating checklists, but I’ve also started looking online for other resources, as well as adapting available materials for my particular situation here.

One of the first things I found was Project Check, which looks like it was set up about the same time that Gawande’s book came out and focuses on surgical and medical checklists. It also includes a Checklist for Checklists, a 3-column checklist designed by Gawande and Dan Boorman (the Boeing expert that Gawande worked with for the Checklist Manifesto). This checklist is available as a PDF download.

Finally, something I think I’ll find VERY useful as I create checklists that I can store in SharePoint for my co-workers to use: (how to create a) checklist in Microsoft Excel, including conditional formatting to make the Go/No-Go result more obvious.

* yes, the new year started 3 months ago, but our goals have been finalized as of March 31.

One of the answers on that post referenced Confluence, a full-sized (not personal) commercial wiki solution from Atlassian. I’ve gone to Python Meetups at Atlassian’s office here in Austin, so I thought it was worth a look. Confluence uses Java in the backend, which isn’t a technology I’m really familiar with – but our internal web hosting group here is, so they would probably be more comfortable supporting it than some alternatives. It’s just $10 (donated to charity) for up to 10 users if you host on your own server (which would be de rigueur for our process documentation, which includes login information for customer FTP sites).

I took a first look at a couple of other full-size wiki solutions:

Wagn (pronounced “wagon”) runs on Ruby on Rails. It has an interesting card-based interface that reminds me a little bit of HyperCard, which might be easier for my co-workers to wrap their minds around when I’m trying to convince them to write documentation for their own jobs. It’s open-source and free – and our web hosting group is also somewhat familiar with RoR, so it might be easier for them to support than some other alternatives, such as…

Moin Moin is a flat-file (not database) wiki built using Python 2.x (not 3.x yet). It’s the wiki software of choice for some major open-source organizations, including Apache, Ubuntu, FreeBSD, as well as Python.org itself. It’s also available in a “personal” (not web hosted) version, which would be useful to use to demonstrate the capabilities on the way to “selling” the company on the idea of hosting our documentation in a wiki (instead of files in random locations on our network file shares).

I was editing some slightly overexposed pictures set against a white background, and decided I wanted to do a quick adjustment on the model without affecting the color of the background. I’m a total Photoshop novice, so my first impulse was to grab the Quick Selection tool – but there were a number of areas that would have been fiddly. That’s when I thought, “There’s got to be a better way to do this!”

Adobe Help provided Select a color range in an image, which was exactly what I was looking for – and a bit more: a short video on selecting and correcting skin tones (plus steps to save a skin tone selection as a preset). I need to revisit this page soon to really get skin tone and color range selections into my brain.

My only regret is that I didn’t discover Photoshop skin tone selections before Halloween – I would have loved to Photoshop my head into a scene with DoriumMaldovar.

I’ve been studying Python for several years, off and on, but I’m only now getting traction on applying it at work.

One of the things I’m struggling with right now is how best to arrange the directory structure when using object-oriented design where some objects inherit from others, when I’m doing the development on my desktop machine but plan to move the project to a common network directory.

This unprecedented TASCHEN publication, authored by Jens Müller, brings together approximately 6,000 trademarks, focused on the period 1940–1980, to examine how modernist attitudes and imperatives gave birth to corporate identity.

Seeing the Slate link reminded me that I’ve had Logo Lounge sitting in an open tab in my browser for, well, months…

I’ll never be a great graphics designer (I’d rather code), but anyone who works with graphics (even if it’s only coding a web framework that “consumes” graphics) ought to seek inspiration from others’ works.