The latest in a series of "I didn't want to write a thing, but couldn't find another thing that already did exactly what I wanted, which is probably because I'm too picky, but whatever" projects, azure-blob-container-download (a.k.a. abcd) is a simple, command-line tool to download all the blobs in an Azure storage container.
Here's how it's described in the README:

The motivation for this project was the same as with my previous post about getting an HTTPS certificate: I've migrated my website from a virtual machine to an Azure Web App.
And while it's easy to enable logging for a Web App and get hourly log files in the W3C Extended Log File Format, it wasn't obvious to me how to parse those logs offline to measure traffic, referrers, etc..
(Although that's not something I've bothered with up to now, it's an ability I'd like to have.)
What I wanted was a trustworthy, cross-platform tool to download all those log files to a local machine - but the options I investigated each seemed to be missing something.

So I wrote a simple Node.JSCLI and gave it a few extra features to make my life easier.
The code is fairly compact and straightforward (and the dependencies minimal), so it's easy to audit.
The complete options for downloading and filtering are:

Azure Web Apps create a new log file every hour, so they add up quickly; abcd's date filtering options make it easy to perform incremental downloads.
The default directory structure (based on / separators) is collapsed during download, so all files end up in the same directory (named by container) and ordered by date.
The tool limits itself to one download at a time, so things proceed at a steady, moderate pace.
Once blobs have finished downloading, you're free to do with them as you please. :)

Earlier this week, I posted the code for a tool I wrote many years ago:

In 2001, I wrote DHCPLite to unblock development scenarios between Windows and prototype hardware we were developing on.
I came up with the simplest DHCP implementation possible and took all the shortcuts I could - but it was enough to get the job done!
Since then, I've heard from other teams using DHCPLite for scenarios of their own.
And recently, I was asked by some IoT devs to share DHCPLite with that community.
So I dug up the code, cleaned it up a bit, got it compiling with the latest toolset, and am sharing the result here.

Trivia

Other than a couple of assumptions that changed in the last 15 years (ex: interface order), the original code worked pretty much as-is.

Now that the CRT has sensible string functions, I was able to replace my custom strncpyz helper with strncpy_s and _TRUNCATE.

Now that the IP Helper API is broadly available, I was able to statically link to it instead of loading dynamically via classes LibraryLoader and AnsiString.

Now that support for the STL is commonplace, I was able to replace the custom GrowableThingCollection implementation with std::vector.

I used the "one true brace style" (editors note: hyperbole!) from Code Complete, but it never really caught on and the code now honors Visual Studio defaults.

It's been a while since I wrote native code and revisiting DHCPLite was a fond trip down memory lane.
I remain a firm believer in the benefits of a garbage-collected environment like JavaScript or C#/.NET for productivity and correctness - but there's also something very satisfying about writing native code and getting all the tricky little details right.

To be clear, both frameworks are perfectly good - but switching was a great opportunity to learn about React. :)

Conversion

The original architecture was pretty much what you'd expect: application logic lives in a JavaScript file and the user interface lives in an HTML file.
My goal when converting to React was to make as few changes to the logic as possible in order to minimize the risk of introducing behavioral bugs.
So I worked in stages:

Stage one migrated the interface from HTML to a JSX file.
This conversion took advantage of the existing observability (via Knockout) of variables to inform the need for UI updates.
Observability for application logic seems like a good companion to React's basic design (and efforts like MobX formalize this approach).

Having performed the bulk of the migration, all that remained was to identify and fix the handful of bugs that got introduced along the way.

Details

While JSX isn't required to use React, it's a natural fit and I chose JSX so I could get the full React experience.
Putting JSX in the browser means using a transpiler to convert the embedded HTML to JavaScript.
Babel provides excellent support for this via the React preset and was easy to work with.
Because I was now running code through a transpiler, I also enabled the ES2015 Preset which supports newer features of the JavaScript language like let, const, and lambda expressions.
I only scratched the surface of ES2015, but it was nice to be able to do so for "free".

One thing I noticed as I migrated more and more code was that much of what I was writing was boilerplate to deal with the propagation of state to and from (observable) properties.
I captured this repetitive code in three helper methods and doing so significantly simplified components.
(Projects like ReactLink formalize this pattern within the React ecosystem.)

Findings

Something I was curious about was how performance would differ after converting to React.
For the most part, things were fast enough before that there was no need to optimize.
Except for one scenario: filtering the list of items interactively.
So I'd already tuned the Knockout implementation for better performance by toggling the visibility (CSS display:none) of unfiltered items instead of removing and re-adding them to/from the DOM.

Implementing shouldComponentUpdate was a good start, but I had the same basic problem that adding and removing hundreds of elements just wasn't snappy.
So I made the same visibility optimization, introducing another component to act as a thin wrapper around the existing one and deal exclusively with visibility.
After that, the overall performance of the filter scenario was improved to approximate parity.
(Actually, React was still a little slower for the 10,000 item case, but fared better in other areas, and I'm comfortable declaring performance roughly equivalent between the two implementations.)

Other considerations are complexity and size.
Two frameworks have been replaced by one, so that's a pretty clear win on the complexity side.
Size is a little murky, though.
The minified size of the React framework is a little smaller then the combined sizes of jQuery and Knockout.
However, the size of the new JSX file is notably larger than the templated HTML it replaces (recall that the code for logic stayed basically the same).
And compiling JSX tends to expand the size of the code.
But fortunately, Babel lets you minify scripts and that's enough to offset most of the growth.
In the end, the React version of PassWeb is slightly smaller than the jQuery/Knockout version - but not enough to be the sole reason to convert.

Conclusion

Now that the dust has settled, would I do it all over again? Definitely! :)

Although there weren't dramatic victories in performance or size, I like the modular approach React encourages and feel it may lead to simpler code.
I also like that React combines UI logic and presentation better and allowed me to completely gut the HTML file (which now contains only head and script tags).
I also see value in unifying an application's state into one place (formalized by libraries like Redux), though I deliberately didn't explore that here.
Most importantly, this was a good learning experience and I really enjoyed getting to know React.

TextAnalysisTool.NET is one of the first side projects I did at Microsoft, and one of the most popular.
(Click here for relevant blog posts by tag.)
Many people inside and outside the company have written me with questions, feature requests, or sometimes just to say "thank you".
It's always great to hear from users, and they've provided a long list of suggestions and ideas for ways to make TextAnalysisTool.NET better.

By virtue of changing teams and roles various times over the years, I don't find myself using TextAnalysisTool.NET as much as I once did.
My time and interests are spread more thinly, and I haven't been updating the tool as aggressively.
(Understatement of the year?)

Various coworkers have asked for access to the code, but nothing much came of that - until recently, when a small group showed up with the interest, expertise, and motivation to drive TextAnalysisTool.NET forward!
They inspired me to simplify the contribution process and they have been making a steady stream of enhancements for a while now.
It's time to take things to the next level, and today marks the first public update to TextAnalysisTool.NET in a long time!

That's where you'll find an overview, download link, release notes, and other resources.
The page is owned by the new TextAnalysisTool GitHub organization, so all of us are able to make changes and publish new releases.
There's also an issue tracker, so users can report bugs, comment on issues, update TODOs, make suggestions, etc..

The new 2015-01-07 release can be downloaded from there, and includes the following changes since the 2013-05-07 release:

2015-01-07 by Uriel Cohen (http://github.com/cohen-uriel)
----------
* Added a tooltip to the loaded file indicator in the status bar
* Fixed a bug where setting a marker used in an active filter causes the
current selection of lines to be changed
2015-01-07 by David Anson (http://dlaa.me/)
----------
* Improve HTML representation of clipboard text when copying for more
consistent paste behavior
2015-01-01 by Uriel Cohen (http://github.com/cohen-uriel)
----------
* Fixed a bug where TAB characters are omitted in the display
* Fixed a bug where lines saved to file include an extra white space at the
start
2014-12-21 by Uriel Cohen (http://github.com/cohen-uriel)
----------
* Changed compilation to target .NET Framework 4.0
2014-12-11 by Uriel Cohen (http://github.com/cohen-uriel)
----------
* Redesigned the status bar indications to be consistent with Visual Studio and
added the number of currently selected lines
2014-12-04 by Uriel Cohen (http://github.com/cohen-uriel)
----------
* Added the ability to append an existing filters file to the current filters
list
2014-12-01 by Uriel Cohen (http://github.com/cohen-uriel)
----------
* Added recent file/filter menus for easy access to commonly-used files
* Added a new settings registry key to set the
maximum number of recent files or filter files allowed in the
corresponding file menus
* Fixed bug where pressing SPACE with no matching lines from filters
crashed the application
* Fixed a bug where copy-pasting lines from the application to Lync
resulted in one long line without carriage returns
2014-11-11 by Uriel Cohen (http://github.com/cohen-uriel)
----------
* Added support for selection of background color in the filters
(different selection of colors than the foreground colors)
* The background color can be saved and loaded with the filters
* Filters from previous versions that lack a background color will have the
default background color
* Saving foreground color field in filters to 'foreColor' attribute.
Old 'color' attribute is still being loaded for backward compatibility
purposes.
* Changed control alignment in Find dialog and Filter dialog
2014-10-21 by Mike Morante (http://github.com/mike-mo)
----------
* Fix localization issue with the build string generation
2014-04-22 by Mike Morante (http://github.com/mike-mo)
----------
* Line metadata is now visually separate from line text contents
* Markers can be shown always/never/when in use to have more room for line text
and the chosen setting persists across sessions
* Added statusbar panel funnel icon to reflect the current status of the Show
Only Filtered Lines setting
2014-02-27 by Mike Morante (http://github.com/mike-mo)
----------
* Added zoom controls to quickly increase/decrease the font size
* Zoom level persists across sessions
* Added status bar panel to show current zoom level

These improvements were all possible thanks to the time and dedication of the new contributors (and organization members):

Please join me in thanking these generous souls for taking time out of their busy schedule to contribute to TextAnalysisTool.NET!
They've been a pleasure to work with, and a great source of ideas and suggestions.
I've been really pleased with their changes and hope you find the new TextAnalysisTool.NET more useful than ever!

Those tools work well, but I also wanted something GUI for a more natural fit with UI-centric workflows.
I prototyped a simple WPF app that worked okay, but it wasn't as ubiquitously available as I wanted.
Surprisingly often, I'd find myself on a machine without latest version of the tool.
(Classic first world problem, I know...)
So I decided to go with a web app instead.

The key observation was that modern browsers already integrate with the host operating system's spell-checker via the spellcheck HTML attribute.
By leveraging that, my app would automatically get a comprehensive dictionary, support for multiple languages, native-code performance, and support for the user's custom dictionary.
#winning!

Aside: The (very related) forceSpellCheck API isn't supported by any browser I've tried.
Fortunately, it's not needed for Firefox, its absence can be coded around for Chrome, and there's a simple manual workaround for Internet Explorer.
Click the "Help / About" link in the app for more information.

Inspired by web browsers' native support for spell-checking, I've created Spell✔ (a.k.a. SpellV), a simple app that makes it easy to spell-check with a browser.
Click the link to try it out now - it's offline-enabled, so you can use it again later even without a network connection!

To import content, Spell✔ supports pasting text from the clipboard, drag-dropping a file, or browsing the folder structure.
For a customized experience, you can switch among multiple views of the data, including:

As you might guess from the name, TextAnalysisTool.NET (introductory blog post, related links) was not the first version of the tool.
The original implementation was written in C, compiled for x86, slightly less capable, and named simply TextAnalysisTool.
I got an email asking for a download link recently, so I dug up a copy and am posting it for anyone who's interested.

The UI should be very familiar to TextAnalysisTool.NET users:

The behavior is mostly the same as well (though the different hot key for "add filter" trips me up pretty consistently).

Writing the original TextAnalysisTool was a lot of fun and contributed significantly to a library of C utility functions I used at the time called ToolBox.
It also provided an excellent conceptual foundation upon which to build TextAnalysisTool.NET in addition to good lessons about how to approach the problem space.
If I ever get around to writing a third version (TextAnalysisTool.WPF? TextAnalysisTool.Next?), it will take inspiration from both projects - and handle absurdly-large files.

While writing the grunt-check-pages task for Grunt.js, I wanted a way to test the complete lifecycle: to load the task in a test context, run it against various inputs, and validate the output.
It didn't seem practical to call into Grunt itself, so I looked around for a mock implementation of Grunt.
There were plenty of mocks for use with Grunt, but I didn't find anything that mocked the API itself.
So I wrote a very simple one and used it for testing.

That worked well, so I wanted to formalize my gruntMock implementation and post it as an npm package for others to use.
Along the way, I added a bunch of additional API support and pulled in domain-based exception handling for a clean, self-contained implementation.
As I hoped, updating grunt-check-pages made its tests simpler and more consistent.

Although gruntMock doesn't implement the complete Grunt API, it implements enough of it that I expect most tasks to be able to use it pretty easily.
If not, please let me know what's missing! :)

For more context, here's part of the introductory section of README.md:

gruntMock is simple mock object that simulates the Grunt task runner for multi-tasks and can be easily integrated into a unit testing environment such as Nodeunit. gruntMock invokes tasks the same way Grunt does and exposes (almost) the same set of APIs. After providing input to a task, gruntMock runs and captures its output so tests can verify expected behavior. Task success and failure are unified, so it's easy to write positive and negative tests.

For a more in-depth example, have a look at the use of gruntMock by grunt-check-pages.
That shows off integration with other mocks (specifically nock, a nice HTTP server mock) as well as the testOutput helper function that's used to validate each test case's output without duplicating code.
It also demonstrates how gruntMock's unified handling of success and failure allows for clean, consistent testing of input validation, happy path, and failure scenarios.

As part of converting my blog to a custom Node.js app, I wrote a set of tests to validate its routes, structure, content, and behavior (using mocha/grunt-mocha-test).
Most of these tests are specific to my blog, but some are broadly applicable and I wanted to make them available to anyone who was interested.
So I created a Grunt plugin and published it to npm:

grunt-check-pages

An important aspect of creating web sites is to validate the structure and content of their pages. The checkPages task provides an easy way to integrate this testing into your normal Grunt workflow.

By providing a list of pages to scan, the task can:

Validate all external links point to live content (similar to the W3C Link Checker)

Link validation is fairly uncontroversial: you want to ensure each hyperlink on a page points to valid content.
grunt-check-pages supports the standard HTML link types (ex: <a href="..."/>, <img src="..."/>) and makes an HTTP HEAD request to each link to make sure it's valid.
(Because some web servers misbehave, the task also tries a GET request before reporting a link broken.)
There are options to limit checking to same-domain links, to disallow links that redirect, and to provide a set of known-broken links to ignore.
(FYI: Links in draft elements (ex: picture) are not supported for now.)

XHTML compliance might be a little controversial.
I'm not here to persuade you to love XHTML - but I do have some experience parsing HTML and can reasonably make a few claims:

HTML syntax errors are tricky for browsers to interpret and (historically) no two work the same way

Something I find useful (and outline above) is to define separate configurations for development and production.
My development configuration limits itself to links within the blog and ignores some that don't work when I'm self-hosting.
My production configuration tests everything across a broader set of pages.
This lets me iterate quickly during development while validating the live deployment more thoroughly.

Footnote: grunt-check-pages is not a site crawler; it looks at exactly the set of pages you ask it to.
If you're looking for a crawler, you may be interested in something like grunt-link-checker (though I haven't used it myself).

I've used a password manager for many years because I feel that's the best way to maintain different (strong!) passwords for every account.
I chose Password Safe when it was one of the only options and stuck with it until a year or so ago.
Its one limitation was becoming more and more of an issue: it only runs on Windows and only on a PC.
I'm increasingly using Windows on other devices (ex: phone or tablet) or running other operating systems (ex: iOS or Linux), and was unable to access my passwords more and more frequently.

To address that problem, I decided to switch to a cloud-based password manager for access from any platform.
Surveying the landscape, there appeared to be many good options (some free, some paid), but I had a fundamental concern about trusting important personal data to someone else.
Recent, high-profile hacks of large corporations suggest that some companies don't try all that hard to protect their customers' data.
Unfortunately, the incentives aren't there to begin with, and the consequences (to the company) of a breach are fairly small.

Instead, I thought it would be interesting to write my own cloud-based password manager - because at least that way I'd know the author had my best interests at heart. :)
On the flip side, I introduce the risk of my own mistake or bug compromising the system.
But all software has bugs - so "Better the evil you know (or can manage) than the evil you don't".
Good news is that I've taken steps to try to make things more secure; bad news is that it only takes one bug to throw everything out the window...

Hence the disclaimer:

I've tried to ensure PassWeb is safe and secure for normal use in low-risk environments, but do not trust me.
Before using PassWeb, you should evaluate it against your unique needs, priorities, threats, and comfort level.
If you find a problem or a weakness, please let me know so I can address it - but ultimately you use PassWeb as-is and at your own risk.

With that in mind, I'm open-sourcing PassWeb's code for others to use, play around with, or find bugs in.
In addition to being a good way of sharing knowledge and helping others, this will satisfy the requests for code that I've already gotten. :)

Some highlights:

PassWeb's client is built with HTML, CSS, and JavaScript and is an offline-enabled single-page application.

It runs on all recent browsers (I've tried) and is mobile-friendly so it works well on phones, too.

Entries have a title, the name/password for an account, a link to the login page, and a notes section for additional details (like answers to security questions).

PassWeb can generate strong, random passwords; use them as-is, tweak them, or ignore them and type your own.

The small server component runs on ASP.NET or Node.js (I provide both implementations and a set of unit tests).

Data is encrypted via AES/CBC and only ever decrypted on the client (the server never sees the user name or password).

The code is small, with few dependencies, and should be easy to audit.

So I dashed off another string extractor based on the implementation of JavaScriptStringExtractor:

HtmlStringExtractor
Runs on Node.js.
Dependencies via npm.
htmlparser2 for parsing, glob for globbing, and request for web access.
Wildcard matching includes subdirectories when ** is used.
Pass a URL to extract strings from the web.

HtmlStringExtractor works just like its predecessors, outputting all strings/attributes/comments to the console for redirection or processing.
But it has an additional power: the ability to access content directly from the internet via URL.
Because so much of the web is HTML, it seemed natural to support live-extraction - which in turn makes it easier to spell-check a web site and be sure you're including "hidden" text (like title and alt attributes) that copy+paste don't cover.

Its HTML parser is very forgiving, so HtmlStringExtractor will happily work with HTML-like languages such as XML, ASPX, and PHP.
Of course, the utility of doing so decreases as the language gets further removed from HTML, but for many scenarios the results are quite acceptable.
In the specific case of XML, output should be entirely meaningful, filtering out all element and attribute metadata and leaving just the "real" data for review.

In keeping with the theme of "small and simple", I didn't add an option to exclude attributes by name - but you can imagine that filtering out things like id, src, and href would do a lot to reduce noise.
Who knows, maybe I'll support that in a future update. :)

For now, things are simple.
The StringExtractors GitHub repository has the complete code for all three extractors.

Enjoy!

Aside: As I wrote this post, I realized there's another "language" I use regularly: JSON.
Because of its simple structure, I don't think there's a need for JsonStringExtractor - but if you feel otherwise, please let me know!
(It'd be easy to create.)