NDIToolbox

If you haven’t updated NDIToolbox since last time, it’s worth doing it now. Here’s where we are today.

Better support for UTWin data files, including preliminary support for compressed waveforms. That last one’s still highly experimental but let me know if it works for you; I don’t have access to a lot of sample data files for testing.

Squashed bugs, which includes better handling of memory errors running a plugin.

(Developers) A new report module which provides a quick-and-easy way of generating simple PDF reports.

Source code has already been updated, binaries will follow shortly. I’ll have more to say on the report module in a later post.

Development on TRI‘s nondestructive evaluation data analysis software NDIToolbox has slowed of late as we’ve gotten closer to our goal for functionality and as we get ready to do an honest-to-goodness field test later this year on a QA line. Nevertheless I’m still plugging away at it whenever I get the chance, and today I’ve got the latest and greatest available with two new features: support for multiple datasets in Winspect data files and a new “batch mode.”

The batch mode feature lets you run an NDIToolbox plugin on a set of input files, optionally spawning multiple processes to speed things up. If you have a ton of data files and you’re doing the same number crunching over and over, just point NDIToolbox to the files and the plugin and let it do it for you. You don’t have to convert your data files to HDF5 before using batch mode; as long as the file format(s) are supported by NDIToolbox it’ll fetch the data and run the plugin automatically. More info on batch mode available here from my mirror of the NDIToolbox docs. If you’re going to use batch mode’s multiprocessing, be sure to read up on the requirements (basically, don’t have really huge data files).

As usual, I’d recommend using the conventional Python version of NDIToolbox if you can. If you’re on Windows and don’t want to install Python (or you want to run from a thumb drive), the Downloads section of NDIToolbox’s Bitbucket page has a Windows installer and a compiled version available, no Python required.

If you’re writing a plugin there’s one additional step required to support the new batch mode. Since more than a few nondestructive testing system file formats like UTWin’s CSC or WinSpect’s SDT can have multiple datasets in a single file, batch mode will send your plugin a dict of all the datasets it finds in a given input file. So you’ll need a bit of code to see if you’ve been passed a single dataset (conventional user interface) or a container full of datasets (batch mode). There’s a few ways to do this but one of the most straightforward is to look for a “keys” attribute like so.

You could also just check to see if you were passed an actual dict, courtesy isinstance(). I’d recommend against doing that for now though – better to just assume it’s an associative container of some sort rather than hard-wiring an expectation of an actual dict.

I haven’t had the chance to do much NDIToolbox work in the past month or so while I’ve been working on another project in the lab – it did involve lasers and a chance to play with C++ after many years’ absence so I’m not complaining. I did just push out an update this week that might be of interest if you’ve been running into memory problems. Hopefully this version’s a little more thoughtful when it comes to releasing memory it no longer needs.

Also in this version, I’ve added preliminary support for ultrasonic gate functions in the MegaPlot presentation. The functionality’s always been there but I’ve had it disabled until now while I was working out how to apply gates to three-dimensional data; I’m not 100% satisfied with the implementation but thought I’d enable it and come back to it later.

Update Wed Dec 26 12:23:44 CST 2012: managed to sneak some more work in on the project before the end of the year. It’s not in the documentation yet, but I’ve added exporting slices of a data file. Handy if you’re only interested in a subset of a much larger data file.

Finally, if you’ve ever wanted to just see screenshots and read about all of this NDIToolbox stuff instead of having to download everything, I’ve put a mirror of the current documentation up on the site. Have a look at the Quick Start for a primer on what NDIToolbox does, and the Plugins page to find out about…plugins. Developers might also be interested in how to write plugins. Sample plugin code is available that demonstrates how to write a server-based plugin, and how to combine Python with Java or C++.

NDIToolbox has been able to generate B-scans from ultrasonic data for a while now, but you had to know how to take slices of data. I just added a switch in the Megaplot presentation that will now do it for you automatically. Here’s what I’m talking about:

For comparison, here’s the usual Megaplot presentation:

Just check/uncheck Plot Conventional B-scans in the Plot menu to switch back and forth.

Also available for testing is a new NDIToolbox installer for the Windows binary distribution. Tested to work under Windows 7 and 8. Available from the NDIToolbox Downloads page. As always I recommend downloading the Python version rather than the precompiled binary since it’s so much easier to keep up to date, but it might come in handy if you don’t want to install a bunch of dependencies and just want to get started right away.

Profiling NDIToolbox’s memory usage I sometimes see it isn’t freeing memory when I think it could. After a bit of searching I’ve discovered that this can sometimes happen when you’re using wxPython because of the connection to wxWidgets (apparently the connection to C++ complicates garbage collection).

Something I’ve been toying with that seems to work fairly well so far is forcing collection when a wxPython window is destroyed:

Another update to NDIToolbox today, I’ve just added the ability to import data from a couple of ultrasonic NDT systems. These imports are still a little flaky because a) we haven’t finalized the HDF5 format we’ll be using in NDIToolbox and b) proprietary binary file formats being what they are, but for what it’s worth I’ve used them on the data I could get my hands on from some immersion tank scans done here @ TRI World HQ and elsewhere and they will at least let you display your data, so it’s a start. Hopefully I’ll be able to improve their functionality and add a few more importers as the project goes on.

Other recent but decidedly less interesting changes:

Support for manual garbage collection – if you’re playing with large data and you get warnings about being out of memory, you can opt to clear some out to keep working. I’ll be implementing HDF5 slicing at some point so this is a temporary work-around.

Fixed a bug in plugins-plugin support folders can now contain Python modules.

All data retrieval functions now in a separate module (models/datio.py) so your code can use them directly.

As always, the source code is up on Bitbucket and a Windows binary is available as well. These changes will also find their way into NDIToolbox Labs – we’re still plugging away on integrating the Automated Data Analysis (ADA) Toolkit into Labs but making progress.

Today we forked NDIToolbox into a new project, NDIToolbox Labs. We’ll be using Labs for experimental features before they go into the main NDIToolbox repository. Read on for more background.

One of my main responsibilities at TRI is working with Subject Matter Experts (SMEs), basically gurus in one particular field or another. A big part of my job is in helping SMEs write code from scratch or port it from MATLAB, R, C/C++, etc. to Python. The SME works out an algorithm, I code it up in Python, rinse and repeat.

I started the Labs fork because in many cases the SMEs aren’t familiar with Python or unit testing, but I didn’t want to slow their efforts down by insisting on tests for inclusion in the main NDIToolbox repository. Labs will be the unstable branch of NDIToolbox – stuff might break but it’s where all the cool new features will be. Once the SME is more or less satisfied with how their code is working in Labs, I’ll add the requisite tests and whatnot and port it to NDIToolbox stable.

One of the first new additions will be “Automated Defect Analysis,” a suite of code designed to read data and automatically locate anomalies in the sensor data. Instead of having an inspector scroll through 100 miles of pipeline inspection data for example, you’d let ADA read the data on its own and let it present you with a report of where it thinks cracks and pits were found.

Update Fri Aug 31 13:18:28 CDT 2012:Computational Tools‘ first alpha of the ADA Toolkit has been added to NDIToolbox Labs. Although it is functional and can run ADA Models (specialized NDIToolbox plugins), it’s still in the early stages of development. I’ve also put together a Windows binary if you’d like to check it out and don’t have Python installed. Be sure also to download the ADA Model ZIP from the same page to see ADA Toolkit put through its paces, or copy the URL to the clipboard and download/install from ADA Toolkit itself.

Next up will probably be some Probability Of Detection (POD) models that you’d use to simulate an inspection to find out if the inspection would actually be able to detect the anomalies you’re interested in finding. Going back to the pipeline inspection example, a POD model might tell you whether a lower resolution scan might suffice to find damage; saving you time and money in the inspection.

So to summarize: if you don’t need the latest and greatest, I’d recommend sticking with NDIToolbox stable. If you need the latest and greatest, try NDIToolbox Labs but expect bugs.

Courtesy PyInstaller, I’ve made an NDIToolbox binary distribution for Windows users that’d just as soon not install Python and friends just to try out one program. Download nditoolbox_windows_binary.zip; extract to a folder of your choice, and run nditoolbox.exe from the nditoolbox folder. Also included are a few sample data files to play with if you don’t have anything handy, taken from honest-to-goodness ultrasonic scans. The binary release is intended to be a demonstration of NDIToolbox and will probably lag behind development so if you like NDIToolbox and plan on using it I’d recommend installing Python and taking it from there.

For any fellow PyInstaller users out there, if you have any problems getting h5py to work, just add hook-h5py.py to your hooks folder. It should already be in recent PyInstaller builds, but if you’re using an older release like I am (1.5.1) you can just copy it over manually.

In other NDIToolbox news, this release is the first to come with some early documentation I’ve been working on, including a quick-start guide and a few words about running and writing NDIToolbox plugins. The documentation is available from NDIToolbox’s Help menu, or under the docs folder in NDIToolbox’s folder.

Finally, we’ve more or less completed our project name change from “a7117” (the internal TRI project number) to “nditoolbox” so default folder locations, etc. have been updated. Your old settings and files aren’t deleted, but NDIToolbox is now looking for nditoolbox.cfg rather than a7117.cfg so if you’re happy with your current setup just rename a7117.cfg to nditoolbox.cfg and you’re set.

Now that the new “megaplot” for plotting three-dimensional nondestructive evaluation data is included in NDIToolbox, I thought I’d better throw a couple of pictures up to explain how NDIToolbox treats the data. If you have a three-dimensional NumPy array, the first index is the row, second is column, and the third index is…well, I’m not quite sure what you’d call it, but it’s the third dimension of the array. Picture your 3D array as an array of two-dimensional arrays, the third index is the index of the 2D arrays. So if your NumPy array was three 5×4 arrays for example:

(Apologies for the figures, I threw them together fairly quickly for this post.)

In NDIToolbox, three-dimensional data are considered to be x along columns, y along rows, and z along…whatever we’re calling the third index (depth?). A C-scan is a 2D array; in this initial release it’s simply a slice index. In working with nondestructive testing data it’s fairly common to take a subset of the depth and apply a function (e.g. max, min, peak-to-peak) to return a 2D array for your C-scan. The data handling code behind NDIToolbox has this ability already, I just need to wire it up in the “megaplot.” When you open a “megaplot” on a 3D dataset, NDIToolbox will load the data from the HDF5 file and plot the C-scan at z=0. If you change the Slice Index spin control value, the plot will refresh with the C-scan at the new index value you’ve selected.

If you change the X Position or Y Position spin controls, or click on a position in the C-scan, NDIToolbox will plot the A and B scans for the given (x0, y0, z0) position. The vertical B-scan is a slice of the C-scan data given by (x=x0, y, z=z0). The horizontal B-scan is a slice of the C-scan data given by (x, y=y0, z=z0). The A-scan is a slice through the data given by (x=x0, y=y0, z). Going back to our original picture of a three-dimensional array, this is how the scans are found:

While I was tinkering with plotting 3D NDE data, I put together an animation showing all the C-scans in the dataset I’ve been using in these posts. I’m not sure if there’s any analytic value in it but it was interesting to watch the ultrasonic wave as it interacted with the sample’s defects. If you’re new to ultrasonic NDE data it might help you visualize how defects at different depths appear and disappear depending on where your C-scan is from. Enjoy!

Just released this week is early support for slicing 3D data. Now when you attempt to plot an image plot of a 3D dataset, NDIToolbox will ask you what z index to use. Same goes for previewing 3D datasets as well.

What’s more interesting is what’s coming up: better support for a few common plot types in Nondestructive Evaluation (NDE). That last link gives a good explanation of the various types of scan but for those of us that didn’t start out in ultrasonic NDE here’s the abridged version.

A scans are waveform plots, voltage vs. time. In a 3D dataset an A scan is the data at a given position (x, y).

C scans are plots of 2D planes taken in the z plane of a 3D dataset. In the simplest case they’re just a slice of data at a given z (i.e. depth into structure) position, e.g. the voltage, time pairs at for every point (x, y) in the plane z=200. In ultrasonics you’ll frequently see C scans created from some feature in the 2D array of A scans, e.g. maximum amplitude for all A scans between z=200 and z=250. If you’re conducting a raster scan of a part and recording voltages at regular (x, y) positions, you’re basically creating a C scan.

B scans are slices through the C scan dataset (at least in our presentation). We’re using the cursor position for our B scans, generating the horizontal and vertical slices through the current C scan dataset.

NDIToolbox already handles A scan and C scan plots but this week I’m working on a four-panel plot that does A, B, and C scan plots of a 3D dataset. Here’s a screenshot of the new type of plot I’m adding:

This is an actual ultrasonic scan of an NDE standard (basically a block with a regular pattern of known flaws at various depths) we made in the lab; when you click on a position in the C scan it will retrieve the A and B scans for that position and update the other plots. You can browse through the 3D dataset by choosing your slice index at the top to update the C scan layer. In this early version you’re limited to viewing a single slice in Z, but the data handler underneath supports running functions against a subset of data in Z. For example, returning the peak-to-peak value between Z=220 and Z=230 and returning that as your C scan data. I’m hoping to have that implemented in the near future.

If you’re not a fan of the four-panel plot, no problem. The aforementioned data handler is a simple Python utility class that when initialized with a 3D NumPy array will return A, B, and C scans on request. So you can slice, dice, and present your 3D NDT data as you see fit.

So far this new plot is fairly responsive; on my machine I’m able to comfortably handle 2 billion data points or around 500MB worth of data in memory without any lag. I mentioned it’s freely available, right? Go get your copy.

Update Mon May 14 14:29:23 CDT 2012: just pushed out the “megaplot” to the bitbucket repo. There’s still more work to be done with plugins and exporting data subsets but the basic plot functionality is ready to go. Here’s another slice through the same ultrasonic data (Linux Mint 12), this time using a different colormap to better highlight the flaws in the test article.