At home, I occasionally need to print color posters and black and white flyers. I’ve found FedEx Office Online Printing Service to be very convenient for this (if you know exactly what you want… more on that later). After you upload your file in one of the supported formats (Word, PowerPoint, Excel, RTF, Post Script, PDF, Text, JPG), select a paper size, and set print options (color/black and white, copies, collation, paper stock, etc.), the web site gives you a preview of the final print output.

For my last print job, the file I wanted to print was in CorelDRAW format, so all I had to do is go into CorelDRAW and export it to Post Script format. The final print output looked perfect to me.

There is a "minor" problem with the service however: there are no prices to be found on the site anywhere. No, the printing is not free, sorry. You do eventually see the total price when you check out. The only reason I can think of for this strange "price hiding" practice is so that people can’t easily compare online prices vs walk-in prices. They obviously have complete pricing data in the system, because the site does give you a total at checkout. This lack of up-front pricing is a major hassle, especially if you are not sure which options you want (type of paper, etc). You can’t easily/quickly compare the different printing options (and there are tons of them). Changing your order and going through the checkout process just to see the price is too cumbersome.

One has to ask, what were they thinking??? I certainly hope this is not a trend among online stores. And don’t you hate it when you google something (such as "FeDex Kinko’s prices") and the first thing you find is other people also looking for the same info and not finding any :-).

Against popular wisdom, I decided to upgrade my bedroom home theater PC to Vista 64-bit a couple days ago (just have to make full use of all of my precious 4GB of RAM). Everything is working surprising well so far, with the exception of sound! Whenever I play any audio, my speakers now produce all kinds of pops and crackles along with the normal audio stream. Urgg.

After a couple of days of googling, tweeking various sound settings, uninstalling/reinstalling drivers, etc. without success, I almost gave up on the thing. Then I decided to try just one more thing, changing the default sample rate to “2 channel, 16 bit, 44100 Hz (CD Quality)” from the default “2 channel, 24 bit, 48000Hz” and just like magic, the pops and crackles are gone.

Nice surprise logging into Gmail just now: Themes! I tried out a few and have to say they are nice to look at. I almost forgot I have actual emails to read. I’m going to try out a cheery theme to offset the grim news from Wall Street.

There’s even a Terminal theme for die UNIX shell diehards. It’s actually kind of cool… if only for a few minutes.

Recently I needed a way to find blank images among a large batch of images. I had tens of thousands of images to work with so I came up with this c# function to tell me whether an image is blank.

The basic idea behind this function is that blank images will have highly uniform pixel values throughout the whole image. To measure the degree of uniformity (or variability), the function calculates the standard deviation of all pixel values. An image is determined to be blank if the standard deviation falls below a certain threshold.

Here’s the code. In order to compile, the project to which this code resides must have “Allow Unsafe Code” checked.

Ever since Windows 2000, menu keyboard shortcut characters are not underlined by default. According to Microsoft, the underlined letters are hidden until you press the Alt key. Let’s try that… First, use the mouse to click on the Help menu in Visual Studio:

Now, press Alt to show the underlined letters right? Poof, the menu is gone. Ok, that’s an easy one. I’m sure everyone have figured out that Alt key must be pressed before you access the menu. But can anyone tell me this? How do I show underlined letters for right-click/context menus with the Alt key? Well, the short answer is you can’t! If you don’t believe me, try it yourself. I’ve tried Alt+right-click, Alt then right click, right click then Alt, etc. Nothing works.

The only thing I’ve found to work is the Application key (this is the key with the image of a mouse pointer on a menu, between Alt and Ctrl). Interestingly, the Application key will always show underlined letters regardless of the “hide underlined letters” settings. The keyboard combination Shift-F10 also brings up the context menu, however that keyboard shortcut does not show underlined letters.

You can forget about all of this nonsense and have Windows always show the underlined letters by changing a setting (instructions below are for Windows XP):

Open the Display Control Panel.

Click on the Appearance tab, then Effects…

Uncheck “Hide underlined letters for keyboard navigation until I press the Alt key”.

If you do any web scraping (also known as web data mining, extracting, harvesting), you are probably familiar with the main steps: navigate to page, retrieve HTML, parse HTML, extract desired elements, repeat. I’ve found the SgmlReader library to be very useful for this purpose. SmglReader turns your HTML into XML. Once you have the XML, it’s fairly easy to use built-in classes such as XmlDocument, XmlTextReader, XPathNavigator to parse and extract the data you want.

Now to the labor intensive part: before your program can make sense of the XML, you have to manually analyze the HTML/XML first. Your program won’t know jack about how to extract that stock price until you tell it exactly where the stock price is, typically in the form of an XPath expression. My process of getting that XPath expression goes something like this:

Step 2b is where it gets very labor intensive and boring, especially for a big web page with many levels of nesting. Visual Studio 2005 XML Editor/Resharper have a couple of features that I find useful for this:

– With Resharper, you can press Ctrl-[ to go to the start of the current element, or if you are already at the start, go to the parent element.

Even with the above tools, it’s still a painful and error-prone exercise. Luckily for us, Firebug has the perfect feature for this: Copy XPath. To use it, open your HTML/XML document, open the Firebug pane (Tools/Firebug/Open Firebug), navigate to the desired element, right click on it and choose “Copy XPath”.

You should now have this XPath expression in the clipboard, ready to be pasted into your web scrapper application: “/html/body/div[2]/table/tr/td[2]/table”.

A feature that I would love to have is the ability to generate an alternate XPath expression using “id” predicates, such as this: “//Table[@id=”searchResultTable”]”. With web pages that are not under your control, you want to minimize the chance that changes on the pages impact your code. Absolute XPath expressions are vulnerable to any kind of changes on the page that change the order and/or nesting of elements. On the other hand, XPath expressions using an “id” predicate are less likely to be impacted by layout changes because in HTML, element IDs are supposed to be unique. No matter where your element is on the page, if it has the same ID, you should still be able to get to it by looking up the ID. Hmm… this sounds like a good idea for a Visual Studio Add-in.

If you are a subscriber to my blog, you may have noticed that I have not been posting my more “Finds of the Week” in the last 2 months. Well, I was a little busy with the month-long Euro 2008 tournament in June, plus a couple of new games (Crysis and Medieval Total War II). Finally the Olympics in August finished me off.

I am going to turn this series into a periodic (as in longer than weekly :-)) Interesting Finds series from now on.

Oh, if you want to know… Crysis is ok. Very good graphics and requires a hot rod box but gameplay is just ok. I am more into realistic squad-based shooters. Medieval 2 is very addictive.

.NET, C#

I can’t believe I didn’t know about ThreadStaticAttribute. While searching for more information on it, I ran across this interesting article on MSDN Magazine: Scope<T> and More (Stephen Toub) that talked about the use of ThreadStatic in System.Transactions.

Would it be nice if we can do something like this in our applications?

// Wrap a file copy and a database insert in the same transaction
TxFileManager fileMgr = new TxFileManager();
using (TransactionScope scope1 = new TransactionScope())
{
// Copy a file
fileMgr.CopyFile(srcFileName, destFileName);
// Insert a database record
dbMgr.ExecuteNonQuery(insertSql);
scope1.Complete();
}

With the rich support currently available for transactional programming, one may find it rather surprising that the most basic type of program operation, file manipulation (copy file, move file, delete file, write to file, etc.), are typically not transactional in today’s applications.

I am sure the main reason for this situation is lack of support for transactions in the underlying file systems. While Microsoft is bringing us Transactional NTFS (TxF) in Vista and Windows Server 2008, most corporate IT applications are still deployed to Windows 2003 or earlier. While I can’t wait to be able to use TxF, I have applications that have to be completed today!

While searching for a solution, I came across several articles describing the use of IEnlistmentNotification to implement your own resource manager and participate in a System.Transactions.Transaction. However, a complete working code example was nowhere to be found. Well, I guess it’s my turn to contribute. I hereby present to you: Chinh Do’s Transactional File Manager.

Here are my basic requirements for a Transactional File Manager:

Works with .NET 2.0’s System.Transactions.

Ability to wrap the following file operations in a transaction:

Creating a file.

Deleting a file.

Copying a file.

Moving a file.

Writing data to a file.

Appending data to a file.

Creating a directory.

Ability to take a snapshot of a file (and restore it to the snapshot state later if required). The snapshot feature allows the inclusion of 3rd-party file operations in your transaction.

Thread-safe.

IEnlistmentNotification and ThreadStatic Attribute

Implementing IEnlistmentNotification is harder that it looks… at least for me it was. It’s not enough to just store a list of file operations. Because transactions can be nested and started from different threads; when rolling back, we have to make sure to only include the correct operations for the current Transaction. At first glance, it looks like we should be able to use the LocalIdentifier property (Transaction.TransactionInformation.LocalIdentifier) to identify the current transaction. However, further investigation reveals that Transaction.Current is not available in our various IEnlistmentNotification methods.

As it turned out, the little known but very cool ThreadStatic attribute fits the bill very well. Since the scope of a TransactionScope spans all operations on the same thread inside the TransactionScope block (excluding nested, new Transactions), ThreadStatic gives us an easy way to track that data.

In the initial version of my Transactional File Manager class (TxFileManager), I made the mistake of trying to implement IEnlistmentNotification in the main TxFileManager class. I had all kinds of difficulty trying to sort out different transactions/threads. Once I started to split to IEnlistmentNotification implementation into its own nested class (TxParticipant), everything became much cleaner. In the main class, all I have to do is to maintain a Dictionary<T, T> of TxEnlistment objects, which implement IEnlistmentNotification. Each TxEnlistment object would be responsible for handling a separate Transaction. Once that is in place, everything else was like pretty much a walk through the park.

IEnlistmentNotification.Commit

Since my Resource Manager always performs operations immediately, there is really nothing to commit, except to clean up temporary files:

IEnlistmentNotification.Rollback

Rolling back is a little bit more complicated. To ensure consistency, we must roll back operations in reverse order.

Another gotcha I ran into is that Rollback is often (if not all the time) called from a different thread from the Transaction thread. Any unhandled exception that occurs in Rollback will cause an AppDomain.CurrentDomain.UnhandledException. To “handle” an UnhandledException, you can either set IgnoreExceptionsInRollback = True or implement an UnhandledExceptionEventHandler.

Test Driven Development /Unit Testing

What does TDD have to do with this? It just happens that if you do Test Driven Development, Transactional File Manager can make testing classes that perform file operations much more convenient. In conjunction with a mocking framework such as Rhino Mocks, you can easily test the class functionality without having to read/write to actual files.

Shortcomings

Here are the known shortcomings of my Transactional File Manager:

Oher processes and transactions can see pending changes. This effectively makes the Transaction Isolation Level “Read Uncommitted”. This is actually advantageous because it allows external code to participate in our transactions. Without the ability for external code to see “dirty data”, our Transaction File Manager would only be useful in the most narrow of scenarios.

There is a performance penalty due to the need to make backups of files involved in the transaction (this is common to all transaction managers). If your process involves working with very large files then using Transactional File Manager may not be practical. In general, transactions should be kept to small and manageable units of work anyway.

Only volatile enlistment supported. If the app crashes or is killed, your transaction will be stuck half-way (perhaps durable enlistment will be added in a future version.)

Example 1

// Complete unrealistic example showing how various file operations, including operations done
// by library/3rd party code, can participate in transactions.
IFileManager fileManager = new TxFileManager();
using (TransactionScope scope1 = new TransactionScope())
{
fileManager.WriteAllText(inFileName, xml);
// Snapshot allows any file operation to be part of our transaction.
// All we need to know is the file name.
XslCompiledTransform xsl = new XslCompiledTransform(true);
xsl.Load(uri);
//The statement below tells the TxFileManager to remember the state of this file.
// So even though XslCompiledTransform has no knowledge of our TxFileManager, the file it creates (outFileName)
// will still be restored to this state in the event of a rollback.
fileManager.Snapshot(outFileName);
xsl.Transform(inFileName, outFileName);
// write to database 1
myDb1.ExecuteNonQuery(sql1);
// write to database 2. The transaction is promoted to a distributed transaction here.
myDb2.ExecuteNonQuery(sql2);
// let's delete some files
for (string fileName in filesToDelete)
{
fileManager.Delete(fileName);
}
// Just for kicks, let's start a new transaction.
// Note that we can still use the same fileManager instance. It knows how to sort things out correctly.
using (TransactionScope scope2 = new TransactionScope(TransactionScopeOptions.RequiresNew))
{
fileManager.MoveFile(anotherFile, anotherFileDest);
}
// move some files
for (string fileName in filesToMove)
{
fileManager.Move(fileName, GetNewFileName(fileName));
}
// Finally, let's create a few temporary files...
// disk space has to be used for something.
// The nice thing about FileManager.GetTempFileName is that
// The temp file will be cleaned up automatically for you when the TransactionScope completes.
// No more worries about temp files that get left behind.
for (int i=0; i<10; i++)
{
fileManager.WriteAllText(fileManager.GetTempFileName(), "testing 1 2");
}
scope1.Complete();
// In the event an exception occurs, everything done here will be rolled back including the output xsl file.
}