Managing Low-Level Keyboard Hooks with the Windows API for VB .NET

I am amazed at the overwhelming and disproportionately high number of email responses I get about hooking the keyboard. Many people in a diverse group of industries have legitimate reasons for wanting to block certain key combinations. Last November, I wrote about low-level keyboard hooks for VB6 (see Managing Low-Level Keyboard Hooks with the Windows API, November 18, 2002 in codeguru.com's VB Today.) In response to queries from many of you, I have revised the keyboard hooks example for VB .NET.

The inimitable Robbie Powell read my earlier article on keyboard trapping and wanted to use the code. Mr. Powell was tasked with trapping specific key combinations for a testing application. The basic idea is that testees should not be distracted during a test. By eliminating the ability to open an application other than the test application, their attention was more ably focused. Considering the nature and importance of the candidate's future endeavors, a bit of tunnel vision during testing was warranted. Unfortunately, the code from the November article does not port directly from VB6 to VB.NET. Mr. Powell did a superlative job porting the code but something still didn't work quite right. Together we figured out the differences, which are provided here.

Permit me to rehash some of the material in the November article for those who did not have an opportunity to read that article. If you have read the November article and just need to fill in the blanks, I encourage you to skip ahead to the Implementing the Keyboard Delegate section and The Complete Code Listing section. For a complete presentation, continue.

Writing API Declarations

The .NET Framework has tidily wrapped up much of the Windows API in methods that are significantly easier to use. However, occasionally you may need to turn to the Windows API. Trapping keystrokes from one application for all applications is a pretty low-level operation, and in such an instance you need to turn to the Windows API. Consequently, you will need to declare API methods.

For VB.NET developers, we can use the old-style Declare syntax to import DLL library methods. One also has the ability to use new .NET attributes for declaring API methods, specifically the DllImportAttribute. The Declare keyword is shorthand notation that causes the compiler to add and use the DllImportAttribute. Simply keep in mind that you will need to use the DllImportAttribute if you are programming in some other .NET language besides VB.NET. For our purposes, we will use the convenience notation.

To trap and examine keys before other applications get them, we need to hook the keyboard (ultimately release the hook), call the old keyboard handler, and interpret key combinations. To accomplish this feat, we need to import the SetWindowsHookEx, UnhookWindowsHookEx, CallNextHookEx, and GetAsyncState. Perhaps you will understand the rationale a bit better with some background information. So, before we look at the syntactic mechanics of a declaration statement, let's take a quick historical journey.

Understanding Low-Level Hooks

It seems like just a few brief years ago that you couldn't write anything interesting without writing interrupt handlers. In very low memory, Basic Input and Output (BIOS) code is loaded. This code provides the basic capabilities that your PC needs. (Assuming you are using a DOS-based PC. I imagine a MAC has something analogous to the BIOS for PCs...) These basic services are called interrupt handlers, and they are referred to by number. For example, interrupt 5 is the print screen interrupt. Interrupt 0x10 (hexadecimal) provides direct video input and output, interrupt 0x19 will reboot your computer, and interrupts 0x9 and 0x16 manage keyboard input. Pretty powerful stuff, these interrupt handlers.

Just a few years ago, one would have to write a custom interrupt handler and redirect the BIOS code to the new handler to replace the basic services. For example, prior to Windows, if code attempted to read from the A: drive and no diskette were in the drive, an application would hang. However, if the code provided an interrupt 0x24 handler, the error could be caught and new behavior provided. Supplanting basic BIOS behavior with new behavior is exactly what popup programs and TSR (Terminate and Stay Resident) utilities did all the time. Problematically, working at this level is an all or nothing proposition. Make a mistake and the whole PC crashed. These very low-level capabilities can still be accessed—for example, write asm int 3 end in Delphi and the debugger will stop because interrupt 3 is a low-level breakpoint. However, because replacing basic system services can result in unreliable PC behavior, operating system engineers were motivated to shield programmers from these mistakes.

To aid in productivity, we work at a higher level of abstraction. Instead of writing an interrupt handler for interrupt 0x9 and 0x16 to handle keyboard input directly, we simply write an event handler for the KeyDown (or some related) event handler. However, you can still interact with the operating system at a much lower level of abstraction than the VB KeyDown event. Simply keep in mind that the lower you go, the more responsibility you have. Back to the present.

To trap keystrokes before other applications get them, we have to interact with the operating system somewhere between the BIOS' interrupt handler and the high-level KeyDown event. To trap all keys, we are closer to the BIOS, perhaps, than the KeyDown event. Consequently, care must be exercised.

Declaring API Methods

The convenience syntax for declaring an API method is very similar to the notation used in VB6. We need to use the Declare keyword, match the signature of the API method, indicate the library that contains the API method, and optionally, indicate the visibility. For example, to import the SetWindowsHookEx API method, we might write:

Public—Defines the visibility as Public. (Any code can call this method.)

Declare—The keyword that indicates that we are implicitly importing a library method

Function—The library method returns a value

SetWindowsHookEx—The name we'll use in our code

Lib "user32"—Specifies the library that contains the method. (You can find the physical API DLL by searching for user32.dll on your PC.)

Alias "SetWindowsHookExA"—Indicates the real name of the method in the DLL

The rest of the declaration defines the signature of the DLL method. If you look closely at the declaration, you will notice something suspicious—KeyboardHookDelegate. Delegates didn't exist prior to .NET, yet the declaration clearly uses something call KeyboardHookDelegate.

The API method does not use a delegate. The API method actually defines the lpfn argument as a 32-bit integer. The CLR does an excellent job matching the needs of the API—a pointer to a function—with an analogous .NET entity a delegate. Delegates are classes that contain function pointers; however, a delegate is a class that is much more than just the address of a function. A function pointer can be represented as a 32-bit integer, so it is clear that some fudging is done for us to permit a delegate to be passed where only an integer is needed. The net benefit is that we can use more convenient .NET types where previously less convenient raw data types would have been used. Additional declarations are shown in The Complete Code Listing.

Implementing the Keyboard Delegate

To hook the keyboard, we are inserting our method into the address space for the existing low-level handler. This is what we did with interrupt handlers, and we still perform the same basic operation at a moderately higher level of abstraction. As is true with interrupt handlers, we need to hang onto the old handler, and ensure we call it. If we don't call the old handler, we prevent someone else's code from running. This would be rude unless our intention is to prevent someone else's keyboard code from running.

The delegate signature has to play by the same rules as a plain vanilla function pointer. The delegate signature must match an expected signature. Delegates will be invoked with the anticipation and necessity of receiving specific arguments and a return value if one is expected. In our example, the operating system will be calling with two integers and a structure that contains key state information. The caller will be expecting a return value, too. We can name the delegate anything, but as mentioned, the signature must match. The signature of our callback method is defined next.

Delegate—Defines this method signature as a subclass of the System.Delegate type

Function—Indicates that the caller will expect a return value

KeyboardHookDelegate—Is the name of the delegate

Code—Is the name of the first argument, an Integer, that is passed by value

wParam—Is a by-value Integer that we don't need in the example but is commonly found in message methods

lParam—Very important to keyboard hooking; we need a pointer to the keyboard state information. This structure will tell us everything we need to know about the keys being pressed, released, and held. It is important to define this argument ByRef.

As Integer—Indicates that the caller will be expecting an Integer.

We will actually need a method that very closely matches the signature of the delegate. The only point at which we can deviate is the name of the actual arguments. The callback method can use different names for the arguments, but the order and type of the arguments and the method type—function or subroutine—must match exactly.

Hooking the Keyboard

To hook the keyboard, we need to call the SetWindowsHookEx method. We will need a constant indicating what we want to hook, the idHook argument. We need a method that can be called back, the lpfn argument. A handle of the application doing the hooking, which is our application and the hmod argument, and the thread of the process we want to hook.

When hooking the keyboard in .NET, this part of the revision—from VB6–7 to VB.NET—is the most problematic. To facilitate, I have taken an important excerpt from the complete listing, listing 2. That excerpt is provided in listing 1.

Delegates are managed objects in .NET. This means that they are garbage collected. A problem occurs when we pass a delegate to the unmanaged code of the user32.dll API. Apparently, the garbage collector doesn't know that the delegate object is in use and after a short interval—roughly 47 seconds in experiments—the delegate is garbage collected. Consequently, when the API method attempts to call the method represented by the delegate back, a null reference exception occurs. To prevent the delegate from getting GC'd, we need to tag a delegate variable with the System.Runtime.InteropServices.MarshalAsAttribute, passing the enumerated value UnmanagedType.FunctionPtr. This tags the delegate argument, preventing it from being GC'd in an untimely fashion.

The first argument to SetWindowsHookEx is WH_KEYBOARD_LL. The second argument is the tagged delegate that contains the address of our local callback method. The third argument is the handle (hWnd) of the application doing the hooking, and passing 0 for the thread id means that we want to hook the keyboard for all threads.

For all of our efforts, if we forget the MarshalAsAttribute, the code fails miserably. You can read more about COMInterop in my new book Visual Basic .NET Power Coding from Addison-Wesley, available July, 2003.

Trapping Key Combinations

Determining if specific key combinations are being pressed requires some tricky gyrations. (Keep in mind that we are working at a pretty low level here.) This code remains pretty much unchanged from the November article. The basic idea is to read the current key press in the KBDLLHOOKSTRUCT.vkCode. If you need to look for specific multi-key combinations, you may need to call GetAsyncKeyState to determine whether additional keys are being held. For example, we call GetAsynckeyState(VK_CONTROL) in listing 2 to see whether the Ctrl key is being held down.

Unhooking the Keyboard

The return value from SetWindowsHookEx is stored. This is the address of the hook we replaced. We don't discard this value because if we want to let some key combinations slip past our hook, we need to use the return value of SetWindwosHookEx to call the old hook. We also use this value to unhook the keyboard, returning the old hook state, when we are finished holding onto the keyboard handler. Call UnhookWindowsHookEx passing the return value from SetWindowsHookEx to restore the original keyboard hook.

The Complete Code Listing

Listing 2 presents the complete revised listing for VB.NET. Most of this code is more of the same kinds of code that we have discussed already, including some additional methods, declare statements, the KDDLLHOOKSTRUCT, and some useful constants. You can copy and paste the code in listing 2 directly into a module to experiment with it. Call HookKeyboard to begin intercepting the three defined key combinations and UnhookKeyboard to restore the old keyboard state.

Be aware that mistakes may completely lock up your keyboard and you may need to reboot. To prevent this kind of problem, I use the ThreadPool and a separate thread to release the keyboard after 10 or 15 seconds. This strategy has been invaluable while developing low-level code. You can learn more about multithreading here in past and future articles or by picking up a copy of my book, Visual Basic .NET Unleashed, from Sams.

Summary

Run the sample code and you will see that the Windows API is alive and well in .NET. Thankfully, you will need to have very special needs indeed to resort to calling into the Windows API. This is a far cry from VB6, where almost anything useful required interaction with the Windows API.

Disclaimer: The VS IDE hooks the keyboard. You may need to run the sample code outside of the IDE for the keyboard hook API call to succeed.

One of the most important differences between VB6 and VB.NET is the notion of managed code. Code in VB.NET is managed. This means objects can be moved around in memory and garbage collected. Old Windows API methods do not represent managed code. As a result, you may get some quirky behavior when interacting between .NET and the Windows API. If you plan on writing a lot of code that interoperates with the Windows API or COM, I encourage you to pick up a good book on COM Interop and a good advanced book such as my Visual Basic .NET Power Coding from Addison-Wesley that explores these intricate nooks and crannies for you.

About the Author

Paul Kimmel is a freelance writer for Developer.com and CodeGuru.com. Look for his recent book, Visual Basic .NET Power Coding, from Addison-Wesley on Amazon.com. Paul Kimmel is available to help design and build your .NET solutions and can be contacted at pkimmel@softconcepts.com.

# # #

Comments

iVfcOK AD uy rkY KPPs By

Nice

Posted by Schnickelfritz
on 06/27/2012 08:46am

Helped a lot! Thanks for the important tip in the disclaimer: I didn't read it at first - it doesn't work if you try and test it with the VB debug run!
Looking for a part 2 now that explains how to manipulate the key stroke ...

How excatly do I use this?

Posted by AITEE
on 05/07/2007 04:17pm

I understand it. But I put it into a module and tried calling HookKeyboard() Which should initiate it. right? but it says "Decleration expected" Can someone show me an example of how to initialize this. Or just tell me what I'm doing wrong?
Thanks

This Code Should be in a DLL

Posted by RoyK
on 03/11/2005 02:11pm

According to the MS KB, this code should be in a dll because it is called out of process. It may work, but it may not. Why it is any different than any other Call back is beyond me, but that's what the Boys from Redmond said.

Wow what a great post!!

Posted by IcyCode
on 10/10/2004 02:18am

In first case, thanks for the code Paul!!
I'm migrating it to C# (not so hard to migrate at all since the hard work is already done ;-) thanks again).
I was looking for some use of the MarshalAs attribute today (and today I first met it also :).
That's all.
***** five stars

Top White Papers and Webcasts

Many businesses still rely on a legacy telephony infrastructure that is costly and complicated in the mistaken belief that it is more expensive and disruptive to change. These businesses are often slow to adopt new communications platforms that can provide a competitive advantage, decrease costs, and grow with the business. Answer a few simple questions about your organization and get a personalized paper that explores the benefits of cloud-based communications platforms.

Thanks to wide spread cloud hosting and innovations small businesses can meet and exceed the legacy systems of goliath corporations.
Explore the freedom to work how you want, with a phone system that will adapt to your evolving needs and actually save you lots of expense—read Get an Enterprise Phone System without High Cost and Complexity.
The article clearly illustrates:
The only hardware you'll need is phone equipment for advanced voice and fax.
How to join all your employees, mobile devices, …