The InlineShapes object represents a collection of pictures, OLE objects, and ActiveX controls contained within a given Document or Range object.

The AddPicture() Method

To insert an image into a document or range, we call the AddPicture() method on the InlineShapes collection. This method accepts up to four parameters:

* FileName (required) - The full path and filename of the image to insert.

* LinkToFile (optional) - If true the inserted picture will be linked to the file from which it was created. You'll usually set this to false to make the picture an independent copy of the file. The default value is false.

* SaveWithDocument (optional) - Set to true to save the linked picture with the document. The default is false. This value will be ignored unless LinkToFile is set to true.

* Range (optional) - A Range object representing the position in which to insert the picture.

The AddPicture() method returns a reference to the newly created InlineShape object that is your picture.

So, to insert an image at the start of the range that we defined earlier, we could call the method on our range object...

Thursday, August 20, 2009

The working title is "Automating Windows Applications with Ruby". As the title implies, the focus is on automating applications, usually via Win32OLE/COM.

This is not an "Introduction to Ruby" book: I make the assumption that the reader already has a working knowledge of the language and now wants to put it to good use driving applications and processes on Windows. Some of the topics covered include:

Each chapter will include step-by-step tutorials, and many chapters will also include a reference section covering commonly-used classes, properties, and methods. Some chapters are extensive, while others may be only a few pages in length.

This isn't intended to be an "Everything You'd Ever Want To Know About..." resource, but it is my intention to provide you with a book that will serve as both a quick-start guide and a go-to reference for automating a wide variety of applications.

Wednesday, August 5, 2009

In a previous article, I discussed a method for inserting images into an Excel worksheet. It seems that the Worksheet.Pictures.Insert() method that I demonstrated in that article, though frequently used, is not actually officially documented in the Excel Object Model Reference. An astute reader has called my attention to this fact and, in reply, I hereby present the officially documented---and probably preferred---method for adding an image to a worksheet.

The Worksheet object's Shapes collection includes an AddPicture() method that creates a picture from an existing file and returns a Shape object that represents the new picture. The syntax is:

Sunday, August 2, 2009

Joe Schmoe from Kokomo has a scanned image of a 300-page contract. Joe wishes he could search this file for certain rates and terms, but it's an image, not a text file. OCR might be just what the doctor ordered.

Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic translation of images of handwritten, typewritten or printed text (usually captured by a scanner) into machine-editable text. --Wikipedia

One such OCR solution that you may already have available to you is Microsoft Office Document Imaging (MODI), part of the Microsoft Office suite. Let's look at how you can use Ruby and the MODI API to automate the conversion of a scanned document into text.Installing MODI

MODI might not have been installed when you installed Microsoft Office, so your first step may be to install it from the Office install disks. If installed, you will probably find an icon for "Microsoft Office Document Imaging" located on your Windows Start/Programs menus under "Microsoft Office Tools". If it's not there, go to your Add/Remove Software control panel, select your Microsoft Office installation, and select the option to add features. Then follow the necessary steps, which may vary depending on your version of Windows and Office.

Accessing the MODI API

To begin with, we'll use the win32ole module to create a new instance of the MODI.Document object:

require 'win32ole'doc = WIN32OLE.new('MODI.Document')

Loading the Image

The next step is to call the document object's Create() method, passing it the name of the .TIF file to load:

doc.Create('C:\images\image1.tif')

NOTE: MODI only works with TIFF files. If your image is in another format (.JPG or .PNG, for example), you can use an image editor (such as Paint.NET or Photoshop) or code library (such as RMagick) to convert it to TIFF format.

Performing the OCR

The OCR() method performs the optical character recognition on the document. The mthod can be called without parameters...

LangId: An integer representing the language of the document. English = 9, French = 12, German = 7, Italian = 16, Spanish = 10. This value defaults to the user's regional settings.

OCROrientImage: This boolean value specifies whether the OCR engine attempts to determine the orientation (portrait versus landscape) of the page. The value defaults to true.

OCRStraightenImage: This boolean value specifies whether the OCR engine attempts to "deskew" the image to correct minor misalignments. The value defaults to true.

You may find that tweaking these parameters from their default values produces better results, depending on the individual image(s) you're working with.

Getting the Text

Naturally, you'll want to get your hands on the text produced by the OCR process. Each page of the Document is represented by an Image object. The Image object contains a Layout object; and that Layout object's Text property represents the text for that image/page. So the hierarchy looks like this:

Document =>Images =>Image =>Layout =>Text

To accrue the entire text, simply iterate over the Document.Images collection and grab the Layout.Text values. For example:

No OCR process can guarantee 100% accuracy, but I've found that MODI does a pretty good job recognizing text. Results will vary, of course, depending on the quality of the TIFF image. Note, however, that it cannot preserve formatting of tabular data. So while the text in a series of columns may be produced with a high degree of accuracy, that text will probably be produced with one value per line. So...

apple orange pear

...comes out as...

appleorangepear

Paragraphs of text have, in my experience, been produced with the proper line feeds. Play around with it and see if it meets your needs.

Sunday, July 12, 2009

A reader recently asked how to insert an image from their PC into a cell in an Excel worksheet. So if you have a couple of minutes, I'll demonstrate how to insert an image, specify the exact position, and resize the image.

UPDATE: It seems that the Worksheet.Pictures.Insert() method that I demonstrated below, though frequently used, is not actually officially documented in the Excel Object Model Reference. See this newer article for the officially documented method.

To insert a picture into a worksheet, call the Worksheet object's Pictures.Insert() method, passing it the filename of the image to insert:

pic = ws.Pictures.Insert('C:\Pictures\Image1.jpg')

The Pictures.Insert() method inserts a new Picture object into the Pictures collection and returns a reference to the new Picture object, which we've assigned to the variable pic.

Positioning the Image

OK, now that you've got the picture inserted into the worksheet, you probably want to specify its exact position, perhaps aligned with a certain range of cells. Let's start by defining that range of cells, from cell C3 to cell F5:

range = ws.Range('C3:F5')

We want the picture aligned with the top/left corner of our range, so we'll set our Picture object's Top and Left properties to be the same as our Range object's Top and Left properties.

pic.Top = range.Toppic.Left = range.Left

Resizing the Image

Now that we have the top/left position set, let's move on to specifying the width and height of the picture. The original image has an aspect ratio, which is the image's width divided by its height. An image that is 800x600 pixels could be said to have an aspect ratio of 4:3. When resizing an image, you may wish to maintain its original aspect ratio, to avoid making the image appear stretched in one direction or the other.

By default, a Picture object in Excel has its aspect ratio locked, so that when you change either the width or the height, the other dimension is automatically adjusted to preserve the aspect ratio.

You adjust the width and/or height of a Picture object by setting its---wait for it---Width and Height properties. With the aspect ratio locked, we can set the image to fill the width of the range...

pic.Width = range.Width

...or to fill the height of the range...

pic.Height = range.Height

...but we cannot successfully apply both settings without first unlocking the aspect ratio.

To unlock the Picture's aspect ratio, you set the Picture object's ShapeRange.LockAspectRatio property to false:

pic.ShapeRange.LockAspectRatio = false

Now that we have unlocked the aspect ratio, we can set the picture's width and height to match the range's width and height:

pic.Width = range.Widthpic.Height = range.Height

The Width and Height properties are just numerical values. So you could instead do something like this:

pic.Width = 400pic.Height = 300

And that, my friends, is how you insert, position, and resize an image in Excel.

Let me know if you have any questions or comments, and thanks for stopping by!

Thursday, July 9, 2009

Occasionally, when working with Excel, you may have a need to hide certain columns or rows in a worksheet. As an example, perhaps your worksheet lists revenue for each of 12 months, but your intended recipient only wants to see the columns showing year-to-date totals. Whatever the reason, you'd like to simply hide certain columns (or rows) rather than delete them completely. And since you're here, you'd probably prefer to do that with Ruby code...

Let's set the scene by assuming that your code is working with an open workbook and you want to hide columns in the currently selected worksheet:

To hide one or more columns, obtain a reference to the column range(s) and set the Hidden property to true:

ws.Columns('A').Hidden = truews.Columns('D:K').Hidden = true

Unhiding (displaying) columns works the same way; except, of course, you set the Hidden property to false:

ws.Columns('A').Hidden = falsews.Columns('D:K').Hidden = false

The following code ensures that all columns in a worksheet are visible:

ws.Columns.Hidden = false

Hiding Rows

Rows have a Hidden property, too, so you can do this to hide rows...

ws.Rows('3').Hidden = truews.Rows('7:11').Hidden = true

...and to unhide them:

ws.Rows('3').Hidden = falsews.Rows('7:11').Hidden = false

Hiding Worksheets

But what if you want to hide an entire worksheet? The Worksheet object doesn't have a Hidden property. Instead, you'll use the Visible property. So to hide the 'Great Amish Scientists' worksheet in the active workbook (wb), we'd use this code:

wb.Worksheets('Great Amish Scientists').Visible = false

And to unhide it, we'd toggle the Visible property back to true:

wb.Worksheets('Great Amish Scientists').Visible = true

The following code ensures that all worksheets in a workbook are visible:

wb.Worksheets.each do |ws| ws.Visible = trueend

And that, folks, is pretty much all there is to it. Let me know if you have any questions or comments, and thanks for stopping by!

Once we have our Font object defined to our satisfaction, we can then pass it to our control's set_font() method:

my_control.set_font(my_font)

Changing Colors

Text color is not an attribute of the Font object, but is rather the foreground color of the control. So, to change the color of the text, we call the control's set_foreground_colour() method and pass it a Colour object. We'll make it easy and use one of wxRuby's built-in Colour constants:

Tuesday, June 23, 2009

Microsoft Word is great for text documents. Microsoft Excel is great for tables of data. But, sometimes, you need to get your chocolate in your peanut butter. In other words, you may occasionally need to include a table in a Word document. Let's walk through how to do that.

Setting the Stage

To borrow an example from my upcoming book, let's say that you want to insert a table that contains a list of all movies starring Spencer Tracy and Katharine Hepburn, including the year each movie was released, the title, and the director. Your 2-dimensional films array might look like this:

First, let's move to the end of the document, where we'll add our new table. We do this by calling the Selection.EndKey() method, which moves the selection to the end of a word, line, or document. We'll pass this method a value of 6, which specifies that we want to move to the end of the document:

word.Selection.EndKey(6)

Adding a New Table

The Tables collection in Word's object model represents all the Table objects in a selection, range, or document. To add a new Table to a document, call the Document object's Tables.Add() method and pass it three parameters:

* The range object representing where the table is to be inserted* The number of rows for the new table* The number of columns for the new table

Now let's add a new table with one row and three columns (we'll add more rows later):

table = doc.Tables.Add(word.Selection.Range, 1, 3)

The Tables.Add() method returns a reference to the newly created table, which we have assigned to the cleverly named variable table.

Inserting Text into Table Cells

You can reference a single cell in a table by calling the Cell() method and passing it the row and column numbers (NOTE: the first row or column is represented by 1, not 0). Once you have your cell, you can set the text via its Range.Text property; so we add the header text as follows:

To add a row to your table, simply call the Table object's Rows.Add() method. Now that we've added the header text, let's iterate over our films array and add a new row to the table for each film and insert the text:

Friday, June 12, 2009

Someone recently asked how to get a count of the number of words and pages in a Microsoft Word document. This is done by calling the ComputeStatistics() method on a Range or Document object.

As an example (play along at home), let's imagine that you have a Word document open. Your first step is to use the win32ole library's connect() method to connect to the existing instance of Word:

require 'win32ole'word = WIN32OLE.connect('Word.Application')

You pass the ComputeStatistics() method an integer representing the type of statistic that you want to calculate. In other words, "What do you want to count?" So let's take a moment to define constants for those values:

When called on a Document object, the method accepts an optional second parameter, IncludeFootnotesAndEndnotes, a boolean which (obviously) specifies if the calculation should include footnotes and endnotes:

word_count = doc.ComputeStatistics(WdStatisticWords, true)

The IncludeFootnotesAndEndnotes parameter defaults to false.

Official details on the ComputeStatistics() method are available from MSDN here.

Thursday, June 4, 2009

Microsoft Word uses the Styles model to apply a set of pre-defined formatting to text. Styles can also serve a second purpose, to tag sections of the document as normal, title, headings and such. You can then, for example, create a Table of Contents in Word based on the text that is formatted with the Heading styles.

Naturally, you can do all this with code (otherwise, I wouldn't be wasting your time here). Over the next few minutes, we'll walk through the process of creating a new Style, setting its properties, and then applying that style to text.

The Style object represents a single built-in or user-defined Word style. The Styles collection contains all the Style objects within a document. To reference the Styles collection, simply call the Styles method on the document:

doc = word.ActiveDocumentstyles = doc.Styles

To create a new Style, we call the Add() method on the Styles object and pass it a hash defining the name and the type of the new Style. The following code creates a new Paragraph style named 'Code':

code_style = doc.Styles.Add({'Name' => 'Code', 'Type' => 1})

The Type property defines what StyleType your new Style is based on. Possible values are:

The Add() method returns a reference to the newly-created Style object. Now that you have your new Style object, you can customize it through various properties that you can set. As a starting point, you may want to base your new style on another style by setting the BaseStyle property:

Now that you've created your own Style, you might want to automatically apply it to some existing text. The following code iterates over each paragraph in the document (doc variable). For each paragraph that uses the 'Preformatted Text' style, the new 'Code' style is applied instead:

Sunday, May 24, 2009

I've previously mentioned Lars Christensen's One-Click Ruby Application Builder, a "compiler" for Ruby scripts that shows a lot of potential (and current value). There's not much documentation yet, but folks are installing it, using it, and providing feedback. It works right out of the box for many purposes, but here are a few things to keep in mind as you use it...

"Failed to create directory" Error

One user reported receiving the error "Failed to create directory" when running the compiled executable. As a possible workaround, try running ocra.rb from the directory where your script is located. For example, instead of running...

ocra.rb "C:\code\rubyscripts\application.rbw"

...navigate to the "C:\code\rubyscripts" directory, then run:

ocra.rb application.rbw

Note that while this resolved the problem on my machine, it didn't help the person who originally reported the problem.

Require RubyGems

When you compile your script (which uses one or more gems) with OCRA and then run the executable, you may receive a 'no such file to load' error. Try adding the line...

require 'rubygems'

...to the top of your script, above the other require statements. I've found that a script that runs pre-compiled without this statement may not execute once compiled.

Compile Without Running

I mentioned earlier that it would be nice to have an option to avoid fully running the script, similar to RubyScript2Exe's exit if RUBYSCRIPT2EXE.is_compiling? idiom. A tip of the hat goes to reader "BackOrder", who offered the following solution:

exit if Object.const_defined?(:Ocra)

Put all your require statements at the top of your code, followed by this line.

OCRA and Mechanize: "libxml2.dll not found"

There may be a problem compiling a script that requires the mechanize gem. Running the compiled executable resulted in a "libxml2.dll was not found" error on my machine.

Note that the above observations relate to version 1.0.2 (current as of this writing) of the ocra gem.

Next, we'll call the WIN32OLE_EVENT.new() method to create a new OLE event object. You pass this method an OLE object---our Workbook object---and the name of the event sink. In this instance, we want to use the WorkbookEvents sink:

ev = WIN32OLE_EVENT.new(wb, 'WorkbookEvents')

Once you have your event sink defined, you call its on_event() method to hook into a particular event and run a block of code when that event fires. In our scenario, we want to take action when the SheetSelectionChange event fires.

Thursday, May 14, 2009

I recently mentioned the fact that RubyScript2Exe 0.5.3 doesn't play well with recent versions of RubyGems. At the end of that post I mentioned that there are alternatives emerging for creating executables from your Ruby code, including Lars Christensen's OCRA, the One-Click Ruby Application Builder.

I've had a chance to take OCRA for a short test drive and it looks promising. Like Erik Veenstra's RubyScript2Exe, OCRA lets you "compile" your ruby code into an EXE file that you can distribute to others, without requiring that the users have Ruby installed on their PCs. To quote the OCRA page:

The executable is a self-extracting, self-running executable that contains the Ruby interpreter, your source code and any additionally needed ruby libraries or DLL.

OCRA can be installed via RubyGems: open a console window and type:

gem install ocra

It's also available from the RubyForge page. Version 1.0.2 is the latest as of this writing.

To create a non-console application, either use the --windows option or give your script the .rbw filename extension. Compiling a basic non-console script is as simple as opening a console window, navigating to the folder containing your script, and typing:

ocra.rb myscript.rbw

OCRA will then run your script to check for dependencies, gather the necessary files, and create your executable. An option to avoid fully running the script would be nice, similar to RubyScript2Exe's exit if RUBYSCRIPT2EXE.is_compiling? idiom.

OCRA uses LZMA compression, so the resulting executable is relatively small. A simple wxRuby app, for example, resulted in a 2.5 Mb executable. The same app compiled with RubyScript2Exe weighed in at over 9 Mb. Apps that do not require a GUI toolkit will be even smaller, perhaps 500k.

I haven't spent a lot of time with OCRA yet, but I think it shows great potential and I want to thank Lars Christensen for his efforts. If you have a need to distribute Ruby programs to non-technical users, you should check it out and pass your feedback along to Lars.

Sunday, May 3, 2009

If you use RubyScript2Exe, you should note that the current version (0.5.3) doesn't appear to play well with the new version (1.3.2) of Rubygems released a few weeks ago. Attempting to compile a script results in the error:

Perhaps Erik will update RubyScript2Exe to utilize the Gem.loaded_specs method, as suggested by Eric Hodel.

In the meantime, RubyGems 1.3.1 continues to work well for me with RubyScript2Exe. Be aware, however, that Daniel Berger points out in the above thread that version 1.3.2 addresses a couple issues related to the win32-file library.

There are alternatives emerging for creating executables from your Ruby code, including ocra and crate. I haven't had the opportunity to try either of these yet, but plan to in the near future. If you have experience with these or similar tools, feel free to share your comments here. Thanks!

Calling the ole_obj_help method on a WIN32OLE object returns a WIN32OLE_TYPE object. The WIN32OLE_TYPE object includes a name attribute. Calling this on an Excel application object...

xl = WIN32OLE.new( 'Excel.Application' )xl.ole_obj_help.name

...returns the string '_Application', while calling it on a Worksheet object...

xl.ActiveSheet.ole_obj_help.name

...returns the string '_Worksheet'.

Q2: How can I tell whether an object instance supports a particular method?

The code:

object.ole_methods.collect!{ |x| x.to_s }.include?( 'MethodName' )

The explanation:

As mentioned previously here, calling the ole_methods method on a WIN32OLE object returns an array of WIN32OLE objects that represent methods that can be called on that object. You can convert each object in the array to a string, using the collect! method. Then it's a simple matter to call the include? method to see if the resulting array of strings contains a certain value.

And so we can use this to find that the Excel application object includes a Quit method...

Wednesday, January 28, 2009

As a loyal reader mentions in the comments to my article about Formatting Worksheets, he wanted to apply different formatting to separate words in a single cell. Specifically, he had cells containing a surname, a comma, and then first name, such as:

Jeter, DerekMatsui, HidekiTeixeira, Mark

His goal was to format the last name in bold, while leaving the first name in normal weight font:

Jeter, DerekMatsui, HidekiTeixeira, Mark

(BTW, the sample names are mine. How many days until Opening Day? Or at least until pitchers and catchers report? But, alas, I digress...)

My suggested solution was to use the Characters object to get a reference to a specific subset of characters within the cell, rather than to all of the cell's text.

Call the Characters(start, length) method on a Range object (usually a cell) to return a reference to the specific characters. Note that the Characters object's index is 1-based.

Once you have your subset of characters you can apply Font formatting much like you would with a Cell object, setting the Font object's Name, Size, Bold, ColorIndex, and other properties.

So, assuming you have a worksheet object, ws, and you want to format the values in cells A1 through A3 as described above, try the following code:

Wednesday, January 21, 2009

The same goes for Windows: some Railers would say that you need to give up on Windows completely and simply go to Ubuntu or Mac OS X. I know for a fact that there are lots of people simply unable to do just that. And “give up their jobs” – as some kindly suggest – is not an option. Some of the people that are starting Rails in closed-minded companies are exactly the seeds those companies need to start to change their minds and every time we, Railers, tell them to just move to another job, we are killing a precious opportunity to introduce Rails to those companies.

Please keep this in mind: there are several circumstances that keep people locked to Windows. Specially if this person works for a closed minded company that won't allow him to use Ubuntu. The instant you tell this person "give up, Ruby is unable to be used under Windows", you lose the single person that could actually bootstrap Ruby adoption inside this company. People have to start somewhere and sometimes Windows is this way. So let's help them get started and one of three things will happen: 1) he will be super happy and will evangelize Ruby to his peers, leading to mass adoption on this company; 2) he will lose hope on his employer and will try to find another job; 3) he will simply give up and none of us lose nothing.

You should read Fabio's article and the comments at both links above; even if, like me, you don't use Rails as much as you use Ruby itself.

It's very easy (and very fashionable -- all the cool kids do it!) for the Mac OS and Linux crowds to take shots at the Windows developers. And I tip my hat to those that actually changed jobs just so that they could avoid working with Windows. Good for you.

But some of us actually like our jobs and our companies, despite having to work with -- Gasp! Cover the children's ears, Mother! -- Windows. Besides, as Marcos Ricardo commented on Fabio's site, "We can never throw way 90% of market share."