Two Office2007 lessons learned–one useful, one maybe not

My wife and I seem to be in the minority in (a) liking Vista, although we’re both using notebook computers with integrated graphics, (b) MUCH preferring Office2007 and specifically Word2007 to predecessors, even with the small unlearning curve.

(Actually, I suspect that tens of millions of people like Vista just fine, but PC Magazine, PC World, and of course our friends at Apple seem devoted to convincing us that none of us can stand it…oh, and that you have to have $600 graphics cards for it to run. BS, but what’cha gonna do? Incidentally, for anyone trying to tell me I really should switch: My notebook cost half what the lowest-end Mac notebook would cost. It has 3GB RAM, 250GB hard disk, dual-layer DVD burner and an Intel Core 2 Duo brain. I’m very much more on a budget in “semi-retirement.” You’re going to have a really hard time convincing me that I’m losing out.)

Anyway, as I was saying…

I’m making great progress on The Liblog Landscape 2007-2008: A Lateral Look. I may have a full draft just about ready to print out and review (after which there may be a lot more work ahead).

In the process, I learned two lessons that I found interesting and that may or may not be new to Word2007/Office2007:

Embedding an editable graph

I’m using a few graphs in this book–sixteen at this point, although that could change. (When I say “graphs” I mean graphs–scatter plots and line graphs, some with logarithmic scales. There are many, many tables.) In preparing a graph, I was usually either taking one column of my master spreadsheet and preparing a pivot graph (that is, taking the count of each unique value in the column, then making a graph of count vs. value–e.g., for number of posts in each liblog) or taking two columns and preparing line graphs of both values or scatter plots comparing one to the other.

When you’re doing graphs with as many as 600 data elements making up each series and you’re putting the results on a 6×9 book page–which really means a body width of 26 picas or 4.5 inches–you learn to make maximum use of the space available. That means doing some experimenting. (Sometimes, unlabeled axes described in text below the graph are extremely useful–they leave more horizontal space for the data!)

I’ve learned to pay attention to that little box that sits there after you do a Paste: It offers all sorts of useful choices. In this case, two of the choices will make the graph fully editable–one of them linking back to the Excel spreadsheet, the other embedding the spreadsheet in the Word document. The first raises cross-program security issues (and is how I managed to wind up with an empty graph at one point); the second works just fine.

Except. When I was somewhere in Chapter 3 or 4, with three or four graphs embedded and with about 15-20,000 words, I noted that the document was up to 1.4 megabytes. Which seemed a trifle high, since a 20,000 word document would typically be around 120K-140K.

Thinking about it, I realized the problem: I was saving a copy of my huge master spreadsheet–huge by my standards, anyway, with 607 rows and between 19 and 30 columns on each of the first three sheets, and other pages for graphs–with each graph.

Easy solution: Create a new spreadsheet for each graph, copy the data columns to that spreadsheet, create the graph, save that as a little tiny Excel file–15 to 30K each, where the biggie is 385K–and use that spreadsheet as the basis for the copy-and-paste.

Actually, it’s also a good solution in that it keeps the data safely removed from the master spreadsheet so the effects of a bonehead move are minimized. Not that I ever make bonehead mistakes…

Net effect: The first half of the book, with 16 graphs and about 35,000 words, was 651K–and the whole book (including the 607 liblog profiles), currently 265 pages long, is 885KB. (The PDF version is over 3MB–but that’s using Microsoft’s PDF driver, which doesn’t optimize nearly as well as Acrobat, but I haven’t purchased a contemporary Acrobat yet.)

It’s not that Word can’t handle multi-megabyte documents. It can–but it does slow Opens and Saves down (and possibly some other activities), and it’s just not necessary.

The one that may rarely be useful…

This one really isn’t new to Word2007, except to the extent that people who didn’t use Styles before are more likely to use them now. I’ve always used Styles, so that’s not a biggie.

Anyway: A key set of changes was to get the library profiles section down to size–it was close to 200 pages on its own, I knew I needed to add a sentence to most profiles, and I want the whole book to be under 300 pages. I asked a question a few weeks ago and am grateful for the unanimous response–but one of the responses gave me another idea. John Dupuis sad something about using smaller type, and I realized that the profiles really aren’t narrative and probably would work well with 10-point rather than 11-point type. For that matter, the tables–and there are literally hundreds of tables, many of them (unfortunately) split across pages–might work better with 10-point type, as it opens them up a little bit.

I’d added two new styles: “f10” (first paragraph but with 10-on-12 instead of 11-on-13 type) and “n10” (normal but with 10-on-12 instead of 11-on-13). I’d gone through and changed tables in the first 10 chapters from First to f10.

For the second half, though, it made sense to do global changes: The only styles (other than headings, subheadings and blognames) were First and Normal, and I wanted all of those to become F10 and N10 respectively.

There are two ways to do that. Unfortunately, I didn’t think of the second way until I’d tried the first. Well, “unfortunately” only cost five minutes, since I’d saved the file beforehand (and I have Word set to autosave every ten minutes anyway).

The first method, also known as “Doing it Wrong”: Select all instances of First or Normal in a 36,000-word document with literally thousands of short paragraphs (each line of a table is a paragraph), then click on the new style.

Word just ground away…actually, it managed to do it on Normal–but went unresponsive on First. I suspect it would eventually have given me a result, but I did a forced shutdown and restart.

The second method, also known as “Thinking“: Do a global replace, using the Style for the search and replace. That took maybe five seconds to handle a few thousand instances…

I suppose the question is: How often would you be doing this sort of thing?

And here’s a tease on the content itself:

As I was updating the liblog profiles (with quintile numbers and a phrase describing those numbers), I thought about “typical liblogs”–that is, liblogs that fall into the third (middle) quintile on all of the metrics, or even just the four key 2008 metrics (posts, words per post, comments per post, figures per post).

Know what? I don’t think there are any. I don’t believe I spotted a single blog that is in Q3 on all four measures. I’ll check again, using the spreadsheet itself. If that’s true, it’s interesting (although not terribly meaningful): There may literally be no such thing as an average liblog, even if you define “average” pretty broadly. (What would it take? For a measured quarter–in this case, March-May 2008–14 to 25 posts averaging 217 to 288 words each with 0.71 to 1.30 comments per post and 0.31 to 0.52 figures/illustrations per post. The comment and figures quintiles are tricky, because they exclude cases where there just aren’t any.)

Update, later the same day: I checked the spreadsheet itself. There are not, in fact, any liblogs that fall into the middle quintile on all four categories. Closest, oddly enough, is ALA Marginalia, a relatively new blog–it’s in the middle quintile on three categories and just short of the third quintile for average figures per post. For an “average blog” it’s about as distinctive as you can get…

*All Cites & Insights PDF ebooks are explicitly site-licensed for
mounting on a library's server and providing to authenticated users. That
includes The Gold OA Landscape 2011-2014, A Library Is..., Beyond the
Damage and any others.