Phil Factor

Phil Factor (real name withheld to protect the guilty) has 30 years of experience with database-intensive applications. Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career.

Author Updates

I was scanning the API of DacFx, the ‘engine’ of SSDT, and became interested in the facility it contains for automating SQL code reviews. DacFx allows you to parse the SQL code sufficiently to do static code analysis, to scan for heresies, deprecated code and code that doesn’t ‘conform to corporate policy’. Dave Ballantyne has used this feature to allow DacFx to detect SQL code smells.
I like ‘code smells‘ when used by programmers to sharpen up their own code. I have, however, al

Three years ago, I listened to a keynote at a developer conference. The man from Microsoft beamed confidently at the vast auditorium and said “I can tell you confidently that in a year’s time, you will all be writing Metro applications for a huge marketplace”. We clapped, but as the keynote proceeded and we saw the details, few of us believed it. I glanced around the auditorium and, despite the semi-darkness, could see developers whispering and shaking their heads. It wasn’t going to happen.

A little while ago Phil got to thinking about his garden, and the myths and misinformation that forced inferior food down the gullets of children. This was an example of bad data, and Phil wants it gone. Below is a video (and transcript) that Phil gave as the keynote of SQL Saturday Exeter. The Transcript
What has spinach got to do with database development? Generations of children were fed spinach in preference to more nutritious food, such as cardboard, through the persistence of ba

In the Windows environment, there seems little safer for application design than a rather staid single-tiered architecture making ODBC/JDBC calls to the RDBMS. I can say this with years of experience in developing applications ranging from the dull but worthy, to the esoteric. However there is an interesting long-term cost to taking the easy route to delivering an application, particularly where the database server ends up evolving into a behemoth: a monster that is shared by a number of appl

The other day, I needed to convert a whole stack of XML files to YAML. Actually , I would have settled for a conversion to JSON, but for some reason, the built-in cmdlet wouldn’t do it. I was trying to figure out a way of doing the YAML conversion when I suddenly remembered I’d actually already published a way of doing it, in ‘Getting Data Into and Out of PowerShell Objects’. With relief, I got the routine out and tried it. It didn’t work, because I’d chosen to display an XML value as a

It is an exaggeration to say that I like stored procedures. They are an essential if somewhat dangerous part of the Sybase and SQL Server landscape, rather like a volcano, bog or swamp. If you use a stored procedure in the same way as a procedure in any other language, you soon end up in the jungle. Nothing quite works the way that your IT education would lead you to expect.
In the beginning was the SQL batch, merely a sequence of SQL queries and statements. It wasn’t recursive, and w

Relational databases aren’t really designed to deal easily with arbitrary sequence, though this is improving with the window functions. Strings and text are sequences. Lists are often sequenced. If you hear people describe an entity such as an invoice in terms of its ordinal sequence ‘the first invoice’ or ‘the fourth invoice’, then you know that it is.
Let’s take a typical ‘sequence’ problem that comes up in science, and occasionally in commerce. We’ll use strings for our

Sometimes you need to know how similar words are, rather than whether they are identical. To get a general measure of similarity is tricky, impossible probably, because similarity is so strongly determined by culture. The Soundex algorithm can come up with some matches but insists that, for example, ‘voluptuousness’ and ‘velvet’ are similar. Family genealogists in Britain would never find a search algorithm that helped them to realise that some branches of the Theobald family spell the

There are eight fallacies about distributed computing, common misconceptions that were first identified at Sun Microsystems in the 1990s, but well-known even before then: With the passage of time, awareness of these fallacies may have faded amongst IT people, so I’d like to remind you of them. They are: The network is reliable. Latency is zero. Bandwidth is infinite. The network is secure. Topology doesn’t change. There is one administrator. Transport cost is zero. The network is ho

I’ve always wanted a SQL function that tells me the longest substring shared between two strings. As a present to myself, I’ve written one. I hope someone else finds it useful.
SQL isn’t particularly good at searching for strings within text. If you prepare things properly by creating inversion tables (inverted indexes), suffix trees or tries so as to allow it to do exact comparisons it is very quick, but this isn’t usually possible because data changes so quickly. You can