I have read many patent prosecution histories (on PAIR), but have never seen one that references an open-source codebase as grounds for a rejection. Instead, examiners have a tendency to rely heavily on earlier patent applications, even when they are not well-suited to documenting a rejection. Does anyone (perhaps, an actual examiner?) know if USPTO examiners actually look at existing open-source codebases for relevant prior art?

4 Answers
4

In general, patents/applications are heavily favored as you noticed. One contributing factor is that the search system (EAST) is the subject of a good amount of examiner training and is actually pretty good since it has a lot of operators that let you really search down to the sentence level. Another factor is that patents somewhat have their own lexicon since most of the times attorneys are the ones writing the applications.

That being said, the PTO has access to a lot of "non-patent literature" databases (NPL) which Examiners do use at times, although not as much as they probably should. One main reason (I found) for not using NPL, is that NPL doesn't neatly lay out entire systems to the extent needed. Previous patents tend to have a lot of language in them that helps hammer out the easy parts of claims when making a rejection. Most technical sources focus on the core functionality. This is especially true for software since claims are written more in terms of structure rather than pure algorithms.

Finally, it's really hard to search codebases for general ideas, especially if the codebases are not well documented. It's very tough to map a plain-english statement to a block of code in a way that will convince the attorney/applicant that it's truly invalidating. One analogous example of this is when a software patent claim incorporates an equation which is very difficult to search for in existing sources.

If you do a little searching, you'll definitely see many non-examiners complaining about the lack of NPL cited during prosecution, which has does have merit. It basically comes down to practical terms. Examiners have a very limited amount of time to do their jobs (unfortunately, and not their fault), so the quickest way to get things done will always prevail.

Side note: if you look at file histories check out "Examiner's search strategy and results" and "Search information including classification, databases and other search related notes" to see what the Examiner is doing in terms of searching. (I still can't believe I used to know what all those EAST search keywords did).

First, the search tools that we examiners have are tuned for searching natural language, not source code, so it's far easier to find natural-language prior art than source code prior art.

And your question assumes that most patent examiners who handle software-related applications are proficient at reading source code. Most of us are not. And there are even fewer of us that can read multiple programming languages proficiently, let alone read every language that an open source software project might use. I know C(++), Python and R pretty well, but I commonly examine applications in scientific computing, and a lot of that software is written in FORTRAN. It's more difficult for me to read the FORTRAN source code than it is to find and read the journal article that the author published about the program.

Even if I am absolutely sure that a certain program has implemented a procedure that's being claimed, and even if I have access to the source code of that program, and even if I am able to establish a clear prior art date of that source code, and even if that source code is written in a programming language I am comfortable reading, I still am very unlikely to cite that source code as prior art. The people that we write for (attorneys and other patent examiners) rarely have experience reading source code, so it takes even longer to explain the code than just cite a source that explains it in natural language; a better document is something like an API reference or software documentation.

Searching source code for a particular concept is extremely time-consuming, even for well-documented code. Citing it in a rejection is therefore correspondingly rare.

I've cited source code on a couple of occasions when I was fairly certain that the software in question did something in a particular way and digging through the source wasn't too hard. Even then, the source code was used to get a small detail in a dependent claim, and not the core inventive concept.

Also it is more time consuming to cite any code or NPL since we have to have dates in order to use the content as prior art. Many times you can not verify the the date that the content was published and therefore it can not be used.