Introduction

During one of my .NET projects working with Adobe PDF files, I encountered the need to simply retrieve the page count of a specific file. I did not need to manipulate the PDF at all so buying a .NET component for this task sounded a little inconvenient.

After a few hours of researching for an easy solution, I found out that the old regular expressions might hold the answer.

Opening the PDF in Notepad, I noticed that for each page in the file there is a specific character sequence: "/Type /Page" (depending on the PDF version with or without the space between the two words). So, all we need to do is to count how many times this sequence repeats in the file.

Getting It Done !

First, we need to open the PDF file using a FileStream and read the contents as a string using a StreamReader.

Share

About the Author

Vicente Angotti has been working as a software developer for more than 12 years and he is currently an IT project manager for a Federal U.S. Government Agency.
Some of his recent projects include TCP/IP communications, Asynchronous Threading, Image Manipulation and VLDB Design using .NET.

I think I finally found the issue.
If you delete pages from a pdf but do not pack it (reduce pdf size under file menu) the PDF still have the pages in its text. That's why the page count fails.
Now that I found the problem I will try to modify the algorith to detect and ignore the deleted pages.
I'll post a review on this article when I have the solution.