Like many others I read a lot of articles, reports, books that come as PDF files. Fairly often, I find myself falling back to a very primitive two part note taking system: system 1. print PDF to paper, then jot down brief margin notes, underline, mark and so on during the first, quick read.system 2. During a second, more thorough reading, type summary, copy&paste key quote sections, add thoughts/comments and so on in notepad or similar. Save notes for "article123.pdf" in "article123.txt" next to the PDF.

This common (perhaps apart from the last detail on naming/saving) system has advantages (+) and drawbacks (-).

system 1: notes on PDF-printout+ quick for short notes+ no computer needed+ notes located right next to relevant source (no separate quoting needed, context is clear)- no programmatic searching, indexing, comparison, editing, tagging- timeconsuming for long notes- physical archive needed- costs related to paper (economical, environmental. Lessened, not removed, by FinePrint)

system 2: notes in separate plain-text filesinverse many +/- above. Also:- rich formats lost when qouting- for long notes, less overview due to plain text. Using manual sectioning (1 , 1.2 , 1.3 ... ) takes time and is error prone.+ independent/portable between computers, operating systems, software and so on

My impression is that many of the applications discussed in this enormous thread replace system 2 with complex software that allow rich text formats, search, indexing, editing, tagging and so on. Such replacement is not my focus here so I'll just say that I've tried many such programs, find many of them impressive but have not yet settled on any single one. Mainly since first making a complete commitment to an application and then later wanting to switch after all could come at a high cost (time, format conversion issues, re-learning and so on). So I keep trying out different software. In the meantime I also often fall back on system 2, complemented with separate rich text files when needed (usually via the Open Office apps)

Now, what I really want to discuss is software that replaces system 1. Specifically, software that emulates and improves on manual notes on/around a printed source text and also ideally connects/combines systems 1 & 2.

I know of three types of software that do some of that: complex pdf-viewers/editors (Acrobat), bibliography tools (EndNote), general information aggregators (OneNote). I don't have a lot of experience with any of these tools but my vague impression is that they are all still pretty crude in this regard. They may lack in customizability, easy portability and/or cross-document searching, indexing, tagging of notes. Maybe the OneNote type of applications are most promising for now, especially combined with input tech like tablet PCs with pens and OCR for hand written text. Such tech is of course needed if we at some time want to import all previous, hand written notes.

Anyway, here are three features I think an future, ideal such system should have:

A. co-location: notes should be inputable/viewable right next to the relevant source section. So underlining, adding margin notes and so should be possible in or in the context of the pdf-viewer, not (only) in a separate document in a separate window. And even additional notes in a separate document should somehow be (hyper)linked to the source text parts that they concern.

C. systematicity: cross-document searching, indexing, tagging and so on of notes should be supported.

Combining A-C seems like a very hard task. B seems to require some standardized format for such notes, a format that different applications can follow or at least import/export from/to.

One way to tackle this design problem would be to find an area with similarities to on-the-page note taking and that already has working solutions with features A-C. An interesting such comparison that I thought of is plain-text .srt-subtitle files for video:http://en.wikipedia.org/wiki/SubRip . These contain a list of paired timestamps and dialog text snippets. Put "movie123.srt" in the same folder as the movie file "movie123.avi" the subtitle was made for, start the video and chances are your media player will recognize and display the subtitle automatically.

Can a similar system be had for notes in PDF files? The .srt system relies on static video files and many pdf files are static enough too. For example pdf versions of professional journal articles. Where .srt files tie content (dialog text) to points in time (timestamps), the pdf notes can instead be tied to page number, X/Y coordinates and other properties of the original pdf text.

With such a system sharing of notes would be very easy if the same source pdf files are already available. Just copy/paste upload/download the note files. Specific sites could host large note archives. Notes could be tied to articles/books through filename and DOI ( http://www.doi.org/ ) or ISBN ( http://en.wikipedia....Standard_Book_Number ) numbers. For a very rough analogy on how such a site could work, see this german site http://www.cutlist.de/ that seems to share timestamps for when to start/stop cropping when removing commercials from PVR recorded programs (my german is bad so I'm not completely sure though).

Also, multiple notes for one single pdf file could potentially be combined and switched on/off individually, like layers in Photoshop ( http://en.wikipedia....igital_image_editing) ). Imagine a pdf viewer window with two panes. The left displays the pdf and also has a row of tabs/checkboxes on top. Someone studying ancient literature has a free pdf version of the Iliad. She then adds and compares on-the-page notes on the Iliad from scholar 1, scholar 2 and so on by clicking on the different note layer tabs. It would be as easy as browsing through the various audio tracks on a DVD disc! The right pane contains more extensive and general notes. Some hyperlinking systems connects these panes, so double clicking on a section in the right pane jumps the left pane the the related pdf section and vice versa.

Also, users could in the pdf viewer customize the display style for one and the same basic note file (like what a .css can do for a .html). For example, switch underline display color from red to green.

Another advantage with such a standardized system is that it would allow different software developers to continually compete and allows users to switch back and forth between various software. The notes aren't locked into one specific software.

Ok, I've written long enough for now. I'd love some feedback on the ideas sketched above. I know they're a bit utopian.

Also, I have a more specific question to all readers of this thread:Do you think that there already is some good software that does something close to A-C for pdf notes?

NoD5 - thank you for a very complete and thought provoking post. I'm going to have to digest it, read it again, and probably again after that before I'll be able to respond! I use Endnote and will try to address your posts WRT it when I have given it some thought.

Hi again, NoD5 - I've re-read (OK, re-skimmed) your post and don't think that Endnote is what you're after at all. However, PaperPort Professional *might* just do it, and it's much cheaper than Acrobat Pro. Only caveat is that, as far as I know, there is no trial. They do have 30 day no questions asked return policy, though, which I have now used twice without any trouble. I am always hesitant to recommend Scansoft/Nuance apps because of their support policy (it's paid support, even if you have to report a bug ALTHOUGH you do get one free incident report with each app, each year... not that that is much of a comfort), but I've been using three of their products for two years now and have had absolutely no problems with them. I have contacted both support and sales on a number of occasions and have never been charged.

Anyway, I just tried PaperPort with a pdf and added a note to it, whic was recognised by Acrobat Reader 8 without issue, athough you can't edit if further within Acrobat Reader. I also verified this by opening the file up on my wife's computer, which does not have PaperPort installed, and the note is still there. Note that if the pdf is protected PaperPort can't do anything with it (ie annotate or add notes to it). There are ways around this, too, of course, but require third-party software to be installed.

Another point in PaperPort's favour, is that it will index and allow you to otherwise manipulate, annotate, and generate all sorts of files other than pdf's. It also comes with Scansoft's SET tools, which I have found to be among the best general image editing tools available for cleaning up images, particularly scanned images. They even work on non-protected pdfs and I use them to do things like remove artefacts from pdf's that I create from scans (or have accepted from other people) - you know, when you get a pdf generated from scanned images of xerox copies of the document? Works REALLY well.

My use of PaperPort is more like semi-regular - I got a great deal on it courtesy of already owning licences for both PDF Converter Pro and OmniPage Pro. I can't really tell you about it's OCR capabilities because it defers to OmniPage Pro for that (the built-in OCR capability is a dumbed down version of OmniPage, I believe). Abby Fine Reader is a fine ( ) OCR app - if you've already got that you don't need anything else if your only interest is OCR.

Actually, one of the issues that I initially had with PaperPort, other than feeling a bit like "what's the point?", was that a lot of its features seem to duplicate those that I already had with PDF Converter and OmniPage. What it adds, though, if you already have these two apps, is indexing and annotating files and the SET tools. It also comes with Stellent viewers so it previews something like 300 file formats. Note, if you DON'T have either of the other two apps, it comes with some of their capabilities. For example, it will allow you to create and manipulate pdf files without PDF Converter installed (I think it includes PDF Creator, which is another Scansoft app). In effect, too, it operates as a dual pane File Manager, though use DOpus 9 myself. Here's a screenshot of PaperPort opening up to give you an idea of what sorts of things it will allow you to do and the interface (note that I don't have the dual pane feature enabled in the screenshot). The pane to the left shows the contents of a folder (they're all nested folders containing pdfs):

Scan in blank forms and let PaperPort automatically detect the form fields so that you can easily fill in the online forms.

What programs come with PaperPort? In addition to the PaperPort program, your software purchase includes these additional programs and components:

FormTyperTM — Available on the Send To bar, FormTyper lets you fill in any type of form that you have scanned as a PaperPort Image (.max) file to the PaperPort desktop. FormTyper automatically recognizes the blank areas in which data can be entered and creates blank fields for the data.

FTP — Available on the Send To bar, the FTP program is included with PaperPort so that you can quickly copy files in PaperPort to and from an FTP site.

PageViewer — Located on the Start menu and available from within PaperPort, PageViewer is a standalone program that opens the Page View window separate from the PaperPort desktop.

ScanDirectTM — Located on the Start menu, ScanDirect displays a small control panel you can use to scan items directly to PaperPort or other programs on your computer without first running PaperPort.

Web Capture — Located on the PaperPort Tools menu, Web Capture lets you quickly capture web pages while you are viewing them online—and place them on your PaperPort desktop as PDF files or as PaperPort Image files, depending on your preference.

Web Publisher — Available on the Send To bar, Web Publisher is a PaperPort program that helps you format your image items for viewing and publishing to a web site.

Index Manager — Available from the PaperPort Tools menu or the System tray. The Index Manager lets you create, modify and schedule indexing tasks for a given local or network folder.

ScanSoft PDF Create! — Available from the PaperPort Print dialog box, and also from Microsoft Word or Excel (2000 or XP) with the Print to PaperPort (PDF) menu item on the File menu. ScanSoft PDF Create! enables you to quickly convert text documents to PDF files.

The file organising features relate to the way PaperPort will monitor folders for incoming files (with extensions that you specify) and then move them into pre-designated folders. It will also allow you to set up profiles so that it will automatically sort scans and other files into pre-designated folders. I haven't used these features, so can't comment on them... Exploring them is on my "to do" list, I just haven't gotten around to it!

There was quite a discussion about Paper Port here - you'll note that I asked a lot of questions, bought Paper Port, RMA'd it, and have subsequently bought it again (I'm fickle - actually, it's because I found it to be a horrible resource hog but after ripping it out of my system and then reinstalling it 2 months later found it quite stable... I ran it for about a week before buying it the second time. I was running it illegally (hadn't gotten around to destroying the download from the first purchase) but considered it a trial period).

I really haven't even scratched the surface of what Paper Port is apparently capable of, either in my posts in this thread or in my playing around with it...

A picture is worth a thousand words - here's a three thousand word essay on File Organisation in Paper Port 11 Professional (should have thought of this before I posted the above, but the above may be useful, too so will let it stand):

Anyway, here are three features I think an future, ideal such system should have:

A. co-location: notes should be inputable/viewable right next to the relevant source section. So underlining, adding margin notes and so should be possible in or in the context of the pdf-viewer, not (only) in a separate document in a separate window. And even additional notes in a separate document should somehow be (hyper)linked to the source text parts that they concern.

To do that, I use Word or OpenOffice "commenting" tool, and I have an older version of acrobat which allows to do pretty much the same thing with pdf files. Comments can be printed or not. I almost never print stuff -- unless I have to edit it for publication.

I find that the solutions mentioned above are pretty portable (between users, computers and even OSs —Windows, OSX, Linux). I have no trouble Reading my comments or notes in Linux, for instance.

C. systematicity: cross-document searching, indexing, tagging and so on of notes should be supported.

I use my own tagging system (that I've been trying to perfect a bit in the last few weeks : I simply use small textual abbreviations that I insert in comments, notes, file names, and that I use to categorise tasks, projects in Outlook or notes in EverNote), I use 2 desktop search programs (Copernic & X1) + Farr (see mouser’s section, here, at DonationCoder), and there are other applications I use a lot -- like EverNote, jedit, etc. As for the crosslinking-hyperlinking, I do it pretty much manually for now.

The secret for “success” is consistency and rigour... And I must admit I've not always been able to be up to the task. That's why I've been bothering people around the forum to find better ways to automate parts of my tagging system. I've trying keepass, tag2find, Clipboard Help + Spell, etc., etc., and I'm now comming back to AHK. We'll see.

Combining A-C seems like a very hard task. B seems to require some standardized format for such notes, a format that different applications can follow or at least import/export from/to.

It might be that I don't understnad you properly, but with X1, Copernic, Farr, the right tagging and file naming system, the right organization structure, the pretty standard "commenting" ability found in many software, and a good note taking software (myBase, EverNote, etc.), a good OCR program and a fast scanner, it doesn't seem like a terribly difficult task. Not easy -- one has to really think about all the different possible evolution of data, media, content, etc. -- but definitely feasible. I actually almost never use paper and my computer is pretty well organized. But I'm not saying there's no room for improvement !

One way to tackle this design problem would be to find an area with similarities to on-the-page note taking and that already has working solutions with features A-C. An interesting such comparison that I thought of is plain-text .srt-subtitle files for video:http://en.wikipedia.org/wiki/SubRip . These contain a list of paired timestamps and dialog text snippets. Put "movie123.srt" in the same folder as the movie file "movie123.avi" the subtitle was made for, start the video and chances are your media player will recognize and display the subtitle automatically.

Interesting. I'll have to check that : I'd like to “comment” video sequences...

Also, multiple notes for one single pdf file could potentially be combined and switched on/off individually, like layers in Photoshop ( http://en.wikipedia....igital_image_editing) ). Imagine a pdf viewer window with two panes. The left displays the pdf and also has a row of tabs/checkboxes on top. Someone studying ancient literature has a free pdf version of the Iliad. She then adds and compares on-the-page notes on the Iliad from scholar 1, scholar 2 and so on by clicking on the different note layer tabs. It would be as easy as browsing through the various audio tracks on a DVD disc! The right pane contains more extensive and general notes. Some hyperlinking systems connects these panes, so double clicking on a section in the right pane jumps the left pane the the related pdf section and vice versa.

Also, users could in the pdf viewer customize the display style for one and the same basic note file (like what a .css can do for a .html). For example, switch underline display color from red to green.

I'm not sure if I really get what you're sketching, but can't you already share comments by different authors in word or OpenOffice -- or even in acrobat, for instance?

I think I can address some of what you requested using UltraRecall (http://www.kinook.com). Here is how...

UR has a feature that I have not seen in any other PIM system. It allows you to link (or store) any document on your system e.g. pdf, doc, xls, OL items, anything. You can actually store the doc within UR database and delete it from the OS. Once done, you can, then, decide to edit this document internally within UR (using its native RTF editor), integrated IE based browser, or externally using the doc's associated default program. Even if you edit it externally, you are still editing the internally stored document. The unique feature is when you save your edits, you can synchronize it and propagate your changes to the doc that is stored in OS folders. If you edit the doc on the OS, UR will see the changes and update its stored doc accordingly, so this is a 2 way sync. Some users keep all their files on the OS, link them into UR for organization, and sync. Some keep a copy inside UR, and a 2nd one on the OS, and keep them in sync. This way they have 2 copies of any given doc all the time.

This feature is great by itself. Furthermore, you can benefit from UR system generated keywords when you link or store a doc. You can also define your own as you wish. The level and complexity of the tagging system is all dictated by you. searching is as-u-type, fast, can be saved, and is very customizable.

As fas comments and notes, UR associate a note page with every doc or item. This note is specific to a single doc and can be displayed right next to the doc itself and resized.

Commenting on PDF docs was something I did all the time using AcrobatReader Pro 5. It did the job beautifully. You can add notes and even make changes to the actual text of the pdf. Armando pointed to the same thing. I also used a product called RepliGo made by cerience.com Simply but, you can print any pfd document to a RG document via its virtual printer driver. You can read it in RG and highlight whatever you like. Each highlight is a comment where you can add your own. You can then view all comments in a doc as a summary. RG is optimized for small screen and use it all the time on my Treo. It is the best solution I've seen. However, it has not be developed in ages and I'm not sure if the company is still interested in maintaining it.

wow - cnewtonne, that sounds COOL. UltraRecall is definitely worth a look... Though, I'm actually on a software "austerity kick" which means that I am uninstalling stuff I am not using like crazy and am NOT installing anything new (if I say it often enough and loud enough it's going to be so... I'm NOT installing anything new, I'm NOT installing anything new,...), but will keep it in mind for when the pendulum swings back toward software mania!

I know what you say and I've been echoing it all along myself. I think we should suggest to Mouser to invite a psychiatrist or a software psychologist to help us cope with with this condition of Information Fatigue Syndrom (IFS). This is characterized by (I think I have all of them)...

- paralysis of analytical capacity ( I feel it all the time)- anxiety and self-doubt (no software is ever good enough)- foolish decisions and flawed conclusions as evidenced by spending money right and left on every PIM out there only to use it for a day or 2 and never again.

Well... I just tried indexing pdf's with Paper Port and had a flashback to one of the reasons for my RMA'ing it back in March: it's unbelievably slow. Archivarius/X1/Copernic and others are able to index pdf's very quickly, Paper Port uses OCR to convert each one to text PAGE BY PAGE. Fortunately, I'm not particularly interested in having Paper Port index my pdf's (I'm happy to let archivarius handle that) but thought I should post back here for NoD5 and Armando. It may be my machine - who knows? I will likely index my pdf's a folder at a time and for big ones (I have a few that are close to a 1000 pages each) might just do them individually as time permits, but it's not really urgent in my case.

EDIT: just to elaborate slightly, I cancelled the indexing of one of my pdf folders (which had 33 main subfolders) after almost 5 hours.

Sorry for posting and then not replying for a while. Great to see so many interesting suggestions. It's hard for me to post back on all the interesting stuff you post about but I'm eagerly reading every bit of it.

urlwolf (anrmado, cnewtonne, ... ), ok, I'll be sure to try the latest Adobe Acrobat trial and see how the note features have evolved.

Urlwolf (and others): You say the notes can be searched. Does that include "cross-document" searching? Also, when using an external indexing search tool like google desktop search, are such notes in pdf files also indexed or only the original pdf text?

Darwin,Yes I'm also sceptical about using EndNote for pdf notes since it can't really do margin notes, underlining and so on on (or overlaying) the pdf pages, right? It can connect separate note files and pdf files in its bibliography database. But maybe the best solution will be a tagteam of Acrobat and some database/organizer tool and so maybe EndNote could be that tag partner.

Great descriptions and screenshots of PaperPort Professional! Sounds like another contender as a pdf note application. And they do have a 15-day trial now it seems ( http://www.nuance.com/paperport/trial/ ) so I'll try that out too. Thanks for the heads up about slow indexing also.

I think the possibility of making notes for edit protected pdf files was one of the things that got me thinking about a system with free-standing note files layered "on top" of the regular pdf files. For personal use it is easy to remove the proctection for most pdf files and I have no qualms about doing so for the sake of entering notes or something like that. But sharing notes by passing around the unlocked files might be another thing... The prospect of just sending a small note file that works for anyone with the original, still protected pdf file is more appealing. Especially if we want, like I want, large, online and free archives of these kinds of margin notes, underlinings and so on. (one obvious problem with that vision, apart from it being a dream and no actual product , is that the document properties needed to position the underlining, margin notes and so on might not be available if the document is in protected mode)

Armando, Interesting what you write about your manual tagging system. I've made some such attempts (not as systematic as what you describe though) but often feel discouraged by my own inconsistencies from too quickly typing the various tags. I think more gui guided/programmatic tagging is the way to go really so I'll check out some of those tools you mention.

Re:hyperlinks: if journal article pdf files consistently cointain doi ID numbers as some type of metadata (I don't know if they do, but it wouldn't surprise me), then maybe some rather simple script could be made to extract doi+page number, make a string and paste into any external note file. Later, selecting that string in the notes and running the script again searches for a doi-matching document, opens it and jumps to the matching page.

"with X1, Copernic, Farr, the right tagging and file naming system, the right organization structure, the pretty standard "commenting" ability found in many software, and a good note taking software (myBase, EverNote, etc.), a good OCR program and a fast scanner, it doesn't seem like a terribly difficult task."

Ok, maybe such a multiple program approach is the way to go. Does apps like myBase, Evernote,... get all their notes indexed by such local search tools? Does the indexing tools also index tags for notes made with the note applications internal tagging tools (in contrast to manual tags as plain text phrases on the notes page)?

cnewtonne,UltraRecall sounds very powerful at least for notes on/in non-pdf documents. If I understand it correctly RepliGo has highlighting only on separate copies of the original pdf and seems very geared toward mobile users. I'll put it on my "to try" list though that i getting long right now so... :-) Are the copies RG makes pure image files or is the text still searchable and so on?

Yes I'm also sceptical about using EndNote for pdf notes since it can't really do margin notes, underlining and so on on (or overlaying) the pdf pages, right? It can connect separate note files and pdf files in its bibliography database. But maybe the best solution will be a tagteam of Acrobat and some database/organizer tool and so maybe EndNote could be that tag partner.

Endnote can't open the pdfs for viewing or editing - it relies on your default viewer to do this. Endnote does allow you to make notes and lists of keywords that are tied to the library record for the pdf, but as these are part of your Endnote library, they have nothing to do with the pdf. The only way to share your notes would be to share the Endnote library, but that relies totally on the other person/people having a recent version of Endnote installed (8 or higher) if you create your library in the latest version. The only other thing Endnote does is to give you the option of storing the pdf as part of your library. This isn't necessary as you can also link to the pdf's location on yourharddrive from within the library record,but this means that the link will be broken if you open the library file from a different location. This is only an issue if you, like me, use Endnote to find references and like the convenience of being able to open the reference as soon as you locate it in Endnote.

A cheaper alternative to both PaperPort ($199 for the Pro version) and to Acrobat Pro ($400 or so) might be something like FoxIt Reader - it will allow you to annotate and highlight your pdfs and there are a couple of shareware upgrades that extend these capabilities. I don't have any experience using these features and don't have FoxIt installed at the moment so can't test this for you right now. Check out the "More PDF Tools" links in the lower left pane of the page I linked to...

As I started writing this, I *thought* Foxit software was charging about $20 for each add-on but I see now that they are $60 - $100... At those prices you might as well look at the serious contenders in the non-Acrobat PDF category - Scansoft PDF Converter Pro, FinePrint PDF Factory (scroll down for the additional Pro features), Jaws, Nitro, etc. The pricing point for all of these seems to be in the $50 to $100 range, depending on whether you go for the standard or pro version. Just as a final note, you can get eXpert PDF 3 for free with the option to get the upgrade to the new version at a reduced price. It looks like the free version will do what you want, though.