disables the majority of plugins (that are
completely useless) hence adobe, should you really be compelled to use it, will start in a wiff

PreciousItem

A free, alternative and quick pdf reader instead of Adobe's monstruosity:Foxit PDF reader: Free and QUICK reader
for PDF documents. You can view and print them with it.
Its small (less than 1MB download.
It doesn't need any lengthy installation, so you can start to run it as soon as you have downloaded it.
It finds text inside the pdf documents, select it and copy.
And it starts up immediately, so you don't need to wait ages for many lengthy useless dlls loading or for an annoying "Welcome" screen to disappear.

And if you really want to create your own pdf-files...PDFCreator easily creates PDFs from any Windows program. Use it like a printer in Word, StarCalc or any other Windows application.

Introduction

Ok, zugegeben: pdf files are a pain in the stomach: cumbersome,
difficult to grep, search, and automate for retrieval, awkward for cut and pasting purposes,
clogging down your computers with the Acrobat overload. But they have also some
positive aspects, of course, hence people still use them,
and you will find USEFUL pdf-files every now and then on
the web - see for instance the very important
altavista's

Evidently -once found- you may want to fiddle with them: use them, catalogue them, grep
them, whatever.But, alas,
pdf files, are annoying: they can be write protected, password protected,
whatever. Searchers should, of course, know how to overcome these small annoyances.
Also, Carpathia pointed out this a useful tool for a lot of multiformat conversions:http://wheel.compose.cs.cmu.edu:8001/cgi-bin/browse/objwebEnjoy!

PDF-related essays

[son_font.htm]:
An essay on identifying and getting ahold of fontsby sonof, february 2003
part of the [Essays],
of the [pdf],
and of the [Targets] sections.
With a script to extract fonts from pdf files

by Ragica, November 1997A Response to +ORC's Message Regarding reversing PDF
the biggest collection ever (in fact
the only collection ever so far!) of information regarding hacking
PDF and links to relevant information
(Kevin Lair CGI-hacks, the GhostScript hack, many good starting point for the USER crack)

The word document is a very lousy format -- it just takes up
too much
space. Now you can convert all your huge documents to liteweight pdfs.
Just follow the following steps blindly :
1) Goto the start menu -> settings -> printers and select 'Add Printers'
2) Select 'Local Printer' when it prompts you for a local or network printer.
3) From the list of printers select any printer ending with 'PS' ,
this indicates that the printer has PostScript support. I generally chose
something like 'HP Color Laserjet 5/5M PS'.
4) Click Next and in the next screen select the Port as FILE:.
5) Click Next again and finish. (Say no to default printer).
6) Download and install GhostView from http://www.cs.wisc.edu/~ghost/
7) Now launch the Word Document that you want to convert to PDF in
winword.
8) In winword select File->Print. In the printer name select the name of
the printer that you just added. and check the option 'Print to file'. Now
click OK.
9) In the 'Print to File' Save As dialog save the file to a folder as filename.prn.
10)Now launch ghostView .
11)Use file->open to open filename.prn in GhostView.
12)Now use file->print. The printer setup dialog is displayed.
13)Select 'device:' as pdfwrite, select 'resolution:' as 300 , select
'Print to File' and click ok, enter the output file name when prompted as
your filename .pdf.
14)Thats it ! you can now view your old word document as a PDF file in
acrobat reader.

There are other tools that may come useful:sfwtools converts anything (also pdf files) to swf

Yet there are many other methods:
One of the simplest ways
to convert into pdf is to use the software OpenOffice which is available at
openoffice.org.

It is a great replacement for Micro$oft Office package. Use the text
editor in OpenOffice and simply click on "export as pdf". Within a few
seconds (depending on the size of your file), the PDF document is
created.

Open office does a lot more.
+ it replaces the entire Micro$soft office package for home users.
+ It dispenses the need for using Word for composing documents
+ it is only a little over 60 MB
+ It is free
+ It is Open source
+ And, again, you can create PDF files by the mere click of a button.
Another possibility is the Win2pdf package, available at http://www.daneprairie.com,
It also installs as windoze print driver, so all you need to do is to print from
any application to create a PDF file. Unregistered versions are free for non commercial use.

Converting an Acrobat
PDF into ASCII

Thank to Wolfgang Redtenbacher for the bulk of the following
advices about
converting an Acrobat PDF
file into ASCII textSeveral
solutions like using Ariel (that can be easily cracked) and
sending an e-mail to "pdf2txt@adobe.com" have been
suggested.

What does not seem to be known widely, however, is the fact that
there exist freeware programs to convert PDF to TXT locally.

One solution is to download Acrobat 4.0 (ca. 6 MB) from
www.adobe.com, plus the accessibility plug-in (ca. 1.2 MB)
from "access.adobe.com". This plug-in permits you to load a
PDF file into Acrobat and save it as .TXT or .HTM.

An even better solution (regarding program size and
conversion
quality) is the program "pdftotext" which is part of the
XPDF-package (a freeware PDF viewer for several operating
systems).

Move these 2 files into a directory that is in your search
path (environment variable PATH= ...), enter the command
"pdftotext xyz.pdf", and within seconds you get an ASCII
text conversion result in the file "xyz.txt" ("xyz" has to be
replaced by the real file name, of course).

NOTE: While the Win32 version of "pdftotext.exe" is more
compact than the DOS version (which contains additional DOS
extender code), it does not work with the widespread DOS
version of "gzip.exe" as it needs gzip with long file name
support. Therefore make sure to use either both programs in
the DOS version or both in the Win32 version. (The DOS version
runs flawlessly on a Win32 platform - it is just a bigger
EXE-file.)

The modern art of HTML2PDF conversion

by Christian Wolfgang Hujer

Hello,
>> where can i find source code for converting html files
>> to pdf format, source code being in java.
several tools exist for that purpose.
I searched in google for "HTML Java PDF Conversion".
I found iText, which is an open source PDF library in Java that already has
capabilities of converting HTML to PDF.
On
http://www.lowagie.com/iText/links.html
are links to several other PDF engines in Java.
iText is an open source project hosted on sourceforge.
But the "modern art of HTML2PDF conversion" is the
following:
1. Make sure the HTML files are valid XHTML (or at least well-formed XML).
2. Use a transformation stylesheet that transforms HTML to XSL Formatting
Objects using XSL Transformation. Apply that stylesheet using an XSLT
processor like xt, xalan, saxon...
3. Run a Formatting Objects engine (like FOP from James Tauber / Apache
Group) that converts the generated FO-Tree from step 2 to PDF.
Most XSLT processors and most Formatting Objects engines are written in Java,
including all mentioned products (xt, xalan, saxon, fop) and come with their
source code.
The following points must be kept in mind:
- - Knowledge
XSLT and XSL:FO are 1-4 new languages to learn (depends on the point of view
and your knowledge, my opinion is that it's four languages altogether: XML,
XPath, XSLT and XSL:FO, but these are quite easy languages)
- - Development speed
XSLT Stylesheets and XSL Formatting are quite easy to learn and very quick to
develop.
It takes only few time to write an XSLT stylesheet.
- - Servlet Usage
XSLT and XSL:FO are usable as servlets.
- - Performance
Java native library performance might be considerably better.
- - The XSLT and XSL:FO is highly configurable. You control nearly every pixel
(resp. point) of the resulting PDF.
But I've never used the non-XSL:FO way, so I can't say much about that.
Just my 2 cents.
Greetings
- --
Christian Wolfgang Hujer

Enabling Print-Challenged PDF Files

Enabling Print-Challenged PDF Files

I've seen a number of
queries recently about printing PDF files when theDocument
Security doesn't allow printing, so I thought I'd pass this
alongbefore I file it with my notes.

With Acrobat
Reader 4.0 all that is required is to enable the Print menu
item,enable it - you print. There is no second check before
it goes to the code thatactually calls the Print Common
Dialog! This is different from Acrobat Reader 3.x,where there
is a second check, which stupidly gives you a Message Box
saying"This Operation is Not Allowed" if you try clicking on
your newly enabled Print function. By saying essentially "You
shouldn't have been able to do this", itgives you the reverser
something to work with to bypass it of course.

The check
for whether printing is allowed occurs as soon as you click on the
Filedrop down menu. I was using the Building Win95 Apps PDF
by Kevin Goodman which isavailable here and there.

The 2nd parameter
specifies the menu item under question and will be either
theidentifier of the menu item if given by the uEnable
parameter MF_BYCOMMAND flag,or the relative position of the
menu item if given by the MF_BYPOSITION flag.The MF_BYPOSITION
flag is normally used.

We can find out the position of
the Print menu item in the drop down list with:

The
GetMenuItemID function which retrieves the menu item identifier of
a menuitem located at the specified position in a menu.

UINT GetMenuItemID(

HMENU hMenu, // handle of
menu int nPos // position of menu item

If a
regular menu item, the Return Value is an identifier.If a
submenu, the Return Value is 0xFFFFFFFF.If a separator, the
Return Value is 0.

If you cycle through the GetMenuItemID
function by setting a breakpoint on the2nd parameter (the 1st
parameter PUSHed in SoftIce), you see an interesting
patternforming. The first item in the drop down list is given
the position # 0 (hex) andan identifier as a return value, the
second #1 and so on, including the separators.

The
following table can be made (and I apologize for the /PRE
formatting spacing):

So, without even going to this trouble we
can deduce what the BYPOSITION positionvalue will be for that
2nd parameter of EnableMenuItem simply by counting thenumber
of menu items, including separators, in the drop down
list.

OK, now what? We know that somewhere between the
time the identifier (1775) isallocated to the Print menu item
by GetMenuItemID, and the EnableMenuItem functionis called,
there is a check to see if this file is actually supposed to be
printable.

So how about doing a TRACE between the two and
see what's going on?

We want the first break to be when the
2nd parameter of GetMenuItemID (the firstparameter PUSHed) is
equal to 5 (the position number of Print in the drop down
list).

The address of the 1st parameter on the function
call stack is given by (ESP+4),the 2nd by (ESP+8), so this
works:

BPX GetMenuItemID IF *(SS:ESP+8)==5

If we
set up a macro to display this address in the data window we can
verify itbroke at the right time:

MACRO Position = "dd
SS:ESP+8"

and the first BPX can become:

BPX
GetMenuItemID IF *(SS:ESP+8)==5 DO "Position"

Break here,
F11 and notice the menu identifier (1775) in EAX and the position
(5)in the first line of the data window.

Set up the
second breakpoint similarly:

BPX EnableMenuItem IF
*(SS:ESP+8)==5 DO "Position" (again, we are looking at the
2nd parameter on the stack)

Then set up the Trace. You
may want to increase the Trace Buffer size from thedefault of
8K.

TASK will give the Taskname Acrord32

BPRW
Acrord32 T will set the trace

Press F5 and you will
break back into SoftIce after the code between the twofunction
calls has been executed. F11 to return to Acrord32. You might
temporarilytoggle out the Register/Data/Code windows with
WR/WD/WC and maximize LINES beforetyping SHOW to display the
trace.

SHOW 1 will show the last command
executed

000001 0137:0054C117 FF159C295700 CALL [USER32!EnableMenuItem]

and you can use the arrow keys to
scroll up and down.

A full screens' worth of this trace
gives a nice screendump with the IcedumpPAGEIN N
c:\filename.txt command.

Looking back through the trace
code, you quickly see a suspicious jump. You canpatch this or
force EAX to 1 a few lines back by changing

SBB EAX,
EAXINC EAX (EAX=0)

into

XOR EAX, EAX
INC EAX (EAX=1)

I won't give the actual
addresses to patch, as that would take the fun out
;-)

Please correct any mistakes I might have
made.

Cheers,
Kayaker

The use of pdf2txt@adobe.com

This is an extract from an email I made some time ago... common knowledge (see also
Wolfgang Redtenbacher's contribution, but hey! it works fine for me!

More and more documents are stored in Adobe's pdf format on the Web.

That may be fine for frill-formatting purposes, but quite annoying for the rest
of us, since pdf files are quite cumbersome for cut & paste and for search & grepping
purposes.
I have realized that many don't know that there's a nice (email) utility by
Adobe itself for those of you that prefer plain *.txt files (that can be searched,
cutted, pasted or grepped ad libitum).

Simply send an email with your pdf files attached (i.e. use the "insert file" option)
to the following email address:
pdf2txt@adobe.com

You don't need to send either text or subjects.

After a couple of minutes: "Hey bingo!" you'll get your text
file emailed back to you (for free of course).

I have been following your site since early 1999, and
am not sure just how active you are, but, I thought I
would send a quick note as an addendum to the PDFFING
page.

Something that was not mentioned was just how simple
getting a non-protected PDF file from a protected PDF
file really is.

First I did some searching and found out that
non-printing was actually up to the PDF viewer. That
means the software is requested by the document to not
print the document, and the software honours the
request. What does this mean? It means that if you
have the source to a viewer, you can tell it to ignore
the request. Just do a search for open source PDF
viewers, and you find xpdf, just like on your PDFFING
page. You now have the source for PDF software.

To create your own PDF software that ignores the
"print disable" information, edit the pdftops.cc file.
Find this section of the file:

Comment out that block, compile, and presto, whammy,
you can convert the the .PDF to .PS using the
pdftops.exe program.

What to do now? Well, all laser printers understand
.PS files, so you can just do a dump from the command
line to the printer, or you can run Adobe Distiller on
the file, and create a new PDF, or you can run
Ghostscript on the file. Anything you want.

ps: To compile on a M$ platform you need either
MSVC(big bucks), or DJGPP(free as in speech). I
didn't try with BCC (free as in beer), but it would
probably work also. The ms_make.bat file that comes
with the source package worked flawlessly for me the
first time.

Gymnast 3.5 (build 149):Gymnast converts text files to Adobe Acrobat PDF format without the need for
any additional Adobe software.
The alternative to the full Acrobat suite (which is relatively
easy to find on the web: search for
KWW500R7150122-128 or for acroba5 .zip / .rar / .ace):
Gymnast supports hyperlinks, annotations, and automatic generation
of bookmarks from headings. You can even include links to Web sites
and other PDF files. Produce professional-looking
documents for the Web. Development stopped end 1999.
Now freeware: Use this key to register your full copy:
GYM03-35672-11110-33170

And just a few days an e-mail from a reader, Kelly Cook, sneaked
into my inbox, claiming to have found a way to trim the fat off
Adobe Reader 6, and improving its load time to more reasonable
levels. Apparently the reader spotted this info on a Mac-related
blog and decided to try it on the Windows version. I have tried it
myself and guess what, it worked.

On my tests, I decided to use the "lowest of the low-end" system:
my Thinkpad 380ed with a Pentium I-MMX class CPU. Before the
liposuction, Adobe Reader 6 took 41 seconds to load (without any
PDF file), after the fat-removal procedure, it took 20 seconds. On
high-end systems, however, the results are more dramatic: Kelly
claims Adobe Reader 6 took over 20 seconds to load on a 1.8 Ghz.
Pentium 4 system, and just under two seconds after the procedure.

So here are the dirty details

Install Adobe Reader 6 :)
From the Start->Run windows menu, Open the "x:\Program
Files\Adobe\Acrobat 6.0\Reader" folder, where x is the right drive
letter.
Find the plug_ins folder and rename it plug_ins_disabled
Create a new folder named plug_ins
Copy the following files from "plug_ins_disabled" to "plug_ins":
EWH32.api, printme.api, and search.api

Of course this will limit the functionality to viewing
non-encrypted pdf files, but that's exactly what I want Acrobat
^B^B^B^B^B Adobe Reader for, 99.9% of the time. You might want to
experiment leaving some of the fat in, I mean, .API files, like
reflow.api and search5.api (if it's there), and see how it affects
functionality and load times.

With the files listed, you get half the load time on low-end
systems, and a 2-sec load time on high-end ones. Still, you might
want to prefer using Acrobat Reader 4.05 on old systems, since it
loads in just seven seconds instead of 20.

Your mileage might vary. Liposuction is a dangerous clinical
procedure. Consult your doctor. All lawsuits and claims should go
not to me, but to our editor and our reader Kelly. ;)

Unexpected features in Acrobat 7
(This article was contributed by Joe 'Zonker' Brockmeier)
Linux users may have been pleased to find that Adobe has finally
made available a new version of its Acrobat Reader, with
accessibility features, a much slicker interface than Acrobat 5.x
and new and other spiffy features. However, there are a few other
features that Linux users should be aware of.
A company called Remote Approach is promising to alert PDF
publishers as to the "reach and use of their materials." We were
curious to find out how Remote Approach was going to make good on
its promise, given that PDF has largely been seen as a one-way
medium. To find out, we created a test account and uploaded a PDF
to be "tagged" by Remote Approach, and then downloaded the
modified document to see whether Remote Approach could log our use
of the document.
Remote Approach's reporting did not work when we viewed the
document with Kpdf, Xpdf and Adobe Reader 5.0.10. It also failed
using Apple's "Preview" application on Mac OS X. The document was
still viewable with no apparent glitch in other PDF readers, but
the reporting function did not work. However, when we opened the
file using Adobe Acrobat Reader 7, Remote Approach started logging
views from our IP address. After doing a little research, we found
that Adobe's Reader was connecting to
http://www.remoteapproach.com/remoteapproach/logging.asp each time
we opened the document. The information is submitted over port 80
using HTTP, so it is unlikely that a home or office firewall
would, in a normal configuration, block the activity, unless the
firewall administrator is attempting to block Web browsing.
Apparently, Remote Approach's "tag" to our document included the
addition of JavaScript code causing Acrobat to report back to
their server; the information reported includes the fact that the
document had been read, our IP address, and which viewer it had
been read in. (Interestingly, Remote Approach does not seem to
recognize the Linux version of Acrobat Reader, as it left the
"User Agent" field blank in its reports.)
What many Linux users may not have realized, since Adobe did not
release an Acrobat Reader 6.x for Linux, is that Adobe has added
JavaScript support to PDF and the official Acrobat readers since
Acrobat 6.x. For those interested in the JavaScript support and
its abilities in Acrobat, see Adobe's scripting reference or
scripting guide. (Both are PDFs, of course.)
By default, Adobe Reader 7 turns on JavaScript, so the "tagged"
document is able to "phone home" without the user's awareness.
Turning off JavaScript disables the document's code, and prevents
Remote Approach (or any other entity) from tracking views of the
document. No doubt, Remote Approach is using features that would
normally be used to submit information from a PDF form.
The inclusion of JavaScript in Adobe Reader 7 for Linux no doubt
provides a number of welcome features for users, but it also
raises some privacy issues. The reader does not inform the user
that information is being submitted, so users are likely to be
oblivious to the fact that another party is aware of their PDF
reading habits. While a user may not find it objectionable to
notify the publisher, there are those of us who don't care to
allow publishers to snoop on activities taking place on our
personal computers.
Lucky for us, there are plenty of alternatives to Adobe's Reader.
Free PDF readers are unlikely to adopt features allowing the
reader to silently phone home in response to code stored within
the document itself. If you must use Acrobat, however, you may
want to have a look at the JavaScript settings first.