acroformtool

Name

acroformtool -- quickly change the values of PDF form fields including text fields, checkboxes and radio buttons.

Synopsis

acroformtool -p pdf_file

acroformtool [ -p ] [ -o output_file ] [ -f data_file ] -c field1 -x value1 -c field2 -x value2 ... pdf_file

DESCRIPTION

The tool should work with PDF documents conforming to PDF versions 1.2, 1.3, 1.4, which corresponds to Acrobat 3.0, Acrobat 4.0 and Acrobat 5.0 respectively. It can handle compressed PDF documents with the help of zlib provided the document was compressed using the Deflate algorithm. Support for more compression methods may be added in the future.

PDF forms are somewhat similar to HTML forms in that a PDF form is a collection of fields where each field has a name and a value. There are other similarities as well, but they are irrelevant for our discussion. An important difference is that PDF form fields constitute a tree hierarchy where each field can have a parent, a grandparent, etc. This brings us to the discussion of terminal and non-terminal fields. A terminal field has its appearance defined (that is, the PDF application knows how to draw it) while a non-terminal field does not (it is not drawable by itself). A non-terminal field is ordinarily used to propagate various properties to terminal fields.

When using acroformtool, you should always look for terminal fields. The main characteristic a terminal field is absence of children. A difficulty may arise due to the fact that a terminal field may not have its name defined. Knowing the name is important because this is one of the few things that distinguish one field from another. The name may not be specified because it may be defined in the parent of the field. To alleviate the problem of retrieving the name, the output of the -p option includes the parent identifier. If a terminal field has its name unspecified, you can use the parent identifiers to walk up the field tree stopping as soon as you find a name. That name will also be the name of all of the descendants. If none of the ancestors have a name either, it is likely that the field was not given a name when the document was created. In that case, use Adobe Acrobat to give that field a name. You can of course use Adobe Acrobat to also find the name of a field.

Besides listing field names and their parent identifiers, the -p option also outputs the type of each field, its current value and, if applicable, the names of the appearance states (See the section about checkboxes and radio buttons below). Currently, three field types are supported: text fields, checkboxes and radio buttons.

To change the value of any of those fields, you need to supply the name of the field and its new value. This can be done in two ways. First, you can supply an arbitrary number of -c/-x option pairs (one per each field) on the command line, where -c is used for the name of a field and -x is for the new value. For example, to change the value of a text field named "city" to "Moscow" and check a checkbox named "big_city", one would give the following command:

path/acroformtool -c city -x Moscow -c big_city -x Yes form.pdf

The example above assumes that you have a PDF file named "form.pdf", the file has a PDF form and the form has the two mentioned fields. The result of the command will be placed in "out.pdf" in the directory from which acroformtool was run. But you can change the name of the output file name with the -o option. Note that the word "Yes" refers to the name of the "on" state (see the section about checkboxes and radio buttons below) and may in theory be any other valid string. In practice, however, the value is almost always "Yes".

The other method of supplying the name/value information is through a data file, the name of which can be specified with the -f option. Each line in such a file consists of the field name and its value separated by one or more spaces. Lines starting with "#" are ignored. Using the same field names and values as in the previous example, one could create a file named "input" containing the following data:

# My input file
city Moscow
big_city Yes

You would then change the command line to:

path/acroformtool -f input form.pdf

FIELD TYPES

The details on changing each particular field type are described below. An important detail you need to remember is that all field names are case-sensitive. The names of appearance states are also case-sensitive (this becomes important when working with checkboxes and radio buttons).

Text Fields. The value of a text field is just a string. When changing it, you should remember that "(", and ")" are treated specially by PDF readers. If present, these characters need to be either balanced or escaped. Apart from that, just provide the name of the text field in the -c option (use the -p option to obtain it) and the new string value in -x.

When looking for a text field in the output of the -p option, look for one of type "text" and having no children. If it has a name specified, good. If not, use the parent identifier and get the name from the parent. In many cases, however, the text fields are terminal fields having no parent.

Important: if you get "no Tj in the stream" when changing a text field, open the file in Acrobat, wipe out the old contents of the text field and put in any letter. Then save the file. Try to run the program again.

Checkboxes. First, find the name of the checkbox in question. This will be the argument of the -c option. To turn a checkbox on or off, you need to supply the name of the corresponding appearance state. As you might have guessed, there are two appearance states that a checkbox can have: one that would cause the checkbox to appear checked and one that would cause it to appear unchecked. The name of the "off" state for a checkbox is always "Off". Thus, if you want to turn a checkbox off, simply supply "Off" as the value of the -x option. The name of the "on" state varies and can be just about any string but almost universally it is "Yes". To make sure, use the -p option to get its value. Once you have it, supply it as the argument of the -x option.

When looking for a checkbox in the output of the -p option, look for fields of type "Checkbox" that have appearance states defined. Because of the hierarchical nature of PDF forms, it is possible and rather likely to have a non-terminal field of type "Checkbox" with no state defined. All that field might be good for is getting the name of the checkbox, the appearance states are defined in one of its descendants -- a terminal field.

Radio buttons. PDF radio buttons are similar to checkboxes in that they also have an "on" and "off" appearance state. However there are also several differences. First, according to the PDF specification, the name of the radio button "off" state is not restricted to "Off". Second, PDF radio buttons, just like their HTML counterparts, can form radio button groups. To reiterate what is to many a well-known fact, a radio button group is a collection of radio buttons with the same name and having only one selected value. PDF radio buttons are organized in a sub-tree where each of them has the same parent. The value of the parent is the name of the state of the selected radio button (child). Here again, one possible source of confusion may come from the fact that the children may not have the name specified. In fact, with radio button groups it is the case most of the time. Fear not, just get the name from the parent of the group by using the parent identifier. Once that is done, use the same approach as with a checkbox. Give the name of the field and the state name as the arguments of the -c and -x options respectively.

If a radio button is not part of any radio button group, it behaves just as a checkbox with the already mentioned exception of non-restricted name for its "off" state.

N.B. Although, the name of the radio button "off" state is not restricted to "Off", it is "Off" most of the time. If neither state is "Off" (unlikely), you will need to try each state.

OPTIONS

The options that the program recognizes are described below.

-o

Specify the output file. If not present, the output is set to "out.pdf" in the current directory.

-c

Specify the name of the field to change.

-x

Specify the new value of the field.

-f

Specify the name of the file containing field name/value pairs.

-p

Print information about the fields and their properties including the name, type, state names (if it is a checkbox or radio button) and parent id. Note that all fields have at least one parent -- the form itself. The identifier of the form is output first when you supply the -p option.

AUTHOR

This manual page was written by Sergei Gerasenko (<gerases@users.sourceforge.net>).

REPORTING BUGS

Report bugs to <gerases@users.sourceforge.net>.

COPYRIGHT

This is free software. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.