Staplr PDF Functions
Staplr PDF Functions
1
2 get_fields
R topics documented:
get_fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
idenfity_form_fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
remove_pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
rename_files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
rotate_pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
rotate_pdf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
select_pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
set_fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
split_from . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
split_pdf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
staple_pdf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
staplr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Index 17
Description
If the toolkit Pdftk is available in the system, it will be called to get form fields from a pdf file.
See the reference for detailed usage of pdftk.
Usage
get_fields(
input_filepath = NULL,
convert_field_names = FALSE,
encoding_warning = TRUE
)
Arguments
input_filepath the path of the input PDF file. The default is set to NULL. IF NULL, it prompt
the user to select the folder interactively.
convert_field_names
By default pdftk will encode certain characters of the field names in plain text
UTF-8 so if using a non-latin alphabet, your field names might be illegible.
Setting this to TRUE will turn the UFT-8 code into characters. However this
process it not guaranteed to be perfect as pdftk does not differentiate between
encoded text and regular text using escape characters. If you have field names
that intentionally include components that look like encoded characters this will
attempt to fix them. Use this option only when necessary. If TRUE, remember
to set it to TRUE when using set_fields as well.
idenfity_form_fields 3
encoding_warning
If field names include strings that look like plain text UTF-8 codes, the function
will return a warning by default, suggesting setting convert_field_names to
codeTRUE. If encoding_warning is FALSE, these warnings will be silenced.
Value
A list of fields. With type, name and value components. To use with set_fields edit the value
element of the fields you want to modify. If the field of type "button", the value will be a factor. In
this case the factor levels describe the possible values for the field. For example for a checkbox the
typical level names would be "Off" and "Yes", corresponding to non checked and checked states
respectively.
Author(s)
Ogan Mancarci
References
https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/
See Also
link{set_fields}
Examples
## Not run:
pdfFile = system.file('testForm.pdf',package = 'staplr')
fields = get_fields(pdfFile)
## End(Not run)
Description
Helps identification of text forum fields by creating a file that is filled with field names. Some pdf
editors show field names when you mouse over the fields as well.
Usage
idenfity_form_fields(
input_filepath = NULL,
output_filepath = NULL,
overwrite = TRUE,
convert_field_names = FALSE,
encoding_warning = TRUE
)
4 remove_pages
Arguments
input_filepath the path of the input PDF file. The default is set to NULL. IF NULL, it prompt
the user to select the folder interactively.
output_filepath
the path of the output PDF file. The default is set to NULL. IF NULL, it prompt
the user to select the folder interactively.
overwrite If a file exists in output_filepath, should it be overwritten.
convert_field_names
By default pdftk will encode certain characters of the field names in plain text
UTF-8 so if using a non-latin alphabet, your field names might be illegible.
Setting this to TRUE will turn the UFT-8 code into characters. However this
process it not guaranteed to be perfect as pdftk does not differentiate between
encoded text and regular text using escape characters. If you have field names
that intentionally include components that look like encoded characters this will
attempt to fix them. Use this option only when necessary. If TRUE, remember
to set it to TRUE when using set_fields as well.
encoding_warning
If field names include strings that look like plain text UTF-8 codes, the function
will return a warning by default, suggesting setting convert_field_names to
codeTRUE. If encoding_warning is FALSE, these warnings will be silenced.
Examples
## Not run:
pdfFile = system.file('testForm.pdf',package = 'staplr')
idenfity_form_fields(pdfFile, 'testOutput.pdf')
## End(Not run)
Description
If the toolkit Pdftk is available in the system, it will be called to remove the given pages from the
seleted PDF files.
See the reference for detailed usage of pdftk.
Usage
remove_pages(
rmpages,
input_filepath = NULL,
output_filepath = NULL,
overwrite = TRUE
)
remove_pages 5
Arguments
Value
Author(s)
References
https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/
Examples
## Not run:
# This command prompts the user to select the file interactively.
# Remove page 2 and 3 from the selected file.
remove_pages(rmpages = c(3,6))
## End(Not run)
## Not run:
if (requireNamespace("lattice", quietly = TRUE)) {
dir <- tempdir()
for(i in 1:3) {
pdf(file.path(dir, paste("plot", i, ".pdf", sep = "")))
print(lattice::xyplot(iris[,1] ~ iris[,i], data = iris))
dev.off()
}
output_file <- file.path(dir, paste('Full1_pdf.pdf', sep = ""))
staple_pdf(input_directory = dir, output_filepath = output_file)
input_path <- file.path(dir, paste("Full_pdf.pdf", sep = ""))
output_path <- file.path(dir, paste("trimmed_pdf.pdf", sep = ""))
remove_pages(rmpages = 1, input_path, output_path)
}
## End(Not run)
6 rename_files
Description
Rename multiple files in a directory and write renamed files back to directory
Usage
Arguments
input_directory
the path of the input PDF files. The default is set to NULL. IF NULL, it prompt
the user to select the folder interactively.
new_names a vector of names for the output files.
Value
Author(s)
References
https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/
Examples
## Not run:
#if the directory contains 3 PDF files
rename_files(new_names = paste("file",1:3))
## End(Not run)
rotate_pages 7
Description
If the toolkit Pdftk is available in the system, it will be called to rotate the given pages of the seleted
PDF files
See the reference for detailed usage of pdftk.
Usage
rotate_pages(
rotatepages,
page_rotation = c(0, 90, 180, 270),
input_filepath = NULL,
output_filepath = NULL,
overwrite = TRUE
)
Arguments
rotatepages a vector of page numbers to be rotated
page_rotation An integer value from the vector c(0, 90, 180, 270). Each option sets the page
orientation as follows: north: 0, east: 90, south: 180, west: 270. Note that the
orientation cannot be cummulatively changed (eg. 90 (east) will always turn the
page so the beginning of the page is on the right side)
input_filepath the path of the input PDF file. The default is set to NULL. IF NULL, it prompt
the user to select the folder interactively.
output_filepath
the path of the output PDF file. The default is set to NULL. IF NULL, it prompt
the user to select the folder interactively.
overwrite If a file exists in output_filepath, should it be overwritten.
Value
this function returns a PDF document with the remaining pages
Author(s)
Priyanga Dilini Talagala
References
https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/
8 rotate_pdf
Examples
## Not run:
# This command prompts the user to select the file interactively.
# Rotate page 2 and 6 to 90 degrees clockwise
rotate_pages(rotatepages = c(3,6), page_rotation = 90)
## End(Not run)
## Not run:
if (requireNamespace("lattice", quietly = TRUE)) {
dir <- tempdir()
for(i in 1:3) {
pdf(file.path(dir, paste("plot", i, ".pdf", sep = "")))
print(lattice::xyplot(iris[,1] ~ iris[,i], data = iris))
dev.off()
}
output_file <- file.path(dir, paste('Full_pdf.pdf', sep = ""))
staple_pdf(input_directory = dir, output_file)
input_path <- file.path(dir, paste("Full_pdf.pdf", sep = ""))
output_path <- file.path(dir, paste("Rotated_pgs_pdf.pdf", sep = ""))
rotate_pages(rotatepages = c(2,3), page_rotation = 90, input_path, output_path)
}
## End(Not run)
Description
If the toolkit Pdftk is available in the system, it will be called to rotate the entire PDF document
See the reference for detailed usage of pdftk.
Usage
rotate_pdf(
page_rotation = c(0, 90, 180, 270),
input_filepath = NULL,
output_filepath = NULL,
overwrite = TRUE
)
Arguments
page_rotation An integer value from the vector c(0, 90, 180, 270). Each option sets the page
orientation as follows: north: 0, east: 90, south: 180, west: 270. Note that the
orientation cannot be cummulatively changed (eg. 90 (east) will always turn the
page so the beginning of the page is on the right side)
rotate_pdf 9
input_filepath the path of the input PDF file. The default is set to NULL. IF NULL, it prompt
the user to select the folder interactively.
output_filepath
the path of the output PDF file. The default is set to NULL. IF NULL, it prompt
the user to select the folder interactively.
overwrite If a file exists in output_filepath, should it be overwritten.
Value
Author(s)
References
https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/
Examples
## Not run:
# This command prompts the user to select the file interactively.
# Rotate the entire PDF document to 90 degrees clockwise
rotate_pdf(page_rotation = 90)
## End(Not run)
## Not run:
if (requireNamespace("lattice", quietly = TRUE)) {
dir <- tempdir()
for(i in 1:3) {
pdf(file.path(dir, paste("plot", i, ".pdf", sep = "")))
print(lattice::xyplot(iris[,1] ~ iris[,i], data = iris))
dev.off()
}
output_file <- file.path(dir, paste('Full_pdf.pdf', sep = ""))
staple_pdf(input_directory = dir, output_file)
input_path <- file.path(dir, paste("Full_pdf.pdf", sep = ""))
output_path <- file.path(dir, paste("rotated_pdf.pdf", sep = ""))
rotate_pdf( page_rotation = 90, input_path, output_path)
}
## End(Not run)
10 select_pages
Description
If the toolkit Pdftk is available in the system, it will be called to combine the selected pages in a
new pdf file.
See the reference for detailed usage of pdftk.
Usage
select_pages(
selpages,
input_filepath = NULL,
output_filepath = NULL,
overwrite = TRUE
)
Arguments
selpages a vector of page numbers to be selected
input_filepath the path of the input PDF file. The default is set to NULL. IF NULL, it prompt
the user to select the folder interactively.
output_filepath
the path of the output PDF file. The default is set to NULL. IF NULL, it prompt
the user to select the folder interactively.
overwrite If a file exists in output_filepath, should it be overwritten.
Value
this function returns a PDF document with the remaining pages
Author(s)
Granville Matheson, Priyanga Dilini Talagala
References
https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/
Examples
## Not run:
# This command prompts the user to select the file interactively.
# Select page 3 and 6 from the selected file.
select_pages(selpages = c(3,6))
set_fields 11
## End(Not run)
## Not run:
if (requireNamespace("lattice", quietly = TRUE)) {
dir <- tempdir()
for(i in 1:3) {
pdf(file.path(dir, paste("plot", i, ".pdf", sep = "")))
print(lattice::xyplot(iris[,1] ~ iris[,i], data = iris))
dev.off()
}
output_file <- file.path(dir, paste('Full_pdf.pdf', sep = ""))
staple_pdf(input_directory = dir, output_file)
input_path <- file.path(dir, paste("Full_pdf.pdf", sep = ""))
output_path <- file.path(dir, paste("trimmed_pdf.pdf", sep = ""))
select_pages(selpages = 1, input_path, output_path)
}
## End(Not run)
Description
If the toolkit Pdftk is available in the system, it will be called to fill a pdf form with given a list of
fields. List of fields can be acquired by get_fields function.
See the reference for detailed usage of pdftk.
Usage
set_fields(
input_filepath = NULL,
output_filepath = NULL,
fields,
overwrite = TRUE,
convert_field_names = FALSE,
flatten = FALSE
)
Arguments
input_filepath the path of the input PDF file. The default is set to NULL. IF NULL, it prompt
the user to select the folder interactively.
output_filepath
the path of the output PDF file. The default is set to NULL. IF NULL, it prompt
the user to select the folder interactively.
fields Fields returned from get_fields function. To make changes in a PDF, edit the
values component of an element within this list
12 split_from
Author(s)
Ogan Mancarci
References
https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/
See Also
get_fields
Examples
## Not run:
pdfFile = system.file('testForm.pdf',package = 'staplr')
fields = get_fields(pdfFile)
set_fields(pdfFile,'filledPdf.pdf',fields)
## End(Not run)
split_from Splits single input PDF document into parts from given points
Description
If the toolkit Pdftk is available in the system, it will be called to Split a single input PDF document
into two parts from a given point
See the reference for detailed usage of pdftk.
split_from 13
Usage
split_from(
pg_num,
input_filepath = NULL,
output_directory = NULL,
prefix = "part",
overwrite = TRUE
)
Arguments
pg_num A vector of non-negative integers. Split the pdf document into parts from the
numbered pages.
input_filepath the path of the input PDF file. The default is set to NULL. IF NULL, it prompt
the user to select the folder interactively.
output_directory
the path of the output directory
prefix A string for output filename prefix
overwrite If a file exists in output_filepath, should it be overwritten.
Value
this function splits a single input PDF document into individual pages
Author(s)
Priyanga Dilini Talagala and Ogan Mancarci
References
https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/
Examples
## Not run:
# Split the pdf from page 10
split_from(pg_num=10)
## End(Not run)
## Not run:
if (requireNamespace("lattice", quietly = TRUE)) {
dir <- tempdir()
for(i in 1:4) {
pdf(file.path(dir, paste("plot", i, ".pdf", sep = "")))
print(lattice::xyplot(iris[,1] ~ iris[,i], data = iris))
dev.off()
}
staple_pdf(input_directory = dir, output_filepath = file.path(dir, 'Full_pdf.pdf'))
14 split_pdf
## End(Not run)
Description
If the toolkit Pdftk is available in the system, it will be called to Split a single input PDF document
into individual pages.
See the reference for detailed usage of pdftk.
Usage
split_pdf(input_filepath = NULL, output_directory = NULL, prefix = "page_")
Arguments
input_filepath the path of the input PDF file. The default is set to NULL. IF NULL, it prompt
the user to select the folder interactively.
output_directory
the path of the output directory
prefix A string for output filename prefix
Value
this function splits a single input PDF document into individual pages
Author(s)
Priyanga Dilini Talagala and Ogan Mancarci
References
https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/
Examples
## Not run:
split_pdf()
## End(Not run)
## Not run:
if (requireNamespace("lattice", quietly = TRUE)) {
staple_pdf 15
## End(Not run)
Description
If the toolkit Pdftk is available in the system, it will be called to merge the PDF files.
See the reference for detailed usage of pdftk.
Usage
staple_pdf(
input_directory = NULL,
input_files = NULL,
output_filepath = NULL,
overwrite = TRUE
)
Arguments
input_directory
the path of the input PDF files. The default is set to NULL. If NULL, it prompt
the user to select the folder interactively.
input_files a vector of input PDF files. The default is set to NULL. If NULL and input_directory
is also NULL, the user is propted to select a folder interactively.
output_filepath
the path of the output PDF file. The default is set to NULL. IF NULL, it prompt
the user to select the folder interactively.
overwrite If a file exists in output_filepath, should it be overwritten.
Value
this function returns a combined PDF document
Author(s)
Priyanga Dilini Talagala and Daniel Padfield
16 staplr
References
https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/
Examples
## Not run:
staple_pdf()
## End(Not run)
## Not run:
if (requireNamespace("lattice", quietly = TRUE)) {
dir <- tempdir()
for(i in 1:3) {
pdf(file.path(dir, paste("plot", i, ".pdf", sep = "")))
print(lattice::xyplot(iris[,1] ~ iris[,i], data = iris))
dev.off()
}
output_file <- file.path(dir, paste('Full_pdf.pdf', sep = ""))
staple_pdf(input_directory = dir, output_filepath = output_file)
}
## End(Not run)
Description
This package provides function to manipulate PDF files: merging multiple PDF files into one.
Author(s)
Priyanga Dilini Talagala, Ogan Mancarci and Daniel Padfield
References
https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/
See Also
The core functions in this package: staple_pdf, remove_pages, split_pdf, rename_files
Index
get_fields, 2, 11, 12
idenfity_form_fields, 3
remove_pages, 4, 16
rename_files, 6, 16
rotate_pages, 7
rotate_pdf, 8
select_pages, 10
set_fields, 2–4, 11
split_from, 12
split_pdf, 14, 16
staple_pdf, 15, 16
staplr, 16
17