0% found this document useful (0 votes)

238 views4 pages

PDF File Format - What Is A PDF

PDF File Format - What is a PDF file

Uploaded by

dare_numero5

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

238 views4 pages

PDF File Format - What Is A PDF

PDF File Format - What is a PDF file

Uploaded by

dare_numero5

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

What is a PDF?

PDF stands for the Portable Document Format, used to display documents in an electronic form
independent of the software, hardware or operating system they are viewed on. Originally
developed by Adobe® Systems as a universally compatible file format based on the PostScript
format, it has become an international de-facto standard for exchanging documents and
information.

In 2008, Adobe relinquished control of PDF development to the ISO (International Organization
for Standardization) and by this PDF became an “open standard”. The specifications for the
current PDF version (2.0) are documented under ISO 32000-2. ISO is also in charge of updating
and developing future versions.

PDF File Format - What is a PDF file?

Portable Document Format (PDF) is a type of document created by Adobe back in 1990s. The
purpose of this file format was to introduce a standard for representation of documents and other
reference material in a format that is independent of application software, hardware as well as
Operating System. The PDF file format has full capability to contain information like text,
images, hyperlinks, form-fields, rich media, digital signatures, attachments, metadata, Geospatial
features and 3D objects in it that can become as part of source document.

In most of the cases, existing documents are converted to PDF rather than creating a new PDF
from scratch. But that doesn’t mean there are no software for creation or manipulation of PDF
files.

(Have to share something about PDF file format? You can post your findings in PDF File
Format News section.)

PDF File Format - Brief History

A quick go-through the timeline about the PDF file formation in terms of timeline is as follow:

1993 - Adobe Systems made the PDF specifications available free of charge

2008 - PDF was released as an open standard on July 1, 2008 and was published by the
International Organization for Standardization as ISO 32000-1:2008.

2008 - Adobe published a Public Patent License to ISO 32000-1 format royalty-free rights for all
patents owned by Adobe that are necessary to make, use, sell and distribute PDF compliant
implementations.

The first version of PDF designated as PDF 1.0 which later went through revisions up to PDF
1.7. PDF 1.7, which became the ISO 32000-1, include some non-standardized proprietary
technologies as well like Adobe XML Forms Architecture (XFA) and JavaScript extension for
Acrobat. It was on July 28, 2017 when PDF 2.0, known as ISO 32000-2:2017 was published
which doesn’t include any non-standardized technologies.

PDF File Format Specifications

A PDF file is a set of bytes that can be grouped in to tokens according to syntax rules defined by
PDF specifications. Once or more tokens are combined to form higher-level syntactic entities,
principally objects, which are the basic data values from which a PDF document is constructed.

File Structure of PDF Files

PDF file contents are arranged in the following sequence inside the file.

|Header |Body |Cross-Reference Table |Trailer

PDF File Header

Irrespective of the PDF version, a PDF file starts with a header containing unique identifier for
PDF and the version of the format such as %PDF-1.x where x ranges from 1-7.

File Body

The body of a PDF file consists of a sequence of indirect objects representing the contents of a
document. The objects, as described above, represent components of the document such as fonts,
pages and sampled images. Beginning with PDF 1.5, the body can also contain object streams,
each of which contains a sequence of indirect objects.

Cross-Reference Table

The cross-reference table contains information that permits random access to indirect objects
within the file so that the entire file need not be read to locate any particular object. The table
shall contain a one-line entry for each indirect object, specifying the byte offset of that object
within the body of the file. (Beginning with PDF 1.5, some or all of the cross-reference
information may alternatively be contained in cross-reference streams.

File Trailer

The trailer of a PDF file enables a conforming reader to quickly find the cross-reference table
and certain special objects. Conforming readers should read a PDF file from its end. The last line
of the file shall contain only the end-of-file marker, %%EOF. The two preceding lines shall
contain, one per line and in order, the keyword startxref and the byte offset in the decoded stream
from the beginning of the file to the beginning of the xref keyword in the last cross-reference
section.
PDF Objects

A PDF file includes several different type of objects that are of following types

 Boolean values - representing conditional true or false

 Numbers - Integer and Real values
 Strings - contains characters within parentheses
 Names - start with a forward / character e.g. /ASomewhatLongerName results in
ASomewhatLongerName
 Arrays - PDF supports one dimensional arrays. Arrays of higher dimensions can be constructed
by using arrays as nested elements
 Dictionaries - collection of objects as key-value pairs. It can have zero entries.
 Streams - represents sequence of bytes which can be of unlimited length as well
 Null Object - represents a null value

There can be other other objects like comments which are introduced with the % sign and may
contain 8-bit characters.

Indirect Objects

Any object in a PDF file may be labelled as an indirect object. Indirect objects are given unique
object identifier by which other objects can refer to it. Cross-referencing to these are maintained
in an index table and marked with the xref keyword which follows the main body and gives the
byte offset of each indirect object from the start of file.

Linear and Non-Linear PDF Layouts

PDF layouts are categorized as Llnear and non-linear depending upon the target applications and
other factors.

Non-Linear - Non-linear PDF files use less disk space as compared to linear PDF files. PDF
pages of the document reside in scattered form across the PDF file and that is why non-linear
files are slower as compared to linear files.

Linear PDF - Targeting online PDF viewers, Linear PDF files are constructed in such a way that
they are written to disk in a linear fashion. This doesn’t required browser plugins for whole
document to load first before display.

Objects Overview

As mentioned, PDF body is a collection of objects mentioned above. PDF is largely based on
PostScript without the control features of programming languages like if and loop commands.
Commands issued by Postscript code to generate graphical contents are collected and tokenized
in addition to any files, graphics or fonts referred by the document. All these contents are
accumulated to a single file, resulting in composed PostScript output.
Text

Text in PDF is represented by text elements which are actually displayed with glyphs from fonts.
A glyph is a graphical shape and is subject to all graphical manipulations, such as coordinate
transformation. Because of the importance of text in most page descriptions, PDF provides
higher-level facilities to describe, select, and render glyphs conveniently and efficiently.

Graphics

The graphics operators used in PDF content streams describe the appearance of pages that are to
be reproduced on a raster output device. The facilities are intended for both printer and display
applications. The graphics operators form six main groups:

 Graphics state operators manipulate the data structure called the graphics state, the global
framework within which the other graphics operators execute. The graphics state includes the
current transformation matrix (CTM), which maps user space coordinates used within a PDF
content stream into output device coordinates. It also includes the current colour, the current
clipping path, and many other parameters that are implicit operands of the painting operators.
 Path construction operators specify paths, which define shapes, line trajectories, and regions of
various sorts. They include operators for beginning a new path, adding line segments and curves
to it, and closing it.
 Path-painting operators fill a path with a colour, paint a stroke along it, or use it as a clipping
boundary.
 Other painting operators paint certain self-describing graphics objects. These include sampled
images, geometrically defined shadings, and entire content streams that in turn contain
sequences of graphics operators.
 Text operators select and show character glyphs from fonts (descriptions of typefaces for
representing text characters). Because PDF treats glyphs as general graphical shapes, many of
the text operators could be grouped with the graphics state or painting operators. However, the
data structures and mechanisms for dealing with glyph and font descriptions are sufficiently
specialized.
 Marked-content operators associate higher-level logical information with objects in the content
stream. This information does not affect the rendered appearance of the content; it is useful to
applications that use PDF for document interchange.

Itext Pdfabc
No ratings yet
Itext Pdfabc
152 pages
Pellet Mill Handbook
100% (2)
Pellet Mill Handbook
21 pages
English Adventure Starter Book
100% (1)
English Adventure Starter Book
79 pages
History of PDF - Wikipedia 2
No ratings yet
History of PDF - Wikipedia 2
10 pages
Abstract Syntax Trees (AST) : Modern Compiler Design by David Galles University of San Francisco
No ratings yet
Abstract Syntax Trees (AST) : Modern Compiler Design by David Galles University of San Francisco
41 pages
English Adventure 1 Teachers
100% (2)
English Adventure 1 Teachers
112 pages
PDF - Wikipedia
No ratings yet
PDF - Wikipedia
91 pages
Frequently Asked Questions (Faqs) : Pdf/A-1
No ratings yet
Frequently Asked Questions (Faqs) : Pdf/A-1
8 pages
English Tenses: Present Tense
No ratings yet
English Tenses: Present Tense
17 pages
PDF - Wikipedia
No ratings yet
PDF - Wikipedia
31 pages
PDF - Wikipedia
No ratings yet
PDF - Wikipedia
30 pages
History of PDF - Wikipedia
No ratings yet
History of PDF - Wikipedia
38 pages
PDF - Wikipedia
No ratings yet
PDF - Wikipedia
23 pages
History: Portable Document Format (PDF), Standardized As ISO 32000, Is A File Format Developed
No ratings yet
History: Portable Document Format (PDF), Standardized As ISO 32000, Is A File Format Developed
13 pages
TR11 Wolf OMG PDF PDF
No ratings yet
TR11 Wolf OMG PDF PDF
197 pages
Portable Document Format: History and Standardization Technical Foundations Technical Overview
No ratings yet
Portable Document Format: History and Standardization Technical Foundations Technical Overview
5 pages
Portable Document Format
No ratings yet
Portable Document Format
14 pages
PDF - Wiki
No ratings yet
PDF - Wiki
24 pages
Malicious Origami in PDF
No ratings yet
Malicious Origami in PDF
27 pages
PDF - Wikipedia
No ratings yet
PDF - Wikipedia
23 pages
PDF - Wikipedia
No ratings yet
PDF - Wikipedia
24 pages
PDF File Format: Internal Document Structure Explained: Save Emails To PDF
No ratings yet
PDF File Format: Internal Document Structure Explained: Save Emails To PDF
7 pages
PDF - Wikipedia
No ratings yet
PDF - Wikipedia
22 pages
Nice Document - Expert Review
No ratings yet
Nice Document - Expert Review
20 pages
91 Languages: Article Talk Read Edit View History
No ratings yet
91 Languages: Article Talk Read Edit View History
4 pages
Find Text
No ratings yet
Find Text
5 pages
PDF - Wikipedia
No ratings yet
PDF - Wikipedia
23 pages
PDF - Wikipedia
No ratings yet
PDF - Wikipedia
24 pages
PDF 1
No ratings yet
PDF 1
3 pages
Technical Details: Postscript Language
No ratings yet
Technical Details: Postscript Language
2 pages
Source
No ratings yet
Source
8 pages
Law 2
No ratings yet
Law 2
2 pages
Jack Eichel
No ratings yet
Jack Eichel
3 pages
Jump To Navigation Jump To Search: For Other Uses, See
No ratings yet
Jump To Navigation Jump To Search: For Other Uses, See
16 pages
Minimal PDF: Adobe PDF Specification ("ISO Approved Copy of The ISO 32000-1 Standards Document") Tips
No ratings yet
Minimal PDF: Adobe PDF Specification ("ISO Approved Copy of The ISO 32000-1 Standards Document") Tips
3 pages
Portable Document Format: For Other Uses, See
No ratings yet
Portable Document Format: For Other Uses, See
2 pages
Part 4
No ratings yet
Part 4
2 pages
PostScript Language
No ratings yet
PostScript Language
2 pages
Documentation PDF
No ratings yet
Documentation PDF
3 pages
PDF - Wikipedia
No ratings yet
PDF - Wikipedia
5 pages
Anatomy of Malicious PDF Documents
No ratings yet
Anatomy of Malicious PDF Documents
6 pages
PDF - Wikipedia
No ratings yet
PDF - Wikipedia
25 pages
O o o o
No ratings yet
O o o o
6 pages
PDF Reader From Scratch
No ratings yet
PDF Reader From Scratch
26 pages
3PDF - Wikipedia
No ratings yet
3PDF - Wikipedia
5 pages
7.2.5.3 Lab - Identifying IPv6 Addresses
100% (7)
7.2.5.3 Lab - Identifying IPv6 Addresses
7 pages
The ABC of PDF With Itext
No ratings yet
The ABC of PDF With Itext
32 pages
Design Proposal of A Prototype For Sawdust Pellet
No ratings yet
Design Proposal of A Prototype For Sawdust Pellet
10 pages
What Is PDF File Extension
No ratings yet
What Is PDF File Extension
3 pages
PDF - Wikipedia
No ratings yet
PDF - Wikipedia
1 page
Redundantly File Format Adobe Documents Application Software Hardware Operating Systems Postscript Fonts Vector Graphics Raster Images Open Format
No ratings yet
Redundantly File Format Adobe Documents Application Software Hardware Operating Systems Postscript Fonts Vector Graphics Raster Images Open Format
4 pages
What To Do Before Starting Your New Pellet Mill Machines
No ratings yet
What To Do Before Starting Your New Pellet Mill Machines
3 pages
Basics
No ratings yet
Basics
2 pages
User Defined Functions in Javascript
No ratings yet
User Defined Functions in Javascript
6 pages
PDF - Wikipedia
No ratings yet
PDF - Wikipedia
32 pages
User Manual Examples
No ratings yet
User Manual Examples
1 page
Vda 6.X Quality Management System Certification
No ratings yet
Vda 6.X Quality Management System Certification
3 pages
What Are The Benefits of A User Manual
No ratings yet
What Are The Benefits of A User Manual
3 pages
PDF 2
No ratings yet
PDF 2
4 pages
PDF - Wikipedia: About 489,000,000 Results (0.52 Seconds)
No ratings yet
PDF - Wikipedia: About 489,000,000 Results (0.52 Seconds)
2 pages
PDF 1
No ratings yet
PDF 1
11 pages
History of Portable Document Format
No ratings yet
History of Portable Document Format
19 pages
Part 5
No ratings yet
Part 5
8 pages
PDF Documents: A Primer For Data Curators: Portable Document Format PDF
No ratings yet
PDF Documents: A Primer For Data Curators: Portable Document Format PDF
11 pages
0906 0867 PDF
No ratings yet
0906 0867 PDF
8 pages
Quick Guide of Seetong Access Cloud IP Camera - EN V1.1
No ratings yet
Quick Guide of Seetong Access Cloud IP Camera - EN V1.1
2 pages
Problem-Solving Through Programming: First/Second Semester B.E. Degree Examination
No ratings yet
Problem-Solving Through Programming: First/Second Semester B.E. Degree Examination
2 pages
PDF Basics CheatSheet
No ratings yet
PDF Basics CheatSheet
2 pages
DVR
No ratings yet
DVR
130 pages
Configuring PVPGN For WarCraft III
90% (10)
Configuring PVPGN For WarCraft III
3 pages
File Handling in Qbasic PDF
67% (3)
File Handling in Qbasic PDF
2 pages
Binding Materials 19 Yo He
No ratings yet
Binding Materials 19 Yo He
58 pages
Cognos Planning Backup and Recovery Guide
No ratings yet
Cognos Planning Backup and Recovery Guide
10 pages
7636v1.0 (G52 76361X3) (H55M E33 - H55M P31) Euro
No ratings yet
7636v1.0 (G52 76361X3) (H55M E33 - H55M P31) Euro
154 pages
Hydra Router Attack
No ratings yet
Hydra Router Attack
6 pages
Main Concept of React Js
No ratings yet
Main Concept of React Js
68 pages
ASM8085
No ratings yet
ASM8085
15 pages
What Is VDA6.3
No ratings yet
What Is VDA6.3
5 pages
Tech Spec Part 1
No ratings yet
Tech Spec Part 1
14 pages
A Project Report On Wireless Doorbell Wi
No ratings yet
A Project Report On Wireless Doorbell Wi
39 pages
How PDF Work
No ratings yet
How PDF Work
4 pages
Canon Therefore
No ratings yet
Canon Therefore
8 pages
Tutorial On Lex & Yacc: Presented by Dewan Tanvir Ahmed Lecturer, CSE Bangladesh University of Engineering and Technology
No ratings yet
Tutorial On Lex & Yacc: Presented by Dewan Tanvir Ahmed Lecturer, CSE Bangladesh University of Engineering and Technology
31 pages
3.1 JDBC Principles
No ratings yet
3.1 JDBC Principles
4 pages
Servicenow Discovery: The It Challenge Benefits Accelerate Time To Value
No ratings yet
Servicenow Discovery: The It Challenge Benefits Accelerate Time To Value
3 pages
Portable Document Format (PDF), Standardized As ISO 32000, Is A
No ratings yet
Portable Document Format (PDF), Standardized As ISO 32000, Is A
12 pages
VDA6-3 Service Sheet PDF - 1
No ratings yet
VDA6-3 Service Sheet PDF - 1
2 pages
How To Create A User Manual
No ratings yet
How To Create A User Manual
2 pages
French Onion Baked Potatoes
No ratings yet
French Onion Baked Potatoes
2 pages
Sidexis XG
No ratings yet
Sidexis XG
198 pages
Chicken Spinach Artichoke Soup
No ratings yet
Chicken Spinach Artichoke Soup
2 pages
Fried Pickles
No ratings yet
Fried Pickles
2 pages
Beneficial Tips For Writing A User Manual
No ratings yet
Beneficial Tips For Writing A User Manual
1 page
What Is A User Manual and Why Is It Important
No ratings yet
What Is A User Manual and Why Is It Important
1 page
Baked Spaghetti
No ratings yet
Baked Spaghetti
1 page
French Bread Pizza
No ratings yet
French Bread Pizza
1 page
Change Request Form - Kalyan Review
No ratings yet
Change Request Form - Kalyan Review
4 pages
Hive in Class Assignment Winter 2021
No ratings yet
Hive in Class Assignment Winter 2021
2 pages
Java Notes
No ratings yet
Java Notes
18 pages
AnekaDynamicProvisioning 5.0
No ratings yet
AnekaDynamicProvisioning 5.0
44 pages
Log
No ratings yet
Log
16 pages
Rocket Launcher Using Opengl
No ratings yet
Rocket Launcher Using Opengl
31 pages
Review Question 1
No ratings yet
Review Question 1
3 pages
Datasheet NetApp StorageGRID
No ratings yet
Datasheet NetApp StorageGRID
5 pages
Netmagic VPN: Source/downloads - HTML
No ratings yet
Netmagic VPN: Source/downloads - HTML
1 page
How To Jailbreak An IPod Touch 4th Generation - 5 Steps - Instructables
No ratings yet
How To Jailbreak An IPod Touch 4th Generation - 5 Steps - Instructables
7 pages
Lesson 18: Supporting Mobile Software
No ratings yet
Lesson 18: Supporting Mobile Software
22 pages
Beginning XML
From Everand
Beginning XML
Joe Fawcett
3/5 (1)
Beginning HTML and CSS
From Everand
Beginning HTML and CSS
Rob Larsen
No ratings yet
PostScript Language Essentials: Definitive Reference for Developers and Engineers
From Everand
PostScript Language Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Introduction to Programming Languages
From Everand
Introduction to Programming Languages
IntroBooks Team
4/5 (1)

PDF File Format - What Is A PDF

Uploaded by

PDF File Format - What Is A PDF

Uploaded by

What is a PDF?

PDF File Format - What is a PDF file?

PDF File Format - Brief History

PDF File Format Specifications

File Structure of PDF Files

|Header |Body |Cross-Reference Table |Trailer

PDF File Header

 Boolean values - representing conditional true or false

Linear and Non-Linear PDF Layouts

You might also like