0% found this document useful (0 votes)
57 views

6 Important Factors When Choosing A PDF Library: Adam Pez

1) Choosing a PDF library requires considering several important factors to avoid hidden costs, such as restrictions on features and platforms, difficulties adding features internally, and poor user experience from slow or crashing performance. 2) Developers commonly underestimate the complexity of PDFs and the challenges of building custom PDF functionality without a full-featured commercial library. This can lead to delays, extra development costs, and an inability to meet user needs. 3) It is important to thoroughly test any library with the types of complex documents that will be used, especially at scale, to avoid performance and rendering issues that negatively impact users. Commercial libraries may have advantages in terms of support.

Uploaded by

lara2005
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views

6 Important Factors When Choosing A PDF Library: Adam Pez

1) Choosing a PDF library requires considering several important factors to avoid hidden costs, such as restrictions on features and platforms, difficulties adding features internally, and poor user experience from slow or crashing performance. 2) Developers commonly underestimate the complexity of PDFs and the challenges of building custom PDF functionality without a full-featured commercial library. This can lead to delays, extra development costs, and an inability to meet user needs. 3) It is important to thoroughly test any library with the types of complex documents that will be used, especially at scale, to avoid performance and rendering issues that negatively impact users. Commercial libraries may have advantages in terms of support.

Uploaded by

lara2005
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

6 Important Factors when

Choosing a PDF Library


ADAM PEZ
Overview

You plan to embed PDF functionality into an This dissatisfaction implies that picking the right
application. But before you dive into the project, PDF SDK is a lot harder than it seems. And to
you must decide: do you go with a more expen- help you avoid the same mistakes as past
sive commercial PDF SDK — or a lower-cost implementations, we’ve written this article.
alternative such as an open-source library or
an open-source wrapper? (We also recently surveyed 57 unique
organizations that switched from PDF.js to a
There are non-trivial costs to switching later. commercial SDK. Read our comprehensive guide
Developers have to re-learn the new library, to PDF.js to learn more.)
re-adjust the backend, customize the UI to match
what users are accustomed to, as well as migrate
documents, form data, annotations, and more.

According to market research conducted by Stax


Inc., the average Net Promoter Score (NPS) for
the top five PDF SDK vendors is 35%. And 70%
of customers express interest in switching
despite the high costs.

2
Restrictions on Features and Platforms

A first mistake organizations make when Additionally, a library may work fine initially on
selecting a PDF library for the first time is to the main platforms preferred by your users. But
assume fixed feature requirements. But these later if you wish to expand, the library does not
are likely to evolve. support the platforms you want — or the APIs are
inconsistent, with different classes and methods
Users start to ask for more functionality as they across platforms making it so your engineers
grow dependent upon a PDF SDK. An organiza- have to start from scratch.
tion will then have to consider saying no to user
feature requests; building time-intensive and To avoid this hidden cost, go with an SDK
challenging customizations on top; or integrating with a broad feature-set across multiple
additional libraries and thus adding more platforms, providing you the flexibility to
complexity, maintenance overhead, and risk. grow down the road.

Maybe big companies can absorb the costs of maintaining


three-to-four different relationships with different vendors,
each with a different code base, different roadmaps, and
different problems. I’m not saying it isn’t possible.
— Kalsefer Co-Founder and CEO, Avshi Segev

3
Unanticipated Difficulty Adding Features

Another mistake is where organizations select a involves a steep learning curve not subject to
basic library to save money with the assumption economies of scale. Throwing more devs into the
that they can build anything needed on top. equation will not shrink the ramp-up time for the
first developer.
But building in an unfamiliar domain can easily
lead to unknown challenges, high expenses, and Additionally, custom features will have to be
reduced speed to market. And PDF is unusually supported and maintained long-term, creating
complex as high-profile teams attest — including an additional ongoing opportunity cost: commit-
those of Slack, Dropbox, and Linkedin. ted resources will be less-available to work on
other projects.
Your devs are not necessarily PDF experts, and
attempting challenging PDF features in-house

PDFs are an incredibly complex file format; this is especially so


given that a PDF can be generated a hundred different ways,
all of which a renderer needs to handle gracefully.
— Developer, LinkedIn

PDFs are complex documents PDF is an incredibly complex


— structured into different file format — the specifica-
layers of information, data, tion is more than a thousand
and objects, and containing pages long, not including the
different languages, images, extensions and supplements.
and graphics. — Developer, Dropbox

— Developer, Slack

4
Organizations that we’ve spoken to have found While it is certainly possible to build the above
the most challenging features are those that in-house, PDF features can consume a shocking
require engaging PDF at a low level, where amount of time. And you eventually may have to
objects are defined in PDF byte code — decide whether to continue — or whether to bite
with unique byte offsets for different objects, the bullet and abandon months or years of work
making it difficult for devs unfamiliar with PDF’s for an alternative that can meet your require-
inner workings to parse and manage these ments cost-effectively.
objects correctly.
To avoid this type of hidden cost, you will want to
Challenging PDF functionality includes carefully consider the capacity of your existing
development team should you decide to build,
• Managing PDF annotations from multiple maintain, and support custom PDF features in-
users (e.g., synchronization and versioning house as these features often prove time-inten-
• PDF generation (creating PDFs from scratch sive and challenging.
or from other documents)
• Page manipulation (add, merge, or remove)
• Layers (via Optional Content Groups)
• Color management features (e.g., ink-color
separations, overprint, etc.)

...you shouldn’t build anything that’s available off the shelf be-
cause it’s not a source of competitive advantage if everybody
else can avail themselves of it. The only scenario where you
should build is if it’s your core technology -- the core source of
your competitive differentiation and competitive advantage.
— Mark Holst-Knudsen, President ThomasNet @
MIT’s 2014 CIO Symposium

5
A lower-quality library also encounters • CAD-based PDFs such as construction and
performance and memory issues, such as large engineering drawings with very large and
documents with frustratingly long wait times for complex designs.
your users as well as complex documents that • Reports, textbooks, and marketing material
crash the viewer. This is often due to the absence using advanced PDF graphics such as shad-
of features such as PDF tiling, parallelization, ings, gradients, soft masks, and patterns.
and linearization that a more mature PDF SDK • Geospatial maps with OCG layers that are
will incorporate. switched off by default.
• Pre-press documents which require an SDK
Some solutions (e.g., image servers) perform with advanced color management features to
excellently when tested on a small number of print colors accurately.
documents and users but then inflict unexpected • High-speed accurate rendering (especially on
hidden costs when scaled up. When hundreds native mobile apps and mobile browsers).
or thousands of users later view, mark up, com- • Context extraction of tables, text, etc. with
ment on, and otherwise interact with (i.e.,scroll, document structure (e.g., text read order or
pan, and zoom) documents, server resource and table arrangement) in tact.
network data usage explodes. To maintain your
desired UX, you have to pay higher fees or invest To prevent crashes, slowness, and rendering
in more servers. issues from disrupting your UX, test functionality
with the types of documents your users will work
The following types of documents have much with. Also test a server-based solution at the
more demanding rendering requirements: anticipated load and usage.

6
Poor UX: Slow Performance, Crashing,
and Inaccurate Rendering

Another source of hidden costs can be a poor smaller company with many remote developers
user experience, especially as users start to may have difficulty providing the same turn-
upload more massive and complex documents around time and specialized support and
that crash or freeze a lower-quality viewer. service as a commercial SDK. If they did not
Construction Computer Software encountered build the rendering engine themselves, they
these issues with a free PDF viewer add-on to its may not be able to fix the issue — or fixes may
flagship estimation software. take a long time — because they have difficulty
finding in the code where the problem originated.
As is often the case with a lower-quality library, If you go with open-source, you may have to fix
PDFs render incorrectly. You then have to wait bugs yourself.
on the vendor to respond. But a reseller or a

If you’re looking for a PDF reader for the first time, you better
make sure it can read 100% of your PDF files. Because if your
client-base starts relying on that PDF reader, exactly what
happened to us, they still want the absolute best quality.”
— Tony Cornwall, Construction Computer Software

7
Low Adoption on a Complex UI

In 2018, AEC-software company PlanGrid source UI will make it difficult to evaluate how
partnered with FMI to survey nearly 600 con- deeply you can customize, optimize, and add new
struction leaders from around the world to tools or annotation types to the UI. Therefore,
discern why construction and engineering your team may build out a proof of concept and
software succeed or fail. The findings report make their plans for future expansion — only
“Construction Disconnected” identified a to have to scale back their ambitions or wait on
complex UI and inadequate user training as two the vendor to adjust the API. A black box UI will
of the top five reasons for why technology fails. prove especially problematic if your UI team is
very strict or if you have unique UI requirements
Being able to slim down the interface and tailor (e.g., accessibility compliance requirements such
feature-sets to specific user groups is proven to as ADA/508).
significantly cut down training costs and improve
user adoption. (See our OEC Graphics success To avoid this hidden cost, choose a vendor with
story to learn more.) an open-source UI or make certain your proof of
concept won’t need to change.
However, a closed-source UI will limit you in what
you can customize, and you may not be able to
fully fix the UX. (And by the time you’ve
discovered this, it may be too late.) A closed-

8
Security Issues The Bottom Line

When writing PDF features from scratch, The best way to avoid hidden costs associated
developers may be tempted to take shortcuts with the wrong PDF library is to perform due
to save time. But these shortcuts cause the diligence during your evaluation. To assist you
solution to become obsolete quickly as devs run in this process, we’ve written a blog with several
into the exact security issues a more mature tool- considerations you can add to your PDF SDK
kit makes a lot easier to solve. evaluation checklist.

One recent instance our solution engineers have We hope this article was helpful! If you have any
noted is where developers use JavaScript-based questions, don’t hesitate to contact us.
submit buttons on forms rather than uploading
and parsing data out of forms — which opens up
the system to phishing and middle-man attacks.
Someone could easily edit the button to have
it send personal information to another server,
and then maliciously re-circulate the form within
your organization or send it to end users.

Vendor Lock-in

Lastly, consider how your data and documents


will be stored. For example, annotations stored in
a proprietary format, such as Brava! annotations
and some versions of JSON, will not be accessible
to users who want to view their annotations with
other tools such as Adobe Acrobat. Moreover, it
will be challenging to migrate these annotations
later if you wish to switch solutions.

A vendor who manages annotations in the ISO


standard for annotations interchange, XFDF, for
example, will eliminate this hidden cost.

You might also like