0% found this document useful (0 votes)
107 views5 pages

15 One Solution For Protecting PHP Source Code

This document discusses solutions for protecting PHP source code. It describes two main approaches: 1) PHP source code obfuscators which provide low-level protection but are free and 2) PHP encoders which provide higher-level protection by encoding the PHP opcode but require a proprietary extension and encoded scripts cannot be used across PHP version updates. It then proposes a novel model for a free and open-source PHP script protection solution that protects at both the source code and opcode levels without relying on a third party.

Uploaded by

kpkdhar 36
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
107 views5 pages

15 One Solution For Protecting PHP Source Code

This document discusses solutions for protecting PHP source code. It describes two main approaches: 1) PHP source code obfuscators which provide low-level protection but are free and 2) PHP encoders which provide higher-level protection by encoding the PHP opcode but require a proprietary extension and encoded scripts cannot be used across PHP version updates. It then proposes a novel model for a free and open-source PHP script protection solution that protects at both the source code and opcode levels without relying on a third party.

Uploaded by

kpkdhar 36
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/262937544

One solution for protecting PHP source code

Conference Paper · April 2014


DOI: 10.15308/sinteza-2014-616-619

CITATIONS READS

0 4,434

2 authors:

Nenad Ristić Aleksandar Jevremovic


Univerzitet Sinergija Singidunum University
30 PUBLICATIONS   43 CITATIONS    81 PUBLICATIONS   173 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Sinteza - International Scientific Conference on ICT and E-Business Related Research View project

Implementation of Artificial Intelligence in human-computer interaction analysis and evaluation of cognitive performances View project

All content following this page was uploaded by Nenad Ristić on 10 June 2014.

The user has requested enhancement of the downloaded file.


SINTEZA 2014  Data security

Impact of Internet on Business activities


in Serbia and Worldwide
Uticaj Interneta na poslovanje u Srbiji i
svetu

doI: 10.15308/SInteZa-2014-616-619

ONE SOLUTION FOR PROTECTING PHP SOURCE CODE

Nenad Ristić1, Aleksandar Jevremović2


Sinergija University, BIH
1

Singidunum University, Serbia


2

Abstract:
Protecting PHP scripts from unwanted using, copying and modifications is a vast problem
today. Present solutions on source code level are mostly working as obfuscators, are free,
and are not providing any serious level of protection. Solutions that are based on encoding Key words:
opcode are more secure but are commercial and require closed-source proprietary PHP PHP,
interpreter’s extension. Furthermore, encoded opcode is not compatible with future ver- interpreted languages,
sions of interpreters which involve re-buying encoders from authors. Finally, if extension source code,
source-code is compromised, all script encoded by that solution are compromised too. In protection,
this paper we present a novel model for free and open-source PHP script protection solution. encryption.

INTRODUCTION At this time, there are some solutions for protecting


PHP source code which, generally, belong to two key
Within the last few years, PHP has established to be groups. First group contains PHP source code obfusca-
the most pervasive web platform round the world, oper- tors, which are usually free, work with source code and
ating in more than a third of the web servers across the provide very low level protection. Second group contains
world. PHP’s development has not only been quantitative PHP encoders, which work with PHP opcode, thus pro-
but also qualitative. Every day more and more businesses vide higher level of protection, but are commercial and
rely on PHP to run their applications which are critical require using proprietary closed-source PHP extension in
for their business. This creates new jobs opportunities and production environment.
increases the demand for PHP developers. While the dif- Even if PHP encoders provide higher level of protec-
ficulty of starting with PHP remains unchanged and very tion, there are two main problems when using them. First
low, the possibilities offered by PHP today allow devel- problem is limited lifetime of encoded product because
opers to extend far beyond simple HTML applications. source code is converted to opcode by current version of
The reviewed object model allows for large-scale devel- PHP interpreter, and then opcode is encoded. This means
opments to be written more efficiently by using standard that encoded solution becomes unusable with future ver-
object-oriented methodologies. sions of PHP interpreters that include some important
New XML support makes PHP the best language avail- change of how source code is transformed to opcode. Be-
able for processing XML and, coupled with new SOAP cause of this, developers are forced to buy new versions
support, an ideal platform for creating and using Web of encoders and to recompile source code whenever PHP
Services [1]. PHP is one of the most popular languages version on Web server is upgraded. Frequent replace-
for Web development. By December 2013, PHP was being ments of application files in production environment are
used by a remarkable 81.7% of sites according to W3Techs usually a painful task.
- World Wide Web Technology Surveys [2]. One of the Second problem with using PHP encoders is depend-
really significant problems for PHP developers today ency of “third trusted part” - author of encoder. This
is lack of free and high-quality solutions for protecting means that whole security of application depends of
source code of PHP Web applications. By “protecting company that develops encoder. If encoder or extension
source code” usually two things are considered: 1) pro- is compromised (source code is revealed to public), all so-
tecting source code to be viewed/modified by others and lutions encoded with that encoder are compromised, too.
2) limiting protected application execution to specific In- Additionally, encoded PHP scripts are not protected from
ternet domain or time period. encoder authors.
616
SINTEZA 2014  Data security

In this paper we are analysing possibilities and issues Encoding/encrypting


with creating solution for high-quality protection of PHP
scripts. Standard cryptology models are used for this anal- Both encoding and encrypting are reversible data
ysis. Based on the results of this analysis, we propose a transformation techniques that, however, contain es-
novel model for solution that provides solid protection of sential differences. Encoding implies using of publicly
PHP source code on both source code and opcode levels known transformation algorithm with, if used, also pub-
and is not based on trusted third party. licly known parameters. In other words, it is assumed
that anyone can decode encoded data if informed what
CURRENT SOLUTIONS encoding algorithm was used for encoding. Encryption,
on the other side, is based on secret parameter (key) used
Obfuscating source code in transformation procedure. It is usually assumed that
transformation procedure algorithm is publicly known,
Obfuscation is a technique that transforms original
but it needs not to be.
source code to its far more complex, confusing and un-
readable variant, while preserving code semantics [3]. This Most of major PHP encoders today are actually acting
technique is used to prevent or decrease efficiency of re- as encrypters because are assuming some secret compo-
verse engineering while providing the same functionality nent that prevents encoded source code or opcode to be
with equal or similar performance. Obfuscating is usually decoded. This component is usually algorithm (sometimes
done by replacing variables and user defined functions combined with some encryption parameter like project
names to meaningless ones, by removing comments and id or so) which explains why PHP interpreter extensions
formatting, and by encoding source code with some of for these encoders are closed source. That, however, as
built-in or user-defined encoding functions. Source code said before, creates unwanted dependency of encoder pro-
obfuscating is very similar to optimization of source code, vider.
with difference, that with obfuscation of code we are try-
ing to maximize obscurity while trying to minimize ex- Protecting source code vs protecting opcode
ecution time.
In exploration of approaches to obfuscate source code There are significant differences between protection on
conclusions from [4], [5] make following statement of the source code level and protection on opcode level. Main
code obfuscation code. Given a set of obfuscating trans- advantage of protection on source code level is compat-
formations Ot = {Ot1, Ot2 … Otn} and program P consisting ibility with future versions of PHP interpreter, while main
of source code objects (classes, methods, statements, etc.) disadvantage of that approach is possibility to reveal origi-
{So1, So2 … Sok}, find new program P’ = {…, So’j = Oti(Soj) nal source code if protection is broken.
…} such that: On the other side, main advantage of protecting on the
◆ P’ has the same functionality as P, for example the opcode level is lack of possibility to reveal original source
conversions are maintaining semantic code, while main disadvantage of that approach is limited
◆ The indistinctness of P’ must be at prime level so lifetime of protected scripts. In case of protecting opcode,
that understanding and reverse engineering of P’ source code is interpreted by current version of PHP in-
would be more time consuming than understand- terpreter before it is encoded/encrypted. However, opcode
ing P compatibility with future versions of PHP interpreters is
◆ The resilience of every obfuscating transformation much lower than it’s the case with source code. This means
Oti(Soj) is maximized, for example it will be to com- that developers often need to re-encode original source
plex and difficult to construct an automated tool to code whenever hosting provider upgrades to next major
undo the obfuscating transformations or applying version of PHP interpreter. Having in mind that process
a tool would be extremely time consuming. of replacing encoded scripts is happening in production
◆ The stealth of each obfuscating transformation environment, it is naturally to want to do this as rarely as
Oti(Soj) is maximized, for example the statistical possible. Also, re-encoding source code for new PHP ver-
properties of So’j are similar to those of Soj. sion usually implies buying new version of encoder.
◆ The execution time delay of P’ because of obfuscat-
ing transformation must be minimized. ANALYSIS WITH CRYPTOGRAPHY MODELS
Using obfuscating can give good results when develop-
er wants to prevent software crackers from understanding From cryptology aspect source code protection could
parts of code and then illegally including them to other be seen as an establishing secure communication channel
Web applications. However, this technique shows poor between developer and PHP interpreter. Instructions, in a
results when used for restricting usage of protected solu- form of source code, which represent secret message, are
tion to specific domain name or time period, because, in encrypted and can only be decrypted by final interpreter.
this case, pirates are not required to understand large por- Using standard characters for representing different roles
tion of code, but only to identify place where limitations in secret communication, Alice and Bob are developer and
are defined. Having in mind that opcode modifications of PHP interpreter, while Eve is everyone else - including
this type are not compatible with original interpreter, this Web server administrator. Encryption algorithm is con-
technique is limited to use with source code only. sidered to be publicly available in all wide used systems.
617
SINTEZA 2014  Data security

ing attack is possible. However, behaviour of interpreter


is not guaranteed because its source code can be changed
and then modified interpreter could be used.

Fig. 1. Cryptology model of secured communication channel


Main problem
Following presented model, main question that arises
As we can see from previous examples, there is no dif-
is how to manage key(s) that is used for encryption/de-
ference if we use symmetric or asymmetric encryption,
cryption procedures. This question leads to another ques-
place for storing key that is used for source code decryp-
tion: is symmetric or asymmetric cryptography more ap-
tion remains the main problem with protecting source
propriate using in this case? Additionally, trust in Bob’s
code. Another part of this problem is the fact that PHP
integrity must be reconsidered.
interpreter is open source, so it can be modified to expose
decrypted source code before executing it. That implies
Symmetric or asymmetric cryptography that we cannot trust in Bob’s integrity, which means that
we can consider PHP interpreter on Web server as Eve,
When using symmetric cryptography same key is used too.
for encryption and for decryption. This opens a questions
- who generates the key (Alice or Bob), and how is key
Reverse engineering
distributed to the other part? If PHP interpreter generates
the key, that key can be stored locally, maybe even inside
interpreter’s binaries. However, how can this key securely Reverse engineering goal is at gaining high-level rep-
be distributed to developer, with no one else access to it, resentations of software systems from known low-level
even system administrator? The same problem, even more objects, such as binaries, source code, execution traces or
accented, remains in case when developer generates the historical information. Reverse engineering methods and
key and needs to distribute that to interpreter. technologies play an important role in many software en-
gineering tasks and quite often are the only way to get an
understanding of large and complex software systems [6].
When evaluated from cryptology aspect, reverse en-
gineering process is analogy to cryptanalysis. By this we
mean that pirate is trying to read or modify message that
is not intended to be seen or modified by him. Idyllic solu-
Fig. 2. Architecture using symmetric cryptography tion for this problem would be one that can’t be reverse
engineered even if reverse engineering is tried on CPU
By using asymmetric cryptography, the need for se- level.
cure channel is eliminated. Developer can encrypt PHP Question that arises is how deep we need to go in order
source code by using PHP interpreter’s public key. That to provide another trusted part in secret communication
means that encrypted source code can only be decrypted - component that will securely implement our programs
by using PHP interpreter’s private key, which is stored on in environments being controlled and eavesdropped a by
secure location. Additionally, developer can digitally sign potential pirates? And also, is it possible to have such a
source code with his protected key so interpreter can be component as open-source, without relying on secret pos-
sure that it’s coming from developer. This is useful when sessed by disputed “trusted third part” - author of that
limiting application to work only with files developed by component? And finally, even if the solution for this prob-
the same developer. lem exists, will its price and complexity be appropriate for
using in cheap shared hosting environments?
For the purpose of this paper we set our goal to make
protected PHP scripts as safe as if they were typed in some
compiled language (like C, for example). This also means
that protection from assembler lever reverse engineering
is not included in proposed solution.

Fig. 3. Architecture using asymmetric cryptography PROPOSED SOLUTION

Lifetime of protected solution in this case is limited by Based on exposed results and insights from cryptology
source code compatibility with future PHP interpreters based analysis, we propose a novel solution model that
versions, or by digital certificate lifetime (which can be provides protection of PHP scripts on both source code
unlimited), whatever comes first. However, problem of and opcode levels, and is not based on trusted third party.
location where PHP interpreter’s private key is stored, and Protection level of proposed solution is equal to currently
how it’s used, remains. Potential solution is storing private available commercial solutions, based on closed-source
key within interpreter’s binary, so only reverse engineer- components.
618
SINTEZA 2014  Data security

sis is based on using of standard cryptology models and


which are used for analysing existing solutions, as well for
search for ideal theoretical model.
Essential problem for protecting PHP scripts, analysed
as cryptology model, is lack of another trusted part in se-
cured communication. PHP interpreter, in a role of an-
other part in secured communication, is an open-source
software which behaviour is publicly know and can be
modified by potential eavesdroppers/pirates.
Fig. 4. Model of proposed solution Source code obfuscation is identified as computation-
ally secure protection, while (human-based) breaking it
Architecture of proposed solution is explained on Fig is analogy to cryptanalysis. However, the need for source
4. Two main components of proposed solution are PHP code to be understandable by interpreter eliminates ob-
source code compiler/encrypter and open-source exten- fuscation as a serious standalone protection.
sion for original PHP interpreter. Additional component On the other side, source code or opcode encryption
is random key generator, but for this purpose any (pseu- requires trusted decryption part, at least as a secured space
do) random generator can be used. where key used for decryption is stored and used. This is
PHP source code compiler works as a regular inter- not possible with using completely open-source solution
preter - converts source code to opcode - with exception for PHP interpreter. Using closed-source components
that result (opcode) is encoded/encrypted with freshly for decryption, which is case with existing PHP encoders,
generated key (which is known only to developer). That creates security and commercial dependency of encoder
encoded opcode can be executed only with PHP inter- provider.
preter that knows the secret key. Also, in order to increase Solution’s model presented in this paper proposes hy-
encoded scripts’ lifetime, encoder can encode source code brid approach where all components that provide scripts
directly (with obfuscation if selected), without transform- protection (or secure communication, from cryptology
ing it to opcode. However, protection level in this case aspect) are publicly available and open-source. However,
will be significantly lower because potential pirate will be decryption component is realized as PHP interpreter’s
able to catch (obfuscated) source code as a result of exten- extension and is obfuscated and compiled by developer.
sion execution. Even if working with source code instead Key, which is used for PHP scripts encryption, is integrat-
opcode is supported, this is not recommended because ed within aforementioned extension during the compiling
it could be revealed by PHP interpreter modified by the procedure.
eavesdropper.
Another component of proposed solution is open- REFERENCES
source (publicly available) extension for decoding previ-
ously explained encoded PHP scripts. However, this ex- [1] Gutmans, Andi, Stig Bakken, and Derick Rethans. „PHP 5
tension is completely unusable without having key which Power Programming (Bruce Perens’ Open Source Series)“.
is used for encoding PHP scripts. That’s why extension is Prentice Hall PTR, 2004.
compiled to binary by developer, and during that process [2] http://w3techs.com/technologies/overview/programming_
key is built-in binary result. However, in order to hide language/all (available on 03.01.2013)
location where key is stored in extension binary, exten- [3] Cho, Seongje and Chang, Hyeyoung and Cho, Yookun,
sion compiler, before compiling, is randomly obfuscating “Implementation of an Obfuscation Tool for C/C++ Source
extension’s source code by adding random code snippets Code Protection on the XScale Architecture”, Proceedings
and false keys as variables, that have no impact on exten- of the 6th IFIP WG 10.2 international workshop on Soft-
sion behaviour. This means that two results of independ- ware Technologies for Embedded and Ubiquitous Systems,
ent compiling’s of extension, even with the same key, will 2008, ISBN: 978-3-540-87784-4, DOI: 10.1007/978-3-540-
give completely different results. 87785-1_36
Next step for developer is to upload encoded PHP [4] Collberg, C., Thomborson, C., & Low, D. (1998, Janu-
scripts and compiled extension to share Web server and to ary). Manufacturing cheap, resilient, and stealthy opaque
enable it when executing his protected scripts. Downside constructs. In Proceedings of the 25th ACM SIGPLAN-
of proposed solution is requirement for server adminis- SIGACT symposium on Principles of programming lan-
trator to allow users to load their own binary extensions. guages (pp. 184-196). ACM.
[5] Collberg, C., Thomborson, C., & Low, D. (1998, May).
CONCLUSION Breaking abstractions and unstructuring data structures.
In Computer Languages, 1998. Proceedings. 1998 Interna-
In this paper we analysed problem of protecting intel- tional Conference on (pp. 28-38). IEEE.
lectual property in a form of interpreted languages source [6] Pinzger, M., & Antoniol, G. (2013). Guest editorial: reverse
code. PHP, as the most prevalent interpreted language for engineering. Empirical Software Engineering, 18(5), 857-
Web development is used as an example. Our main analy- 858.

619

View publication stats

You might also like