12 +Software+Security
12 +Software+Security
Read Chapter 11
1
Software Security
• Many vulnerabilities result from poor programming practices
– Open Web Application Security Top Ten list of critical Web application
security flaws includes 5 software related flaws:
• unvalidated input,
• cross-site scripting,
• buffer overflow,
• injection flaws,
• improper error handling.
• Consequence from insufficient checking and validation of data and
error codes in programs
• Awareness of these issues is a critical initial step in writing more
secure program code
• Software error categories:
– Insecure interaction between components
– Risky resource management
2
– Porous defenses
CWE/SANS TOP 25 Most Dangerous Software Errors
(2022) (1/3)
• Software Error Category: Insecure Interaction between Components
2. Improper Neutralization of Input During Web Page Generation (“Cross-site
Scripting”)
3. Improper Neutralization of Special Elements used in SQL Command (“SQL
Injection”)
4. Improper Input Validation
6. Improper Neutralization of Special Elements used in OS Command (“OS Command
Injection”)
9. Cross-Site Request Forgery (CSRF)
10. Unrestricted Upload of File with Dangerous Type
12. Deserialization of Untrusted Data
17. Improper Neutralization of Special Elements used in a Command (“Command
Injection”)
21. Server-Side Request Forgery (SSRF)
24. Improper Restriction of XML External Entity Reference
25. Improper Control of Generation of Code (“Code Injection”)
3
CWE/SANS TOP 25 Most Dangerous Software Errors
(2022) (2/3)
• Software Error Category: Risky Resource Management
1. Out-of-bounds Write
5. Out-of-bounds Read
7. Use After Free
8. Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal')
11. NULL Pointer Dereference
13. Integer Overflow or Wraparound
19. Improper Restriction of Operations within the Bounds of a Memory Buffer
22. Concurrent Execution using Shared Resource with Improper Synchronization
('Race Condition')
23. Uncontrolled Resource Consumption
4
CWE/SANS TOP 25 Most Dangerous Software Errors
(2022) (3/3)
• Software Error Category: Porous Defenses
14. Improper Authentication
15. Use of Hard-coded Credentials
16. Missing Authorization
18. Missing Authentication for Critical Function
20. Incorrect Default Permissions
5
Reducing Software Vulnerabilities
• The NIS report NIST IR 8151 presents a range of
approaches to reduce the number of software
vulnerabilities
• It recommends:
– Stopping vulnerabilities before they occur by using improved
methods for specifying and building software
– Finding vulnerabilities before they can be exploited by using
better and more efficient testing techniques
– Reducing the impact of vulnerabilities by building more resilient
architectures
• emphasize the need for software developers to address these known
areas of concern and provide guidance on how this is done.
6
Software Quality vs Security
• Software quality and reliability
– Concerned with the accidental failure of program as a result of some
theoretically random, unanticipated input, system interaction, or use of
incorrect code
– Can be improved using structured design and testing to identify and eliminate
as many bugs as possible from a program
– Concern is not how many bugs, but how often they are triggered
• Software security is related
– Attacker chooses probability distribution, specifically targeting bugs that
result in a failure that can be exploited by the attacker
– Triggered by inputs that differ dramatically from what is usually expected
– Unlikely to be identified by common testing approaches
7
Defensive Programming (1/2)
• Design and implement a software so that it continues to function even
when under attack
• Requires attention to all aspects of program execution, environment,
and type of data it processes
• Software is able to detect erroneous conditions resulting from some
attack
• Also called secure programming
• Key rule is to never assume anything, check all assumptions and
handle any possible error states
8
abstract model of a program
• Concepts taught in most introductory
programming courses. Program:
reads input data from a
variety of possible sources,
processes that data
according to some
algorithm,
then generates output,
possibly to multiple
different destinations.
executes in the
environment provided by
some OS, using the
machine instructions of
some specific processor
type.
• While processing the data, the program will use system calls, and possibly other programs
available on the system.
• These may result in data being saved or modified on the system or cause some other side
effect as a result of the program execution.
• All of these aspects can interact with each other, often in complex ways 9
Defensive Programming (2/2)
• Programmers often make assumptions about 1) the type of inputs a
program will receive and 2) the environment it executes in
– Assumptions need to be validated by the program and all potential failures handled
gracefully and safely
• Requires a changed mindset to traditional programming practices
– Programmers have to understand how failures can occur and the steps needed to
reduce the chance of them occurring in their programs
• Conflicts with business pressures to keep development times as short
as possible to maximize market advantage
10
• Security and reliability: common design goals in most engineering
disciplines
– society not tolerant of bridge/plane failures
• Software development not as mature
– much higher failure levels tolerated
• Recent years have seen increasing efforts to improve secure software
development processes
• Despite having a number of software development and quality
standards
– main focus is general development lifecycle
– increasingly identify security as a key goal
• Software Assurance Forum for Excellence in Code (SAFECode)
– Develop publications outlining industry best practices for software assurance
and providing practical advice for implementing proven methods for secure
software development 11
• Incorrect handling a very common failing
• Input is any source of data from outside and whose value is not
explicitly known by the programmer when the code was written
– data read from keyboard, file, network
– also execution environment, configurable data
• Must identify all data sources
• Explicitly validate assumptions on size and type of values before use
Input Size & Buffer Overflow
• Programmers often make assumptions about the maximum expected
size of input
– eg. that user input is only a line of text
– size buffer accordingly (512 B) but fail to verify size
– resulting in buffer overflow
• Testing may not identify vulnerability since focus on “normal,
expected” inputs
- Test inputs are unlikely to include large enough inputs to trigger the overflow
• Safe coding treats all input as dangerous
– hence must process it in a manner that does not expose the program to danger
13
Interpretation of Program Input
• Program input may be binary or text
– binary interpretation depends on encoding and is usually
application specific
• When processing binary data, the program assumes some interpretation of
the raw binary values as representing integers, floating-point numbers,
character strings, or some more complex structured data representation.
– also need to validate interpretation before use
• e.g. filename, URL, email address, identifier
14
Injection Attacks
• Injection attack refers to a wide variety of program flaws relating to
invalid input data handling which then influences program execution
– often when passed as a parameter to a helper program or other utility or
subsystem
– input data (deliberately) influence the flow of exec
• Most often occurs in scripting languages (JavaScript, perl, PHP,
python, sh, etc)
– Such languages encourage reuse of other programs/modules and system
utilities where possible to save coding effort
– often seen in web CGI scripts
15
Unsafe Perl Script
• Finger command : return some basic details on
the specified UNIX user
• This is an example of command injection
• Attacker provides a value that includes
metacharacters, for example, xxx; echo
attack success; ls -l finger*
• Counter attack : defensive programmer should
validate input
– compare to pattern that rejects invalid input
Unsafe perl finger CGI script
Finger form
Expected
and
subverte
d finger
CGI
response
Safety extension to perl finger CGI s 16
SQL Injection
• Another widely exploited injection attack
• When user-supplied input is used to construct a SQL request to retrieve
information from database
– similar to command injection
– SQL meta-characters are the concern
– must check and validate input for these
17
• Further variant Code Injection
• Input includes code that is then executed by the attacked system
– Many of the buffer overflow attacks include a code injection component
• injected code is binary machine language for a specific computer system
• injection of scripting language code into remotely executed scripts
– see PHP remote code injection vulnerability
• variable + global field variables + remote include
– this type of attack is widely exploited
19
An XSS Example
• Guestbooks, wikis, blogs etc
• Where comment includes script code
– e.g. to collect cookie details of viewing users When this text is viewed, it
displays a little text and then
• Need to validate data supplied executes the JavaScript code.
– including handling various possible encodings This replaces the document
contents with the information
• Attacks both input and output handling returned by the attacker’s cookie
script.
Plain XSS
Encoded XSS example example
20
Validating Input Syntax
21
Alternate Encodings
• may have multiple means of encoding text
– due to structured form of data, e.g. HTML or via use of some large character sets
• Unicode used for internationalization
– uses 16-bit value for characters
– UTF-8 encodes as 1-4 byte sequences
– have redundant variants : e.g. / is 2F, C0 AF, E0 80 AF
• Some attacks attempt to supply an absolute pathname for a file to a
script that expects only a simple local filename
– ensure that supplied filename does not start with “/” and does not contain any “../”
• Growing requirement to support users around the globe and to interact
with them using their own languages
• must canonicalize input before checking
– Transforming input data into a single, standard, minimal representation
– Once this is done the input data can be compared with a single representation of
acceptable input values
22
Validating Numeric Input
• Additional concern when input data represents numeric values
• Internally stored in fixed sized value
– e.g. 8, 16, 32, 64-bit integers
– 32, 64, 96 float depend on the processor
– signed or unsigned
23
Input Fuzzing
• Developed by Professor Barton Miller at the University of
Wisconsin Madison in 1989
• Powerful Software testing method using a large range of randomly
generated inputs to a program
– Intent is to determine whether program/function correctly handles abnormal
inputs
– simple, free of assumptions, cheap
– assists with reliability as well as security
• Can also use templates to generate classes of known problem
inputs
– Disadvantage is that bugs triggered by other forms of input would be missed
– Combination of approaches is needed for reasonably comprehensive coverage
of the inputs
24
Writing Safe Program Code
• Second component of computer programs model is
processing of input data by some algorithm to solve
required problem
• High-level languages are compiled and linked into machine
code which is then directly executed by the target processor
– execution of a program involves the execution of machine
instructions
– instructions will manipulate data stored in various regions of
memory and registers
• Security issues:
– correct algorithm implementation
– correct machine instructions for algorithm
– valid manipulation of data 25
Correct Algorithm Implementation (1/2)
• Issue of good program development technique
– Algorithm may not correctly handle all cases or variants of the problem
– Consequence of deficiency is a bug in the resulting program that could be
exploited
• Netscape session keys random number generator: supposed to be
unpredictable, but wasn’t => bug
• Initial sequence numbers used by many TCP/IP implementations are
too predictable
– Combination of the sequence number as an identifier
and authenticator of packets and the failure to make
them sufficiently unpredictable enables the attack to
occur
• Spoofed address three way hand shake:
response from the server will not be seen by the
attacker
– if attacker can correctly guess this number, can
construct suitable ACK packet and connection will
26
be established
Correct Algorithm Implementation (2/2)
• Another variant is when the programmers deliberately include
additional code in a program to help test and debug it
– This code remains in production release and could inappropriately release
information
– May permit a user to bypass security checks and perform actions they would not
otherwise be allowed to perform
– sendmail mail delivery program vulnerability was exploited by the Morris
Internet Worm
• Sendmail, to support DEBUG command, allowed the user to remotely query and
control the running program
• Worm used this feature to infect systems running versions of sendmail with this
vulnerability
• Problem aggravated because the sendmail program ran using superuser
27
Ensuring Machine Language Corresponds to
Algorithm
• Ensure machine instructions correctly implement high-level language code
– often ignored by programmers
• Assumption is that the compiler or interpreter generates or executes code that validly implements the
language statements
28
Correct Data Interpretation
• Data stored as groups of bits, saved in bytes in computer memory
– grouped together as a larger unit as words, longwords etc
– Other languages such as C allow more liberal interpretation of data and permit
program code to explicitly change their interpretation
29
Correct Use of Memory
• Related to the issue of interpretation of data values is the allocation
and management of dynamic memory storage,
• Many programs use dynamic memory allocation
– used to manipulate unknown amounts of data
– allocated when needed, released when done
• Memory leak occurs if incorrectly released
− Steady reduction in memory available on the heap to the point where it is
completely exhausted
− program will crash once the available memory on the heap is exhausted
• By default they are a copy of the parent’s environment variables but can be
modified by the program process at any time
– Modifications will be passed to its children
• Security concern: they are another source of untrusted program input hence need
to be validated
– Most common use is by a local user attempting to gain increased privileges
• Goal is to subvert a program that grants superuser or administrator privileges
33
Example Vulnerable Scripts
• Using PATH or IFS environment variables
• Cause script to execute attackers program with privileges granted to script
• Almost impossible to prevent in some form
Which grep, which sed?
• Takes the identity of some user, strips any domain specification if included, and then retrieves
the mapping for that user to an IP address
• Calls two separate programs: sed and grep. Shell search each directory named in the
PATH variable for these programs
• However, attacker simply can redefine the PATH variable to include a directory they control
which contains a program called grep that can do whatever the attacker desires
• Solution: use absolute names for each program
Alternative: PATH variable could be reset to a
known default value by the script
• However, IFS used to separate words that form a line of commands.
• Defaults: space, tab, or newline character. However, it can be set to any sequence of characters
• Consider the effect of including the “=” character in this set.
• Then assignment of a new value to PATH variable is interpreted as a command to execute program
PATH with list of directories as its argument. (attacker can change PATH to include a directory
34
with an attack program PATH)
Vulnerable Compiled Programs
• If invoking other programs can be vulnerable to PATH variable
manipulation
– must reset to “safe” values
• If dynamically linked may be vulnerable to manipulation of
LD_LIBRARY_PATH
– used to locate suitable dynamic library
– must either statically link privileged programs or prevent use of this variable
35
Use of Least Privilege
• Exploit of flaws may give attacker greater privileges - privilege
escalation
– Using higher levels of privilege may enable attacker to make changes to the
system, ensuring future use
• Hence run programs with least privilege needed to complete their
function
• Whenever a privileged program runs, care must be taken to determine
suitable user and group privileges required
– Decide whether to grant extra user or just group privileges
• latter preferred and safer, may not be sufficient
• Ensure that privileged program can modify only those files and
directories necessary
• common deficiency: many privileged programs have ownership of all associated
files and directories
• If program is compromised, the attacker can cause greater damage
• recheck these when moved or upgraded 36
Root/Admin Programs
• programs with root / administrator privileges a major target of
attackers
– since provide highest levels of system access and control
– Such high privileges are needed to manage access to protected system
resources, e.g. network server ports
• often privilege only needed at start
– can then run as normal user
• Example: HTTP, SSH, SMTP, DNS servers need to bind to a privileged service port
• Traditionally, these server programs executed with root privileges for the entire time
45