0% found this document useful (0 votes)
3 views23 pages

Was-Unit I Notes

The document discusses the fundamentals of web application security, tracing the history of software security from early hacking incidents to modern threats. It highlights key developments such as the Enigma machine, the rise of personal computers, and the evolution of the World Wide Web, while also identifying common web application security threats like SQL injection and cross-site scripting. The document emphasizes the importance of secure coding practices and mitigation strategies to protect against these threats.

Uploaded by

mmuhamedasief
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views23 pages

Was-Unit I Notes

The document discusses the fundamentals of web application security, tracing the history of software security from early hacking incidents to modern threats. It highlights key developments such as the Enigma machine, the rise of personal computers, and the evolution of the World Wide Web, while also identifying common web application security threats like SQL injection and cross-site scripting. The document emphasizes the importance of secure coding practices and mitigation strategies to protect against these threats.

Uploaded by

mmuhamedasief
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

UNIT-1

FUNDAMENTALS OF WEB APPLICATION SECURITY


The history of Software Security-Recognizing Web Application Security
Threats, Web Application Security, Authentication and Authorization, Secure
Socket layer, Transport layer Security, Session Management-Input Validation

1. The history of Software Security


The Origins of Hacking

 In the past two decades, hackers have gained more publicity and notoriety than ever before.
As a result, it’s easy for anyone without the appropriate background to assume that hacking
is a concept closely tied to the internet and that most hackers emerged in the last 20 years
 While the number of hackers worldwide has definitely exploded with the rise of the World
Wide Web, hackers have been around since the middle of the 20th century—potentially
even earlier depending on what you define as “hacking.” Many experts debate the decade
that marks the true origin of modern hackers because a few significant events in the early
1900s showed significant resemblance to the hacking you see in today’s world.
 For example, there were specific isolated incidents that would likely qualify as hacking in
the 1910s and 1920s, most of which involved tampering with Morse code senders and
receivers, or interfering with the transmission of radio waves. However, while these events
did occur, they were not common, and it is difficult to pinpoint largescale operations that
were interrupted as a result of the abuse of these technologies
 It is also important to note that I am no historian. I am a security professional with a
background in finding solutions to deep architectural and code-level security issues in
enterprise software.

The Enigma Machine, Circa 1930

 The Enigma machine used electric-powered mechanical rotors to both encrypt and decrypt
text-based messages sent over radio waves (see Figure 1-1). The device had German
origins and would become an important technological development during the Second
World War.
 The device looked like a large square or rectangular mechanical typewriter. On each key
press, the rotors would move and record a seemingly random character that would then be
transmitted to all nearby Enigma machines.
 However, these characters were not random, and were defined by the rotation of the rotor
and a number of con‐ figuration options that could be modified at any time on the device.
Any Enigma machine with a specific configuration could read or “decrypt” messages sent
from another machine with an identical configuration.
 This made the Enigma machine extremely valuable for sending crucial messages while
avoiding interception. While a sole inventor of the rotary encryption mechanism used by
the machine is hard to pinpoint, the technology was popularized by a two-man company
called Chiffriermaschinen AG based in Germany. In the 1920s, Chiffriermaschinen AG
traveled throughout Germany demonstrating the technology, which led to the German
military adopting it in 1928 to secure top-secret military messages in transit.
 The ability to avoid the interception of long-distance messages was a radical development
that had never before been possible. In the software world of today, the interception of
messages is still a popular technique that hackers try to employ, often called a man-in-the-
middle attack. Today’s software uses similar (but much more powerful) techniques to those
that the Enigma machine used a hundred years ago to protect against such attacks

Automated Enigma Code Cracking, Circa 1940

 Alan Turing was an English mathematician who is best known for his development of a test
known today as the “Turing test.” The Turing test was developed to rate conversations
generated by machines based on the difficulty in differentiating those conversations from
the conversations of real human beings.
 This test is often considered to be one of the foundational philosophies in the field of
artificial intelligence (AI). While Alan Turing is best known for his work in AI, he was also
a pioneer in cryptography and automation. In fact, prior to and during World War II, Alan’s
research focus was primarily on cryptography rather than AI. Starting in September of
1938, Alan worked part time at the Government Code and Cypher School
(GC&CS)GC&CS was a research and intelligence agency funded by the British Army,
located in Bletchley Park, England.
 Alan’s research primarily focused on the analysis of Enigma machines. At Bletchley Park,
Alan researched Enigma machine cryptography alongside his then-mentor Dilly Knox, who
at the time was an experienced cryptographer. Much like the Polish mathematicians before
them, Alan and Dilly wanted to find a way to break the (now significantly more powerful)
encryption of the German Enigma machines. Due to their partnership with the Polish
Cipher Bureau, the two gained access to all of the research Marian’s team had produced
nearly a decade earlier. This meant they already had a deep understanding of the machine.
 They under‐ stood the relationship between the rotors and wiring, and knew about the
relationship between the device configuration and the encryption that would be out‐ put
(Figure 1-2).

Introducing the “Bombe

 A bombe was an electric-powered mechanical device that attempted to automatically


reverse engineer the position of mechanical rotors in an Enigma machine based on
mechanical analysis of messages sent from such machines (see Figure 1-3)
 The first bombes were built by the Polish, in an attempt to automate Marian’s work.
Unfortunately, these devices were designed to determine the configuration of Enigma
machines with very specific hardware. In particular, they were ineffective against machines
with more than three rotors.
 Because the Polish bombe could not scale against the development of more complex
Enigma machines, the Polish cryptographers eventually went back to using manual
methods for attempting to decipher German wartime messages

Telephone “Phreaking,” Circa 1950

 After the rise of the Enigma machine in the 1930s and the cryptographic battle that
occurred between major world powers, the introduction of the telephone is the next major
event in our timeline. The telephone allowed everyday people to communicate with each
other over large distances, and at rapid speed.
 As telephone networks grew, they required automation in order to function at scale. In the
late 1950s, telecoms like AT&T began implementing new phones that could be
automatically routed to a destination number based on audio signals emitted from the
phone unit. Pressing a key on the phone pad emitted a specific audio frequency that was
transmitted over the line and interpreted by a machine in a switching center. A switching
machine translated these sounds into numbers and routed the call to the appropriate
receiver

Anti-Phreaking Technology, Circa 1960

 In the 1960s, phones were equipped with a new technology known as dual-tone mul‐
tifrequency (DTMF) signaling. DTMF was an audio-based signaling language developed
by Bell Systems and patented under the more commonly known trademark, “Touch
Tones.” DTMF was intrinsically tied to the phone dial layout we know today that consists
of three columns and four rows of numbers.
 Each key on a DTMF phone emitted two very specific audio frequencies, versus a single
frequency like the original tone dialing systems. This table represents the “Touch Tones,”
or sounds, (in hertz) that older telephones made on keypress: 1 2 3 (697 Hz) 4 5 6 (770 Hz)
7 8 9 (852 Hz) * 0 # (941 Hz) (1209 Hz) (1336 Hz) (1477 Hz) The development of DTMF
was due largely to the fact that phreakers were taking advantage of tone dialing systems
because of how easy those systems were to reverse engineer.
 Bell Systems believed that because the DTMF system used two very differ‐ ent tones at
the same time, it would be much more difficult for a malicious actor to take advantage of it.

The Origins of Computer Hacking, Circa 1980

 In 1976, Apple released the Apple 1 personal computer. This computer was not con‐
figured out of the box and required the buyer to provide a number of components and
connect them to the motherboard. Only a few hundred of these devices were built and sold.
In 1982, Commodore International released its competitor device.
 This was the Com‐ modore 64, a personal computer that was completely configured right
out of the box. It came with its own keyboard, could support audio, and could even be used
with multicolor displays. The Commodore 64 would go on to sell nearly 500,000 units per
month until the early 1990s.
 From this point forward, the sales trend for personal computers would continually increase
year over year for several decades to come. Computers soon became a common tool in
households as well as businesses, and took over common repetitive tasks, such as managing
finances, human resources, accounting, and sales. In 1983, Fred Cohen, an American
computer scientist, created the very first computer virus.
 The virus he wrote was capable of making copies of itself and was easily spread from one
personal computer to another via floppy disk. He was able to store the virus inside a
legitimate program, masking it from anyone who did not have source code access. Fred
Cohen later became known as a pioneer in software security, demonstrating that detecting
viruses from valid software with algorithms was almost impossible.

The Rise of the World Wide Web, Circa 2000

 The World Wide Web (WWW) sprang up in the 1990s, but its popularity began to explode
at the end of the 1990s and in the early 2000s. In the 1990s, the web was almost
exclusively used as a way of sharing documents written in HTML. Websites did not pay
attention to user experience, and very few allowed the user to send any inputs back to the
server in order to modify the flow of the website.
 Figure 1-4 shows an Apple.com website from 1997 with purely informa‐ tional data. The
early 2000s marked a new era for the internet because websites began to store user-
submitted data and modify the functionality of the site based on user input.
 This was a key development, later known as Web 2.0. Web 2.0 websites allowed users to
collaborate with each other by submitting their inputs over Hypertext Transport Protocol
(HTTP) to a web server, which would then store the inputs and share them with fellow
users upon request.

Hackers in the Modern Era, Circa 2015+

 The point in discussing hacking in previous eras was to build a foundation from which we
can begin our journey in this book. From analyzing the development and cryptoanalysis of
Enigma machines in the 1930s, we gained insight into the importance of security, and the
lengths that others will go to in order to break that security.
 In the 1940s, we saw an early use case for security automation. This particular case was
driven by the ongoing battle between attackers and defenders. In this case, the Enigma
machine technology had improved so much it could no longer be reliably broken by
manual cryptoanalysis techniques.
 Alan Turing turned to automation to beat the security improvements. The 1950s and 1960s
showed us that hackers and tinkerers have a lot in common. We also learned that
technology designed without considering users with malicious intent will lead to that
technology eventually being broken into. We must always con‐ sider the worst-case
scenario when designing technology to be deployed at scale and across a wide user base. In
the 1980s, the personal computer started to become popular.
 Around this time, we began to see the hackers we recognize today emerge. These hackers
took advantage of the powers that software enabled, camouflaging viruses inside of
legitimate applications, and using networks to spread their viruses rapidly. Finally, the
introduction and rapid adoption of the World Wide Web led to the development of Web
2.0, which changed the way we think about the internet.
 Instead of the internet being a medium for sharing documents, it became a medium for
sharing applications. As a result, new types of exploits emerged that take advantage of the
user rather than the network or server.
 Today’s browsers offer very robust isolation between websites with different origins,
following a security specification known as Same Origin Policy (SOP). This means that
website A cannot be accessed by website B even if both are open at once or one is
embedded as an iframe inside the other.
 Browsers also accept a new security configuration known as Content Security Policy
(CSP). CSP allows the developer of a website to specify various levels of security, such as
whether scripts should be able to execute inline (in the HTML).
 This allows web developers to further protect their applications against common threats.
HTTP, the main protocol for sending web traffic, has also improved from a security
perspective. HTTP has adopted protocols like SSL and TLS that enforce strict encryp‐ tion
for any data traveling over the network. This makes man-in-the-middle attacks very
difficult to pull off successfully

2.Recognizing Web Application Security Threats


Here are the ten common web application security threats we will cover in this article:

 SQL injection
 Cross-site scripting (XSS)
 Cross-site request forgery (CSRF)
Insecure direct object references
Remote code execution
Insufficient logging and monitoring
Insecure cryptographic storage
Failure to restrict URL access
Cross-origin resource sharing (CORS) misconfiguration
Using components with known vulnerabilities

1. SQL Injection
 A SQL injection attack is executed when an attacker injects malicious code into an
application's database through user input fields. These types of attacks can accomplish
many different things. Two of the most common outcomes include allowing the attacker to
gain unauthorized access to sensitive data stored in the database.
 Depending on what data the database is storing, the attack could get access to passwords,
financial information, and personal data. The second outcome could be the manipulation or
deletion of data. For instance, a user may be able to execute a DROP TABLE or DROP
DATABASE command.
You can mitigate this with the following steps:
Validate user input.
 Use output encoding, which involves converting special characters such as < and > into
their HTML entity equivalents, to prevent them from being interpreted as HTML code.
 Use prepared statements, parameterized queries, or stored procedures instead of dynamic
SQL whenever possible.
 Most languages and frameworks have recommended ways of handling form input. By
combing frontend and backend standards to prevent SQL injection from happening, your
application can increase its security against this type of threat.
2. Cross-Site Scripting (XSS)
 Cross-site scripting (XSS) attacks involve injecting malicious code or a malicious script
into a website. The website then executes the script, allowing the attacker to steal sensitive
user data, like session tokens and cookies, or perform other actions.
 There are two main types of XSS attacks: reflective and stored. Reflective XSS attacks
involve injecting malicious code into a website that is immediately executed. Stored XSS
attacks involve injecting malicious code into a website that is stored and executed at a later
time.
 If successful, a cross-site scripting attack can result in the theft of user session IDs, website
defacement, and redirection to malicious sites, thereby enabling phishing attacks.
You can mitigate this with the following steps:
Validate user input.
Use output encoding techniques.
Use auto-sanitization libraries such as OWASP AntiSamy.
Implement a content security policy.
Similar to the recommendation for SQL injection, using modern web frameworks generally tends
to steer developers towards secure coding practices to avoid XSS and similar attacks.
3. Cross-Site Request Forgery (CSRF)
 Cross-site request forgery (CSRF) is a type of attack that involves tricking a victim into
performing an action on a website without their knowledge. This can be done by injecting a
malicious link or form into a website that the victim is already authenticated on.
 When the victim clicks the link or submits the form, the action is performed on their behalf,
potentially leading to data loss or unauthorized access.
You can mitigate this with the following steps:
Leverage CSRF protections already built into the framework you are using, if applicable.
 Use CSRF tokens. These are unique, randomized values associated with a user's session
and are included in forms and links to verify the authenticity of the request.
 Use SameSite cookies. These are a type of cookie that is only sent with requests to the
same origin as the cookie's creation. This can help prevent attackers from being able to
send requests on behalf of a victim, as they would not have access to the victim's SameSite
cookies.
4. Insecure Direct Object References (IDOR)
 Insecure Direct Object References, or IDOR, occur when an application exposes direct
object references, such as URLs or database keys, that allow attackers to access restricted
data by manipulating these references.
 For example, an application may allow users to access their account information by
entering their account number in a URL, such as "www.example.com/account/123". An
attacker could potentially access other users' account information by changing the account
number in the URL.
You can mitigate this with the following steps:
 Implement proper access controls and session management. This involves setting up
mechanisms to ensure that only authorized users have access to certain resources or data.
The OWASP cheat sheets on authorization and authentication can be helpful resources for
reviewing best practices in these areas.
 Validate user input. To help prevent attackers from manipulating direct object references to
access-restricted data, ensure that user input is the correct type, length, and format.
 Avoid using predictable references. Instead, consider using globally unique identifiers
(GUIDs) to prevent attackers from guessing the direct object references they need to access
restricted data.
 As noted in the mitigation steps above, IDOR-based vulnerabilities don’t occur on their
own. These vulnerabilities must be coupled with other vulnerabilities to become an
effective attack vector.
5. Remote Code Execution (RCE)
 Remote Code Execution (RCE) attacks allow attackers to execute arbitrary code on a
server, potentially leading to full system compromise and unauthorized access to sensitive
data.
 RCE attacks can occur through a variety of means, such as exploiting vulnerabilities in
code libraries or injecting malicious code through user input fields.
 A successful RCE attack can have several consequences. These include Denial of Service
(DoS) attacks, exposure of sensitive data, illicit cryptocurrency mining, and execution of
malware. In some cases, a successful RCE attack can even give full control over the
compromised machine to the attacker.
You can mitigate this with the following steps:
Sanitize user input.
 Implement secure memory management. RCE attackers can potentially take advantage of
memory management flaws such as buffer overflows. Conducting regular security
vulnerability scans on your applications can help you identify buffer overflow and
memory-related vulnerabilities that an attacker could exploit.
 Always keep your operating system and your third-party software up to date to ensure that
you have the latest security patches.
 Limit the attacker's ability to move through a network y implementing network
segmentation, access management, and a zero-trust security strategy.
 RCE attacks have been a major source of breaches in the last few years, many leading to
worldwide security emergencies. One that many people will remember is the Log4j fiasco
discovered in 2021 where multiple RCE vulnerabilities were discovered in Log4j. These
RCE vulnerabilities allowed attackers to exploit vulnerable applications to execute
Cryptojacking attacks and other malware on compromised servers.
6. Insufficient Logging and Monitoring
 Insufficient logging and monitoring refer to a lack of proper logging and monitoring
processes in place to detect and respond to security threats. This can allow attackers to go
unnoticed and continue to compromise the system, potentially leading to data loss and
financial loss.
 It’s also important to be aware of what is being logged. If secure information, such as credit
card numbers or passwords, are being written to logs, attackers who gain access to the logs
could use this information maliciously. Fraudulent credit card charges or unauthorized
access to a system could be easily executed.
You can mitigate this with the following steps:
Enable logging for key events and actions in your application and monitor logs regularly.
 Use log analysis tools. These can help automate the process of reviewing logs and identify
potential security issues or anomalies more quickly and efficiently.
 Set up alerting systems to notify administrators of any potential security issues in real time,
allowing them to respond more quickly to potential threats.
Ensure that sensitive information is either not included in logs or is properly masked.
7. Insecure Cryptographic Storage
 Insecure cryptographic storage refers to the improper handling of cryptographic keys, such
as storing them in plain text or using weak keys. This can allow attackers to gain access to
sensitive data through compromised cryptographic keys.
You can mitigate this with the following steps:
Use strong cryptographic algorithms, such as AES or RSA, to secure stored data.
 Implement key management best practices, such as regularly rotating keys and securely
storing them, to help prevent unauthorized access to encrypted data.
 Use secure storage solutions, such as hardware security modules or encrypted storage
devices, to help further protect encrypted data.
 One suggestion is also to audit the data that you need to store in an encrypted state. The
best way to protect data is to simply not store it at all. If sensitive data is being stored
without need, it may be best to forego the storage of this data to lessen the data that
potential attackers have access to.
8. Failure to Restrict URL Access / Broken Access Control
 Failure to restrict URL access refers to a lack of proper access controls that allow
unauthorized users to access restricted pages and resources. This can allow attackers to
access sensitive data and potentially compromise the system.
 This security threat is mostly similar and related to the IDOR vulnerabilities we discussed
earlier. The core differentiating factor between the two is that IDOR tends to give the
attacker access to information in the database, while failure to restrict URL access allows
the attacker access to special functions and features that should not be available to any
typical user.
You can mitigate this with the following steps:
 Implement proper access controls by setting up authentication and authorization processes
to ensure that only authorized users have access to certain resources or functions.
 Use role-based authorization. The enforcement mechanism should deny all access by
default, requiring explicit grants to specific users and roles for access to every page.
Implement adequate authorization measures at relevant stages of user web app use.
 Many routing libraries and routing mechanisms built into modern web frameworks tend to
protect against this by default. By making sure that the application routing is set up
correctly, these types of vulnerabilities can be completely avoided.
9. Cross-Origin Resource Sharing (CORS) Misconfiguration
 Cross-Origin Resource Sharing (CORS) is a security feature that allows a web server to
specify which domains are allowed to access its resources. However, if CORS is
misconfigured, it can allow attackers to access restricted resources from a different origin.
This could potentially expose data through services that can be used without authorization.
You can mitigate this with the following steps:
Properly configure CORS headers.
 Use CORS libraries that provide an easy-to-use interface for configuring CORS headers to
help you configure CORS properly.
 Many server-side frameworks and platforms can aid developers in properly configuring
CORS for their services. Developers should be aware of how CORS can be configured in
the framework of their choosing. One common reason for CORS security misconfiguration
is that when developers are creating applications locally they will set an entirely open
CORS policy for easier development. Ensuring that these policies do not get checked into
production code is crucial.
10. Using Components with Known Vulnerabilities
 Using components with known vulnerabilities refers to the use of outdated code libraries,
frameworks, or other components with known vulnerabilities.
 Many websites today are built using complex components, which can make it difficult for
development teams to understand their internal workings. This can create potential
vulnerabilities if a component contains known security issues that are not properly
addressed.
You can mitigate this with the following steps:
 Keep track of component versions. You can address any vulnerabilities by regularly
checking for updates and staying up to date with the latest versions of components.
 Use security scanners to help you identify known vulnerabilities in components and alert
developers to potential issues.
 Using tools like Dependabot can help keep your dependencies up to date automatically. By
using scanning tools and automation, keeping dependencies secure and up-to-date is easy to
do as part of the development process.

Authentication and Authorization


 Because we are storing credentials and offering a different user experience to guests and
registered users, we know we have both an authentication and an authorization system.
This means we must allow users to log in, as well as be able to differentiate among
different tiers of users when determining what actions these users are allowed.
 Furthermore, because we are storing credentials and support a login flow, we know there
are going to be credentials sent over the network. These credentials must also be stored in
a database, otherwise the authentication flow will break down.
This means we have to consider the following risks:
• How do we handle data in transit?
• How do we handle the storage of credentials?
• How do we handle various authorization levels of users?

Secure Sockets Layer and Transport Layer Security


 One of the most important architectural decisions to tackle as a result of the risks we have
determined is how to handle data in transit. Data in transit is an important firststep
evaluation during architecture review because it will affect the flow of all data throughout
the web application.
 An initial data-in-transit requirement should be that all data sent over the network is
encrypted en route. This reduces the risk of a man-in-the-middle attack, which could steal
credentials from our users and make purchases on their behalf (since we are storing their
financial data). Secure Sockets Layer (SSL) and Transport Layer Security (TLS) are the
two major cryptographic protocols in use today for securing in-transit data from malicious
eyes in the middle of any network.
 SSL was designed by Netscape in the mid-1990s, and several versions of the protocol have
been released since then. TLS was defined by RFC 2246 in 1999, and offered upgraded
security in response to several architectural issues in SSL (see Figure 18-1 for an example).
TSL cannot interpolate with older versions of SSL due to the amount of architectural
differences between the two.
 TLS offers the most rigid security, while SSL has higher adoption but multiple
vulnerabilities that reduce its integrity as a cryptographic protocol.

 All major web browsers today will show a lock icon in the URL address bar when a
website’s communication is properly secured via SSL or TLS. The HTTP specification
offers “HTTPS” or “HTTP Secure,” a URI-scheme that requires TLS/SSL to be present
before allowing any data to be sent over the network. Browsers that support HTTPS will
display a warning to the end user if TLS/SSL connections are compromised when an
HTTPS request is made. For MegaMerch, we would want to ensure that all data is
encrypted and TLS compatible prior to being sent over the network.
 The way TLS is implemented is generally server specific, but every major web server
software package offers an easy integration to begin encrypting web traffic
 Secure Credentials Password security requirements exist for a number of reasons, but
unfortunately, most developers don’t understand what makes a password hacker-safe.
Creating a secure password has less to do with the length and number of special characters,
but instead has everything to do with the patterns that can be found in the password. In
cryptography, this is known as entropy—the amount of randomness and uncertainty. You
want passwords with a lot of entropy. Believe it or not, most passwords used on the web
are not unique.
 When a hacker attempts to brute force logins to a web application, the easiest route is to
find a list of the top most common passwords and use that to perform a dictionary attack.
An advanced dictionary attack will also include combinations of common passwords,
common password structure, and common combinations of passwords. Beyond that,
classical brute forcing involves iterating through all possible combinations. As you can see,
it is not so much the length of the password that will protect you, but instead the lack of
observable patterns and avoidance of common words and phrases. Unfortunately, it is
difficult to convey this to users. Instead, we should make it difficult for a user to develop a
password that contains a number of well-known patterns by having certain requirements.

Hashing Credentials

 When storing sensitive credentials, we should never store in plain text. Instead, we should
hash the password the first time we see it prior to storing it.
 Hashing a password is not a difficult process, and the security benefits are massive.
Hashing algorithms differ from most encryption algorithms for a number of reasons. First
off, hashing algorithms are not reversible. This is a key point when dealing with
passwords. We don’t want even our own staff to be able to steal user passwords because
they might use those passwords elsewhere (a bad practice, but common), and we don’t
want that type of liability in the case of a rogue employee.
 Next, modern hashing algorithms are extremely efficient. Today’s hashing algorithms can
represent multiple-megabyte strings of characters in just 128 to 264 bits of data. This
means that when we do a password check, we will rehash the user’s password at login and
compare it to the hashed password in the database. Even if the user has a huge password,
we will be able to perform the lookup at high speeds
 Another key advantage of using a hash is that modern hashing algorithms have almost no
collision in practical application (either 0 collisions, or statistically approaching 0—
1/1,000,000,000+). This means you can mathematically determine that the probability that
two passwords will have identical hashes will be extraordinarily low.
 As a result, you do not need to worry about hackers “guessing” a password unless they
guess the exact password of another user.If a database is breached and data is stolen,
properly hashed passwords protect your users. The hacker will only have access to the
hash, and it will be very unlikely that even a single password in your database will be
reverse engineered.
 Let’s consider three cases where a hacker gets access to MegaMerch’s databases:

Case #1

Passwords stored in plain text

Result

All passwords compromised

Case #2

Passwords hashed with MD5 algorithm


Result

Hacker can crack some of the passwords using rainbow tables (a precomputed

table of hash→password; weaker hashing algorithms are susceptible to these)

Case #3

Passwords hashed with BCrypt

Result

 It is unlikely any passwords will be cracked As you can see, all passwords should be
hashed. Furthermore, the algorithm used for hashing should be evaluated based on its
mathematical integrity and scalability with modern hardware. Algorithms should be SLOW
on modern hardware when hashing,hence reducing the number of guesses per second a
hacker can make.
 When cracking passwords, slow hashing algorithms are essential because the hacker will be
automating the password to hash process. Once the hacker finds an identical hash to a
password (ignoring potential collision), the password has been effectively breached.
Extremely slow to hash algorithms like BCrypt can take years or more to crack one
password on modern hardware.
 Modern web applications should consider the following hashing algorithms for securing the
integrity of their users’ credentials.

BCrypt

 BCrypt is a hashing function that derives its name from two developments: the “B” comes
from Blowfish Cipher, a symmetric-key block cipher developed in 1993 by Bruce
Schneier, designed as a general purpose and open source encryption algorithm. “Crypt” is
the name of the default hashing function that shipped with Unix OSs.
 The Crypt hashing function was written with early Unix hardware in mind, which meant
that at the time hardware could not hash enough passwords per second to reverse engineer a
hashed password using the Crypt function. At the time of its development, Crypt could
hash fewer than 10 passwords per second.
 With modern hard‐ ware, the Crypt function can be used to hash tens of thousands of
passwords per second. This makes breaking a Crypt-hashed password an easy operation for
any current-era hacker.

PBKDF2

 As an alternative to BCrypt, the PBKDF2 hashing algorithm can also be used to secure
passwords. PBKDF2 is based on a concept known as key stretching. Key stretching
algorithms will rapidly generate a hash on the first attempt, but each additional attempt will
become slower and slower.
 As a result, PBKDF2 makes brute forcing a computationally expensive process. PBKDF2
was not originally designed for hashing passwords, but should be sufficient for hashing
passwords when BCrypt-like algorithms are not available. PBKDF2 takes a configuration
option that represents the minimum number of iterations in order to generate a hash.
 This minimum should always be set to the highest number of iterations your hardware can
handle. You never know what type of hard‐ ware a hacker might have access to, so by
setting the minimum iterations for a hash to your hardware’s maximum value, you are
eliminating potential iterations on faster hardware and eliminating any attempts on slower
hardware.
 In our evaluation of MegaMerch, we have decided to hash our passwords using BCrypt and
will only compare password hashes

2FA

 In addition to requiring secure, hashed passwords that are encrypted in transit, we also
should consider offering 2FA to our users who want to ensure their account integrity is not
compromised. Figure 18-2 shows Google Authenticator, one of the most common 2FA
applications for Android and iOS.
 It is compatible with many websites and has an open API for integrating into your
application. 2FA is a fantastic security feature that operates very effectively based on a
very simple principle
Most 2FA systems require a user to enter a password into their browser, in addition to entering a
password generated from a mobile application or SMS text message. More advanced 2FA
protocols actually make use of a physical hardware token, usually a USB drive that generates a
unique one-time-use token when plugged into a user’s computer.

 Generally speaking, the physical tokens are more applicable to the employees of a business
than to its users. Distributing and managing physical tokens for an ecommerce platform
would be a painful experience for everyone involved.
 Phone app/SMS-based 2FA might not be as secure as a dedicated 2FA USB token, but the
benefits are still an order of magnitude safer than application use without 2FA. In the
absence of any vulnerabilities in the 2FA app or messaging protocol, 2FA eliminates
remote logins to your web application that were not initiated by the owner of the account.
 The only way to compromise a 2FA account is to gain access to both the account password
and the physical device containing the 2FA codes (usually a phone). During our
architecture review with MegaMerch, we strongly suggest offering 2FA to users who wish
to improve the security of their MegaMerch accounts.
Input validation
 Security flaws often occur when an attacker can submit inputs that violate your
assumptions about how the code should operate. For example, you might assume that an
input can never be more than a certain size. If you’re using a language like C or C++ that
lacks memory safety, then failing to check this assumption can lead to a serious class of
attacks known as buffer overflow attacks.
 Even in a memory-safe language,failing to check that the inputs to an API match the
developer’s assumptions can result in unwanted behavior.
 DEFINITION A buffer overflow or buffer overrun occurs when an attacker can supply
input that exceeds the size of the memory region allocated to hold that input. If the
program, or the language runtime, fails to check this case then the attacker may be able to
overwrite adjacent memory.
 A buffer overflow might seem harmless enough; it just corrupts some memory, so maybe
we get an invalid value in a variable, right? However, the memory that is overwritten may
not always be simple data and, in some cases, that memory may be interpreted as code,
resulting in a remote code execution vulnerability.
 Such vulnerabilities are extremely serious, as the attacker can usually then run code in your
process with the full permissions of your legitimate code.
 DEFINITION Remote code execution (RCE) occurs when an attacker can inject code into a
remotely running API and cause it to execute. This can allow the attacker to perform
actions that would not normally be allowed.
 In the Natter API code, the input to the API call is presented as structured JSON. As Java is
a memory-safe language, you don’t need to worry too much about buffer overflow attacks.
You’re also using a well-tested and mature JSON library to parse the input, which
eliminates a lot of problems that can occur. You should always use well established formats
and libraries for processing all input to your API where possible.
 JSON is much better than the complex XML formats it replaced, but there are still often
significant differences in how different libraries parse the same JSON. Although the API is
using a safe JSON parser, it’s still trusting the input in other regards.
 For example, it doesn’t check whether the supplied username is less than the 30-character
maximum configured in the database schema. What happens you pass in a longer
username?

$ curl -d '{"name":"test", "owner":"a really long username

➥ that is more than 30 characters long"}'

➥ http://localhost:4567/spaces -i

HTTP/1.1 500 Server Error

Date: Fri, 01 Feb 2019 13:28:22 GMT

Content-Type: application/json
Transfer-Encoding: chunked

Server: Jetty(9.4.8.v20171121)

{"error":"internal server error"}

 If you look in the server logs, you see that the database constraint caught the problem:

Value too long for column "OWNER VARCHAR(30) NOT NULL"But you shouldn’t rely on the
database to catch all errors. A database is a valuable asset that your API should be protecting from
invalid requests. Sending requests to the database that contain basic errors just ties up resources
that you would rather use processing genuine requests.

 Furthermore, there may be additional constraints that are harder to express in a database
schema. PRINCIPLE Always define acceptable inputs rather than unacceptable ones when
validating untrusted input. An allow list describes exactly which inputs are considered valid
and rejects anything else.1 A blocklist (or deny list), on the other hand, tries to describe
which inputs are invalid and accepts anything else.
 Blocklists can lead to security flaws if you fail to anticipate every possible malicious input.
Where the range of inputs may be large and complex, such as Unicode text, consider listing
general classes of acceptable inputs like “decimal digit” rather than individual input values.
 Open the SpaceController.java file in your editor and find the createSpace method again.
After each variable is extracted from the input JSON, you will add some basic validation.
First, you’ll ensure that the spaceName is shorter than 255 characters, and then you’ll
validate the owner username matches the following regular expression:

[a-zA-Z][a-zA-Z0-9]{1,29}

That is, an uppercase or lowercase letter followed by between 1 and 29 letters or digits. This is a
safe basic alphabet for usernames, but you may need to be more flexible if you need to support
international usernames or email addresses as usernames.

Listing 2.8 Validating inputs

public String createSpace(Request request, Response response)

throws SQLException {

var json = new JSONObject(request.body());

var spaceName = json.getString("name");

if (spaceName.length() > 255) {

throw new IllegalArgumentException("space name too long");

var owner = json.getString("owner");


if (!owner.matches("[a-zA-Z][a-zA-Z0-9]{1,29}")) {

throw new IllegalArgumentException("invalid username: " + owner);

..

 Regular expressions are a useful tool for input validation, because they can succinctly
express complex constraints on the input. In this case, the regular expression ensures that
the username consists only of alphanumeric characters, doesn’t start with a number, and is
between 2 and 30 characters in length.
 Although powerful, regular expressions can themselves be a source of attack. Some regular
expression implementations can be made to consume large amounts of CPU time when
processing certain inputs,

Listing 2.9 Handling exceptions

import org.dalesbred.result.EmptyResultException;

import spark.*; Add required imports.

public class Main {

public static void main(String... args) throws Exception {


..
exception(IllegalArgumentException.class,
Main::badRequest);
exception(JSONException.class,
Main::badRequest);
exception(EmptyResultException.class,
(e, request, response) -> response.status(404));
}
private static void badRequest(Exception ex,
Request request, Response response) {
response.status(400);
response.body("{\"error\": \"" + ex + "\"}");
}
..
}
Now the user gets an appropriate error if they supply invalid input:
$ curl -d '{"name":"test", "owner":"a really long username
➥ that is more than 30 characters long"}'
➥ http://localhost:4567/spaces -i
HTTP/1.1 400 Bad Request
Date: Fri, 01 Feb 2019 15:21:16 GMT
Content-Type: text/html;charset=utf-8
Transfer-Encoding: chunked
Server: Jetty(9.4.8.v20171121)
{"error": "java.lang.IllegalArgumentException: invalid username: a really
long username that is more than 30 characters long"}

Session Management
 Session management manages sessions between the web application and the users.
The communication between a web browser and a website is usually done over HTTP
or HTTPS. When a user visits a website, a session is made containing multiple
requests and responses over HTTP.
 According to RFC (section 5, RFC2616), HTTP is a stateless protocol. In this process,
each request and response is independent of other web processes. Session
management capabilities linked to authentication, access, control, and authorization
are commonly available in a web application.

 Modern web applications require maintaining multiple sessions of different users over
a time frame in case of numerous requests. Regarding security, session management
relates to securing and managing multiple users’ sessions against their request. In
most cases, a session is initiated when a user supplies an authentication such as a
password.
 A web application makes use of a session after a user has supplied the authentication
key or password. Based on the authentication, the user is then provisioned to access
specific resources on the application.

Session ID
 A session ID, also known as a session token, is a unique number ID assigned by a
website server to a specific user for the duration the user is on the website. This
session ID’s storage is in the form of a cookie, form field, or URL.
 Each time a user opens a web browser and visits a website, a session ID is generated.
The session ID remains the same for some time. If a user closes the browser and
reopens the web browser to visit a site, a new session ID is created again.
 The token or session ID binds the user’s credentials for authentication for user HTTP
traffic. The web application then applies the access control and permissions.

Session Cookies
 A session cookie contains data put away in a temporary memory area and deleted
after the session is finished or the web browser is closed. This cookie stores data that
the client has entered and tracks the client’s developments inside the website.
 This kind of cookie is made without a set date, unlike a persistent cookie, which has
an expiration date attached to it. Since a session cookie is temporary, it doesn’t
acquire data from the user’s PC or the user’s identity.

Attacks related to Sessions


 When authentication and session management are not properly secured and
configured, adversaries can steal the passwords or session IDs to access user’s
accounts and spoof their IDs.
 If the session IDs are compromised, adversaries can impersonate other users on the
network, system, or application. This kind of attack is known as session hijacking,
where the hacker can use brute force, predict and expose the session tokens.

Session fixation is another type of attack that enables attackers to hijack a user’s valid session
ID.
 According to Acunetix, “The attack explores a limitation in the way the web
application manages the session ID, more specifically the vulnerable web application.
When authenticating a user, it doesn’t assign a new session ID, making it possible to
use an existent session ID.
 The attack consists of inducing a user to authenticate himself with a known session ID
and then hijacking the user-validated session with the knowledge of the used session
ID. The attacker has to provide a legitimate web application session ID and try to
make the victim’s browser use it.”

Best Practices for Implementing Session Management

 Having many points of attack related to a web session or a large attack surface can
compromise web applications and sessions in many different ways.Below are some of
the best practices for implementing session management. Implementing these
practices will reduce the attack surface and minimize the risk and damage caused by
vulnerabilities and attacks resulting from improper session management.

 Setting secure HTTP flags on cookies

 Avoid sending sensitive traffic over unencrypted channels, i.e. HTTP. Setup the
secure flag, which will ensure that data is transmitted over encrypted protocols such
as HTTPS. The HTTP flag should only be set on session cookies to prevent session
hijacking, which can be caused due to client-side javascript execution.

 Generation of new session cookies

 New session token generation should be ensured at every step of the authentication
and interaction process, i.e. when a user visits an application or website and when the
user gets authenticated. Apart from this, a new session should be created when a user
exits from the application. Cookies should have an expiration time. In this way, if an
account is inactive for a long time the session will expire.

 Session cookies configuration

 Session tokens should not be easily guessable, they should be long, unique and
unpredictable. Doing so will decrease the chances of an attacker being successful in
using brute force to figure out the session token. The expiration time of persistent
cookies should be no longer than 30 minutes, so that attacks such as session fixation
can be prevented.

Session Management Best practices according to OWASP

The following are some of the best practices as per the OWASP

 Use a trusted server for creating session identifiers.


 Efficient algorithms should be used by the session management controls to ensure the
random generation of session identifiers.
 Ensure that the logging out functionality terminates the associated connection/session
entirely.
 Ensure that session inactivity timeout is as short as possible, it is recommended that
the timeout of the session activity should be less than several hours.
 Generate a new session identifier when a user re-authenticates or opens a new
browser session.
 Implement periodic termination of sessions, especially for applications that provide
critical services.
 Appropriate access controls should be implemented to protect all server-side session
data from unauthorized access by other users.

You might also like