CWE - CWE-20 - Improper Input Validation (4.14)
CWE - CWE-20 - Improper Input Validation (4.14)
Home About ▼ CWE List ▼ Mapping ▼ Top-N Lists ▼ Community ▼ News ▼ Search
Mapping
View customized information: Conceptual Operational Complete Custom
Friendly
Description
The product receives input or data, but it does not validate or incorrectly validates that the input has the properties that are
required to process the data safely and correctly.
Extended Description
Input validation is a frequently-used technique for checking potentially dangerous inputs in order to ensure that the inputs are
safe for processing within the code, or when communicating with other components. When software does not validate input
properly, an attacker is able to craft the input in a form that is not expected by the rest of the application. This will lead to parts
of the system receiving unintended input, which may result in altered control flow, arbitrary control of a resource, or arbitrary
code execution.
Input validation is not the only technique for processing input, however. Other techniques attempt to transform potentially-
dangerous input into something safe, such as filtering (CWE-790) - which attempts to remove dangerous inputs - or
encoding/escaping (CWE-116), which attempts to ensure that the input is not misinterpreted when it is included in output to
another component. Other techniques exist as well (see CWE-138 for more examples.)
Input validation can be applied to:
Data can be simple or structured. Structured data can be composed of many nested layers, composed of combinations of
metadata and raw data, with other simple or structured data.
Many properties of raw data or metadata may need to be validated upon entry into the code, such as:
specified quantities such as size, length, frequency, price, rate, number of operations, time, etc.
implied or derived quantities, such as the actual size of a file instead of a specified size
indexes, offsets, or positions into more complex data structures
symbolic keys or other elements into hash tables, associative arrays, etc.
well-formedness, i.e. syntactic correctness - compliance with expected syntax
lexical token correctness - compliance with rules for what is treated as a token
specified or derived type - the actual type of the input (or what the input appears to be)
consistency - between individual data elements, between raw data and metadata, between references, etc.
conformance to domain-specific rules, e.g. business logic
equivalence - ensuring that equivalent inputs are treated the same
authenticity, ownership, or other attestations about the input, e.g. a cryptographic signature to prove the source of the
data
Implied or derived properties of data must often be calculated or inferred by the code itself. Errors in deriving properties may
be considered a contributing factor to improper input validation.
Note that "input validation" has very different meanings to different people, or within different classification schemes. Caution
must be used when referencing this CWE entry or mapping to it. For example, some weaknesses might involve inadvertently
giving control to an attacker over an input when they should not be able to provide an input at all, but sometimes this is
referred to as input validation.
Finally, it is important to emphasize that the distinctions between input validation and output escaping are often blurred, and
developers must be careful to understand the difference, including how input validation is not always sufficient to prevent
vulnerabilities, especially when less stringent data types must be supported, such as free-form text. Consider a SQL injection
scenario in which a person's last name is inserted into a query. The name "O'Reilly" would likely pass the validation step since it
is a common last name in the English language. However, this valid name cannot be directly inserted into the database
because it contains the "'" apostrophe character, which would need to be escaped or otherwise transformed. In this case,
removing the apostrophe might reduce the risk of SQL injection, but it would produce incorrect behavior because the wrong
name would be recorded.
Relationships
Relevant to the view "Research Concepts" (CWE-1000)
Nature Type ID Name
ChildOf 707 Improper Neutralization
ParentOf 179 Incorrect Behavior Order: Early Validation
ParentOf 622 Improper Validation of Function Hook Arguments
ParentOf 1173 Improper Use of Validation Framework
ParentOf 1284 Improper Validation of Specified Quantity in Input
ParentOf 1285 Improper Validation of Specified Index, Position, or Offset in Input
ParentOf 1286 Improper Validation of Syntactic Correctness of Input
ParentOf 1287 Improper Validation of Specified Type of Input
ParentOf 1288 Improper Validation of Consistency within Input
ParentOf 1289 Improper Validation of Unsafe Equivalence in Input
PeerOf 345 Insufficient Verification of Data Authenticity
CanPrecede 22 Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal')
CanPrecede 41 Improper Resolution of Path Equivalence
CanPrecede 74 Improper Neutralization of Special Elements in Output Used by a Downstream
Component ('Injection')
CanPrecede 119 Improper Restriction of Operations within the Bounds of a Memory Buffer
CanPrecede 770 Allocation of Resources Without Limits or Throttling
Relevant to the view "Weaknesses for Simplified Mapping of Published Vulnerabilities" (CWE-1003)
Relevant to the view "Architectural Concepts" (CWE-1008)
Relevant to the view "Seven Pernicious Kingdoms" (CWE-700)
Memberships
Usage: DISCOURAGED (this CWE ID should not be used to map to real-world vulnerabilities)
Rationale:
CWE-20 is commonly misused in low-information vulnerability reports when lower-level CWEs could be used instead, or
when more details about the vulnerability are available [REF-1287]. It is not useful for trend analysis. It is also a level-1
Class (i.e., a child of a Pillar).
Comments:
Consider lower-level children such as Improper Use of Validation Framework (CWE-1173) or improper validation involving
specific types or properties of input such as Specified Quantity (CWE-1284); Specified Index, Position, or Offset (CWE-
1285); Syntactic Correctness (CWE-1286); Specified Type (CWE-1287); Consistency within Input (CWE-1288); or Unsafe
Equivalence (CWE-1289).
Suggestions:
CWE-ID Comment
Notes
Relationship
CWE-116 and CWE-20 have a close association because, depending on the nature of the structured message, proper
input validation can indirectly prevent special characters from changing the meaning of a structured message. For
example, by validating that a numeric ID field should only contain the 0-9 characters, the programmer effectively
prevents injection attacks.
Terminology
The "input validation" term is extremely common, but it is used in many different ways. In some cases its usage can
obscure the real underlying weakness or otherwise hide chaining and composite relationships.
Some people use "input validation" as a general term that covers many different neutralization techniques for ensuring
that input is appropriate, such as filtering, canonicalization, and escaping. Others use the term in a more narrow
context to simply mean "checking if an input conforms to expectations without changing it." CWE uses this more
narrow interpretation.
Taxonomy Mappings
Content History
Submissions
Submission Date Submitter Organization
2006-07-19 7 Pernicious Kingdoms
(CWE Draft 3, 2006-07-19)
Modifications
Previous Entry Names