Path Traversal by Portswigger Academy
Path Traversal - Portswigger Academy
#Path Traversal
Path traversal, also known as directory traversal, is a tactic used by attackers to discover certain endpoints in an application and potentially read arbitrary files on the server.
###What This Might Include:
- >Application code and data
- >Login credentials
- >Sensitive operating system files
In some cases, an attacker may even write to arbitrary files, allowing them to modify application data or behavior and ultimately take full control of the server.
##Reading Arbitrary Files
Imagine an application that embeds an image on its homepage. The HTML for the page might look like this:
<img src="/img?filename=image1.png">Here, the
URL accepts aimg
parameter and fetches and returns the specified file. Typically, image files are stored underfilename
. To return the requested file, the application appends the user-provided filename to the base URL and uses the filesystem API to read its contents./var/www/images
If the application does not implement defenses against path traversal attacks, attackers can exploit this behavior to retrieve sensitive files from the server.
For example, an attacker could request the
file (which contains information about users registered on the server) using the following:/etc/passwd
https://insecure-website.com/img?filename=../../../etc/passwd
###How It Works:
- >The
represents the parent directory in Linux... - >By chaining
three times, the path resolves to the root directory (../
), and the application then accesses/
./etc/passwd
##Common Obstacles to Exploiting Path Traversal Vulnerabilities
Many applications attempt to defend against path traversal by filtering or blocking directory traversal sequences. However, these defenses can often be bypassed.
###Bypass Techniques:
- >
Using Absolute Paths
Directly reference a file using its absolute path without traversal sequences.
Example:GET /image?filename=/etc/passwd HTTP/2 - >
Nested Traversal Sequences
Use patterns like
or....//
, which simplify to....\/
when inner sequences are stripped.../
Example:GET /image?filename=....//....//....//etc/passwd HTTP/2 - >
URL Encoding or Double URL Encoding
If the application strips
, encode it as../
(URL encoding) or%2F
(double URL encoding).%252F
Example:GET /image?filename=..%252F..%252F..%252Fetc%252Fpasswd HTTP/2 - >
Required Base Folder
If the application requires filenames to start with a base folder, append traversal sequences after the folder.
Example:GET /image?filename=/var/www/images/../../../etc/passwd HTTP/2 - >
Required File Extension
Use a null byte (
) to terminate the file path before the required extension.%00
Example:GET /image?filename=../../../etc/passwd%00.png HTTP/2
##How to Prevent a Path Traversal Attack
###Best Practices:
- >
Avoid Passing User-Supplied Input to Filesystem APIs
Rewrite application functions to avoid processing filesystem paths directly. - >
Validate User Input
- >Use a whitelist of permitted values for user input.
- >If not feasible, ensure the input contains only safe characters (e.g., alphanumeric only).
- >
Canonicalize and Verify Paths
- >Append the input to the base directory.
- >Use a platform filesystem API to canonicalize the resulting path.
- >Verify that the canonicalized path starts with the expected base directory.
Example (Java):
javaFile file = new File(BASE_DIRECTORY, userInput); if (file.getCanonicalPath().startsWith(BASE_DIRECTORY)) { // process file }
By implementing these strategies, you can significantly reduce the risk of path traversal vulnerabilities in your application.