According to the Java API [API 2006] for class java.io.File
:
A pathname, whether abstract or in string form, may be either absolute or relative. An absolute pathname is complete in that no other information is required to locate the file that it denotes. A relative pathname, in contrast, must be interpreted in terms of information taken from some other pathname.
Absolute or relative path names may contain file links such as symbolic (soft) links, hard links, shortcuts, shadows, aliases, and junctions. These file links must be fully resolved before any file validation operations are performed. For example, the final target of a symbolic link called trace
might be the path name /home/system/trace
. Path names may also contain special file names that make validation difficult:
- "
.
" refers to the directory itself. - Inside a directory, the special file name "
..
" refers to the directory's parent directory.
In addition to these specific issues, a wide variety of operating system–specific and file system–specific naming conventions make validation difficult.
Canonicalizing file names makes it easier to validate a path name. More than one path name can refer to a single directory or file. Further, the textual representation of a path name may yield little or no information regarding the directory or file to which it refers. Consequently, all path names must be fully resolved or canonicalized before validation.
Validation may be necessary, for example, when attempting to restrict user access to files within a particular directory or to otherwise make security decisions based on the name of a file name or path name. Frequently, these restrictions can be circumvented by an attacker by exploiting a directory traversal or path equivalence vulnerability. A directory traversal vulnerability allows an I/O operation to escape a specified operating directory. A path equivalence vulnerability occurs when an attacker provides a different but equivalent name for a resource to bypass security checks.
Canonicalization contains an inherent race window between the time the program obtains the canonical path name and the time it opens the file. While the canonical path name is being validated, the file system may have been modified and the canonical path name may no longer reference the original valid file. Fortunately, this race condition can be easily mitigated. The canonical path name can be used to determine whether the referenced file name is in a secure directory (see FIO00-J. Do not operate on files in shared directories for more information). If the referenced file is in a secure directory, then, by definition, an attacker cannot tamper with it and cannot exploit the race condition.
This recommendation is a specific instance of IDS01-J. Normalize strings before validating them.
Noncompliant Code Example
This noncompliant code example allows the user to specify the path of an image file to open. By prepending /img/
to the directory, this code enforces a policy that only files in this directory should be opened. The program also uses the isInSecureDir()
method defined in FIO00-J. Do not operate on files in shared directories.
However, the user can still specify a file outside the intended directory by entering an argument that contains ../
sequences. An attacker can also create a link in the /img
directory that refers to a directory or file outside of that directory. The path name of the link might appear to reside in the /img
directory and consequently pass validation, but the operation will actually be performed on the final target of the link, which can reside outside the intended directory.
File file = new File("/img/" + args[0]); if (!isInSecureDir(file)) { throw new IllegalArgumentException(); } FileOutputStream fis = new FileOutputStream(file); // ...
Noncompliant Code Example (getCanonicalPath()
)
This noncompliant code example attempts to mitigate the issue by using the File.getCanonicalPath()
method, introduced in Java 2, which fully resolves the argument and constructs a canonicalized path. Special file names such as dot dot (..
) are also removed so that the input is reduced to a canonicalized form before validation is carried out. An attacker cannot use ../
sequences to break out of the specified directory when the validate()
method is present. For example, the path /img/../etc/passwd
resolves to /etc/passwd
. The getCanonicalPath()
method throws a security exception when used in applets because it reveals too much information about the host machine. The getCanonicalFile()
method behaves like getCanonicalPath()
but returns a new File
object instead of a String
.
Unfortunately, the canonicalization is performed after the validation, which renders the validation ineffective.
File file = new File("/img/" + args[0]); if (!isInSecureDir(file)) { throw new IllegalArgumentException(); } String canonicalPath = file.getCanonicalPath(); FileOutputStream fis = new FileOutputStream(canonicalPath); // ...
Compliant Solution (getCanonicalPath()
)
This compliant solution obtains the file name from the untrusted user input, canonicalizes it, and then validates it against a list of benign path names. It operates on the specified file only when validation succeeds, that is, only if the file is one of the two valid files file1.txt
or file2.txt
in /img/java
.
File file = new File("/img/" + args[0]); if (!isInSecureDir(file)) { throw new IllegalArgumentException(); } String canonicalPath = file.getCanonicalPath(); if (!canonicalPath.equals("/img/java/file1.txt") && !canonicalPath.equals("/img/java/file2.txt")) { // Invalid file; handle error } FileInputStream fis = new FileInputStream(f);
Compliant Solution (Security Manager)
A comprehensive way to handle this issue is to grant the application the permissions to operate only on files present within the intended directory—the /img
directory in this example. This compliant solution specifies the absolute path of the program in its security policy file and grants java.io.FilePermission
with target /img/java
and the read action.
This solution requires that the /img
directory is a secure directory, as described in FIO00-J. Do not operate on files in shared directories.
// All files in /img/java can be read grant codeBase "file:/home/programpath/" { permission java.io.FilePermission "/img/java", "read"; };
Risk Assessment
Using path names from untrusted sources without first canonicalizing them and then validating them can result in directory traversal and path equivalence vulnerabilities.
Rule | Severity | Likelihood | Remediation Cost | Priority | Level |
---|---|---|---|---|---|
FIO16-J | Medium | Unlikely | Medium | P4 | L3 |
Automated Detection
Tool | Version | Checker | Description |
---|---|---|---|
The Checker Framework | 2.1.3 | Tainting Checker | Trust and security errors (see Chapter 8) |
Coverity | 7.5 | BAD_EQ | Implemented |
Fortify | 1.0 | Path_Manipulation | Implemented |
Parasoft Jtest | 2024.1 | CERT.FIO16.CDBV | Canonicalize data before validation |
Related Vulnerabilities
CVE-2005-0789 describes a directory traversal vulnerability in LimeWire 3.9.6 through 4.6.0 that allows remote attackers to read arbitrary files via a ..
(dot dot) in a magnet request.
CVE-2008-5518 describes multiple directory traversal vulnerabilities in the web administration console in Apache Geronimo Application Server 2.1 through 2.1.3 on Windows that allow remote attackers to upload files to arbitrary directories.
Related Guidelines
FIO02-C. Canonicalize path names originating from tainted sources | |
VOID FIO02-CPP. Canonicalize path names originating from untrusted sources | |
Path Traversal [EWR] | |
CWE-171, Cleansing, Canonicalization, and Comparison Errors |
Android Implementation Details
This rule is applicable in principle to Android. Please refer to the Android-specific instance of this rule: DRD08-J. Always canonicalize a URL received by a content provider.
31 Comments
Robert Seacord (Manager)
The idea of canonicalizing path names may have some inherent flaws and may need to be abandoned. (One of) the problems is that there is an inherent race condition between the time you create the canonical name, perform the validation, and open the file during which time the canonical path name may have been modified and may no longer be referencing a valid file.
Dhruv Mohindra
EDIT: This guideline is broken. To be fixed...
I lack a good resource but I suspect wrapped method calls might partly eliminate the race condition:
Though the validation cannot be performed without the race unless the class is designed for it. Of course, the best thing to do is to use the security manager to prevent the sort of attacks you are validating for.
David Svoboda MGR
The 2nd CS looks like it will work on any file, and only do special stuff if the file is /img/java/file[12].txt. Not sure what was intended, but I would guess the 2nd CS is supposed to abort if the file is anything but /img/java/file[12].txt.
Robert Seacord (Manager)
I would like to reverse the order of the two examples. The first example is a bit of a disappointment because it ends with:
Needless to say, it would be preferable if the NCE showed an actual problem and not a theoretical one.
Robert Seacord (Manager)
I'm thinking of moving this to (back to) FIO because it is a specialization of another IDS rule dealing specifically with file names.
David Svoboda
I suspect we will at some future point need the notion of canonicalization to apply to something else besides filenames. (It could probably be qpplied to URLs). So I would rather this rule stay in IDS. I don't think this rule overlaps with any other IDS rule.
David Svoboda
One comment...the isInSecureDir() method requires Java 7.
Robert Seacord
I'm reading this again 3 years later and I still think this should be in FIO. The fact that it references the isInSecureDir() method defined in FIO00-J. Do not operate on files in shared directories is a good indication of this. It doesn't really matter if you want to canonical something else. That rule may also go in a section specific to doing that sort of thing. I'm going to move.
Yozo TODA
there is a phrase "validation without canonicalization" in the explanation above the third NCE.
it sounds meaningless in this context for me, so I changed this phrase to "canonicalization without validation".
Robert Seacord
I reverted this change.
The problem of "validation without canonicalization" is that the pathname might contain symbolic links, etc. making it difficult if not impossible to tell, for example, what directory the pathname is referring to.
Yozo TODA
I know, I know, but I think the phrase "validation without canonicalization" should be for the second (and the first) NCE.
the third NCE did canonicalize the path but not validate it.
so, I bet the more meaningful phrase here is "canonicalization without validation..." (-:
David Svoboda
I agree. Changed the text to 'canonicalization w/o validation".
Masaki Kubo
I think 3rd CS code needs more work. The code doesn't reflect what its explanation means. Also both of the if statements could evaluate true and I cannot exactly understand what's the intention of the code just by reading it.
David Svoboda
The code is good, but the explanation needed a bit of work to back it up...hopefully it's better now.
Masaki Kubo
Thanks David! The explanation is clearer now.
But... because the inside of if blocks is just "//do something" and the second if condition is "!canonicalPath.equals" which is different from the first if condition, the code still doesn't make much sense to me, maybe I'm not getting the point...
for example, it would make sense if the code reads something like:
David Svoboda
You're right, I cleaned up the code.
Masaki Kubo
Thanks David, I confirmed your fix.
Masaki Kubo
The following sentence seems a bit strange to me:
1 is canonicalization but 2 and 3 are not. what is "the validation" in step 2? validation between unresolved path and canonicalized path?(not explicitly written here) Or is it just trying to explain symlink attack? I don't get what it wants to convey although I could sort of guess.
David Svoboda
The race condition is between (1) and (3) above. I've rewritten the paragraph; hopefuly it is clearer now.
Masaki Kubo
Thanks David!
I've rewritten your paragraph. How about this?
suggested replacement text:
David Svoboda
Hm, the beginning of the race window can be rather confusing. The window ends once the file is opened, but when exactly does it begin? We have always assumed that the canonicalization process verifies the existence of the file; in this case, the race window begins with canonicalization. (If a path name is never canonicalizaed, the race window can go back further, all the way back to whenever the path name is supplied. For instance, if a user types in a pathname, then the race window goes back further than when the program actually gets the pathname (because it goes through OS code and maybe GUI code too). So it's possible that a pathname has already been tampered with before your code even gets access to it! This is ultimately not a solvable problem.
So the paragraph needs to make clear that the race window starts with canonicalization (when canonicalization is actually done).
Masaki Kubo
Okay, so in the first sentence:
"you" is not a programmer but some path canonicalization API such as getCanonicalPath(). I initially understood this block of text in the context of a validation with canonicalization by a programmer, not the internal process of path canonicalization itself. I think that's why the first sentence bothered me.
Thanks for the clarification!
David Svoboda
You're welcome. I took all references of 'you' out of the paragraph for clarification.
A Bishop
This rule has two compliant solutions for canonical path and for security manager. I'm not sure what difference is trying to be highlighted between the two solutions. Can they be merged?
David Svoboda
You can merge the solutions, but then they would be redundant.
For the problem the code samples are trying to solve (only allow the program to open files that live in a specific directory), both getCanonicalPath() and the SecurityManager are adequate solutions. The getCanonicalPath() function is useful if you want to do other tests on the filename based on its string. For instance, is the file really a
.jpg
or.exe
? IIRC The Security Manager doesn't help you limit files by type. Also, the Security Manager limits where you can open files and can be unweildly...if you want your image files in /image and your text files in /home/dave, then canonicalization will be an easier solution than constantly tweaking the security manager.A Bishop
I was meaning can the two compliant solutions to do with security manager be merged, and can the two compliant solutions to do with getCanonicalPath be merged?
David Svoboda
Yes, they were kinda redundant. I've dropped the first NCCE + CS's.
G. Ann Campbell
Is / should this be different from IDS02-J. Canonicalize path names before validating them?
David Svoboda
No, since IDS02-J is merely a pointer to this guideline.
MT
Couple questions from my side:
In first compliant solution, there is check is directory is safe followed by checking is file is one of the listed file. Correct me if Im wrong, but I think second check makes first one redundant.
If i remember correctly, `getCanonicalPath` evaluates path, would that makes check secure `canonicalPath.startsWith(secureLocation)` ?
David Svoboda
The getCanonicalPath() will make the string checks that happen in the second check work properly. Without getCanonicalPath(), the path may indeed be one of the images, but obfuscated by a './' or '../' substring in the path.
Using canonicalPath.startsWith(secureLocation) would also be a valid way of making sure that a file lives in secureLocation, or a subdirectory of secureLocation.