AWS S3 Naming Conventions

Updated on December 10th, 2022

Ideally when confirming how files are named in S3 buckets for migration with our workflows (proxy creation etc), things should be as easy to work with as possible. Safe characters are usable in S3 file naming, however, with folder and sub folder naming, some of these should also be removed. This should always be considered on top of the file name convention.

If file/folder naming does not match the below then the file names need to be cleaned up. A CSV list of files/folders that fail to meet these requirements can be produced (currently optimising the workflow now), but will require list object access to the bucket being checked.

Safe Characters

The following character sets are generally safe for use in key names:

  • Alphanumeric characters [0-9a-zA-Z]

  • Special characters !, -, _, ., *, ', (, and )

 It should be noted that although these characters are safe for file naming in S3, because we will also use the name as an identifying key for ingest, we should also consider that the characters that break Windows cannot be used in file names:

As such, special characters should really be limited to:

  • Special characters:       !     -     _     .     ' 

Characters That Might Require Special Handling

The following characters in a key name (file name) might require additional code handling and likely will need to be URL encoded or referenced as HEX. Some of these are non-printable characters and your browser might not handle them, which also requires special handling:

Ampersand ("&")

Dollar ("$")

ASCII character ranges 00–1F hex (0–31 decimal) and 7F (127 decimal)

'At' symbol ("@")

Equals ("=")

Semicolon (";")

Colon (":")

Plus ("+")

Space – Significant sequences of spaces may be lost in some uses (especially multiple spaces)

Comma (",")

Question mark ("?")

 

 Due to the fact that it is highly likely that these characters will need to be URL encoded, for example when using the SQS queue to ingest, it would be advisable if at all possible not to use these characters. Often these characters, such as "space", will be replaced by SQS with a + character, which although it can be sanitized via workflows, it is a lot better to avoid this if possible. If not, we will need to make sure workflows are correctly configured to sanitize these characters to URL encoded, and potentially back again depending on if you are downloading a file from S3, checking the file exists on S3, or uploading to S3.

Characters to Avoid

Avoid the following characters in a key name due to significant special handling for consistency across all applications.

Backslash ("\")

Left curly brace ("{")

Non-printable ASCII characters (128–255 decimal characters)

Caret ("^")

Right curly brace ("}")

Percent character ("%")

Grave accent / back tick ("`")

Right square bracket ("]")

Quotation marks

'Greater Than' symbol (">")

Left square bracket ("[")

Tilde ("~")

'Less Than' symbol ("<")

'Pound' character ("#")

Vertical bar / pipe ("|")

These characters will cause issues in both the on disk file naming (i.e., Windows naming) but also with AWS S3 buckets as well, along with URL encoding consistency. These cannot be used at all.

Reference: AWS Object Key

Was this article helpful?