Blog
Extract the file extension from a filename with regex
Sounds simple. The tricky bit is what counts as "the extension" for files like archive.tar.gz.
Last extension only
function getExt(filename) {
const m = filename.match(/\.([^.\\\/]+)$/);
return m ? m[1] : "";
}
getExt("document.pdf"); // "pdf"
getExt("archive.tar.gz"); // "gz"
getExt("no-extension"); // ""
getExt(".hidden"); // "" (no separate extension)
The [^.\\\/]+ avoids matching across directory separators (which would happen if you had a path like /home/user.name/file).
Multi-part extensions (tar.gz, tar.bz2)
function getCompoundExt(filename) {
const m = filename.match(/(\.[^.\\\/]+(\.[^.\\\/]+)?)$/);
return m ? m[1] : "";
}
getCompoundExt("archive.tar.gz"); // ".tar.gz"
getCompoundExt("backup.sql.bz2"); // ".sql.bz2"
getCompoundExt("file.txt"); // ".txt"
This grabs the last two extensions if both exist. Refine if you only want compound extensions for specific cases like .tar.gz and not .config.json.
Just the name without extension
function stripExt(filename) {
return filename.replace(/\.[^.\\\/]+$/, "");
}
stripExt("document.pdf"); // "document"
stripExt("archive.tar.gz"); // "archive.tar"
Python
import os
name, ext = os.path.splitext("document.pdf")
# name = "document", ext = ".pdf"
Don't use regex for this in Python — os.path.splitext handles edge cases (hidden files, paths) for you.
Use language built-ins when available
Most languages have a stdlib function for this:
- Python:
os.path.splitext - Node.js:
path.extname - Go:
filepath.Ext - Java:
FilenameUtils.getExtension(Apache Commons)
Regex works when you're processing strings outside a filesystem context, like log lines or CSV rows where the filename is part of a longer string.