Validate URL syntax with JavaScript

Something that I initially thought would be simple turned into hours of googling for solutions. The problem? I wanted to use JavaScript to check if a URL uses valid syntax.

What I ended up doing is use js-uri to parse the URLs I want to check and then use a couple of regular expressions to check the syntax of the URL parts it returns.

Where I’m using this I’m not particularly interested in checking that the domain exists or verifying that the URL is live—I just want to check that the path, query, and fragment parts do not use characters other than those described in RFC 3986.

Here’s the script I use for that:

var isValidURI = function(uri) {
	var path,
		query,
		fragment,
		// Regexp for allowed characters in URL paths
		pathRE = /^([-a-z0-9._~:@!$&'()*+,;=\/]|%[0-9a-f]{2})*$/i,
		// Regexp for allowed characters in querystrings and URL fragments
		qfRE = /^([-a-z0-9._~:@!$&'()*+,;=?\/]|%[0-9a-f]{2})*$/i;
	uri = new URI(uri);
	path = uri.getPath();
	query = uri.getQuery();
	fragment = uri.getFragment();

	if (path && !path.match(pathRE)) {
		return false;
	}
	if (query && !query.match(qfRE)) {
		return false;
	}
	if (fragment && !fragment.match(qfRE)) {
		return false;
	}
	return true;
};

This may not be bulletproof—parsing URLs is tricky—but as far as I can tell it works with the test cases I’ve thrown at it.

If you spot any errors with this method or happen to know of a better way of verifying URL syntax with JavaScript I’m all ears.

Posted on May 3, 2011 in JavaScript