Extracting Domain Name from URL with Perl
I oftentimes need to extract the domain name from a URL in the software I write. And I thought I would share my method. This is using the Perl language.
This method consists of 3 consecutive regular expressions.
It discards any leading www. from the domain name. But it does not discard other third-level domain names.
First, remove the http/https and possible www. from the front of the URL:
$url =~ s!^https?://(?:www\.)?!!i;
The above uses "!" as operator delimiters instead of "/" to avoid having to escape the embedded "/" characters. The "i" at the end is to specify case insensitiveness.
Then, strip off everything from the first "/" to the end of the URL (doing nothing if there is no "/"):
$url =~ s!/.*!!;
Last, in case the URL was http://example.com?stuff or http://example.com#stuff or http://example.com:80/whatever, also strip off everything from the first "?" or "#" or ":", if present:
$url =~ s/[\?\#\:].*//;
The value of $url is now the domain name by itself.
Will Bontrager