I was writing unit tests and needed to cause this function to kick out an error and return FALSE in order to test a specific execution path. If anyone else needs to force a failure, the following inputs will work:

@ solenoid: Your code was very helpful, but it fails when the current URL has no query string (it appends '&' instead of '?' before the query). Below is a fixed version that catches this edge case and corrects it.

At the first line there was parse_query($val), I made it $var. It used to return a null array before this fix.

I have added the parse_url line. So now the function will only focus in the query part, not the whole URL. This is useful if something like below is done:<?php$my_GET = parse_query($_SERVER['REQUEST_URI']);?>

Here's a simple class I made that makes use of this parse_url.I needed a way for a page to retain get parameters but also edit or add onto them. I also had some pages that needed the same GET paramaters so I also added a way to change the path.

I have coded a function which converts relative URL to absolute URL for a project of mine. Considering I could not find it elsewhere, I figured I would post it here.

The following function takes in 2 parameters, the first parameter is the URL you want to convert from relative to absolute, and the second parameter is a sample of the absolute URL.

Currently it does not resolve '../' in the URL, only because I do not need it. Most webservers will resolve this for you. If you want it to resolve the '../' in the path, it just takes minor modifications.

parse_url doesn't works if the protocol doesn't specified. This seems like sandard, even the youtube doesn't gives the protocol name when generates code for embedding which have a look like "//youtube.com/etc".

So, to avoid bug, you must always check, whether the provided url has the protocol, and if not (starts with 2 slashes) -- add the "http:" prefix.

I've realized that even though UTF-8 characters are not allowed in URL's, I have to work with a lot of them and parse_url() will break.

Based largely on the work of "mallluhuct at gmail dot com", I added parse_url() compatible "named values" which makes the array values a lot easier to work with (instead of just numbers). I also implemented detection of port, username/password and a back-reference to better detect URL's like this: //en.wikipedia.com
... which, although is technically an invalid URL, it's used extensively on sites like wikipedia in the href of anchor tags where it's valid in browsers (one of the types of URL's you have to support when crawling pages). This will be accurately detected as the host name instead of "path" as in all other examples.

I will submit my complete function (instead of just the RegExp) which is an almost "drop-in" replacement for parse_url(). It returns a cleaned up array (or false) with values compatible with parse_url(). I could have told the preg_match() not to store the unused extra values, but it would complicate the RegExp and make it more difficult to read, understand and extend. The key to detecting UTF-8 characters is the use of the "u" parameter in preg_match().

Here's a function which implements resolving a relative URL according to RFC 2396 section 5.2. No doubt there are more efficient implementations, but this one tries to remain close to the standard for clarity. It relies on a function called "unparse_url" to implement section 7, left as an exercise for the reader (or you can substitute the "glue_url" function posted earlier).

Note that if you pass this function a url without a scheme (www.php.net, as opposed to http://www.php.net), the function will incorrectly parse the results. In my test case it returned the domain under the ['path'] element and nothing in the ['host'] element.

There was one thing missing in the function dropped by "to1ne at hotmail dot com" when i tried it : domain and subdomain couldn't have a dash "-". So i add it in the regexp and the function looks like this now :

Here's a method to get the REAL name of a domain. This return just the domain name, not the rest. First check if is not an IP, then return the name:

<?php
function esip($ip_addr)
{
//first of all the format of the ip address is matched
if(preg_match("/^(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})$/",$ip_addr))
{
//now all the intger values are separated
$parts=explode(".",$ip_addr);
//now we need to check each part can range from 0-255
foreach($parts as $ip_parts)
{
if(intval($ip_parts)>255 || intval($ip_parts)<0)
return FALSE; //if number is not within range of 0-255
}
return TRUE;
}
else
return FALSE; //if format of ip address doesn't matches
}

Thanks to xellisx for his parse_query function. I used it in one of my projects and it works well. But it has an error. I fixed the error and improved it a little bit. Here is my version of it:

<?php// Originally written by xellisxfunction parse_query($var){/** * Use this function to parse out the query array element from * the output of parse_url(). */$var = parse_url($var, PHP_URL_QUERY);$var = html_entity_decode($var);$var = explode('&', $var);$arr = array();

At the first line there was parse_query($val), I made it $var. It used to return a null array before this fix.

I have added the parse_url line. So now the function will only focus in the query part, not the whole URL. This is useful if something like below is done:<?php$my_GET = parse_query($_SERVER['REQUEST_URI']);?>https://vb.3dlat.net/

// CALCULATES THE SECOND LEVEL DOMAIN POSITION IN THE ARRAY ONCE THE POSITION OF THE TOP LEVEL DOMAIN IS IDENTIFIED$l2 = $l1 - 1; } else {// INCREMENTS THE COUNTER FOR THE TOP LEVEL DOMAIN POSITION IF NO MATCH IS FOUND$l1++; } }

// RETURN THE SECOND LEVEL DOMAIN AND THE TOP LEVEL DOMAIN IN THE FORMAT LIKE "SOMEDOMAIN.COM"echo $tldArray[$l2] . '.' . $tldArray[$l1];}

my function catch the url written on the browser by the user and does the same thing of parse_url. but better, I think. I don't like parse_url because it says nothing about elements that it doesn't find in the url. my function instead return an empty string.

out[0] = full urlout[1] = scheme or '' if no scheme was foundout[2] = username or '' if no auth username was foundout[3] = password or '' if no auth password was foundout[4] = domain name or '' if no domain name was foundout[5] = port number or '' if no port number was foundout[6] = path or '' if no path was foundout[7] = query or '' if no query was foundout[8] = fragment or '' if no fragment was found

But everything should be tested against the two examples provided by RFC3986,

/* line too long for this site's commnet handler */ "foo://username:password@example.com:8042". "/over/there/index.dtb;type=animal?name=ferret#nose"and "urn:example:animal:ferret:nose"

Here the native function parse_url() performs admirably on that "urn:" example. Mine fails to pick out the path ("example:animal:ferret:nose") and the laulibrius/theoriginalmarksimpson function can't decipher anything there. On the "foo:" example, both my function and parse_url() get it right, while the other examples on this page don't.