Please note this is a Request For Comments topic, not a discussion. It is not meant to serve as a discussion for whether Pretty URLs are useful or not, rather to give my suggested implementation and receive comments and suggestions on this specific implementation.

The idea here would be to covertly stuff the ID (and some identifier to show what sort of ID it is) into the URL. This ensures that collisions are impossible and will not require lookups to check for similar url slugs.

When you submit a topic, it will clean the title out and tack on "-t{ID}" to the end to produce the final slug. This is stored in the database. When you visit a page, it will go only by the topic ID, where the rest of the text is simply dummy text. The page will check if the the URL slug is correct and redirect if it does not match the one stored in the database. This will allow topic title changes to happen seamlessly, and the 301 will tell search engines to update their links.

An example of a redirect:

User clicks an old url: http://www.phpbb.com/community/viewtopic.php?f=14&t=2133523

htaccess (which is not aware of anything DB side) will redirect here: http://www.phpbb.com/community/f14/t2133523/

User lands on the page, which detects the slugs do not match. It will then direct them here: http://www.phpbb.com/community/announcements-f14/phpbb-at-oscon-july-26-28-t2133523/

naderman wrote:Can you explain more precisely what the cleaning of the topic title would work like? What are your thoughts on handling Unicode? Are the compatability problems with webservers and unicode in paths?

Should it be u<i> or m<i> for user/member? We have a member list, but typically refer to users.

Cleaning the topic title would basically strip all odd characters, punctuation, replace whitespace with a single hyphen ( - ), and possibly run a UTF8 strtolower() on the slug as well. Should give us a nice pretty URL. Unicode should and can be preserved, though we may have tweak the text parsing engine a little because of the issue below. I have not checked specifically if anything other than apache supports unicode paths, but here is an example, yet simple implementation on my test server:

Currently, It does look that phpBB doesn't seem interested in parsing these as URLs.

I chose "m" for member simply because it is accessed via "memberlist.php". "u" would not be used for anything otherwise, so it could easily be changed if it makes more sense to continue to refer to members as users.

Such a feature would have to be optional and disabled by default. For one reason because of the listed drawbacks (especially leaking information), but It probably also requires the webserver to coorporate.

bantu wrote:Such a feature would have to be optional and disabled by default. For one reason because of the listed drawbacks (especially leaking information), but It probably also requires the webserver to coorporate.

A way around that is to have a click-through page (for all external links) that just acts an intermediary page between the page the link is on and the link's destination. It would cause the referrer to be something like domain.com/click.php?url=http://google.com.

bantu wrote:Such a feature would have to be optional and disabled by default. For one reason because of the listed drawbacks (especially leaking information), but It probably also requires the webserver to coorporate.

A way around that is to have a click-through page (for all external links) that just acts an intermediary page between the page the link is on and the link's destination. It would cause the referrer to be something like domain.com/click.php?url=http://google.com.

That doesn't protect from directly posting the links. On one of my boards the team has an public and a private forum, team members will post links to the private section in the public forum.

I'd as well like to see this made optionally but with all three formats available

bantu wrote:Such a feature would have to be optional and disabled by default. For one reason because of the listed drawbacks (especially leaking information), but It probably also requires the webserver to coorporate.

A way around that is to have a click-through page (for all external links) that just acts an intermediary page between the page the link is on and the link's destination. It would cause the referrer to be something like domain.com/click.php?url=http://google.com.

But then you can no longer copy links directly, which is a trivial thing to do right now.