Slashdot videos: Now with more Slashdot!

View

Discuss

Share

We've improved Slashdot's video section; now you can view our video interviews, product close-ups and site visits with all the usual Slashdot options to comment, share, etc. No more walled garden! It's a work in progress -- we hope you'll check it out (Learn more about the recent updates).

New submitter matteocorti writes "I work at medium-sized university and we are considering reducing the number of domains used for email addresses (now around 350): the goal is to have all the 30K personal addresses in a single domain. This will increase the clashes for the local part of the address for people with the same first and last name (1.6%). We are considering several options: one of them is to use 'username@domain.tld' and the other is to use 'first.last@domain.tld.' The first case will avoid any conflict in the addresses (usernames are unique) but the second is fancier. Which approach does your organization use? How are name conflicts (homonyms) solved? Manually or automatically (e.g., by adding a number)?"

Choosing to name yourself something that doesn't use modern characters (in both cases) is your own fault.

1 line of UTF-8 characters for "name" should cover everybody who matters. Trying to divide things up into first/last or force any other convention upon names is asking for trouble. (Although it's hilarious how many people's 3rd party form auto-fill software will enter just their first name into the "name" box when purchasing on my website for example...)

On the first point: Someone may be named using archaic Chinese characters in their native language, but if they're studying in, say, Germany, or in the United States, they're required to choose a Latin form of their name, which is what will be used for legal purposes. If they're studying in Russia, they must render it in the Cyrillic alphabet, and in Greece, in the Greek alphabet. If you're in one of those legal contexts, you can assume all employees and students have a name conforming to the local legal requirements. I have students from many countries in my classes, but they all use names written in Latin characters when signing up for courses or turning in homework.

On the second: The artist legally named Prince Rogers Nelson never changed his name. He's just used a variety of stage names.

I wish the designer of my company's setup had read that. I called an analyst from India who moved here Fnu for about a year before someone finally gold me that was an acronym for "First name unknown" and her real name was her "Last" name.

People need to stop assuming everyone has a legal First and Last name.

Everyone has a name, which people pronounce out loud. English uses characters and combinations of characters to represent sounds. Thus, everyone has a Name which can be translated into English. In our society, people are assumed to have a first and last name, if you only have one name then the other can be assumed to be blank, empty, NULL, etc. but it is easily compensated for in any society which can grasp the concept of "zero" or "nothing". It's a trivial task to program for, if you can't handle an empty

Everyone has a name, which people pronounce out loud. English uses characters and combinations of characters to represent sounds. Thus, everyone has a Name which can be translated into English.

If this last statement has an accuracy requirement, then it is demonstrably false. Many (most? all?) languages do not have characters representing every sound that a human can make. For example, there is no letter or combination of letters in English that represent the sound of the guttural (I don't know the accurate linguistic term) letters Het and Haf. Conversely, Hebrew has no letter for the sound of the English combinations ch and th, though there is a letter for sh. You can get close enough for most purposes, such as using h or ch for those Hebrew letters, but if you pronounce them as if they were English, you'll be pronouncing the name incorrectly.

"People have exactly N names, for any value of N.People’s names fit within a certain defined amount of space."

So how many people have a uncertain number of names at any given time? Is your name involved in some quantum uncertainty fluctuation?And I do not believe that some people have infinite names. That is obviously untrue.

And I do not believe that some people have infinite names. That is obviously untrue.

I believe it is the old Welsh tradition (or maybe a similar variant) to have a single given name, followed by a listing of your patrilineal genealogy as a series of surnames. So, in this tradition, it is technically possible to have a name of unlimited length. However, the longest proven genealogy in modern times is 85 generations, which does put a maximum realistic size on this type of naming system.

My daughter was born in another country (Australia), her last name is my name and her mother's name with a hyphen in between.

The consulate of my country (Belgium) did not accept double names, so they only put my name on her passport.

When my daughter and her mother returned to my country a couple of months before I did, the local community (Schaerbeek) had a conflict with the ministry of foreign affairs and they were doing a boycot action: they unlawfully did not recognize any foreign birth certificates, so

I'll fess up and admit I've never actually put too much thought into human names.

Unlike Joel Spolsky, though, Patrick McKenzie doesn't actually try to point folks in any direction to the truth. What is the robust way to handle names? that simultaneously not in violation of all those misconceptions?

Unlike Joel Spolsky, though, Patrick McKenzie doesn't actually try to point folks in any direction to the truth. What is the robust way to handle names? that simultaneously not in violation of all those misconceptions?

Most of those misconceptions don't matter, though.

Somewhere, you have a legal identifier (on a passport, driver's license, etc.) that can be called your "name". That identifier can be input with a single Unicode text box that has a reasonably insane length limit (say 512 characters). Even if someone has a name that won't fit in 512 characters, it's highly unlikely that you will have a collision because of truncation. Even if you do, it doesn't matter, because you shouldn't use the name as a unique record

In a university setting, some kind of western name assumption is typically already made: students and employees are in a database with family names and given names listed, and all sorts of communication is already generated from that (e.g. paychecks).

If the person is a United States resident, at least, they have something filled in in the "surname" and "given name" sections of their birth certificate (if born in the US) naturalization certificate, green card, or visa document. That might not be true in all western countries, but I know it's true in Denmark as well: to work or study legally in the country you need to register with the Citizen Register and list something in those boxes. Then the university will just use whatever your state registration sa

I would much prefer fullname@xyz.tld over full.name@xyz.tld. It just looks cleaner and is less confusing when spelling it out to people. You expect emails to format to string@string.string. Throwing in any additional symbols, especially one that's already used elsewhere, throws people off even if there's no technical reason not to.

For simplicity, I'd say go with username@domain.com. That way there is standardization across email and other systems... which also confuses people less. Our email system (Novell

Let me throw out an issue – may you have thought about it and can give me some clue.

Western nomenclature is given_name family_name. Eastern is flipped. Having a standard convention helps decode who you are talking to. If Kim is the given name then the probably female. If Kim is the last name then, well, 50/50 chance.

But at least when I pick up the phone I can chose to be formal (using the family name) or informal (using the given name.)

You are not losing data since you have no idea that the data in the email is formatted family name last. I see this getting flipped all the time because the HR person is ignorant of the naming conventions.

In a course I once taught, I had two students of middle eastern descent, who were not related to each other, yet the first 47 letters of their names were the same. After the 48th and 49th letters, which were different, they again matched for another 10 letters, at which point one name ended, and the other continued. Many email programs will stop looking at the "full name" being assigned after a certain number of letters has been reached, and frankly, expecting someone to type that much just to send someon

Things don't have to be either-or. The email system can route both userid@domain.tld and First.Last@domain.tld (with First.M.Last for conflicts, and shortened forms for very long names if desired, or omitted at the user's discretion) to the proper users. There's no reason to restrict each user to one and only one address. I think most Western non-geeks would prefer First.Last where possible, and forcing people to remember some jumble of userid letters seems like a system designed for the ease of the impl

In a name outside Western Europe and the Americas, the "chars that are not valid" may outnumber the characters that are valid in an e-mail address and accepted by most MTAs and MUAs. In such a case, what "known way" is the best practice?

This is a bad idea. If people know that the email address is "First.Last@domain.tld" they will just type it in without thinking. They will say I want to email John Doe and type John.Doe@domain.tld into the email address and fire off without bothering to check if there are multiple John Does.

The company I work for has an employee in HR that shares my first and last name. We have separate middle initials but because I was first I get First.Last and she gets First.M.Last. I often get rather awkward emails

Or you could, you know, conventionally assume the conventions of where your company is based, and treat special cases as special cases.

The key problem with this idea is the word "Automatically" in the title. Special cases are called "errors" in this scenario. And whether you plan to have a solution for them or just need code handling to catch and throw a meaningful, helpful exception when you encounter them, you need to try and predict what they will be. Humans are great at defining unforseen exceptions. Software isn't.

University usernames aren't typically anonymous anyway. They're often pretty trivially generated from real names, e.g. bgates, and in any case you can usually go to university.edu/~username/ to look the person up.

The problem is, within a large organisation that will presumably be using directory and calendar services, you can end up making name lookup harder than it should be and/or confusing.In nearly every big company that I've worked with, 'jon.doe@xx.yyy' always ended up getting mail, and invited to meetings, that were intended for 'jon.doe1@xx.yyy'. (Outlook, Lotus Notes et al are all great at 'helping' you complete the 'to:' fields in this way)

My post was intended to be more of a joke about how Google, Yahoo, etcetera 'randomize' a common name. At some places I've worked, the first person there gets first.last, and if someone else comes along with the same name, they add MI. That's probably a better method than appending random characters, and can help a bit with the 'helpful' auto complete.

Here's a solution to this problem: If there is more than one John Doe, you change them _all_ to john.doe followed by a random but unique three digit number. john.doe itself is redirected and automatically gives a reply containing the list of correct john.doe email addresses plus some information that makes them identifiable.

So if I wanted to email John Doe in accounting, I'll get an email back telling me the CEO is john.doe386, there is john.doe196 in accounting, and the janitor john.doe412.

Another solution is to add the abbreviated department john.doe.ft@company.com or and reduce even more the collision risk, add the birthdate (only month day), john.doe.0229@company.com. And nobody will forget your birthday anymore!

I would avoid punctuation as people will get it wrong and not realize the intended person did not get it. Worse, I have an account with a provider that ignores punctuation even though you can put it in your email address so first.last and first last both go to me. I had an idiot admin insist he had the correct email even though I told him I was getting emils with private information from him. He refused to verify the addy and suggested I change mine. I declined and said since I notified him of the privacy

If usernames won't give conflicts, then use them. And for the people that wants fancier emails, you can put aliases as firstname.lastname while there are no duplicates

One company I know did this. The username was derived from your real name, but because they know of conflicts, they let you pick what you want. You could pick a first initial-last name if it was available, else first-name-last-name, first-name.last-name, or a few other combinations. You could choose any one of them (they ran a collision check

My university takes the unique usernames approach ( abc123@mail.domain.tld ), but also creates aliases for everyone ( generally in the form first.last@domain.tld , but the user actually can choose whatever they want, if there's a collision). Seems to work well enough.

username@domain.tld is the actual email address, with an automatic alias of firstname.lastname@domain.tld, and (if the user requests it) an additional alias of nickname@domain.tld I have only refused one request for an alias -I decided it was stretching the bounds of "business appropriate" a bit too far.

My old company used first initial, middle initial, and the first 5 letters of your last name. Collisions were handled with numbers, so there were some usernames that were tdharry19@company.tld. It's the same idea as passwords, maximize your entropy to avoid collisions.

A lot of places these days have added something, usernames and e-mail addresses not being identical. Makes it a tiny bit harder to get usernames for your network. So your username is tdharry19, but your e-mail address is Tom.Dick.Harry@com

This should prevent any name clashes and still move all the emails to one domain and even preserve the similar format the users already have. New users may not even need their own.subdomain after the email name, but you'll be adding them as you go forward and can check for clashes when they are added and maybe just add a.subdomain to them, or numbers to the end.

This should prevent any name clashes and still move all the emails to one domain and even preserve the similar format the users already have. New users may not even need their own.subdomain after the email name, but you'll be adding them as you go forward and can check for clashes when they are added and maybe just add a.subdomain to them, or numbers to the end.

What happens when their subdomain changes because they change jobs or departments? This effectively re-instates one of the reasons to get away from 350 different domain/subdomain combinations in the first place, as the OP is doing.

Based on my experience, I expect 99% of your students and a non-trivial percentage of your faculty will just forward their university email account to their personal Gmail account. They won't much care what their university address is (okay, faculty WILL still care and express their opinions, even though they won't be using it).

The staff will be the only group that actually uses your email offerings with any sort of consistency.

As others have pointed out any assumption you make about names is probably wrong for somebody. Some simple examples, i am on the system as 'samuel' but i am known as 'sam'. I have colleagues who are know by their middle name or by their anglicised name.

It sounds like you already have globally unique usernames, so that would be a good starting point. You could then offer people an alias, suggesting fullname, first.last or first.initial.last, but allowing reasonable alternatives.

You work at a university and you are sorting out the email system? Well, wave bye bye to your job soon, because one day the suits will say "Hey, lets move to Microsoft's Live.EDU" and then the problem is somebody else's. [Or Google mail for organisations, of course]. Either way, the suits will wonder why university IT are doing mundane things like setting up email addresses when that can be outsourced. Cheaper.

Keep in mind that as a university you are going to have a much larger turnover than a standard organisation, so their strategies may not be suitable for you.
I would suggest that using any combination of First Name and Last Name will give you a pretty large amount of collisions, either with current users, or with past users. Collisions with past users may not seem like a huge problem until you get a ton of new users asking you why their accounts filled up with donkey porn spam on the first day.
Of cours

This is the first question you should ask. Once upon a time I worked for a department that managed its own email, and hence had it's own domain. Someone had the bright idea of consolidating to just use the central email solution in the interest of saving time/money, in spite of the fact that managing mail took very little time and very little money. Transitioning everyone took a lot more time than managing the original process, shoehorned people into arbitrarily small mail quotas (hint: do not tell people who cost $100+/hour that they need to manage their email to fit in an amount of disk that you can buy for a dollar), made them less efficient and less happy as they had to switch from mail clients they knew well and were happy with to unfamiliar ones they didn't like.

In the end, we spent more time and money making everyone less happy and less efficient than if we'd just left it alone.

As far as simply avoiding clashes, consider that this is one of the benefits of there being a hierarchy in DNS. You can have bob.smith@finance.domain.com, bob.smith@engineering.domain.com, bob.smith@sales.domain.com, etc. Is there an actual requirement for everyone to be @domain.com, or is someone just empire building?

... and don't even use names. Issue them a number or nonsense sequence of characters like most big companies do. Your collision % is probably based on current students, right? Remember the current student body changes by 25% every year. Name collision will grow over time until common names ten years from now need to have a nonsense sequence anyway..

I work in schools. I often have to generate the systems to make usernames, passwords or email addresses and the like. Sometimes several dozens of times over in a variety of formats and allowable restraints (I do HATE software / services that can't just let me enter whatever the hell I like, how long I like, and with spaces if I like, and handle it like any other string - passwords, I accept, but anywhere else is just another way to waste my time going back and forth).

Eliminating an identifiable first name prevents random creeps stalking the female employees. (Yes, it can be a problem, both internally and externally.)

Our company eliminated first names and went with first initial, middle initial, last name with no separator: John C. Doe becomes jcdoe@domain.com.For duplicates, the longest-term employee is assigned jcdoe, the next is jcdoe1... etc. Over 10,000 employees and only 7 conflicts that I know of and 3 of them are rcsmith. One is R.C. senior, one is R.C. junior a

You're giving away one-half of the user's login credentials. Second problem is first.lastname@blah.edu could still be subject to collision and eventually is giving away information about the user making phishing campaigns much more effective.

The best solution I've ever seen is a place I worked for which had around 1500 employees. They used the first name of the person and first letter of last name "Jims" or "bobb" and suffixed with a 3 digit number "jims112" "bobb113" (they never used the 0 as it would get

Use firstname.lastname999@domain.tld where 999 is a 3 digits random number (retry in case of colision, also improper funny numbers are left over).
This apply even for non-coliding names, the first one to be registered will have the digits also.

At the university I previously studied at, they went through pretty much the same process when they decided that individual departments would no longer be permitted to have their own email domains. They set up a system to allow people transferring to the University-wide domain to specify their own name, with the limitation that it had to include at least one character from your first and last names (along with various other requirements). So if your name was John Smith, you could choose whether you wanted J

Oops, didn't realize university.edu was actually in use - trust a business school to buy up such a generic domain name. My earlier comment doesn't (as far as I know) actually apply to the actual university.edu.

Still the same? Increment the middle initial. The first person with the same name as someone else got an "x", the second person got a "y", the third got a "z", and I don't think we ever needed to exceed that. If necessary, we would have just continued through the alphabet, starting back at "a".

The biggest single problem we had with names and email addresses was employees who were legally empowered to use a different identity when dealing with the public. Anything that the public might see (their name or signature on a document, their email address, etc.) was a pseudonym, yet we had to use their legal names for internal purposes. Undercovers are a pain but I assume the OP won't be dealing with that.:-)

There is a problem with the middle initial, if you have a branch in a country where middle initials are not very common, or in this case, if the university has many students from countries where most people don't have a middle initial. For instance, in my family, most people don't have a second given name and thus no middle initial at all, and my father's name has two front initials before his given name.
But I've seen a kind of "artificial" middle initial, where the first John Smith gets the email address

But I've seen a kind of "artificial" middle initial, where the first John Smith gets the email address john.smith@organisation.tld, the second becomes john.a.smith@organisation.tld, the third one john.b.smith@organisation.tld etc.pp.

My early big-systems computing life was with the e-mail system at Dartmouth that went to real names in the 80's. There were twenty thousand-ish users and there definitely were a few name collisions with the First.M.Last standard.

There were two solutions. First was a user-editable nickname field. Just a space separated list that could be used to add to matching rules.

So, I had a proper e-mail left part of 'William.P.McGonigle' but my nickname field consisted of 'bill wpm skynet photographer sigep' to help other people find me. Only the real address was guaranteed unique but for phone conversations I could tell people wpm@ (it was unique at the time). People could get me at my machine name that way, look me up in the directory, address me as bill.mcgonigle, etc. (it would combine all dot separated parts with nicknames and department names to find matches).

So, if there were 20,000 people happily using this system, there were four people who it didn't work for, and those were people with the exact same name as somebody who was already on campus. The usual choice was to adopt a different middle initial, use a full middle name, or to accept the nickname as the real first name.

Now, there was always a contingent of people (I won't say aspy nerds because that would be rude) who insisted that those were WRONG and that the addressing scheme had to work exactly the same way for everybody. They probably advocated bmcgo654@ for my e-mail address. But what they missed was that the utility of the system that was in use was so high that it greatly outvalued having a 'perfect' system that had very low utility.

If we lived in a world where every e-mail user could easily query the other institution's LDAP and not run the risk of spam, then that might be fine. But we don't, so easy to use addresses makes the computers easier to use.

I don't want a middle initial. It was completely useless, if it weren't for filling out forms designed by stupid data collectors.
My father's three names are those of his grandfather, his father and this own given name. Reversing the order of the names to fit into a form is pointless, dropping one is pointless too, and accepting his grandfather's name as his own for the sake of some silly database is too.
It gets worse if you have people whose names don't follow the "a name consists of exactly one word" r

I'm sure there's a protocol for three people joining where the second and third share a middle initial, but I haven't seen it come up and we're not a small place.

I have and the conflict is solved by reverting back to first_name.last_name format with a number appended at the end for the second (and later) full name with same middle initial or for all names without a middle initial.

Yep. I know a good deal of larger public agencies that share their domain across tens of thousands of employees that are going this route(employee ID or badge number for public safety). Student IDs are no longer social security numbers, so there is very little from a security perspective to worry about.

Email addresses are intended to be public, and an organization handing them out to their users typically don't want them to be anonymous. And by its nature, as soon as an address is used to send mail it loses its anonymity.

"Hey there, I'm Gary Wilson. I'd like to get more information about this petition you're circulating, but I'm running late to class... can you email me more info?""Sure, Gary. Thanks for your interest. What's your email address? Gary.Wilson@myuniversity.edu?""No, it's generated using a salted hashing algorithm, it's actually 8msMWlk09$1)_23@myuniversity.edu""uh...... yeah, why don't I just give you my card, you can contact me later."

I actually think this is a pretty good solution. If you used a unique incremented number each time someone with the same initials joined then something like that would work fine. 3 digits would allow for 999 employees with the exact same initials and gives everyone a name of 5-6 chars (assuming you limit to three initials).