Slashdot videos: Now with more Slashdot!

View

Discuss

Share

We've improved Slashdot's video section; now you can view our video interviews, product close-ups and site visits with all the usual Slashdot options to comment, share, etc. No more walled garden! It's a work in progress -- we hope you'll check it out (Learn more about the recent updates).

Hugh Pickens writes "In a move reminiscent of George Carlin's Seven Words You Can Never Say on TV, the Pakistan Telecommunication Authority has handed down a ban on about 1,600 terms and phrases it has deemed obscene and told carriers they have seven days to block the words on their networks, or face legal action. 'The filtering is not good for the system and may degrade the quality of network services — plus it would be a great inconvenience to our subscribers if their SMS was not delivered due to the wrong choice of words,' says an official at a one of the telecoms. The list includes such words and phrases as 'idiot,' 'monkey crotch,' 'athlete's foot,' 'damn,' 'deeper,' 'four twenty,' 'fornicate,' 'looser,' and 'go to hell,' among others. There are also various double entendres included in the ban such as 'beat your meat' or 'flogging the dolphin.' Mohammad Younis, a spokesman for the PTA, says the ban is 'the result of numerous meetings and consultations with stakeholders' after consumers complained of receiving offensive text messages. 'Nobody would like this happening to their young boy or girl.'"

Animal lovers will appreciate the banning of the words "Cockfight," and "Pussy Cat." Rich people will get behind "Deposit," "Penthouse," and "Showtime." Reporters will love "Hostage," "Kill," "Murder," "Suicide," "Sniper," and presumably "Stupid." Construction workers seem to get the best with the banning of "Deeper," "Back Door," "Laid," "Banging," "Dome," "Harder," "Hole," "Joint," "Period," "Slant," "Screw," and "Budweiser." Everyone else will get behind the banning of such horrible words as "Creamy," "Jugs," and "K Mart." And pretty much all feminine hygiene is, by definition, unhygienic.

Strangely, they banned both root words and modifiers of root words... like calling out ass AND ass clown, ass banger, etc. It's like they don't know how filtering, or words, work. Also, they banned the phrases "XXX" which is, itself, a censor word to represent something else.

Being a Pakistan who knows all the BS the current government has been doing (or not doing) for the past 4 years. This is insane. They failed at everything else, there's daily power loadshedding/blackouts, 2,3 days a week CNG (gas) blackouts, loads of corruption. And then they come out with strange moves like this out of no where to divert people's attention.
This was really uncalled for. The only thing that every teenager and college student texts almost once a day is prank/hate messages about the current corrupt president Zardari, I wouldn't be surprised if there was 'Zardari' listed somewhere in those words.

Seemingly a random and disassociated list. Obvious purpose make it more palatable as more politically aligned words are added to the list. Silence private communication between individuals associated with the political opposition. After texts comes voice.

Because its Pakistan I can't tell if they meant loser or if they were serious...

As soon as I saw 'looser' on the list, I thought that they can't be all that evil. We do need a concerted effort to eliminate this example of stupidity. It seems rare these days to find anyone spelling "loser" correctly. Ignorant loosers!

But seriously, this list seems a bit dubious to me. Why would a country so paranoid about having bad things said about the Prophet Mohammed only include Jesus Christ on the list as a blasphemy? Why would a country that was once a part of the British Empire (and as such, still has English as on of its official languages) have one word on the list with the British spelling "arse", and 71 words with the American spelling "ass"? Why would there be no attempt to include the deliberate misspellings, abbreviations and contractions that are typical of texters?

It could be that they simply sourced a list of words from elsewhere, but it seems strange that they would not tailor it to their own country's requirements.

Why would a country so paranoid about having bad things said about the Prophet Mohammed only include Jesus Christ on the list as a blasphemy

Ummm...because if they censored the words "Prophet" and/or "Mohammed", that would be censoring a pillar of the Muslim faith? How would the righteous and moral doublegood citizens of Pakistan discuss the most important person in their lives? Or is this a test....?

Jesus is a prophet but not THE Prophet in Islam, so that's OK to make sure the infidels don't get to sell that silly concept outside of their strange cult.

More importantly though, this is actually a good thing. Why? Because we can look to Urdu - the national language of Pakistan - becoming the source of an entirely new and titillating orgy of euphemisms and slang that will defeat this list and that can never adapt effectively to counter it. The authorities have unwittingly introduced chaos and creativity into the very evolution of their national language. In less than a year, I make a gentlemen's bet that there will be their equivalent of the Number 1 Top 40 hit by their equivalent of Justin Bieber or Duffy belting out lyrics about "big tracts of land" and "brown roses with small petals" that will have the older generation pleased at the agricultural bent of the song.....and the young'uns practically creaming themselves in laughter.

And as we all know from the early attempts of the MAFIAA to curtail the sharing of music by disallowing the names of certain songs to be part of a file being shared, people will invent creative ways around it. 1600 or 16000 variants, people will find some that will slip through and the info will get shared quickly.

The weird thing is, we don't text in English! We "txt" a bizarre 1337 Roman Urdu, with lavish sprinkling of punjabi curse words.

Awesome language, that Punjabi, it has both, some of the very best poetry *and* curses.

So yeah, our dear president, we will still continue to crude messages about you, good luck stopping us. (I could swear one in ten messages is something disparaging about the president, given my inbox)

This is, of course, if the list *is* real. First of all, it's unlikely PTA would have revealed it, and secondly, I don't think they would dare censor something like Jesus Christ. All the churches would be in uproar, and the Supreme court would rip them a new one.

(*BTW*, just to give an idea, Pakistan has one the highest rates for text messaging in the world. We have six companies offering extremely competitive sms packages, and we don't have incoming charges bullshit that you have there, so good luck filtering those tens of millions of messages sent every day.)

They are doing it the wrong way. For total security, you don't ban threats proactively, instead you whitelist only the safe stuff. E.g. woodworking terminology is safe, unless you need to polish some rods, and so is cooking, unless you need to choke a snake or two for that aphrodisiac soup.

Even the loosers win when they apply the tried and tested best security practices.

"We spent several weeks building a UI that used pop-downs to construct sentences, and only had completely harmless words – the standard parts of grammar and safe nouns like cars, animals, and objects in the world.""We thought it was the perfect solution, until we set our first 14-year old boy down in front of it. Within minutes he’d created the following sentence:I want to stick my long-necked Giraffe up your fluffy white bunny.

In my original attempt to post I wrote "It's for the children!." in all caps in order to communicate the absurdity to those on Slashdot who don't always think things through and might actually take me seriously. I received the following error: Filter error: Don't use so many caps. It's like YELLING.

All of those things happen in Pakistan, but each one is illegal except for one*.

- Age of consent for marriage is 18 for males & 16 for females under Muslim Family Laws Ordinance 1961. Underage marriages are illegal.- Throwing acid in a person's face was explicitly criminalized in May of this year and can get you life in prison. Under older Muslim law, the victim had the right to return the favor and have acid dribbled in the eyes of her attacker.- Burqa wearing is optional and largely AFAIK common mostly in areas that border Afghanistan. Stoning a woman to death for not wearing a burqa is murder.

* The one legal bit you implied was forcing your wife to have sex. Pakistani law requires that the victim not be legally married to the perpetrator in its definition of rape, just like in many US states up until North Carolina was the last to close the loophole in 1993. Many states still don't protect a woman if she's incapacitated and unable to refuse her husband.

That's nice. But the AOC isn't enforced outside of major cities. Pakistan is quickly sliding into a imperialist islamist shithole. Meaning that sharia is the law of the day, and a wife or women who isn't subservient is disrespectful of their man, and in turn god. And where the whole sharia law thing has kicked into full gear, there's no such thing as rape. Unless you can find 8 male witnesses. And of course you can't rape your wife, she has to submit.

The Burqa is also becoming a 'norm' throughout the country as the government tries to appease the hardliners. Keep up with the times AC.

After all, don't people realize the horrible things that can happen when someone gets offended?

I found this documentary [youtube.com] about the terrible consequences of being offended. It recounts the gruesome details of people who have been offended, went to sleep, and woke up the next morning with leprosy.

It's good that Pakistan is stopping these atrocities before they get out of hand.

You had to say it, didn't you? You had to say that one little four-letter word. You couldn't just say "call hell" or "eval hell" or "do hell while true" or even "gosub hell". No, you had to put yourself right there beyond the bounds of civilised discourse and say The Word.

It's actually a plot to boost telcom profits by selling more voice and data services. The new and increasingly creative(and disturbing) euphemisms will proliferate at such speed that it will soon be impossible to have even the remotest confidence that any given SMS message (even if checked for lewdness before sending) will not end up being blacklisted and dropped before delivery...

Some of the banned words are amusing for various reasons. Some have fairly obvious explicit meanings, others do not. Some examples of messages that will be banned after this goes into effect:

"I am putting a new roof on my house and the stringer length is 18 feet.""Did you see the new wuutang clan movie on netflix?""When using distance measuring equipment in aircraft, it measures the slant length between the VOR and the aircraft.""When approaching to land, you should retard the throttle abeam the intended landing point.""I want to go land at Bremerton Airport, IACO identifier PWT.""When running long distances, you should be careful of joint pain in your knees.""Calculus is often considered to be a harder class than algebra.""Juggalo fatso got jesus" * (All words in this one are banned)

Wow. This is good stuff. I often wonder what is going on in these people's heads when they come up with lists like this. They are not sane as we know it.

Assuming their SMS system handles tens of thousands of texts per second, each of which needs to be tested against this user-definable dictionary of 1600 words, is it even possible for the platform to keep up? Are there sophisticated search / pattern matching algorithms for testing a message against 1600 substrings? I can think of a very naive way to do this, but I'm sure it would not scale.

I looked it up, and folks in the US send 80 billion SMSes per month. That works out to about 30k SMSes/sec on average across the entire United States. Now, I realize that certain times of day are more likely to have SMSes than others, so let's say, to a first order, the peak rate of SMSes is 100k/sec. Now divide that among all the cell towers, understanding that some will be busier than others.

Let's say a given cell tower has to process 100 SMSes a second, each at the full 160 character limit. That's 16kB/s. Let's say each word take 1000 cycles to test for, which should be on the high side since it assumes you can't use, say, a trie to take advantage of common word roots, or use pattern matching accelerators (which are quite common in this space [google.com]). 16kB/s * 1000 * 1600 = 25.6Gcyc/sec. That sounds like a lot, but it isn't.

A single board in one of these cellular base stations has literally dozens of processor chips, most with multiple cores, running in the GHz range. And that's just one board. My employer sells a chip in this space which crunches away 10Gcyc/sec across all of its 8 processors, and our customers put dozens of these on each board.

On GSM networks, SMSes are control channel messages. They go via a low bandwidth side channel that is nowhere near as compute-intensive as the main voice channel. If you're provisioned to handle a certain number of phone calls, you're more than adequately provisioned to handle SMSes and the corresponding filtering, as long as you do the filtering at the base station.

BTW, I realize this is Pakistan that we're talking about, not the US. I just used the US numbers to get an initial order of magnitude to get in the ball park for the number of SMSes/sec a given cell tower might see, on the presumption that a cell tower in the US has a similar amount of work to do per subscriber as a cell tower in Pakistan.

It would be trivially solvable using a trie [wikipedia.org]. 10000 messages is only 1,6 million characters even if every message has the maximum length. Even your typical smart phone processor could probably manage that throughput.

With a maximum character length of 140 characters, 1600 strings to match, and assuming 8 character long strings, it would take 140*8*1600=1,792,000 character matches per message if you do it naively. That is only a millisecond on modern GHz processors, but when processing large numbers of messages using embedded processors, that is probably a few more cycles than you want to spend on each message. You can do better by using Knuth-Morris-Pratt or Boyer-Moore. Since we can pre-process the strings to be matched, this means it takes only 140*1600*k=224,000*k (for some k determined by the algorithm). This is better, but not by much.

Notice that the dominant factor is the 1600 strings to be matched. If you really care about performance, then you want to get rid of that factor. Simplest way is to build a finite-state automaton. If it is encoded as an NFA, the performance won't be much better than before, but if you encode it as a DFA, then each message can be processed in only 140 table lookups. The downside of this is the size of the lookup tables. In the worse case, expect them to take terabytes of space depending on the particular 1600 strings being matched.

There are algorithms like Rabin-Karp and Aho-Corasick that might take less space while still taking only ~140 character operations. The practical answer, is to try DFA, RK, and AC to see which, if any, don't require too much preprocessing space, and then use one of those. The space requirements will depend on the particular text involved, but there are good odds that the tables for DFA will be small, and even better odds that the tables for RK and AC will be small.

Searching and sorting are two of the most well studied algorithmic problems in computer science. If you ever find yourself wondering how to do them efficiently, there is a good chance that very smart people have already figured out how to do it.

Here in Iran messages are censored but nobody knows for which words. It's not even consistent: when there's going to be a protest event or news the filtering increases. Normally it filters less words. People guess these words.
The worst happens for advertisers and advertising companies that send bulk SMS and later find out that nothing has delivered!

...The worst happens for advertisers and advertising companies that send bulk SMS and later find out that nothing has delivered!

If that really is the worst that happens, I applaud the Iranian government for their discovery of successful anti-spam measures. (However, I suspect that their censorship policy has worse effects than blocking advertising campaigns)

A very entertaining and instructive list, I didn't realise my English vocabulary had so many holes:)

Weird that Jesus Christ is on the list, after Mohammed he is Islam's second most important prophet.
Difficult when you are a plumber and need to order 1/2" NPT nipples...
A Pakistani Big Bitch must be huge, she's 2x on the list:)

Now Pakistan is a country of many languages so I wonder if there are, besides this English one, comparable lists for the others?

I thought that about Jesus, but then I remembered being taught that Christ is a title, not a name. Jesus is permitted in the Islamic context, but not in the Christian. So, is it weird that Jesus Christ is on the list? Meh, not really. Oppressive? Fuck, yeah!

Of course the most pointless thing with language bans and censorship of this kind is that it's exactly why we -have- so many double entendres and such. Every time a culture, religion, politician, parent, teacher, whomever, tells someone that saying something is offensive, the best they usually manage is the creation of some other way of stating the same thing. Even if that involves making up new words. Beyond that, the very children who everyone is usually trying to protect with language bans like this, are the absolute masters at creating new words to circumvent such things.

What you do is to give people a lot of freedom somewhere while you take it away in other places.

And when people aren't good behaving citizens then you pepper spray them [youtube.com].

Freedom is what you think you have, reality is that you may be free to express what you think in the western world but as soon as you act upon your opinions to get others to listen then you are a danger to the order.