Robocallers stand out in a troll through Chinese cell phone records

How to tell the robots from the humans even if you can't hear the conversation?

The availability of electronic records of communications, from the use of cellphones to chats in online games, has given social scientists new options for studying how humans interact. Communication patterns, friendship networks, and the spread of ideas have all become accessible to large-scale analysis. Now, researchers have combed the records of 5.9 million Chinese cellphone users, trying to figure out the normal pattern of calls they make. And in the process, they've identified a few abnormal patterns, ones that probably aren't made by humans at all.

The researchers, four of whom hailed from Shanghai's East China University of Science and Technology, involved in the work obtained 108 days worth of call data from an unspecified Chinese carrier. They used these to identify the 100,000 most active callers, since these should call often enough to provide a decent picture of the statistics. Although their records could be analyzed a number of different ways, they chose to focus on the interval between calls: how often, in general, does one wait before making a second phone call?

You might expect that this value would show a classic poisson distribution, with a bell-shaped curve centered on some reasonable value. But, in fact, the typical time between calls overall showed a power law distribution, the classic spread that shows a peak towards one end of the graph, followed by a "long tail" of gradual decreasing.

However, when the researchers started diving into the data, something strange became apparent: the power law distribution was dominated by the accounts that made the most phone calls (and this is already the most active subset of users in the records). So, the researchers went through and categorized every single individual account. Some of them displayed power law distributions, but the majority (over 73,000) were a better fit to something called a Weibull distribution—think of a bell curve with a long tail on one end. Only about 3,500 showed a power law pattern of spacing in between calls.

The authors then compared these two groups based on a variety of statistics: the percentage of the total calls that were outgoing, the count of different phone numbers they dialed, and how diverse the recipients of their calls were. The result of this analysis is that the power law group had the most "anomalous and extreme calling patterns," according to the authors. And, in most cases, these are potential signs of trouble.

Some accounts showed a high frequency of outgoing calls, but only to a limited number of (or only one) target phone numbers. The authors inferred that they are "robot-based users." Another cluster of accounts had a high frequency of outbound calls, but had an inordinate number of targets, and called them all with equal frequency. The authors suspect that these are sales accounts, or represent instances of phone-based frauds.

The authors suggest that their work provides "information valuable to both academics and practitioners, especially mobile telecom providers." But they then go on to ignore the network providers and focus on academics. For them, the key message is that cell phone users are a diverse population, and shouldn't be modeled as if they all follow patterns that fall on a simple power law curve. In fact, even among the users that showed a power law distribution, the value of the exponent that described the curve varied a great deal.

Could an actual cell provider use this information? Clearly, a scammer like the guy who tried to gain access to one of our writers' computers will show one of the patterns seen here: lots of calls, almost all outgoing, and spread among a wide variety of contact numbers. Even if the phone company felt no ethical obligation to block the practice, it might still see it as a drain on its resources (provided the scammers have an unlimited calling plan). At the same time, there will be numbers that show the same pattern, but for legitimate reasons—automated appointment reminders from medical practices spring to mind.

So, sadly, although patterns like this could be a useful starting point for investigations, and could definitely serve as evidence if a scammer gets caught, they're not going to be especially useful in creating an automated system that could shut down scammers and spammers.

For instances like appointment reminders, using the information, it would be fairly easy for the providers to check on the numbers that are calling in a spamming like manner and determine if they are legitimate or not - it's not as if they have all of the details to hand (so to speak).

I suspect that following this research lots of companies are going to follow through with their own research.

Hopefully this will lead to telemarketer-spam-filters being available to phone users fairly soon. Dare I hope?

Being available from who? Not the carriers - they see telemarketers as a source of revenue if properly managed, not as a nuisance to their customers. They don't really give a shit about their customers at all, other than if they don't pay their bill on time.

Hopefully this will lead to telemarketer-spam-filters being available to phone users fairly soon. Dare I hope?

Being available from who? Not the carriers - they see telemarketers as a source of revenue if properly managed, not as a nuisance to their customers. They don't really give a shit about their customers at all, other than if they don't pay their bill on time.

My phone/internet provider already provides such a thing - turned on by default.

I do wonder why fairly tech savvy readers and authors of a tech site don't invest in a call blocker for their landline phone. (There are also apps that do much the same thing to protect your cell phone too.)

I use such a device and Robo callers never get through. All withheld numbers are bounced. The phone never rings unless it's a person who's on my list or they're a real human that pressed the correct key to get through and haven't withheld their numer. I can prescreen unknown callers without ever having to speak to them and if they aren't legit, I dump them to a terse recorded message that warns them not to call again, additionally blocking their number from ever being able to get through again.

I appreciate that such a device shouldn't be necessary but we don't live in utopia. Computers require antivirus software, the doors of homes and businesses require locks. No one deserves to be harassed by robo callers, viruses or thieves, unfortunately not protecting yourself invites it.

It could be that the callers who follow a power law curve are those who call mostly from a certain time onwards. Here are 3 examples: firstly say someone has a work only phone which they leave at work, now suppose they arrive at work and after checking emails, their calendar and misscalls the first thing that they would want to do is get all the resulting calls out of the way. This would cause a lot of calls that are close together which will cause the high peak near the origin, then throughout the rest of the day they make calls more spread out which will make the long tail. Another group like this is the group with a non-work phone that wait until the call rates are low in the evening. Or even someone who has has a lot of friends or family oversees in a totally different time zone and they wait until they are up before calling.

Other completely different explanation is that this pattern might be a human adaptation, once you start making huge amount of calls, maybe you start doing them in batches.

Finally for it to be robocallls sounds very odd as they said the group is very diverse. Like all science their explanation should be verified by independent means to eliminate other explanations that fit with the data.

@ sonolumi -Works great on landlines - a little mroe difficult with cell phones tho. BUT in either case - Caller ID is yore friend.

Quote:

At the same time, there will be numbers that show the same pattern, but for legitimate reasons—automated appointment reminders from medical practices spring to mind.

What's funny here is that somehow the Service Providers seem to be allowed to circumvent being "bad guys" when they do it. I get crap form ATT all the time - even tho I've opted out of their marketing. I get automated Voice calls - I get text messages and I even have an "inbox" inside my online account that fills up with crap.

I find sopmething interesting tho - that this is coming from China. At a time when they are cracking down on and restricting more and more free flow of information. What's the paradigm here ?

I'm not sure how we can read into this without - reading into it.

Just seems odd.

the same study coming from almost anywhere else ont he planet might be more receptive.

I get incessant calls from "card services", they use rotating numbers. I have done searches and discovered they are an illegal operation and use internet based telephony. They hang up when you say to put me on the do not call list. Any help here would be greatly appreciated. Sonolumi mentioned call blocking apps, would that work in this case. Any names of Apps?

Interesting stuff. My (USA) experience has been robodialers call me in waves. I'll get no calls for a month or so, and then get them every few hours for a few days. My guess is they call all phone numbers sequentially in an area, or something similar, and it takes them a while to get back to any individual area. When I get hit with robocalls, they tend to be in bunches even if the subject matter (credit, fundraising, etc) is different, which seems to suggest the same firms run a bunch of scams for different clients, and efficiently blanket a certain area before moving on. The gap between blitzes may also be explained by the robocallers have to abandon one outbound dialing service and find another one.

The way to stop robocalls is to hold the outbound dialers responsible by law as the "exit point" where robocalls get into the phone system. That would stop robocalls instantly. But it would also destroy the profits of the entire chain, so there's about as much chance of this happening as there is of stopping junk mail.

The researchers, four of whom hailed from Shanghai's East China University of Science and Technology, involved in the work obtained 108 days worth of call data from an unspecified Chinese carrier.

Bit of a tangent, but 108 is a rather significant number in eastern religions, particularly (but not limited to) Buddhism. I found it interesting that a Chinese study used 108 days as a sample size, though it's likely mostly whimsy (or entirely random, though I sorta doubt it).

Yeah, that's who we have too (DSL only, not fiber, though). But I maintain that the nationwide carriers still don't give a shit about their customers. They're a necessary evil that stands between the carriers and the customer's bank accounts.

Reading the paper I see that the to and from phone numbers have been hidden from the researchers (by "encrypting" them). (That might be worth mentioning in the article!)

These researchers appear to have done something that I don't see often enough. They have broken the results down into different populations which has given them better insight. It seems that, had they not done that, an undifferentiated analysis of the whole data set would have been misleading. (Could there be further ways to break the data into different populations?)

Much marketing research appears to fall into that trap that these guys avoided. Explains why some conclusions of such research are simply wrong.

(I'm concerned at what the state might be doing with research like this. Not just China but maybe all countries have tax funded watchers!)