I used mitmproxy as my setup in order to re-route all app traffic for analysis. One can see in the video how the device information, usage time and list of watched videos are being sent to Appsflyer and Facebook.

It is hard to believe that this is covered by „legitimate interest“ and transparency: the search terms that I entered are being forwarded to Facebook:

The transfers to the two companies are clearly conflicting with the GDPR:

Facebook cannot comply with article 14 regarding the rights to deletion of information etc. for this data.

The data transfer to Appsflyer also lacks transparency as it is unknown to which of its more than 4500 partners the data might get transferred down the line. Bytedance’s answer to this: „We won’t show you the contracts.“ Did they even read article 26 of the GDPR?

Most importantly, fundamental rights are being violated since Personally Identifying Information (PII) is transferred to a server under the control of a company residing in an unsecure, non-european country. The location of the server is irrelevant – what is important is the location of the company deciding about the data, according to Malte Engeler. Bytedance’s headquarter is located in Beijing, China.

I also checked the website itself, which is important since all videos that are being shared via messenger or social media are getting viewed thereon. Any shortened URL for a video (like vm.tiktok.com/9uTpDV) gets resolved to an URL containing the installation ID. Thereby, TikTok is able to check who shared which video.

They also track who is watching the video. Besides conventional trackers (Google Analytics), the highly controversial method of device fingerprinting is performed to assign a unique hash value for a cookie variable named s_v_webid. This is being achieved by combining unique hardware and browser characteristics.

One of them: Canvas Fingerprinting. An image is being drawn in the background, using vector graphic commands. The image then gets rasterized to a PNG image, which in return gets saved. The so-created data is quite unique among different devices, and depends on diverse settings and features of the hardware used.

Audio fingerprinting is also being used to identify visitors. This doesn’t mean that microphone or speakers of the device are being used. Instead, a sound is generated internally and its bitstream is getting recorded. This will also generate different results, depending on the device being used. This is what it sounds like:

Bytedance states that these fingerprinting techniques are being used to identify malicious browser behaviour. Quite hard to believe, as the website still works as expected even if the corresponding script is being blocked. Furthermore, Akamai’s own server-side fingerprinting technology is equally being used (which is a complete different story waiting to get investigated).

There are several other issues, like Google Analytics being used without anonymizing the IP data. And to top this off, free software is being used without attributing proper licensing – Zepto.js from Thomas Fuchs, Murmur Hash from Austin Appleby and FingerprintJS from Valentin Vasilyev, just to name a few. How low can you go?

So is it a good idea if the german news magazine Tagesschau fosters TikTok’s ecosystem by publishing their news clips, which are getting paid for by germany’s citizens through the means of an obligatory and nation-wide broadcasting fee?

TikTok channel operators may also fall under joint controllership with TikTok, as the ECJ has ruled for Facebook fanpages. As a consequence, a channel on TikTok could be locked down if privacy rights are being infringed. Heiko Neuhoff, the DPO of the public broadcaster NDR told me, he is about to decide if this is applicable to the channel of Tagesschau.

My comment

TikTok is breaching the law in several ways whilst exploiting the data of its mainly teenage users. This should get addressed immediately in a swift and rigorous manner. The required legislation for this is in place. Don’t let them get away by breaking society, just as 10 years of Facebook did. Journalists should find a better place for their vertical video clips to get published.

(I first published this on Twitter/Mastodon and then later transferred it with minor corrections to this blog post. Special thanks to multimedial.de for corrections regarding my english.)

35 Comments

I’ve another experience with tiktok. My son (ten years old) has a phone. Not a smartphone, a phone like a nokia 3310. He can access to a computer, with windows family program, and i’ve a week report with all website visited etc.

One day, he call me: Dad i’ve a text message, i don’t understand.
It’s from tiktok, they ask him to create an account. I ask him to delete this message.
But how did they get his phone number?
I did the mistake to ask to tiktok to delete personnal data. I should have ask them first to provide me data and consents to see how they collected the phone number.
The dpo answered me that it was my son who did a request to connect. That’s impossible.
I think that’s probably a collect from contact list of a friend of him…

I agree. Apart from a couple of adjectives instead of adverbs (something native English speakers do as well), the post is very well written. I couldn’t do that well in German (which is why I’m commenting in English, even though I do speak German) 😉
And kudos for graciously accepting help Mathias!

Bro,
Did you report this to the proper body of government in EU? If not, you should immediately. Send them or tag them on Twitter to anyone that is connected to privacy bodies like GDPR etc. There needs to be taken action.

As a journalist I follow the journalism ethics rule not to reach in any investigated material to a authority. This way all people I speak to don’t have to worry that material will be passed to authorities. Also I stay independent this way. Only exception is if I’m affected as private person. But of course I encourage any reader to do so. You can reach your local data protection authority (they pass it on) or the french one which is directly responsible.

If you open up TikTok app in Japan, the content is all about high school or middle school girls dancing in a skimpy dress or school uniform. They obviously encourage them to do so. All those practices they are doing wrong for the privacy is one thing but how they monetize the content to begin with is entirely gross as well.

Hello,
Great article! You have mentioned that „Any shortened URL for a video (like vm.tiktok.com/9uTpDV) gets resolved to an URL containing the installation ID. Thereby, TikTok is able to check who shared which video.“ However, hashing is a one-way process i.e. from the shortened URL, one cannot know the input URL and installation ID. Isn’t it? If so, how would TikTok get the installation ID from the shortened URL? One way to do this would maintaining a hash of every device ID with every video on the planet. The memory requirement would be so much that their servers will crash. So, I’m not sure how you can claim that „TikTok is able to check who shared which video.“ Can you please share your thoughts?

Hi, thanks for the praise. The shortened URL has to be resolved, otherwise Tiktok couldn’t deliver the correct video belonging to it. This is done server side by a database table and not by decrypting in the browser, as you correctly observed. So there definitely exists a database on the server with the ability to map every shortened URL (which is build exclusively for every share activity by a user, not for every video!) to a long form in split seconds. No magic here, Youtube also has to resolve its video ID strings in the URLs server side to every video file ever uploaded. After that, normal web server log technology could be used to track the ID in the longform url, which now appears as a second request in the log files (after the redirect). As we know from facebook and it’s hadoop log processing the industry isn’t doing tracking analysis always for all views ever occurred but rather only for a relevant time span. Probably they aren’t processing it at all until an advertising interest (or other interest?) is occurring and then they run a confined log query. But this details are just my guess, I neither have insight to tiktoks database nor today’s server side database technology.

Update: They informed the whole audience in the stream, apologized (they had trackers blocked on their computers and didn&apos;t realize) and it looks like they removed the tracker from the setup during the running conference, which is quite amazing.

"Facebook&apos;s business model relies on optimizing everything for &apos;engagement&apos; and &apos;growth&apos; to make as many people stay as long as possible. Like yellow press on steroids, automated+personalized at global scale.It&apos;s toxic. They know it. They won&apos;t change it."https://www.wsj.com/articles/facebook-knows-it-encourages-division-top-executives-nixed-solutions-11590507499(Via https://twitter.com/WolfieChristl/status/1265391689071501313)

Just discovered that #Frida Objection has a pause switch (not well documented): objection patchapk -s example.apk --pauseApp is disassembled, then waits to get rebuild. So instead of code injections you could just change everything in a temp folder (eg manifest, smali)https://github.com/sensepost/objection/issues/305#issuecomment-574047101