Clickstream tracking of users of the Tor browser – A research paper
The growing significance of web analytics, we have been witnessing throughout the past few years, has been also accompanied by an enormous growth in the number of web users concerned about preserving their online anonymity. The Tor browser has been often considered as the best online browsing tool available, as evidenced by more than 2.5 million people using it daily. For the vast majority of Tor users, even though most of Tor’s terms and options are rather difficult to understand, they inarguably believe that the Tor browser offers them more anonymity protection than what it is actually capable of providing.
A recently published paper proved that the Tor browser can provide very little privacy protection if used via its default settings. As such, to achieve near total anonymity, extra care must be exercised by users of the Tor browser. Let’s take a look at some of the ways that can be used to track the clickstream of Tor users that were presented in this paper.
Clickstream tracking via timing and traffic correlation:
Tor users can be vulnerable to deanonymization using end-to-end timing attackers. An adversary monitoring network traffic sent to the initial relay node, as well as traffic sent to the final relay node, can make use of statistical analysis to identify the circuit they belong to. Consequently, Tor technically does not provide total anonymity for its users. The user’s IP address as well as the destination IP of the observed traffic can be sniffed by the adversary, who can easily track the clickstream of a user via correlation attacks. Interestingly, the adversary needn’t control the entry and exit nodes within a Tor circuit to be able to correlate network traffic streams observed travelling across these relay nodes. The adversary only needs to be capable of observing the traffic.
Sometimes, tracking the clickstream of a user does not require any complex forms of statistical analysis. For example, a student in Harvard University was caught sending fake bomb threats to ditch an exam! The student sent the emails using a Guerilla email, an email address provider, via the Tor browser. The Guerilla email service adds the IP address of the email sender to all sent messages, which helped in identification of the user’s Tor exit node.
Clickstream tracking via traffic correlation attacks is, more or less, easy to conduct, especially when the anonymity set (number of users using the Tor client) is somehow small. In other words, whenever a small number of clients are using the Tor client, within a given local network, then deanonymizing them is relatively a simple task to accomplish. More sophisticated attack forms require more complex techniques of statistical analysis of traffic, as well as timing. Recent experimental studies have revealed that such techniques can help track the clickstream of a large percentage of users of the Tor browser and visitors of Tor hidden services.
Deanonymization and tracking clickstream via practical side channel attacks (Torben):
This is a unique form of deanonymization attack, named Torben. The technique utilizes an approach that is more reliable than timing and traffic correlation attacks, as it is much less intrusive. The attack relies on interaction of multiple technologies – firstly, web pages loaded via the Tor browser can be easily manipulated to load scripts from untrusted origins; secondly, even though Tor encrypts loaded content, using a low latency anonymization circuit is ineffective at hiding the magnitude of request-response pairs. The attack was first described by a group of researchers from the University of Gottingen, Germany, who exploited this interplay to create a side channel in the Tor communication circuit, which enables the transmission of short markers of web pages in order to expose the web pages a client visited using the Tor browser. In an experimental evaluation that involved 60,000 web pages, the attack enabled tracking the clickstream of Tor users via detecting web page markers with a 91% accuracy.
Failure of security of operations:
It is easy to track users by monitoring the pattern of their behavior. This is relatively simple to accomplish for users who neglect using a bridge to connect to the Tor network. This method involves following up the pattern of browsing behavior of users linked to the same aliases on multiple forums, social networks, etc. This approach was how the identity of the mastermind behind Silk Road, Ross Ulbricht, revealed. Ulbricht made a big mistake using the same aliases on multiple forums and on the Silk Road marketplace itself such as “Dread Pirate Roberts” (DPR) and “frosty”.
Recent experiments have shown that 10 web addresses are all that might be needed to identify who the Tor clickstream belongs to. The clickstream is identified by matching account aliases and other online data belonging to the clickstream to publicly available data. The stream can be accurate to the point that it reflects everything a user has been doing, minute by minute.
Clickstream tracking via modified exit/DoS node:
This form of deanonymization attack utilizes five components – a modified exit node, a modified DoS node, a lightweight DoS web server, a client side JS for measurement of latency, and an instrumentation client to receive data. Implementing this attack is conducted as follows:
– The JS ping code is injected by the exit node into the HTML response.
– As the user browses as per usual, the JS will continue to “phone home.”
– As the attacker continues measuring, DoS attack will strain possible initial hop(s).
– If no significant level of variance is detected, another node is selected from candidate nodes and the attack sequence will restart again.
– Once sufficient change is detected within the measurements, the entry node will be detected, which will denaonymize the user and aid in tracking their clickstream.
This attack method helps identify the whole patch of connection through the Tor network. The attack utilizes bandwidth multiplication which makes it possible for low bandwidth connections to DoS connections with high bandwidths.
Clickstream tracking via BGP:
Experimental studies have shown that Tor is vulnerable to Autonomous Systems (Ases) that can relay Tor traffic, thanks to their effective eavesdropping capabilities. When a malicious AS, or a group of colluding ASes, intervening between a Tor user and the entry relay node, and between the exit relay node and the destination, can conduct timing analysis to deanonymize Tor users. AS level adversaries are very powerful for many reasons. Firstly, routine BGP routing can alter the number of ASes that can effectively track the clickstream of Tor users. Secondly, ASes can effectively manipulate BGP announcements to place themselves on Tor circuits along the paths entering and exiting relay nodes. Thirdly, an AS can undergo timing analysis, even if it can only monitor a single traffic direction between the entry node and the exit node. It was proven that asymmetric routing boosts the efficiency of ASes in tracking the clickstream of a Tor user.
Final thoughts:
The paper presented multiple means for tracking the clickstream of Tor users. It is worth mentioning that the biggest weakness that can boost the success of deanonymization attacks is the user. Users should be aware of techniques that can increase their privacy via Tor such as using a bridge, disabling JS, avoiding using Windows OS, and others. The Tor Project is continuously offering users detailed guidelines and tutorials to help them maximize their privacy and protect their online anonymity.