agsandrew - Fotolia
With the ever increasing focus on online privacy, it's important to know what data is being sent to whom, and by what programs. Here, I show how free Web traffic tools can make it easy to acquire this information.
In our always-connected world, it's easy to forget about the constant invisible Web requests that our machines make throughout the day. In my non-scientific test, I let my machine sit idle for just a minute, and in that time my Web traffic tool logged more than 100 Web requests from apps like Facebook, Skype, Pinterest, Google Docs and MS OneDrive. These are just some of the sources that constantly send and receive data over the Web without your knowledge.
There are several traffic analysis tools available for download. Wireshark is a name that has been around for a long time and it is a solid cross-platform tool. However, because I'm a Windows developer and a fan of Telerik tools, I prefer to use the free tool Fiddler. The tool offers HTTPS traffic analysis, performance testing and ways to add custom features via extensions.
You can download and install the Web traffic tool Fiddler (note that you do not have to provide your email if you don't want to). Once installed, run Fiddler and you will see a display showing every request sent by your machine to the Web. Even when doing nothing, you'll be surprised how much traffic a typical machine can generate.
Fiddler is meant to be a Web debugging tool for developers, but it provides a lot of information for the non-developer as well. With this Web traffic tool, the casual user can see each request that is sent along with the data and the response from the remote server without having to dig very far.
To read the data, just scan the following default columns.
The result column shows the code for whether the request was successful or not. The most common results are:
- 200 -- Success. Request was sent and a response was received successfully.
- 400 -- Bad request. This happens when the target server receives the request but doesn't understand the details and so cannot process it.
- 404 -- Page not found. This can happen if the target API has been moved, or was updated without retaining backward compatibility.
- 500 -- Internal server error. Something went critically wrong on the server side and the error was not captured by the service provider.
The protocol being used is either HTTP or HTTPS. HTTPS means SSL is being used so your traffic is being encrypted before it is sent over the wire. Just because this says HTTP does not mean your traffic is unencrypted. (See the discussion of "Host" and "Tunnel To," below.)
The Host is either the root URL that is being accessed or "Tunnel To." You may notice that much of your most sensitive traffic is traveling through these tunnels, such as Microsoft OneDrive, Google data requests, et cetera. It may appear that this traffic is unencrypted because the protocol listed is HTTP, but don't be concerned. "Tunnel To" means a connection request was already issued indicating that there would be ongoing traffic to the destination site. This initial request was made via SSL, and established a dedicated tunnel to the destination. Once the tunnel is established, all traffic traveling through this tunnel is encrypted before it leaves your machine.
This is the specific page or endpoint being requested (or the root URL if using a tunnel). Many back-end services use URLs that may not look familiar. For example, e docs.live.net:443 is actually Microsoft OneDrive.
Check the Process column to see what application made the given request. If you see the process Explorer, this is Windows Explorer (not Internet Explorer). Windows Explorer is basically Windows itself, and this traffic is typically attributable to Windows refreshing the data in live tiles.
Client-side cache management is something the website can specify in the header of a Web page. It indicates whether the page should be cached on the client side (not the server side). If the Caching column is blank, the client side is allowed to cache the page to improve the speed of display the next time you browse there.
Looking at a sample of data pulled from my machine (Figure 1), you can tell quite a bit about what I'm doing.
At the top of the screen in Figure 1, you see a tunnel to docs.live.net:443. This is the connection to Microsoft OneDrive from Microsoft Word, where I'm writing this post. The Vortex.dat.microsoft.com entry is diagnostic information sent back to Microsoft for its Customer Experience Improvement Program. For the other entries, you can see on the right under the Process column what was responsible for the method call.
The four entries by Explorer are not from Internet Explorer -- those are from Windows Explorer (as noted earlier) which is your local file explorer on your machine, and is tightly integrated into Windows itself. These are queries made by Windows to keep the live tiles in my Windows 8 installation updated with live data.
Highlight any of the lines in the grid to see details on that particular request. There are many things you can explore, but the most interesting is the Inspectors tab on the right side of the screen. Choose the Inspectors tab, then the WebView tab on the bottom, and you can see the details of the information that is been transmitted or received. For example, Figure 2 shows the details from the Sports Live Tile update.
Using Fiddler, you can view all the Web traffic that your local machine is generating, and also see what data those programs are sending back to the mother ship.
Do you know how to automate API code generation?
What app managers should know about microservice architecture
A guide to building a functional dashboard application