BaloneyGeek's Place

BaloneyGeek's Place

Operator! Give me the number for 911!

My Little Facemash Moment

Of the many movies that were released in 2010, one in particular stuck with me. It was about a socially inept college student, who failed miserably at love - very possibly because of his social ineptitude - and then decided to compensate for it by doing something from his dorm room that would stick it to the establishment and gain him some notoriety. The movie was called The Social Network.

At that time, I was socially inept, and I had also failed miserably at love, although I now know it wasn't down to social ineptitude. And being the immature child I was, I also wanted to compensate for my romantic failures by causing a little revolution that would make me famous. I just didn't know how to.

It took another 3 years. It was in late 2013, just as I was being battered by the harsh Rajasthan winter as I was about to finish my first semester at university, that I found out I could no longer read BBC News on the campus network. I've always been a rebel, not caring much for authority, and definitely not caring for authority that blocks me from doing things I want to. The time for my little Facemash moment had come.

Content Control

For a network administrator, internet pornography is a nightmare to handle. The average network administrator couldn't care less about the societal norms surrounding pornography; it's the sheer volume of traffic involved, and the sketchiness of the websites and their potential to infect entire networks of computer with insiduous malware, that keep them awake at night.

The internet is for porn. This is a fact. Depending on whom you ask, anywhere from 4% to 30% of the internet's websites are related to pornography.

Now 30% might be a small number, until you consider the nature of traffic. While Wikpedia or Google is mostly text, and Facebook is a mixture of text, images and video, most pornography is video. It's not just video, these days it's high-definition video, which means one person watching porn on the internet can easily blow through gigabytes of data in minutes.

So depending on whom you ask, an average of 17% of the world's websites are for porn, but a whopping 75% of the traffic volume is pornographic video.

In fact, lurk around enough on the internet and you'll find that some pornography websites are at the forefront of content delivery network technology - there's just so much data transfer capacity and so much high-speed bandwidth that you need to run a streaming video website - a popular one at that, because let's face it, people watch porn - that pornography websites are actually the driving force in development in distributed content delivery networks.

YouTube is the worlds single most popular video streaming website. PornHub is the second. The next non-porn video streaming website - Vimeo - comes far down the list. YouTube operates at Google scale, with networks of servers in almost every country in the world to deliver content to viewers as efficiently as possible. Given PornHub's popularity, you would guesstimate that they have at least two-thirds the server capacity of YouTube.

And India loves to watch porn. A couple of years ago, Indian Railways started a pilot project to equip large stations in India with free WiFi in collaboration with Google. In return, Google got to collect data on what the users of the free WiFi service were looking at. In the city of Patna, the capital of the state of Bihar, we got news headlines like this: Patna Is The Top User Of Google's Free Wi-Fi At Railway Stations, Mostly For Porn: Report.

My university started with the best of intentions. It happens to be connected directly to the National Internet Backbone through a 10 Gigabit network link and in the first few years of the university's existence, the campus WiFi was unfiltered. Of course, fewer than 100 students and faculty managed to saturate that link every single night and bring down the network down to a crawl. And then there was a campus-wide computer virus outbreak that was presumed to come from a careless porn viewer.

So the network administrators decided to block internet pornography. And this is where things started going wrong.

They started by blocking pornographic websites, and then blocked torrents. The speeds improved, but they were still not the blazing fast speeds that we should have had been having. So they decieded to block access to more categories of websites. Gaming went away. For a little while, so did YouTube, although it was brought back because of "the availability of educational content". And then they blocked news.

The rationale behind this was, in my opinion, absolute insanity. We had televisions in the common rooms, and their stance was - if you want to be updated on the world's current affairs, watch the news on the telly, or come down to the library for a newspaper. Apparently, the few videos embedded into news websites was too much for the network to handle.

I tried to submit requests to the IT department to get BBC and Reuters unblocked. It didn't work. My alternatives were to go to the university administration, or do something about it myself.

In hindsight, I could have gone to the administration. They're nice people, and completely reasonable. But the IT department had pissed me off, and I no longer had a much of a high opinion of them. I really wanted to "stick it to the man." And so I did.

DNS Blocking

The university used (and still uses) DNS to block websites. DNS, or the Domain Name System, translates domain names to IP addresses.

You see, every single website on the internet has a numeric address, called an IP address, or Internet Protocol Address. But these addresses can be as large as 12 digit numbers (or 32 digit hexadecimal numbers - numbers using the numerals 0-9 and the letters a-f - nowadays), and they're mighty hard to remember. What would you rather type in to your browser - www.google.com, or 167.182.123.89?

So our university ran its own DNS server, which would, for unblocked websites, translate the domain name to the real IP address, but for blocked websites, it would translate the domain name to an IP address that pointed to some other website that just said "The website you are trying to access is blocked on our network".

This was effective, but for an university which taught courses in computer engineering, exceedingly easy to break - just tell your laptop or computer to use a different DNS server, not the one provided by the university. Google runs one such public DNS server service, and we just en-masse pointed all our laptops to Google's DNS servers, completely unblocking everything.

It took a while for IT to figure out what was going on, but they retaliated by making sure DNS traffic never left the university's premises. We were thus limited to using DNS servers located inside the campus network, which now happened to be the university's own services that blocked websites.

Proxying DNS

This put a stop to all but the most enterprising of the students. Most resorted to using VPNs - Virtual Private Networks, a techniqe to route all internet traffic via a "private" network, bouncing it off other servers outside the campus network before releasing it to the internet. Unfortunately, VPNs are either free or fast. You can't have both.

I decided to poke around their network to see how they were actually blocking DNS traffic from exiting the campus.

This is where things get technical.

There are usually two protocols - a set of rules that computers follow to communicate with each other - that are widely used on the Internet. They're called TCP (Transmission Control Protocol) and UDP (User Datagram Protocol).

Then there's the concept of ports. You see, one physical computer may serve multipe services. It might serve websites, and simultaneously run a database that you can connect to from the outside world. If you want to connect to www.google.com, how do you say if you want to connect to the website running on the server, or the database?

Every computer therefore has 65,536 "ports", numbered from 0 to 65,535. A particular service - say the website, or the databse - "listens" on a particular port. When we connect to the website, we have to specify its IP address or domain name, and the port we want to connect to.

The IANA - Internet Assigned Numbers Authority - assigns some "well-known ports". They say that websites must listen on port 80, and websites that are secured and encrypted must listen on port 443. That's why, when you type in www.google.com into a web browser, you don't specify port 80 or port 443, because the browser assumes that you want to see the website and automatically connects to port 80. If the website was, by any chance, listening on port 1234, a nonstandard port, you'd have to write the address like this: www.google.com:1234.

Here's how the university was blocking DNS access - the well known port for DNS is port 53, and the university created a rule in their network firewall that said if any computer inside the network wants to connect to any computer outside the network on port 53, block that connection.

Simple, right?

Well, this is where things start getting fun. Just because the IANA says that DNS servers have to listen on port 53 doesn't mean DNS doesn't work if it listens on port 1234. It just means we have to explicitly specify port 1234 when we point our operating system to a particular DNS server.

I ran some tests on our network. It turned out, we were only allowed to connect to servers outside the network using TCP on ports 80 and 443. We were theoretically only allowed to browse websites.

And guess what, there are quite a few DNS servers on the internet that listen on port 443. We could just use one of those, right?

Almost.

There's another angle to the story - the protocol (TCP, or UDP). Websites can only be browsed using TCP, but DNS traffic can use both TCP and UDP. And since UDP is faster for small amounts of data (typical of DNS requests), DNS defaults to using UDP for traffic.

So DNS servers listen on port 53, expecting to hold conversations with the client using the UDP rules. And now you can probably guess what the problem was - since websites only work using TCP, our network administrators set up the firewall so that only TCP traffic went out on ports 80 and 443. Simply pointing my system to DNS servers listening on port 443 wouldn't work, since the system would try to make a DNS request using UDP, and fail.

So I came up with a little idea. What if I wrote a tiny prograam that ran on my own computer, which listened on port 53, listening to UDP traffic, and forwarded whatever it recieved to the real DNS servers outside the network using TCP over port 443? It would then wait for the reply, recieve it via the TCP connection, and relay it back to the program requesting the translation (such as Google Chrome) using UDP again.

And that's what I did, and it worked perfectly.

Dennis

Because I was feeling so cocky, I decided to put up the code on GitHub for everyone to see - if they could find it. I called it Dennis, a phonetic play on DNS, named after the character Dennis the Menace, because it was also supposed to be a menace to the university's IT department.

I used it for three and a half years. The university's IT department never noticed I was accessing websites that were supposed to be blocked, even though nothing was encrypted. I let a friend of mine, whom I trusted to be responsible with this kind of power, use it. No one noticed him downloading games from Steam once in a while.

After I graduated, I finally let everyone in the university (including the administration) know what I did. After I already had my degree.

At an university that boasts that it produces the engineers of tomorrow, and at an university that inculcates an entrepreneureal spirit right in the course curriculum, you can't reasonably expect that absolutely no one will manage to innovate around a real problem that they face every day. This was a sign that the university worked. In spirit, at least.

If you want to find out which school I attended and what I studied there, I invite you to stalk me on the internet. That information is not hard to find. But if you're in India, trying to choose an university to attend, and planning to study computer engineering, take a look at the one I went to. It's a brilliant little place, and you might like it. For what it's worth, I had the flexibility to choose my own study path that I designed myself, and as a result of this path I moved to Germany during my last semester of college, where I still live and work.

Dennis is available here: https://gitlab.com/BaloneyGeek/dennis