With Halloween right around the corner, I thought I’d share how I am building my Halloween costume. Ever play the electronic board game “Operation” - the one where you remove body parts from the naked guy but try to avoid touching the sides else causing his nose to light up and an annoying buzzer to go off? Well…
I built a life-size wearable version of the game. I used a clown nose outfitted with an red light LED, a buzzer, and built a 555 IC timing circuit that is tripped by one of four reed sensors located in “body parts” I constructed out of felt. My friend Lyndsey Benson, who mentioned the idea, had a friend who did a costume based on “Operation,” but this costume takes it to the next level by making it actually work!
To trip the buzzer and light up nose, I have attached a ceramic magnet to some kitchen tongs and you simply wave the tongs in front of the body part. Originally I tried to make fabric switches that you would press to activate the buzzer, but failed because I couldn’t find conductive thread and silver wire just wasn’t cutting it.

The one-liners with this costume are endless… Here are some suggestions:
- Want to play?
- Please, touch me.
- Can you fix my broken heart?
- Many, many more that are much worse

DipZoom is “an approach to provide focused, on-demand Internet measurements.” By creating a Peer-to-Peer network of hosts around the world, clients can issue requests to specific hosts to perform various network measurements including curl, ping, traceroute, and dig.
For my purposes, I wanted to be able to download a single webpage from many different hosts located around the world to see if a problem detected on one host would appear differently or not at all at another host. I created a command line client to access the DipZoom network so that I could automate such tests in the framework I am helping to develop.
If you are interested in using this Java command line client, you may download dzcommand here. You must go to DipZoom to download the necessary support files to run this client, as described in the readme.txt file, and you will need a suitable JDK and JRE to compile and run the code.
I needed a list of websites of all the Fortune 500 from 2007 for my masters project. Unfortunately, Fortune wanted to charge me hundreds of dollars to get some fancy excel spreadsheet with much more information than I really needed. I suspect there are other people out there who might find this list useful, so I’ll share how I made it (in case you want more than the URLs). However, you can skip all the following and just download The 2007 Fortune 1000 Website List. Also, here is a zip file of the 2007 Fortune 1000 HTML files found at money.cnn.com.
Using wget, you can download each of the 1000 . The URL links seem to be slighly random, but they are found between 1.html and 5000.html. Thankfully, wget just ignores saving 404 error pages. So, we download all the links in an empty directory:
- for i in `seq 1 5000`; do wget http://money.cnn.com/magazines/fortune/fortune500/2007/snapshots/$i.html ; done
Ah, that’s nice - we somehow have some extras. After some analysis, this will fix the problem:
- for i in `fgrep xxxxx *|awk ‘{ print $1 }’|awk -F “:” ‘{ print $1 }’|sort|uniq`;do rm $i;done
Woohoo! 1000 html files - perfect!
- cat *.html|grep Website|grep headersmtext|awk ‘{print $4}’|awk -F “\”" ‘{ print $2 }’|sort|uniq > output.txt
Oddly, you’ll only end up with 996 unique urls because the following are duplicates (which is correct):
- http://www.cvscaremark.com
- http://www.fcx.com
- http://www.integrysgroup.com
- http://www.oshkoshtruckcorporation.com
I had trouble compiling packit today, a network auditing tool that allows one to define (spoof) nearly all TCP, UDP, ICMP, IP, ARP, RARP, and Ethernet header options to test firewalls, intrusion detection/prevention systems, port scanning, simulating network traffic, and general TCP/IP auditing. I’m running OpenSUSE 10.1 and the compilation failed on a #include <net/bpf.h>.
I found a solution to this problem on Jeff Terrell’s site, but here it is in a nutshell. Assuming you have libpcap already installed, all you should have to do is:
- cp /usr/include/pcap-bpf.h /usr/include/net/bpf.h
Alternatively, edit the header file with the problem and point it in the correct location.
My masters project involves testing a large number of webservers. In order to test a large number of webservers I would obviously need a large list of them - at least 100,000. I got such a list by applying as a researcher with http://www.ircache.net. They offer (free for researchers, pay for others) trace files from their Squid public proxy servers situated in the US. What this means is that when people use their free proxy servers, they anonymously log all the URLs people visit. In fact, the anonymization process is fairly interesting in that they MD5 encode all the POST/GET variables and assign a random but consistent IP address to the client…but enough about that and back to the task at hand.
Once you get an account for the trace files (read the FAQ and email them), you can download all the trace files. They will recommend you use the command line linux “ftp” client, but I’d recommend downloading the command line linux “ncftp” program. This will allow you to download an directory at once rather than each file one-by-one.
So:
- mkdir traces;cd traces
- ncftp -u youruser -p yourpassword ftp.ircache.net
- get Traces/*
- quit
- gunzip *.gz
And here is the voodoo magic command I wrote that turns all those trace files into a single list of unique, ordered, web urls (with no POST/GET data):
- cat *|grepĀ 200 |grep http://|awk ‘{ print $7 }’|awk -F/ ‘{ print $1 “//” $3 }’|sort|uniq > urls.txt
The urls.txt file which is generated (after a LONG time) contains a single url, such as http://www.google.com, per line. In total, this gave me about 200,000 unique urls. IRCache.net only keeps one week worth of data on their server at a time, so by downloading new traces each day over a larger period of time, you could acquire an even larger number of urls fairly quickly.