Skip navigation

Monthly Archives: August 2013

Tower defense games: everyone’s played them in some form, they’re pretty ubiquitous in this touchscreen world.  At lunch today, I hit upon a new twist that I don’t think has been tackled before: the WiFi needs of users at a large organization (in this case, a school).

Now imagine that they want to kill you because they can’t get on Facebook to post another twelve duckface selfies.

The idea is that you, the player, are the sysadmin on a school campus.  You have a budget for network infrastructure, and you need to keep the WiFi running and able to meet the demands of the users as they bring in more and more devices.  You can get status reports from various parts of your user base much like you could get reports from your cabinet in Sim City 2000, and your life bar would tick down and more and more users stream into the always-open door of your office.  Your campus would start small but eventually expand as you got more users (students).  The construction of buildings would, of course, affect the signal of your access points, and you would also have to make sure that you had enough switchports and network drops available to connect both APs and wired users.

The best part?  It teaches while you learn!  Balancing user needs against a budget and battling all the issues that can come from a mix of WiFi infrastructure and user devices and needs is a useful skill.  Will you force an SSID to go entirely to 5GHz at the risk of crippling a cart full of ancient laptops that only have a 2.4GHz antenna?  What do you do when you start piloting a one-to-one program?

So my call to you, fair readers, is to help me make this game a reality.  I’m not a game designer, but I am a sysadmin, so I am familiar with a lot of scenarios, and, with my training as a writer and my general nature, I have ideas and opinions, which I would be happy to force upon share with you if you decide to jump on board.

I also have a GitHub repo for the project.  I hope that you can help me make this project happen.

Advertisements

Yesterday, I got a first-hand demonstration of how a simple, well-meaning act of tidying up can have far-reaching consequences for a network.

Our campus uses Cisco IP phones both for regular communication and for emergency paging.  As such, every classroom is equipped with an IP phone, and each of these phones is equipped with a switch port, so that rooms with only one active network drop may still have a computer (or more often a networked printer) wired in.  If you work in such an environment, I hope that this short tale will serve as a cautionary tale about what happens when you don’t clean up.

I was working at my desk yesterday afternoon, already having more than enough to do, since the start of school is only a few days away, and everybody wants a piece of me all at once.  While reading through some log files, a bit of motion at the bottom of my vision caught my attention: the screen on my phone had gone from its normal display to a screen that just said “Registering” at the bottom left with a little spinning wheel.  Well, thought I, it’s just a blip in the system–not the first time my phone’s just cut out for a second.  So I reset my phone.  Then I looked and saw that my co-workers’ phones were doing the same thing.  Must just be something with our switch, I thought.  So I connected to the switch over a terminal session and checked the status of the VLANs.  Finding them to be all present and accounted for, I took the next logical step and reset the switch.  A couple minutes later, the switch was back up and running, but our phones were still out.

Logging in to the Voice box, I couldn’t see anything out of the ordinary, and the closest phone I could find outside of my office was fully operational.  Soon, I began getting reports that the phones, the wi-fi, and even the wired internet were down or at least very slow elsewhere on campus, though from my desk, I was still able to get out to the internet with every device available to me.  The reports, though, weren’t all-encompassing.  The middle school, right across a courtyard from my office, still had phones, as did the art studios next door, but the upper school was down, and the foreign language building was almost completely disconnected from the rest of the network–the few times I could get a ping through, the latency ranged from 666 (seriously) to 1200-ish milliseconds.

I reset the switches I could reach in the most badly affected areas.  I reset the core switch.  I reset the voice box.  Nothing changed.  I checked the IP routes on the firewall: nothing out of the ordinary.  Finally, in desperation, my boss and I started unplugging buildings, pulling fiber out of the uplink ports on their switches, then waiting to see if anything changed.  Taking out the foreign language building, the most crippled building, seemed like the best starting point, but was fruitless.  Then we unplugged the main upper school building, and everything went back to normal elsewhere on campus.  Plug the US in, boom–the phones died again–unplug it, and a minute later, everything was all happy internet and telephony.

We walked through the building, looking for anything out of the ordinary, but our initial inspection turned up nothing, so, with tape and a marker in hand, I started unplugging cables from the switch, one by one, labeling them as I went.  After disconnecting everything on the first module of the main switch, along with the secondary PoE switch that served most of the classroom phones, I plugged in the uplink cable.  The network stayed up.  One by one, I plugged cables back into the first module, but everything stayed up.  Then I plugged the phone switch back in, and down the network went again.

After another session of unplugging and labeling cables, I plugged the now-empty voice switch back in, hoping for the best.  The network stayed up.  Then I plugged in the first of the cables back into the switch.  Down the network went.  Unplug.  Back up.  Following the cable back to the patch panel, we eventually found the problem, missed on my initial sweep of the rooms: two cables hanging out of a phone, both plugged into ports in the wall.  For whatever reason, both ports on that wall plate had been live, and that second cable, plugged in out of some sense of orderliness, had created the loop that flooded the network with broadcast packets and brought down more than half of campus.

Take away whatever lesson you want from this story, but after working for almost four hours to find one little loop, I will think twice about hotting up two adjacent ports if they aren’t both going to be connected immediately and (semi)permanently to some device, especially if one of them is going to a phone.