The Waikato Internet Traffic Storage archives contain nearly 200GB of traces taken from a number of locations, in a number of formats.
Spent a little time reviewing my old YouTube paper in preparation for discussing it in 513.
Tracked down and fixed a few outstanding bugs in my new and improved anomaly_ts. The main problem was with my algorithm for keeping a running update of the median -- I had a rather obscure bug when inserting a new value that was between the two values I was averaging to calculate the median that was causing all sorts of problems.
Added an API to ampy for querying the event database. This will hopefully allow us to add little event markers on our time series graphs. Also integrated my code for querying data for Munin time series into ampy.
Churned out a revised version of my L7 filter paper for the IEEE Workshop on Network Measurements. I have repositioned the paper as an evaluation of open-source payload-based traffic classifers rather than a critique of L7 filter. I also spent a fair chunk of time replacing my nice pass-fail system for representing results with the exact accuracy numbers because apparently reviewers found the former confusing.
Tried to continue my work in tidying up and releasing various trace sets, but ran into some problems with my rsyncs being flooded out over the faculty network. This was quite a nuisance so we need to be more careful in future about how we move traces around (despite it not really being our fault!).
Managed to get a decent little algorithm going for quickly detecting a change between a noisy and constant time series. Seems to work fairly well with the examples I have so far.
Decided to completely re-factor the existing anomaly_ts code as it was getting a little unkempt, especially if we hope to have students working on it. For instance, there were several implementations of a buffer containing the recent history for a time series spread across the various detector modules. Also, most of the detectors that we had implemented were not being used and were creating a lot of confusion and our main source file had a lot of branching based on the metric being used by a time series, e.g. latency, bytes, users.
It took the whole week, but I managed to produce a fresh implementation that was clean, tidy and did not have extraneous code. All of the old detectors were placed in an archive directory in case we need them later. Each time series metric is now implemented as a separate class, so there is a lot less branching in the main source. There is also now a single HistoryBuffer implementation that can be used by any detector, including future detectors.
Released the ISP DSL I traces on WITS -- we are now sharing (anonymised) residential DSL traces for the first time, which will no doubt prove to be very popular.
Finished up the 513 marking (eventually!) and released the marks to the students.
Released a new version of libtrace -- 3.0.17.
Started working on releasing some new public trace sets. Waikato 8 is now available on WITS and the DSL traffic from our 2009 ISP traces will hopefully soon follow. In the process, I found a couple of little glitches in traceanon that I was able to fix before the libtrace release.
Decided that our anomaly detection code does not handle time series that switch from constant to noisy and back again particularly well. A classic example is latency to Google: during working hours it is noisy, but it is constant other times. We detect the switch, but only after a long time. I would like to detect this change sooner and report it as an event (although not necessarily alert on it). I've started looking into an alternative method of detecting the change in time series style based on a pair of sliding windows: one for the last hour, one for the previous 12 hours before that. It is working better, but is currently a bit too sensitive to the effect of an individual outlier.
Spent a day messing around with the event detection software, mainly seeing how Brendon's detectors work with the existing AMP data. The new "is it constant" calculation seems to be working reasonably well, but there are still a lot of issues with some of the detectors. Need to spend a bit of uninterrupted time with it to really see how it all works.
Had a quick look at the latest ISP traces with libprotoident to see if there are any obvious missing protocols I can add to the library. Added one new protocol (Minecraft) and tweaked a few existing protocols.
Spent the rest of the week at NZNOG, catching up on the state of the Internets. Most of the talks were pretty interesting and it was good to meet up with a few familiar faces.
Finally found and fixed the bug that was causing the occasional trace file to be truncated when written to disk. Having done that, I released libtrace 3.0.13 on Monday.
Worked with Nevil to get a test capture up and running on his capture box in Auckland. After a couple of false starts, we managed to successfully capture a day's worth of trace without issues.
Set up a Fedora machine for testing libtrace prior to subsequent releases, as it has become apparent that testing on just Debian and Ubuntu is insufficient. Will hopefully replace with a virtual machine once the new emulation network is up and running.
Started working on a possible presentation for NZNOG, mostly about libprotoident again.
Spent a little bit of time reading over my extended NAT sessions paper, making a few edits here and there.
Began preparing for a new round of captures at both Auckland and our ISP. Added a feature to wdcap at Nevil's request where the amount of payload to capture can be specified in the config file (rather than being fixed at four bytes). In the process, found and fixed a libtrace bug which was causing wdcap to capture an extra four bytes of payload than what was requested.
Pushed towards a new libtrace release. First finished adding support for OSPFv2, based on Simon's code. This was a bit harder than expected, as OSPF is a rather complicated protocol and I wanted to try and get the API right first time around. There were a few little traps in the spec that Simon's original code didn't deal with very well, so I had to work around those as well. It's not a perfect implementation but seems to deal with the sample OSPF packets I have pretty well.
Started the 2012 ISP capture on Friday, seems to be going well so far.
Met with Steffen Wendzel on Friday and talked about our various projects. He was pretty impressed with libtrace and BSOD, while I expect his experience in cyber security and covert channels could be useful for us one day.