I found two Luckie papers on alias resolution. The one on MERLIN seems a bit limited as it relies on multicast mrinfo. The one on MIDAR looks promising as it appears to be an improved version of Ally, which uses monotinic IP ID detection to identify interfaces belonging to the same router. I need an estimate of what proportion of aliases are found.
A scamper run was carried out using the same initial port numbers for old < 99% and new 99% MDA. It turned out however from existing data that when running new MDA scamper fails to probe nodes which were found to be load balancers by the old < 99% mode. It is unclear at this stage what might cause this to happen. Some debugging analysis will be necessary to understand this.
A comparison between analyses carried out at different times was carried out, where the varying port bits ran in their original incomplete bit variation mode, and then random intial port values were used as well as bit flipping. The initial mode gave a larger number of null returns. This is an unexpected result. The latter mode gave a larger number of load balancers as might be expected, however it may be necessary to carry out this comparison in the same scamper run.
Further work was carried out on churn analysis to add a more complete range of result categories.
Not a lot of progress this week. Finally worked through and realised what was causing the problem with my write-metadata instruction -- the execution order wasn't what I expected, so packets would be returned to the controller with information about a flow before the write-metadata instruction was executed, which means that it appears like the instruction hasn't been executed. Waiting on a response about how the specification expects things to work.
Brad and I brought the HP openflow switch back out and I had another crack at getting RouteFlow running on it. Made more progress this time, partially due to improvements in the latest RouteFlow code, and partially because I have a bit more familiarity with Open vSwitch, which was the point of failure last time we tried. Didn't quite get it all up and running, but the basic test environment + GUI seemed to work pretty nicely.
With a bit of tweaking, my smoother modelling process is now producing results just as good, if not better, than what I was getting with the old wavelet-based system. There are still quite a few false positives, which is annoying, but these are almost all situations where there is a traffic spike but I judge it to be too small to qualify as a genuine event.
At this point, we need to stop playing with anomaly detection and start thinking about combining everything into a rough but functional final product.
Spent some time helping Meenakshee get set up and helped out with her 591 proposal. Also worked out a revision plan for the IMC paper and sent it off to our shepherd.
Haven't had a lot of time this week, but managed to make some decent headway into persistence. The SQLite API is quite straightforward (if a little verbose) despite the lack of examples in the documentation. I've written a function to serialize all the headers at once, which I'll use for sending them in one chuck and to store in the database (using a relational table seems like asking for bugs, though I can always change that). I'm halfway through implementing the SQLite access abstraction to store and retrieving frames (which I should easily be able to finish on Monday), then I can just write frames to the database before sending messages. I'll need to put some careful thought and refactoring into connections going down, multiple connections and when to re-send frames. I'm going to need to carefully manage my time in order to have time for at least some useful testing before the conference.
Spent around an hour or so working on the blurb description last week, and more time than than I would have liked working on the proposal this week. Shane was very helpful, commenting on stuff I could add to the proposal and whatnot. That included going through a proposed implementation plan, which makes things much clearer now.
He was also kind enough to show me around and I finally got around to sorting out lab access to the WAND hardware lab. Will try to spend some time in the labs everyday -- chances are I'll be more productive in a "serious" environment.
For the next 2-3 weeks, the plan is to play around with Libprotoident and have a look at the examples, source code, etc but assignments are cropping up too, so not sure how well I'll be able to stick to the proposed schedule.
Decided there wasnt much point creating a hacked version of link aggregation, so moved onto vlans instead. Did a bit of reading and planning about that.
Bit counts where calculated for the current scamper run with the wider bit varation. The initial port value was not varied from trace to trace in original scamper and this is still the case. A random number is now used in the same range and a repeat scamper run is being carried out.
Debugging was carried out on the new confidence levels of scamper and it appears to be behaving as expected, apart from the results of the old vs new analysis. To help verify the correctness of the analysis packet counting was carried out to confirm the type of trace being analysed, and this did confirm that the correct traces were being analysed as having old and new confidence settings.
I then moved on to finding cases where the old <99% method found successors that the new 99% didn't find. There do some to be some anomalies associated with AS 174.
Further preparation of my talk was carried out.
Saw a patch come through this week for other instruction support, which made me do a bit of re-think on the way I've implemented the metadata instruction so far. Although this patchset is not particularly near ready, it may be useful for my patch to work against it. It seems that for supporting instructions as commandline input, I may end up running the same set of checks multiple times to verify correctness of the metadata instruction after it's ported to internal structures.
This isn't exactly elegant, but the way which actions are included in various packet types in OpenFlow mean it's the only way to proceed. I tinkered with this against the new patchset, but didn't make much progress. Hoping to post this once more for review in the next week.
Started looking at using topology data to generate more datapoints to help
group events on. Hopefully should be able to group events between sites
that share common paths (at this stage I'm planning on starting with the
AS path) as well as those that share sources and targets. As part of this
added an event detector to alert on major path changes between sites and
realised that there appears to be a bug in the AMP code to determine
common paths. Spent some time trying to track it down and it looks to be
due to counting the sample time period incorrectly, which I'm now trying
Figured out the cause of the AMP data interface module crashing on newer
php/apache. An incorrectly sized variable was being used in the c portion
to receive data from the php portion and along the way it was clobbering
something it shouldn't have. I'm sure the compiler warned about this last
time, but not in this case.
Over the last couple of weeks the schematics have been finalised and the board layout complete ready for manufacture. We're now just waiting on an assembly quote before proceeding.