Brendon Jones's blog
Tidied up a few bugs in the new python AMP data API that were causing it
to break when the randomly generated test values were within a certain
range. Also fixed up a couple of small bugs to do with option parsing in
the new ICMP test after testing all combinations of arguments and
Started to investigate approaches to data reporting, starting with the
work Anthony did on messaging systems. I've got a few ideas as to how I
want to proceed with this and I've started implementing some test code to
see if the systems will do the job properly and easily.
Finished working on the new dns test, up to the point of reporting data
(which will have to be one of the next things I tackle). Moved the
functions shared between this and the icmp test out into a library so that
all tests can use them (mostly functions for sending and receiving
Reworked the way that test destinations work to allow measured to resolve
addresses if desired. Destination names that are not in the nametable will
be resolved every time they are used, with a selectable number of
addresses added to to the destination list. This should help with testing
to places that pull tricks with dns, without requiring individual tests to
deal with it.
Spent some time looking at the new amp web interface, writing tickets and
checking what various libraries are capable of to help plan our approach
for the next while.
Spent some time with Brad looking into git workflows and approaches that
would work for the small group we have working on the new AMP web
interface. The project trac is slowly gaining useful wiki pages and
tickets documenting our approach. Expanded the test python AMP API to
return more extensive data and behave more like the real one will.
After a good discussion with Shane worked out the best approach to get
test destinations from measured into the tests themselves while still
making it easy to run all the test as standalone binaries. The best way
seems to be calling the test functions directly inside the forked process
so that destinations can be passed in directly in memory (in much the same
way as the original threaded model worked). A small wrapper around the
tests that provides a main function and command line parsing of the
destination list also makes each test a standalone binary using a majority
of the same code.
Updated the icmp test to use the destinations properly. Started porting
the old dns test to the new architecture too, which is doing a good job of
pointing out functionality that is common between all tests.
Made various updates to the KAREN weathermap, including adding the new
inter-island link from Avalon to Christchurch.
Wrote a first attempt at a logger for AMP, mostly based around syslog but
still similar in style to the original one. Messages are logged to the
most sensible place depending on how the process is being run (stdout,
log file, etc) and I'm hoping to use syslog to write the logs in the
Implemented most of the basic ICMP test in the new measured framework.
This showed up a few places where work needs to be done that the really
simple skeleton tests didn't. Still need to decide on the most appropriate
ways to get destinations into the tests and how to report the data.
Started putting together a simple python API for the AMP data to make it
easier for the web interface to access. It's based fairly closely on
various database libraries, returning a result object that can be iterated
on and including a bit of metadata about the result. The data returned is
currently only placeholder data.
Also spent some time helping out with students, planning future work and
cleaning out G.1.01 with Brad.
First week with the new summer students this week, so spent some time
helping get them settled in. Wrote up some AMP documentation as part of
this and got started with Brad on laying out the wiki etc for future
Added a nametable to the new measured that operates pretty much the same
as the existing one, to avoid using DNS. I'm thinking that it might also
be handy to have a way to also resolve names to addresses to make it
easier to test to sites that pull tricks with DNS (load balancing, CDN
geolocation, etc). This would remove the need for tests to deal with that
themselves and reduce duplicated code. With the addition of the nametable
it also now tracks test destinations and merges tests that allow multiple
destinations if they have the same schedule and parameters.
Spent a while chasing down a timing issue that was causing tests to be run
slightly early and then rescheduled shortly after for the correct time.
Using libwandevent as the sole source of all time information fixed this.
Continued to work on measured this week. Wrote another skeleton test for
measured as an example of how tests with a custom callback function work.
Added a handler for SIGHUP that will reload all the currently registered
tests and the schedule.
Registered tests now run the appropriate command they registered rather
than being hard coded, and parameters from the schedule file are no longer
ignored. Extra parameters determined at runtime by the callback function
are also used now.
Started to work on having measured read and parse the nametable, to allow
tests to run without requiring DNS. This will likely work in a pretty
similar way to how the existing one works.
Spent some more time working on measured. Tests will now be forked and
run (currently just running touch or ping to check it works), with a timer
scheduled to kill any that run too long. Successful tests remove the timer
once they complete - catching the SIGCHLD from the test lets me do all the
Tested it briefly on an emulation machine with 1000 tests scheduled
simultaneously every 20 seconds. Led to discovering a few small bugs with
the signal handling. After fixing them it all seems to run well, as long
as the watchdog timeout for hung tests is not too short (there isn't
always enough cpu time to go around). Everything works fine with slightly
fewer tasks or a slightly longer timeout.
Had another discussion with Shane about how we should structure tests and
started fleshing out a skeleton/example test. Basing it on a similar
structure to how Maji loads its various decoders etc, with lots of shared
objects that register various properties of the test when they are loaded.
Spent some more time reading bits of honours reports before they were
Updated addressing the KAREN AMP machines so they would continue to work
with recent network changes. In the process of doing so, discovered that
CFEngine would no longer update certain sites and spent quite a while
trying to debug it. It was failing to authenticate server keys properly,
which was fixed by forcing it to refetch the (exactly the same,
identical) key. Not impressed that it is acting flakey over something like
Started work on a new implementation of AMP using some of the ideas we've
been talking about. Currently I'm working on a reimplementation of
measured using libwandevent. At this stage it can read the old format of
schedule file and creates a new timer event for each one, runs a dummy
function when the time arrives and reschedules itself afterwards.
Spent a lot of time reading over 520 reports this week. This took most of
Fixed the way the plateau detector works to prevent many spurious events
being triggered if multiple locations had events happen around the same
time. Tidied up some of the database accesses when storing and fetching
events to prevent injection attacks.
Continued working with Nathan to get smokeping data successfully into the
event detection system. I generated some random data to fill the
historical buffers and then continued to run it over live data, which
generated a small number of plausible looking events. I'm now looking into
the scalability and resource usage of this as it seems a little higher
than it should be. Also polished the dashboard graphs slightly, changing
them to use more sensible axis and better resolution data.
Spent some time with Richard, Tony and Shane thinking about the future
direction of AMP. We've got some good ideas and have a whiteboard full of
initial planning for the work that needs to be done.
Read draft introductions to a number of 520 reports and gave some
hopefully useful feedback. Everyone seems to be on the right track so far,
looking forward to reading more.