My dumbest python moment ever….

In cleaning up tmk’s cache related code for a fresh commit, I wrote an expand2str() method that encapsulates the issue of dealing with expand() returning a list of expansions, and a properly converted string being desired anyway.

Suddenly I noticed this backtrace:


Traceback (most recent call last):
  File "tmk.py", line 929, in
    process_recipe(recipe)
  File "tmk.py", line 742, in process_recipe
    recipe.parse()
  File "tmk.py", line 276, in parse
    if not self.eval_pproc_directive(p):
AttributeError: Recipe instance has no attribute 'eval_pproc_directive'

Which is totally ridiculous, because eval_pproc_directive and parse are both methods of the same class. While the former is defined after the latter, by the time the instance (self) exists, the class is fully defined. Making a short test case, proved that the act of one in a billion hadn’t changed Pythons rules about this stuff.

In poking around to see what changed introduced during this commit, may have popped the magic cork, I noticed removing the reference caused the same type of error, successively on methods defined after they were invoked.

Then I saw it!

I had accidentally indented the expand2str() method one short, there by making it a function rather then a class method, and there by doing like wise and making them nested methods inside expand2str().

Sometimes Python really irks the typoist in me!

Today at work, I was thinking over my next agenda with tmk, namely implementing the checksum based method of checking whether or not targets are up to date, and recipe caching.

Implementing the checksums are actually pretty easy, the hardest part is just adding a command line option and a method for changing the checksum algorithm to be used.

Doing the cache on the other hand, is a bit more ‘thought’ required in the solving. Because tmk currently does variable expansions on rules, it’s impossible to correctly cache for any rule or variable assignment involving an environment variable. The principal reason for the variable expansions, was so to help ensure the uniqueness of a rule. So obviously, the proper thing to do is to delay the variable expansions until they are needed, there by making the data store completely cacheable.

The obstacle of course, is that rule definitions can no longer have the chance to be a way in which the rules become unique, which rules out the possibility that variable assignments can occur after the top level: which encourages me to push the spec in the direction I had originally planned. Another option, would be to just leave it as is, and note the issue around environment variables in the manual. Of course, a third possible solution is making the processing step more aware of line numbers.

Since I’d rather have it that way, I’m opting for the first: delaying the variable expansions so that the entire result of the parsing phase can be cached.

I also feel like a chicken with it’s head cut off atm :-S

+1 for design

I’ve just fixed the bug about tmk reporting the wrong filename when an included file contains an error. The thing that makes me smile, is because I designed the recipe parsing and processing code reasonably well, making that work correctly was trivial. At worst, I expected that I would have to modify the Recipe class to incorporate a separate data stack into the parser, but not even that was necessary. Most changes were (as hoped) just refining the data structures used for storage.

tmk is pretty simple, it does two passes at a recipe:

  • First it parses the recipe into an internal data store, most serious (e.g. syntax) errors are reported here. A minimal level of evaluation is done, namely we need to do some expansions or you can’t use variable substitutions when defining a rule.
  • Secondly, it walks through the data store at processing time (e.g. doing the magic), conducts final expansions when needed, and carries out its mission in life.
One reason I chose the syntactic style that I did for tmk, it is both visually straight forward, and easy to code around. I like simple but effective, when it works.
My next (and real) task, will likely be making the rules relational to one another, exempli gratia to topologically enqueue rules in dependency order. Right still tmk is limited to sequentially executing the rules. That’s actually good *enough*, but I’d rather have that tidbit taken care of by tmk, then having to address it in the recipe construction.

I must admit however, that adapting isnewer() by way of it’s cmpfunc parameter, to cope with using file checksums instead of modification times, will be fun to implement :-D.

I’m going to be dead tired before noon even approaches,  but I’m smiling now! The focus of my day, has been on getting tmk up to snuff enough that I can use it as a general purpose solution to my problem: cursing at the present ‘generation’ of such tools.

I worked on getting a base set of magic bound variables and teaching tmk that certain rules may be skipped, if a set of pre-conditions hold about the files involved. Getting that done was easy enough. The checking code is now more robust, properly handling an arbitrary set of input/output names, in as much is humanly possible ;). I’m smiling, because I spent most of the night being annoyed every which ways up, on top of an already splitting head. Ended up having to quit coding for a bit, and just hit RvS for a couple hours.

About two hours sleep, and still plenty of hours until sunrise, I woke up and got back to the codin’ and now it’s done!

So far, tmk is put together in a rather short amount of time, even if it’s been on my dancing list for a few months; finally it’s almost beta quality. Only show stopper that’s come up in testing, is it fails to handle unexisting tmk variables correctly, but that’s a one LOC fix. An outsanding issue, is that tmk variables with whitespace in them are improperly expanded, e.g. $(foo bar) expands to bar) even when a variable named ‘foo bar’ exists. That however is because of how the tokenization feeds the parsed data into the variable expansion system. Although for the sake of simplicity, I planned long ago to make the specs dictate such variable names as invalid whether or not tmk actually accepts them, so it’s a lesser issue. The include processor directive, also reports the wrong (e.g. parent) filename but correct line number (e.g. from the included file), yet that bug can be fixed in a few minutes; so I haven’t bothered yet.

Two features that remain to be done, is making rules relational (by dependency) rather then executing them in sequence, and to giving tmk the option of using checksums rather then modification times for minimising rebuilds.Which also comes into part of the leg work, for implementing a cache, hehe

Most of what needs doing, is some light polish and adding more builtin directives. Been thinking about making tmk understand a simple plugin system, that would allow it to load reasonably trusted bits of python code into part of the program, thus allowing new directives to be added at will, as well as replaced. I’ll worry about that later though.

A bit of fun with tmk

Yesterday I setup tmk to understand how to include files, by using a simple pre processing directive, making the syntax for tmk fairly simple:

variable = value

rule lhs -> rule rhs:
        command directive! word arguments

Where variable expansions and file name globbing is widely used as part of the design, while keeping the recipe parsing code fairly trivial. I designed it this way, both to be familiar with Make (although lhs/rhs are reversed), fairly obvious at a glance, and to be easy to implement. Now tmk also understands the concept of a (pre) processing directive. Since directive! is used within rules, the processing directives use the inverse, i.e. !directive. I believe that makes a better distinction then using a different symbol, especially since you can not use a command directive everywhere a processing directive is permitted. Right now there is only one directive, !include which slurps up it’s argument list as the name of a recipe file to include within the current, before continuing of course. Some form of conditional will probably be added later on; even though the rule syntax gives a way to define certain looping behaviour.

Today’s goal is to basically implement the concept of if the outputs (lhs) are more up to date then the inputs (rhs), the rule can be skipped. I don’t expect that to take to long, since most of it is just screwing with the rule expansions in order to properly stat the files. If I can get it done expeditiously, I’ll also have time to setup a few late binding `special` variables that will be useful, in proper Make fashion lol.

At which point, tmk will be effectively as good as a generic make implementation; and I can work on the parts that need to follow on. Namely the parallel support and portability aspects. The latter of course, being the main reason why CMake, SCons, and most every other such tool I’ve tried, has been rejected as more trouble then it’s worth.

An idiosyncrasy no one else gets

Whenever someone asks me how I am, I often phrase ‘and how are you?’ as ‘&you ?’, which is something usually lost on everyone. In the C programming language, the ampersand is used as an address-of operator used to create a reference of sorts, and is integer to utilising pointers. So litterally ‘address-of you ?’ makes a very explicitly reference while remaining a syntactically correct substitution of ‘&’ for ‘and’, in English anyway.

If anyone finds that odd, just try not to think about how Lisp and Perl have impacted my brain over the years lol.

Visual C++ 2010 Express, new levels of weirdness…

As usual there is a redist.txt file in the Visual Studio root, that explains about what Microsoft supplied files you’re allowed to redistribute with your applications. In MSVC9.0, the express edition only came with release and (non re-distributable) debug assemblies of the C, C++, and  managed code runtime libraries.

In looking at the files installed by the latest and strangest version yet, I see that there is no vcredist folder… instead there is a vccrt folder, that contains a mungle of C and C++ that appears to be source code for some type of MS C/C++ runtime libraries :-/.

+1 simple relationals

On my way to the head, I was thinking about ways to improve the robustness of a program  that I’ve been tinkering with on the side. Simply put it defines an ordered set of roughly hierarchical data, that is integral to processing later based on certain groupings of it. The set is such a collection of information, that recalling it later would best be done through an associative container, where in the keys may be any unique attribute of the data set being processed, rather then having to be any given accessor.

The obvious solution to bundling the data, is to create an abstract record representing the keys that need querying, e.g. each objects attributes are expected to correspond to an unique instance formed by the data set. In thinking about how such a thing might be implemented without losing the speed of a hashed table look up; the first thought to come to mind, of course was the most simple and straight forward idea. If the implementation language was C, it would be trivial to throw together a set of structures for representing items in the data set, and wrap them in an opaque record that binds together a group of hash tables or balancing BSTs for each key type we want, which would then look up a pointer to the individual records, through a structure tuned for minimal memory usage. Second to come to mind, was a rather interesting tree structure to minimise cost for retrieving any given node. In which of course I remembered that this particular implementation case dealt with a certain language that traded such memory conciousness for a garbage collector.

On my way back to my work station, I thunk to myself, “Well hell, I just defined a relational data structure!”, and with the idea of running an SQLite database in process memory now floating through my mind, sure enough the API that I had envisioned, was little more then a relational algebra tuned to the problem domain being worked in.

The implementation language might lack an equivalent to Cs memory management, and gives no guarantee that the amount of copying and GC work involved wouldn’t grow exponentially with the data set size… but it does have a binding to the SQLite database, which is fairly handy, hehehe. So the obvious question is which way handles things more efficiently: relying on the language implementation to avoid unnecessary copying of memory, or going through the overhead of a lite weight SQL database.

Sometimes I love how these kind of things are so simple to work out in the course of short trip down the hall, lol.

Today was a refreshing change, I spent most of it glued to my desktop: three command prompts open, with pidgin tabs, chrome tabs, and mplayer occasionally filling in the rest of the space :-D.

I basically rewrote the recipe parser for tmk, bringing it up to the specification. Generally I’ll avoid such leg work, except when it’s a simple grammar or no suitable tools are available; and tmk is very simple. Command directives and much of the expansion system were also implemented today, making tmk almost complete enough to compile a project. What needs doing, is proper macro expansions (tmk vars don’t work yet, but the rest of the expression syntax does), so that `magic` variables acline to makes automatics can be created. Once that is done, creating the desired directives is a fairly trivial process; hooking them up to language/tool independent and developer serviceable backends being a piece of cake.

While testing the directives code, I needed to come up with a temporary error message, one that had to be a bit absurd: being as it was intended for testing the domain in which the error applies, rather than an already iron clad reason to use any given message To that end, I came up with one that would really stick out among the other diagnostics: “{file name}:{line number}: the bad year blimp has landed”. This, as some might instantly see (;), is a slight homage to Mel Brooks’ Spaceballs, more specifically to the scene, “Uh oh, here comes the bad year blimp!”, in which Lone Starr calls for the switch to “Secret hyper jets”. And of course, I had to load that film into MPlayer and my DVD drive while I wrote code xD.

When developing something, I typically do a large amount of in flight testing to verify the codes behaviour. My efforts today have been no different, testing time perhpas, making up more than 60% of the time I spent working on tmk. It’s not unheard of, to find XXX marked comments denoting details about some edge case that hasn’t been covered, and a note to kick me in the head if that theoretical hotspot ever occurs; such things usually happen when I’m extremely pressed for think-time, due to (ofc) being interrupted every upteen times. At which point, priority is usually on implementation speed over perfection. Today, I was lucky enough to be able to work largely uninterrupted: a rare pleasure.

For me, programming is a very relaxing effort. It’s one that I can absorb myself into the art and craft, designing programs bit by bit, constantly improving them with each iteration, until my aims have been achieved, or I’ve passed out. Despite being a very exhausting task to keep working at, I rarely find it to be an intellectually arduous one, so much as a test of endurance: to stay in the zone for long stretches and deal with the effort required for the, eh, shall we say more paltry and menial aspects of coding a decent program. There are some parts of programming that really do tax the brain, the fun stuff to work on solving ;). On the other side of the coin of course, is a lot of things are more straight forward to sort out. Many aspects of programming overlap both the engineering and the every day groan of getting ti done. Both are needed, in any non-trivial application, that you’ll be seeing more than a couple hours of use with.

That being said, being able to look at things from several vastly different directives, does help a lot. Because of the amount work needed, it seriously helps if you can stay in the zone for a good 6-8 hours or more per sitting; but becoming tunnel visioned on the creative and problem solving parts, is generally a bad thing. One must slip gently through the mind of no mind, and know how to leave your box behind. There can be an insane amount of stuff to deal with in some programs, and I find that the level of such, increases both with the complexity of the problem and the pitfalls of the language utilised. For example, C, lisp, perl, and python can each express many problems quite well; yet each excels at expressing certain ideas more naturally, or with less effort, than the others. I actually was missing lisp for a good hour today lol.

Days where I can just sit and focus on getting stuff done (in peace) are terribly rare here. When I get into work mode, just get out of my way, or I’ll be quickly annoyed. I don’t like it when people waste my stack space.

Tweaking my noise at the old API

In fooling around with the Windows API, I’ve just had an enjoyable moment of guffawing. As a quick test of the JS stuff in winmm, I hooked up MM_JOY1MOVE to MessageBox() and ran the program under the debugger. It resulted in an endless stream of MessageBox(), resulting in the Windows task bar hanging, and taking at least 25-30 seconds to recover, after the program had finally overflowed the stack, been examined, and finally terminated manually.

I almost died laughing lol.