In sleeping on it and having a job, that frankly leaves my mind free: I’ve come up with an idea for the tpsh problem.

The tokenizer is beautiful, the only problem is it does the expansions through a table of call backs. Now if I was to modify that so that it instead attaches meta-data to each token instead, those expansions could be delayed until later without major redesign. It will be necessary to redactor the lexer in order to handle the change in data structure. The code generator can then be tasked with completing any expansions during generation, by expanding lexemes to corrisponding code as it goes. After wards, re do the code generate into something more elegant and voila…. we have what it should’ve been in the first place!

A good nights sleep always helps the code[r]

A small shoot yourself in the foot, coders moment.

Being bored and lacking further ops I can get done before bed, I picked up on tpsh again. In look for a quick challenge, I noted that the git repo was still on the ‘codegen’ branch. Basically, a branch to test the idea of generating the execution code on the fly per command sequence.

As a quickie of interest, I picked up the generation phase for the for-loop. Then I hit a road block. Since my shell expands variables, globs, and aliases quickly during tokenization. The reason being, the input field separator ($IFS) and quoting rules determine how this shell splits text into ‘words’ or tokens for the execution mechanisms—you could say it’s “On the way there”, thus deals with it as it comes. Currently tpsh handles environment variables by fouling around with the programs own idea of the referenced environment variables (%ENV) without any distinction between exported and unexported variables. My intentions have been to use a more controllable interface for shell variables at a later date, since it is kind of a low-yield concern at this stage of development.

I see several choices:

a/ redesign how things work (obviously this is the whole story, lol), saving the issue until later when other components have matured to match.

b/ leave expanding environment variables (etc) until the last minute, I don’t like this idea.

c/ have some way of retaining things that can not be confirmed until later, with an indicator to strip out or expand the remainders at the last minute (this gives me visions of ugly code)

d/ incorporate the code generator closer into the process, so that things get expanded only after they have been confirmed, but generated as soon as possible (a more multiple pass focused design comes to mind).

a is basically a worry-later, and see if other things that needs doing either fix or exacerbate the problem (this is a two edged flaming sword). b is possible but would take a lot of reworking crap, and IMHO result in an ugly LA phase and become prone to introducing bugs into the final results. c sounds simple enough at first glance but I do not see a method that I’m willing to live with. I worry about how easy d would could confuse readers, and what danger a slip up on it could do the results.

For now, I intend to not worry about the minor issue, until after variable handling matures. Because I really love how expand_quotes() works, and that is the best part of the whole program IMHO. Needless to say, tpsh has poor handling of shell/environment variables and has had it throughout its development, since growing the code can wait longer then the other parts.

Not to mention the fact that tpsh has mostly been developed under sleep deprivation in the first place…. lol

edit

In the time it took to submit the entry, type ‘shutdown -p now’, put away the computer, and take a quick leak: I came up with another solution. Give the code that expands variables an understanding of how they are defined, rather then only how they are referenced. Not only would checking if a referenced variable was just defined in the same set of input work with ‘for X’ and ‘for X in …’ like constructs, it could also be used to implement the ‘VAR=… … command …’ syntax at a later stage 🙂

The way expand_quotes() invokes the other expand_* procedures, we would need adjustment before the syntax of prefixing commands with variable settings could work, yet implementing the for-loop in it would be trivial, since anything that would cause the statement to get broken into an unusable token set before defining said variable and attempting variable expansion on it, would also be a usage that gets around for’s keyword status!

Problem solved with less fuss, maybe? It is amazing what you can think of while taking a piss!

reStructured Text, intersting…

Hmm, reStructedText / rst is looking to be very interesting the closer I look at it — maybe the next best thing to Plain Old Documentation / POD.

I could see this stuff possibly displacing a major chunk of my use for LaTeX and troff, since I have little need for typesetting mathematics or other complex stuffs (think troff p*-processors), it could be quite valuable! Of course, I would have to mate it with the m4 macro processor (or roll my own, or just abuse and filter the C Pre Processor hehe) to get exactly what I would want, but hey, it would work pretty nice!

It would have one chief advantage over LaTeX, not even the people I have to share documents with on occasion, could foul it up toooo badly… lol.

Oh give me a project ….

Oh, give me a project, where the perls roam,
Where the join() and the split() play,
Where seldom is heard a discouraging error,
And the Makefiles are not cloudy all day.

Project, project on the web
Where the map() and the grep() play
Where seldom is heard a discouraging warning
And the lusers are not cloudy all day

How often at night when the heavens are bright
With the light from the glittering stars
Have I stood there amazed and asked as I gazed
If their glory exceeds that of ours

Project, project on the web
Where the push() and the pop() play
Where seldom is heard a discouraging error
And the Makefiles are not cloudy all day

Where the code is so pure, the expressions so free
The operators so balmy and light
That I would not exchange my project on the web
For all of the jobs so bright

Project, project on the web
Where the reverse() and the sort() play
Where seldom is heard a discouraging error
And the warnings are not cloudy all day

Oh, I love those wild regex’rs in this dear land of ours
The lusers, I love to hear scream
And I love the white $scalars and the @list flocks
That defref on the mountaintops green

Project, project on the web
Where the substr() and the eval{} play
Where seldom is heard a discouraging die $!
And the warnings are not cloudy all day

— missing my favorite language

GCC spitting error: stray ‘1’ in program (etc)

Ok, so I am wondering why the bloody heck I’m getting messages like main.o:x:y: error: stray ‘1’ in program and related errors * near infinity whilst nitting object files into an executable .

Note to self: never allow your Makefile to say g++ -x c++ foo.o … -o program !!!!

I don’t know what is worse….

(EDIT: actually this reminds me, a friend said it sounded like an encoding error; and in retrospec it looks like GCC was interpreting main.o as a C++ file because of the -x flag lol)

Icon, the programming language?

http://en.wikipedia.org/wiki/Icon_(programming_language)

Geeze, it looks like a nifty language!

http://en.wikipedia.org/wiki/Unicon_programming_language suggests something exceptionally fun to work with as well…

Delightfully enjoyed this article

http://freshmeat.net/articles/stop-the-autoconf-insanity-why-we-need-a-new-build-system

I can also sympathize with the fictitious Joe and Jane in the examples — I have no love for the GNU build system / autotools. I’ve also had to waddle though ugly auto* files and deported documentation on occasion ;). The only part of autotools I do like, is GNU Make: because it is the most portable make implementation available, short of limiting things to a subset of the standard syntax.

I don’t quite understand the authors comments about m4, because it is a pretty simple tool. Heh, I still remember watching The One on TV one night, and interspersing it with learning the m4 macro processor. IMHO m4 is an incredibly useful tool: being a fairly generic (yet expressive) macro processor lends itself to virtually any tasks that can benefit from pre or post processing. Although to be fair, for what most (smart) people would use m4 for doing, I typically (ab)use the C Pre-Processor (cpp) into doing for me (^_^)/. The main reason I avoid using m4, is because I can never seem to count on a *consistent* set of behaviours whereever m4 can be found/ported, the last time I required m4 for a project, I ended up in a “F it, I’ll make a C||C++ compiler a dependency” like situation: because the platforms m4 would not behave IAW the norm. It is a shame really, GNU M4 adds some nifty features (and even more shamefully, it was a ported/tweaked version of GNU M4 that was the problem child in the aforementioned situation lol)

When it comes to building things of my own, I usually create a Makefile; exception being Qt based stuff, in which case I generate makefiles with qmake ;). I’ve also considered implementing a Perl script that automagically does the right thing (or should I say, the infering the right thing) through a quick bit of build rules written in XML — but why do that, when there is a tool like ant? I personally like makefiles and GNU Make; then again I’ll put up with virtually any make with $() and documented inference rules… hehe

SCons has been something tha thas been increasingly interesting to me, but unfortunately time constrants mean writing custom makefiles is a more economical use of time then learning a new tool like SCons :-/. Like wise, the main reason I have never adapted the Boost libraries is no time to fiddle with their build tool, which also interests me.

Random thoughts on working with XML

It’s no real secret that I love XML but truly hate working with XML parsers in general (^_^).

Xerces-C++ and libxml++ are not to bad but I have never met a parser that I love. The main reason I choice Xerces was the painless’ness of compiling and linking against the library; I really do not want to go through the bother of setting up libxml++ in MSVC. Especially when taking a look at the pkg-config output on my workstation:

FreeBSD$  pkg-config libxml++-2.6 --cflags --libs                      18:07
-I/usr/local/include/libxml++-2.6 -I/usr/local/include/libxml++-2.6/include
-I/usr/local/include/libxml2 -I/usr/local/include -I/usr/local/include/glibmm-2.4
-I/usr/local/lib/glibmm-2.4/include -I/usr/local/include/sigc++-2.0
-I/usr/local/lib/sigc++-2.0/include -I/usr/local/include/glib-2.0
-I/usr/local/lib/glib-2.0/include -L/usr/local/lib -lxml++-2.6 -lxml2 -lglibmm-2.4
-lgobject-2.0 -lsigc-2.0 -lglib-2.0

SAX, DOM, or whatever else, the parser style doesn’t really matter to me that much: as long as it gets the job *done*. Although obviously, I am more familiar with DOMs (thank you JavaScript). I tend use XML for storing structured data without having to resort to a binary file/database, or a curmudgeon of files within a zip archive. So operations tend to be very straight forward using a couple of glue functions.

Personally, my idea of fun XML parsing is to take data this as input:

<rootnode>
<child1 attr="val">string of text</child1>
<child1>
<child2>another string of text</child2>
</child1>
</rootnode>

and to in turn receive a nested data structure like this as output:

# example in Perl
$structure = {
node => 'rootnode',
attributes => undef,
data => [
{
node => 'child1',
attributes => { attr => 'val' },
data => 'string of text'
},
{
node => 'child1',
attributes => undef,
data => [
{
node => 'child2',
attributes => undef,
data => 'another string of text'
}
]
}
]
};

Probably because that is how my brain sees the preceding XML xD.

Not to mention it makes writing something like a pretty printer easy as pi:

# for some reason, writing this subroutine was very relaxing...
sub pp_xml {

my $xhr = shift;
my $depth = shift;
my $indent = sub { "t" x shift };
my $node = $xhr->{node} or warn "XML node has no data!n";

if ($xhr->{attributes}) {
while (my ($attr, $val) = each %{$xhr->{attributes}}) {
$node .= " " . $attr . "='" . $val . "'";
}
}
print $indent->($depth), '<', $node, '>', "n";

$xhr = $xhr->{data};
if (ref $xhr eq 'ARRAY') {
pp_xml($_, $depth+1) foreach @$xhr;
} else {
print $indent->($depth+1), $xhr, "n";
}
print $indent->($depth), '</', $node, '>', "n";
}

pp_xml($structure, 0);

Making it accept a callback ident function as a 3rd argument, is left as an exercise for others who are equally in need of R&R 8=).

Terry@dixie$ perl -Mstrict /tmp/xml.pl -Mwarnings                         21:57
<rootnode>
<child1 attr='val'>
string of text
</child1 attr='val'>
<child1>
<child2>
another string of text
</child2>
</child1>
</rootnode>

quick note to self

once tpshs implementation of shell script is more mature: transition the windows machine to running a shell script on init, rather then the Startup system used by Windows NT, and compare performance.

Why commercial EULAs are stupid.

1. INSTALLATION AND USE RIGHTS.

a. Installation and Use. You may install and use any number of copies of the software on your devices.

….
4. BACKUP COPY. You may make one backup copy of the software. You may use it only to reinstall the software.

— from Microsoft DirectX SDK (March 2009) EULA

Maybe it is because I don’t deal in legalize daily, but I am still laughing 😀