Wednesday, June 24, 2009

A small shoot yourself in the foot, coders moment.

Being bored and lacking further ops I can get done before bed, I picked up on tpsh again. In look for a quick challenge, I noted that the git repo was still on the 'codegen' branch. Basically, a branch to test the idea of generating the execution code on the fly per command sequence.

As a quickie of interest, I picked up the generation phase for the for-loop. Then I hit a road block. Since my shell expands variables, globs, and aliases quickly during tokenization. The reason being, the input field separator ($IFS) and quoting rules determine how this shell splits text into 'words' or tokens for the execution mechanisms—you could say it's "On the way there", thus deals with it as it comes. Currently tpsh handles environment variables by fouling around with the programs own idea of the referenced environment variables (%ENV) without any distinction between exported and unexported variables. My intentions have been to use a more controllable interface for shell variables at a later date, since it is kind of a low-yield concern at this stage of development.

I see several choices:

a/ redesign how things work (obviously this is the whole story, lol), saving the issue until later when other components have matured to match.

b/ leave expanding environment variables (etc) until the last minute, I don't like this idea.

c/ have some way of retaining things that can not be confirmed until later, with an indicator to strip out or expand the remainders at the last minute (this gives me visions of ugly code)

d/ incorporate the code generator closer into the process, so that things get expanded only after they have been confirmed, but generated as soon as possible (a more multiple pass focused design comes to mind).

a is basically a worry-later, and see if other things that needs doing either fix or exacerbate the problem (this is a two edged flaming sword). b is possible but would take a lot of reworking crap, and IMHO result in an ugly LA phase and become prone to introducing bugs into the final results. c sounds simple enough at first glance but I do not see a method that I'm willing to live with. I worry about how easy d would could confuse readers, and what danger a slip up on it could do the results.

For now, I intend to not worry about the minor issue, until after variable handling matures. Because I really love how expand_quotes() works, and that is the best part of the whole program IMHO. Needless to say, tpsh has poor handling of shell/environment variables and has had it throughout its development, since growing the code can wait longer then the other parts.

Not to mention the fact that tpsh has mostly been developed under sleep deprivation in the first place.... lol


In the time it took to submit the entry, type 'shutdown -p now', put away the computer, and take a quick leak: I came up with another solution. Give the code that expands variables an understanding of how they are defined, rather then only how they are referenced. Not only would checking if a referenced variable was just defined in the same set of input work with 'for X' and 'for X in ...' like constructs, it could also be used to implement the 'VAR=... ... command ...' syntax at a later stage :-)

The way expand_quotes() invokes the other expand_* procedures, we would need adjustment before the syntax of prefixing commands with variable settings could work, yet implementing the for-loop in it would be trivial, since anything that would cause the statement to get broken into an unusable token set before defining said variable and attempting variable expansion on it, would also be a usage that gets around for's keyword status!

Problem solved with less fuss, maybe? It is amazing what you can think of while taking a piss!

No comments:

Post a Comment