An epiphany!

3. a sudden, intuitive perception of or insight into the reality or essential meaning of something, usually initiated by some simple, homely, or commonplace occurrence or experience.

I actually thanked my mother for dragging my out on a shopping expedition today, because I figured out sometime a ways “down” my todo list: implementing control flow in tpsh. I guess you can take the programmer away from the code, but you can’t take the code away from programmer ^_^ ^_^ ^_^.

Earlier I recorded that the goal was for tpsh to use a queue of commands to execute rather then going by lines, the idea being:

`cmd1; cmd2; cmd3`

would parse into a list like ‘(cmd1, cmd2, cmd3’) which tpsh would then walk, eval, and execute the result in sequence (i.e. first in, first out). Then it occurred to me, if the shell will go that route for lexical reasons: why not implement the shells control flow operators as keywords that manipulate that queue directly? So that conceptually, a shell snippet like:

cmd0
if [ expr ]; then
cmd1
cmd2
else
cmd3
cmd4
fi
cmd5

would be inserted into the queue in a suitable manor and passed to the call back, so if the queue looked something like this:

( cmd0, if [ expr ], cmd1, cmd2, else, cmd3, cmd4, fi, cmd5 )

the call back would receive that relevant portion of the queue, (i.e. queue1 through queue7), evaluate the test and return the appropriate portion to be spliced into the queue, in this case:

( cmd0, cmd1, cmd2, cmd5 ) or ( cmd0, cmd3, cmd4, cmd5 )

depending on whether `[ expr ]` evaluated as true or false.

Damn, this so makes me want to straighten up the necessary parts of tpsh… now if only I didn’t have to work tomorrow from morning to whenever I drop over undead, or pass out lol. These kinds of things have always fascinated me about programming, but life has never offered much chance to study compilers, virtual machines, and interpretors… let along create something like tpsh, that requires implementing a scripting language :. The one good thing, despite the ups/downs and expressiveness of sh script, it’s really an easy language for a human to understand.

hmm, there was something I wanted to add a `set -o option` for, but I can’t remember what the heck it was lol.

man it’s been a slow day… I’ve only made about 15~16 commits on tpsh, one of which was merging branches lol. The main focuses of today’s tinkering has been minor bugfixes (mostly related to and tweaking the completion function.

Previously my completer would just take a quick approach, but always doing completions for built in commands, stored macros, known programs, and filenames. Now it has some notion of *what* it should complete.

$ b^I -> assume it can be anything
$ echo b^ -> assume it’s a file (formally a full eval() was done)
$ builtin l^I -> complete built-in commands
$ alias x -> complete to names in %Macros.

When I get the time, I want to make the completion function have a more hot-plug-in nature to it. A pre-completion hook, completion hook, and post-completion hook; the pre getting to modify the thing before completion, the completion hook being able to totally replace do_completion() aside from the pre/post hooks; and the post-hook having a change to modify the completions being returned. The current working idea is: a TPSH_WHAT_HOOK environment variable being set to A. perl code to generate a CODE ref from, or B. an external program to delegate to. (e.g. TPSH_COMPLETION_HOOK=”compl.rb” would pass compl.rb the necessary data, and expect the program to generate a n delimited list on its stdout for tpsh to parse). Odds are there will be more variables to tune it with later, as well as the API avail. being documented in the manual page My main reason for desiring this style of dynamically configurable completion is so I could do things like, seting up my tpshrc to “ignore” any ‘lost+found’ entries in filename completions; i.e. /usr/local/l^I expands to /usr/local/local not /usr/local/lo – completion list for local / lost+found. I know the zsh (which I use a lot) is noted for it’s configurable completion and spelling correction, but honestly I’ve never customized it beyond the lines zsh’s setup program added to my rc, let along RTFM’d about it lol.

Making it easy to add command sensitive completions through the hooks would be nice too, e.g. `man 2 foo^I` and have a hook spot the man 2 and go about completing foo to manual pages in section 2, hehe. (the only zsh feature I abuse, actually lol)

Really what needs doing atm in tpsh, is cleaning up the lex/eval subroutines and merging them and a few related changes with resolve_cmd(). Basically the principal adjustment needed is to go from line-based handling of commands, to (properly) breaking them lexically for storage into an execution queue. Generally though I like how stuffs coming along, except for a few odds and ends here and there.

At one point,tpsh relied on a list of “this command marks something to skip macro expansion on”, a quick solution until expand_aliases() could be rewritten. One night I found myself working on a scratch file with commands to modify said list (add, rm, and show the non expanding commands list via expr or array slice/index). That’s when I threw my hands up in disguest, and decided tomorrows job would be fixing expand_aliases (note to self: s/aliases/macro/). Imagine if you tried to type ‘unalias ls’ and the shell expanded ls to it’s aliased value, that kinda stuff. Kind a cool to be able to configure that, but kinda wrong to need a linear look up on each pass through the loop (ugh!).

To add built-in commands to manipulate said list rather then fix the problem, just was not something I was NOT willing to do… because to my eyes, it would be like a pox on my honour as a programmer: to gamma-weld such a cheap trick into place with built in commands to configure and abuse it rather then fix it — thus the macro expansion system got re-implemented, much more correctly this time. (didn’t have time to sort positional params and what nots).

I am also VERY glad that tpsh is being developed with git, rather then cvs! (I’ve also spent part of tonight learning more about gits plumbing)

Been working over SSH from SAL1600 today, since my rooms flooded out with cloths racks :. This crappy wireless ain’t helping either, just to keep things stable I’ve had to drop from AES encryption to Blowfish and add compression… either the signal utterly blows these days, or PuTTY must leak a lot of resources me thinks.

I cloned master in /srv/git/Projects/tpsh to /tmp/tpsh to do a little work, when surprise surprise… OpenBSDs perl barfed at Getopt::Long::GetOptionsFromArray. A cursory inspection of installed Perl modules around the network & change logs, showed my worst fear was right: it’s an experimental function added 2 years ago. After inserting a banda-id, and doing quite a bit of testing to avoid regressions… I also fixed a few other bruises and found a few things Perl 5.8.8 / use warnings didn’t like very much.

The birth of the “porting” branch, lol.

After taking care of pushing that out to the shared-repos, I checked out master and implemented the eval built in, so before I start coding tonight on my laptop I’ll need to update the master branch there from the one on the server, and merge in the changes from the porting branch.

Currently I’ve got three versions of Perl installed, 5.8.9 on the FreeBSD laptop (the authoritative git repository), 5.10.0 (activeperl) / 5.8.8 (from msys-git) on Windows XP, and 5.8.8 on the OpenBSD machine (home of the shared repository / insurance policy).

if tihs infernal wireless goes out once more…. I might just start working from my laptop on the couch, and the hell with this P.O.S. At least then, Iw ouldn’t have to SSH to get work done ^_^.

commit b841dc4954c24d0abea43daf407b6bf70e1c450b
Author: Terry ******* ****** <***********@****.***>
Date: Wed Mar 25 07:15:31 2009 +0000

massively improved alias expansions

aliases now expand recursively until resolved or aborted. A circular
alias like x = y; y = z; z = x; will resolve x to x when it hits z. The
expansion of aliases should behave more or less as desired, but without
positonal paramter support

tpsh:
$ alias x=’y -opts P’
$ alias y=’z -flags PP’
$ alias z=’echo z PPP’
$ x one two three
z PPP -flags PP -opts P one two three
zsh:
$ alias x=’y -opts P’
$ alias y=’z -flags PP’
$ alias z=’echo z PPP’
$ x one two three four
z PPP -flags PP -opts P one two three four

At the moment the shell local()’s %ENV before each expansion, and will
likely set 0..$#, $#, $@, and $* accordingly someday; but currently does
not use %ENV for anything. In order to allow macros to change the
environment, we can’t just local() %ENV to implement positional params
for macros, but it will be a suitable stop-gap until done fully.

note: alias, macro, and function all revolve around %Macros and
&expand_aliases for macro expansion.

the next big chore is improving the code that invokes expand_aliases(), lol.

another (big) section of the manual written, ENV processing, and an initial implementation for the history built-in among a few other things is done.

One interesting thing, usually sh only allows a single file in $ENV, and some versions of sh don’t even understand it period! In tpsh, as an interesting extension $ENV is treated like $PATH, in so far as ENV=/etc/tpshrc:/usr/local/etc/tpshrc:~/.tpshrc would cause tpsh to source an rc file in /etc, /usr/local/etc, and then the users home directory.

(Because of old OSes using drive letter:path like C:Windows, under such OS tpsh uses ‘;’ instead of ‘:’ to separate things like $PATH)

casual fun with the Perl profiler

call: dproffpp -ap shift.pl

there are three sets of data, each meant to represent a small, medium, or large set. Each set is a list of words, 5, 25, and 100 words long respectively. (realistically the elements would average within 2 to 5 words inclusive). For simplicity N is 10.

There are 2 functions, xx and yy; representing different ways of solving the same problem: pretty printing the last N items of a given data set. In xx(), the set is reversed and then $#set -= N’d to clip all but the last N items, then reversed again to put it back into proper order. In yy() we avoid any reversals and just shift off the front of the list one at a time, until we reach N items left in the set. If the set contains less then N elements, no adjustment need be done.

Each function is called 3 times per iteration, once with each data set, over 3000 iterations (that’s 9000 calls to each function, or 3000 times with each set). The test was then executed 10 times.

The things so simple, it’s not important how long it takes to finish, but I’m interested in how big the the difference is between for (expr1; expr2; expr3) { shift @list } and $#list -= expr; and how much those two reversals hurt.

Every time, yy() ran faster by at least a half second. Then ran tests with xx() doing one reversal, then no reversals and yy() still beat it.

Now out of more curiosity, let’s see how larger data sets work. Each data set now contains 3 words instead of 1, and N is now 43; with the data sets being 5, 25, 100, 250, 500, and 1000 elements long.

A new function, zz() which is xx() without the reversals is also executed during the tests. After running the tests a short duration, it seems that the $#set -= N’ing is a bit faster, more so then the cost of the reversals.

here’s the new run down:

$ do_test() {
> local T=10
> while [ $T -gt 0 ]; do
> N=$1 dprofpp -ap shift.pl | tail -n 7 >> dp
> T=$(($T - 1))
> done
> }
$ for NUM in `builtin echo "3n10n43n51n227n"`; do do_test $NUM; done

The above (z)sh code will execute the test on shift.pl 10 times with an N of 3, 10, 43, 51, and then 227; appending the report (3 subs = (4+3) lines) to the file ‘dp’ for post-cpu meltdown review, otherwise we would have to take a look at all the I/O the tests generate before the report is printed by dprofpp.

Yes, I’m to damn lazy to use command history, let along retype the commands each time; why else would they have invented functions and loops 😛

About 15 minutes and 17 degrees Celsius later, some of the arithmetic involved finally caught up with my throbbing head.

Recap:

  • 5 tests, each test has a different value of N
  • 10 runs of each test, meaning 50 runs
  • each run examines the data sets 3000 times, for 150,000 examinations
  • each examination calls 3 functions once with each of 6 data sets, 18 function calls per examination.
  • The six data sets consists of a list of 5, 25, 100, 250, 500, and 1,000 elements; each element is 3 words long. So like 1880 elements in the data set, and 5,640,000 elements processed per examination

So that is what, maybe 2,700,000 function calls to &xx, &yy, and &zz; without counting the calls within those functions… and 846,000,000,000 list elements processed overall? After a little estimation based on the original data set/run time, I stopped counting after the estimated execution time passed 8 hours * X on this old laptop. Hmm, how’s that old saying go, curiosity fried the geeks cooling fan? lol.

I’m beginning to understand why some peoples workstations have like 32 ~ 64 GB of ECC RAM, and Symmetrical Multiple Processor (SMP) configurations to drool $! for, haha!

How the code comes out, when you’re ready to pass out

# message for commit 041a7343eb452b827dbd97a0c82c8538597f86f6:
#
# read built-in command implemented


sub read_bin {

my ($prmpt, $time);
my $line = "";
my @argv = @_;
my %opts = ( 'p=s' => $prmpt,
't=s' => $time,
'e' => sub { "no-op" },
);

do_getopt(@argv, %opts);
unless (@argv) {
warn "I can't read into thin air!";
return 0;
}

if ($prmpt and -t *STDIN) {
print $prmpt;
}

eval {
# remove custom die for ease of error check/report
local %SIG;
$SIG{__DIE__} = sub { die @_ };
$SIG{ALRM} = sub { die "timed-outn" };
if ($time) {
# an s, m, or h suffix causes sleep for sec, min, or hour
#
if ($time =~ /^(d*)([smh])/) {
if ($2 eq 's') {
$time = $1;
} elsif ($2 eq 'm') {
$time = $1 * 60;
} elsif ($2 eq 'h') {
$time = $1 * 3600;
} else {
warn "internal error on ", __LINE__;
# NOTREACHED
}
}
alarm $time;
}
chomp($line = );
alarm 0 if $time;
};
if ($@) {
warn $@ unless $@ eq "timed-outn";
# on time out, init the vars to empty strings
@ENV{@argv} = ('') x scalar @argv;
return 0;
} else {
# set each var to the words
#
# XXX because ifsplit has no notion of a &split 'LIMIT'
# if we used ifsplit here instead of a manual split,
# read x y
# foo bar ham
# would set $y to 'bar' instead of 'bar ham'
#
my $ifs = defined $ENV{IFS} ? qr/[$ENV{IFS}]/ : qr/s/;
@ENV{@argv} = grep !/^$/, split $ifs, $line, scalar(@argv);
return 1;
}
}

from the manual page updated in commit b1317d7e6e7f91b6c3a2650f44cd4f425e381d42 with message:

read built-in command documented

and blockquoted here using the pod2html output

read [-p prompt] [-t timeout] variable …]

Read a line from standard input, split by fields, and assign each field to the indicated variables. If the number of variables is less then the number of fields, the remaining fields will be stored ‘as is’ in the last variable. If there are more variables then fields, the excess variables will be undefined. A prompt may be printed before reading input, by using the -p option. The -t option may be used to specify a timeout in which to abort the operation, should the user take their sweet time about pressing CR. The timeout value can take an optional s, m, or h suffix to denote seconds, minutes, or hours. If no suffix is given, s will be assumed.

It’s not the greatest… but hey, I ain’t had any sleep since this mornings roll out… lol. It’ll do fine for an initial implementation, until I’ve actually got a functioning brain to deal with it 😛

Now if I didn’t have to be awake in ~4 and a half hours, I could get the . / source commands implemented, so handling tpsh_profile / .tpshrc files on startup could be done… and have more of the manual page written out :. Hmm, that reminds me another big thing I need to work on is the shell special variables ($0…$9…, $*, $@, $#, $?) and the rest of ${parameter expansion} syntax.

If I could get anything done during the day here, and actually sleep at night… instead of coding odd hours, now that would just be heavenly lol.

$ git log | sed 's/my name/<snip>/g' | sed 's//<sed@removed.it>/g' | vim -

commit 4835c4091e59af282ee98952568f0e9ac91c8d09
Author: Terry <snip> <sed@removed.it>
Date: Sat Mar 21 09:27:41 2009 +0000

do_getop() made generic, `set [options]` now works.

The do_getopt() function now takes an array ref and hash to pass onto
Getopt::Long::GetOptionsFromArray, and twists it in place.

set_bin() now calls do_getopt() with it’s argument vector and a global
hash; startup code now calls do_getopt() with @ARGV and that same hash.
Because of that, `set -x`, `set -o xtrace`, and such work but `set +o
xtrace` and such can’t be used to turn the option off yet lol. (+o atm
is just an alias for -o.)

commit 90bf68e8b5f0aed867ae4a9fd2eca2c8d99dc1fd
Author: Terry <snip> <sed@removed.it>
Date: Sat Mar 21 08:16:39 2009 +0000

tpsh -o longname now works

further more as an extension, ‘-o logname1,longname2’ also works but is
not documented yet.

commit dcffa41bb57d70c82f502b08ea8c76ebeb559b1c
Author: Terry <snip> <sed@removed.it>
Date: Sat Mar 21 07:49:28 2009 +0000

which built-in command added and documents

see Built-in Commands and CEVEATS & BUGS in tpsh.1.pod for details.

It is really time to make do_getopt() a generic wrapper around
Getopt::Long, the current functionality in do_getopt() needs to be moved
into set_bin() anyway…

Thanks Wiz… lol

After a few other commits, I’ve finally buckled down and made do_getopt() a simple helper, previously it did parsing for @ARGV via the Getopt::Long module, and twiddle %Options accordingly; now it takes an array ref and a hash to do the job. The wonderful thing? Now both tpsh [options] and the built in `set [options]` command work with half a line of duplicated code 🙂

In prepping my code for commit, I wanted to take one last look at do_getopt(), so I went for a quick jump over with the vi/vim :tag command. Wiz IM’d me, so he ended up getting the :ta command instead of vim haha! After alt+tabbing back to the editor, I tried something different just for the heck of it, :ta do_ and guess what!? Vims ex mode completion works on tags as well…. this is so going to spoil me lol. ex/vi lacked command completion, history, and editing; but vim added them. One of these days, I need a refresher on the improved tag commands, my muscle memory is pretty vi compatible until we get into tab completion, :split windows, and multiple tabs lol.

So thanks to Wiz, I’ve just found a lovely Vi IMproved feature that I never dreamed existed xD