Projects/tpsh – Page 2 – Captain's Log Supplemental

recent shell efforts

October 8, 2023April 5, 2009 by Terry Poulin

well, the foundation has been laid down for the next phase of tpsh development. Branch ‘parserlexer’ is basically setup to deal with the changes in tpsh_parse, tpsh_lex, and the switch from singular command line execution to enqueued command execution.

The new quote handling is actually quite a bit better, if buggy for now. The sh_eval() functions mostly become dead weight; coupled with the behaviour changes in tpsh_{parse,lex} behaviour the source, ., and eval built-ins (and anything relaying on them) are temporarily broken; as is tpsh -c ‘cmdstr’ until things are further integrated. Pipes also no longer work, since the command resolution doesn’t know how to deal with the command queue yet lol. Fixing 1 subroutine should will fix most breakages.

The idea is more or less that a command like

$EDITOR -o f1 f2 f3; cat f1 f2 f3 | sort -args | sed 's/x/y/g' > /tmp/q

becomes this:

     ( ['vim', '-o', 'f1', 'f2', 'f3'],
       ['cat', 'f1', 'f2', 'f3, '|'], 
       ['sort', ',-args', '|'], 
       ['sed', 's/x/y/g', '>', '/tmp/q'] )

and the trailing ‘|’ symbols would be used to indicate that the current element should be joined with the next (in a non technical sense that is) until the end of pipes is reached; recreating the pipeline (in so far as what happens).

The line is parsed into tokens, then analyzed and formed into a more interesting set of elements like the above array of array references; where the array refs are the argument vector (argv) of the commands to be passed onto pexec() or other suitable function. Previously the line was just parsed and dropped onto resolve_cmd() to figure out if it’s a pipe based, i/o redirection based, built-in, or external command; based on the scalar line or argument vector resulting from expansions.

the master branch remains the stable line for now until this topic branch is finished with.

October 8, 2023April 3, 2009 by Terry Poulin

It’s been a pretty good day so far. Got stuck getting up early for a shopping trip, but hey at least I got some doughnuts out of the deal lol. Oy, I’m going to end up doing press ups more often 8=). Two bags of powdered doughntus my absolute favorite xD.

Ducked into Proving Grounds #1, and joined Spawn, Ez, Hostile, and a few pubs for some games. Man, it’s been insane today in the servers. Those I don’t give a crap if I hit anything, it’s time to empty the magazine kind of moments – like a bad zombie flick, swarms of tangos out for blood. After a bit of a break to work on tpsh, I ended up in Proving Grounds #3 with Duke and a couple others joined: but still groups 3-5 tangos haunting the halls. Well actually that’s not to bad, in RvS it was more like 4-7 tangos at a time… hehe. One odd thing, this time out in SWAT 4, I got stuck in the heavy plates. Normally I hate body armour that restricts movement, especially in games like SWAT where heavy armour slows you down, and doesn’t stop a patato gun, let along bullets. I think the suspects must’ve gotten scared — last man standing, and feeling like a land battleship, but surviving without injury ;-).

tpsh gained the sh derived `tpsh -c “commands”` behaviour today. Command completion and history features have evolved quite nicely. Really what needs working on is the shells lexical analysis. I figure for setting it up as my login shell, I’ll compile a small C program that sets up PERL_RL to load a suitable Term::ReadLine backend before exec’ing tpsh.

All in all, not a bad day; but not very furfilling either :-/

Code monkey go to bed…

October 8, 2023March 27, 2009 by Terry Poulin

It’s been a rather slow day, but somewhat productive (only about 10 or 11 commits :@). My families made sure that I’ve had a throbbing headache most of the day… rat fuckers! But at least there is some work to show for it. tpsh now has a concept of $PATHEXT based on cmd.exe’s %PathExt% variable.

Windows is ruled by file extensions, while UNIX could care less about them; so really one of the few good things about Microsoft’s cmd.exe is you can tell it what file extensions should be “understood”, i.e. so you can type ‘notepad’ instead of ‘notepad.exe’. Since tpsh is modeled after the standard Unix sh, it’s mostly oblivious to file extensions: it cares about names. However it is virtually _impossible_ to use Windows from the CLI level without implementing something like PATHEXT or typing yourself into a nightmare (winxp/cmd.exe is actually good at making you do that, compared to a unix/sh).

For better compatibility with a Windows environment, tpsh now implements it’s (my) own concept of the feature, complete with a ‘pathext’ option (default on) to toggle the functionality. The main reason tpsh does this, is so I can type ‘gvim’ when I mean ‘C:Pathtogvim.bat’, the fact that I’m used to typing ‘vim’ is aside the point lol. (I rarely use gvim off win32 b/c of diffs between nt and unix cli)

About the only time I use file extensions is when forced (Win32), when ideal (.zip, .tar, etc), when saving text with CRLF for new line indicators (notepad friendly .txt), multimedia files (.png, .ogg, etc), or when deploying crap to a Windows machine lol. So it’s not an important thing for me; just a time saver.

Another thing I sorted out is what I call the “hash separator” for environment variables. UNIX shells by convention separate values like $PATH with a ‘:’, e.g. ‘/bin:/usr/bin’, but DOS and related bastards use ‘;’ for things like %PATH%, e.g. ‘C:Windows;C:WindowsSystem32’. Because many older operating systems use a ‘letter:path’ style for paths anyway, there is no universally portable default setting. Since virtually all operations involving a variables like PATH, CDPATH, and ENV (an extension used in tpsh, which I’ve never seen another shell use) involve a hash table, I call the mark the ‘hash-sep’ for short. One of today’s changes was exposing the hash-sep directly to the user.

The default hash-sep is ‘;’ under MSWin32, DOS, and OS/2 and ‘:’ otherwise, I’ve no clue what the hell VMS uses and don’t have access to it, so no worries yet ;-). Whenever the user changes the HASHSEP environment variable, the next time an operation that relies on it (basically hsplit(), short for hash-sep split()) the shell compiles it down to a suitable regular expression to save time on future commands.

setenv HASHSEP '/'
rehash

the rehash will cause ‘/’ to compile via qr for to speed up later splits; and due to the change of hashsep, causes the internal cache of $PATH to switch from the default ‘:’ or ‘;’, to ‘/’, which is probably not something anyone wants to do, but being able to fiddle with it can be useful for scripting reasons ^_^. Not to mention the fact that it makes concatenating things like PATH/CDPATH settings more portable when the environment requires something different.

Another bit of today’s work, was setting it up so that the history built-in now displays the correct line numbers. I want to setup builtins to save/load data from HISTFILE; so that command history is not lost between sessions – I’ve always had a bit of an itch about how most shells deal with it; wonder what tpsh might do hehe. Not sure about the $LINENO yet, I never really use it in scripting or interactive usage that much: will probably take the single unix specification into consideration on $LINENO.

If I didn’t have to so much crap for tomorrow, I could probably have half the manual done tonight and still get to bed before 0600 local. But no…. work early, work long, to be driven nuts after work, and probably end up PTFO instead of coding the night way.

*sigh*.

my idea of a great vacation: a Ferrari, laptops, solar panels to power them, hot date, and a tropical island with a sunny beach 😉

An epiphany!

October 8, 2023March 26, 2009 by Terry Poulin

3. a sudden, intuitive perception of or insight into the reality or essential meaning of something, usually initiated by some simple, homely, or commonplace occurrence or experience.

I actually thanked my mother for dragging my out on a shopping expedition today, because I figured out sometime a ways “down” my todo list: implementing control flow in tpsh. I guess you can take the programmer away from the code, but you can’t take the code away from programmer ^_^ ^_^ ^_^.

Earlier I recorded that the goal was for tpsh to use a queue of commands to execute rather then going by lines, the idea being:

`cmd1; cmd2; cmd3`

would parse into a list like ‘(cmd1, cmd2, cmd3’) which tpsh would then walk, eval, and execute the result in sequence (i.e. first in, first out). Then it occurred to me, if the shell will go that route for lexical reasons: why not implement the shells control flow operators as keywords that manipulate that queue directly? So that conceptually, a shell snippet like:

cmd0
if [ expr ]; then
  cmd1
  cmd2
else
  cmd3
  cmd4
fi
cmd5

would be inserted into the queue in a suitable manor and passed to the call back, so if the queue looked something like this:

( cmd0, if [ expr ], cmd1, cmd2, else, cmd3, cmd4, fi, cmd5 )

the call back would receive that relevant portion of the queue, (i.e. queue₁ through queue₇), evaluate the test and return the appropriate portion to be spliced into the queue, in this case:

( cmd0, cmd1, cmd2, cmd5 ) or ( cmd0, cmd3, cmd4, cmd5 )

depending on whether `[ expr ]` evaluated as true or false.

Damn, this so makes me want to straighten up the necessary parts of tpsh… now if only I didn’t have to work tomorrow from morning to whenever I drop over undead, or pass out lol. These kinds of things have always fascinated me about programming, but life has never offered much chance to study compilers, virtual machines, and interpretors… let along create something like tpsh, that requires implementing a scripting language :. The one good thing, despite the ups/downs and expressiveness of sh script, it’s really an easy language for a human to understand.

October 8, 2023March 26, 2009 by Terry Poulin

hmm, there was something I wanted to add a `set -o option` for, but I can’t remember what the heck it was lol.

October 8, 2023March 26, 2009 by Terry Poulin

man it’s been a slow day… I’ve only made about 15~16 commits on tpsh, one of which was merging branches lol. The main focuses of today’s tinkering has been minor bugfixes (mostly related to and tweaking the completion function.

Previously my completer would just take a quick approach, but always doing completions for built in commands, stored macros, known programs, and filenames. Now it has some notion of *what* it should complete.

$ b^I -> assume it can be anything
$ echo b^ -> assume it’s a file (formally a full eval() was done)
$ builtin l^I -> complete built-in commands
$ alias x -> complete to names in %Macros.

When I get the time, I want to make the completion function have a more hot-plug-in nature to it. A pre-completion hook, completion hook, and post-completion hook; the pre getting to modify the thing before completion, the completion hook being able to totally replace do_completion() aside from the pre/post hooks; and the post-hook having a change to modify the completions being returned. The current working idea is: a TPSH_WHAT_HOOK environment variable being set to A. perl code to generate a CODE ref from, or B. an external program to delegate to. (e.g. TPSH_COMPLETION_HOOK=”compl.rb” would pass compl.rb the necessary data, and expect the program to generate a n delimited list on its stdout for tpsh to parse). Odds are there will be more variables to tune it with later, as well as the API avail. being documented in the manual page My main reason for desiring this style of dynamically configurable completion is so I could do things like, seting up my tpshrc to “ignore” any ‘lost+found’ entries in filename completions; i.e. /usr/local/l^I expands to /usr/local/local not /usr/local/lo – completion list for local / lost+found. I know the zsh (which I use a lot) is noted for it’s configurable completion and spelling correction, but honestly I’ve never customized it beyond the lines zsh’s setup program added to my rc, let along RTFM’d about it lol.

Making it easy to add command sensitive completions through the hooks would be nice too, e.g. `man 2 foo^I` and have a hook spot the man 2 and go about completing foo to manual pages in section 2, hehe. (the only zsh feature I abuse, actually lol)

Really what needs doing atm in tpsh, is cleaning up the lex/eval subroutines and merging them and a few related changes with resolve_cmd(). Basically the principal adjustment needed is to go from line-based handling of commands, to (properly) breaking them lexically for storage into an execution queue. Generally though I like how stuffs coming along, except for a few odds and ends here and there.

At one point,tpsh relied on a list of “this command marks something to skip macro expansion on”, a quick solution until expand_aliases() could be rewritten. One night I found myself working on a scratch file with commands to modify said list (add, rm, and show the non expanding commands list via expr or array slice/index). That’s when I threw my hands up in disguest, and decided tomorrows job would be fixing expand_aliases (note to self: s/aliases/macro/). Imagine if you tried to type ‘unalias ls’ and the shell expanded ls to it’s aliased value, that kinda stuff. Kind a cool to be able to configure that, but kinda wrong to need a linear look up on each pass through the loop (ugh!).

To add built-in commands to manipulate said list rather then fix the problem, just was not something I was NOT willing to do… because to my eyes, it would be like a pox on my honour as a programmer: to gamma-weld such a cheap trick into place with built in commands to configure and abuse it rather then fix it — thus the macro expansion system got re-implemented, much more correctly this time. (didn’t have time to sort positional params and what nots).

I am also VERY glad that tpsh is being developed with git, rather then cvs! (I’ve also spent part of tonight learning more about gits plumbing)

October 8, 2023March 25, 2009 by Terry Poulin

Been working over SSH from SAL1600 today, since my rooms flooded out with cloths racks :. This crappy wireless ain’t helping either, just to keep things stable I’ve had to drop from AES encryption to Blowfish and add compression… either the signal utterly blows these days, or PuTTY must leak a lot of resources me thinks.

I cloned master in /srv/git/Projects/tpsh to /tmp/tpsh to do a little work, when surprise surprise… OpenBSDs perl barfed at Getopt::Long::GetOptionsFromArray. A cursory inspection of installed Perl modules around the network & change logs, showed my worst fear was right: it’s an experimental function added 2 years ago. After inserting a banda-id, and doing quite a bit of testing to avoid regressions… I also fixed a few other bruises and found a few things Perl 5.8.8 / use warnings didn’t like very much.

The birth of the “porting” branch, lol.

After taking care of pushing that out to the shared-repos, I checked out master and implemented the eval built in, so before I start coding tonight on my laptop I’ll need to update the master branch there from the one on the server, and merge in the changes from the porting branch.

Currently I’ve got three versions of Perl installed, 5.8.9 on the FreeBSD laptop (the authoritative git repository), 5.10.0 (activeperl) / 5.8.8 (from msys-git) on Windows XP, and 5.8.8 on the OpenBSD machine (home of the shared repository / insurance policy).

if tihs infernal wireless goes out once more…. I might just start working from my laptop on the couch, and the hell with this P.O.S. At least then, Iw ouldn’t have to SSH to get work done ^_^.

October 8, 2023March 25, 2009 by Terry Poulin

commit b841dc4954c24d0abea43daf407b6bf70e1c450b
Author: Terry ******* ****** <***********@****.***>
Date: Wed Mar 25 07:15:31 2009 +0000

massively improved alias expansions

aliases now expand recursively until resolved or aborted. A circular
alias like x = y; y = z; z = x; will resolve x to x when it hits z. The
expansion of aliases should behave more or less as desired, but without
positonal paramter support

tpsh:
$ alias x=’y -opts P’
$ alias y=’z -flags PP’
$ alias z=’echo z PPP’
$ x one two three
z PPP -flags PP -opts P one two three
zsh:
$ alias x=’y -opts P’
$ alias y=’z -flags PP’
$ alias z=’echo z PPP’
$ x one two three four
z PPP -flags PP -opts P one two three four

At the moment the shell local()’s %ENV before each expansion, and will
likely set 0..$#, $#, $@, and $* accordingly someday; but currently does
not use %ENV for anything. In order to allow macros to change the
environment, we can’t just local() %ENV to implement positional params
for macros, but it will be a suitable stop-gap until done fully.

note: alias, macro, and function all revolve around %Macros and
&expand_aliases for macro expansion.

the next big chore is improving the code that invokes expand_aliases(), lol.

October 8, 2023March 23, 2009 by Terry Poulin

another (big) section of the manual written, ENV processing, and an initial implementation for the history built-in among a few other things is done.

One interesting thing, usually sh only allows a single file in $ENV, and some versions of sh don’t even understand it period! In tpsh, as an interesting extension $ENV is treated like $PATH, in so far as ENV=/etc/tpshrc:/usr/local/etc/tpshrc:~/.tpshrc would cause tpsh to source an rc file in /etc, /usr/local/etc, and then the users home directory.

(Because of old OSes using drive letter:path like C:Windows, under such OS tpsh uses ‘;’ instead of ‘:’ to separate things like $PATH)

How the code comes out, when you’re ready to pass out

March 22, 2009 by Terry Poulin

# message for commit 041a7343eb452b827dbd97a0c82c8538597f86f6:
#
#   read built-in command implemented


sub read_bin {

    my ($prmpt, $time);
    my $line = "";
    my @argv = @_;
    my %opts = ( 'p=s' => $prmpt,
                 't=s' => $time,
                 'e'   => sub { "no-op" },
               );

    do_getopt(@argv, %opts);
    unless (@argv) {
        warn "I can't read into thin air!";
        return 0;
    }

    if ($prmpt and -t *STDIN) {
        print $prmpt;
    }

    eval {
        # remove custom die for ease of error check/report
        local %SIG;
        $SIG{__DIE__} = sub { die @_ };
        $SIG{ALRM}    = sub { die "timed-outn" };
        if ($time) {
            # an s, m, or h suffix causes sleep for sec, min, or hour
            #
            if ($time =~ /^(d*)([smh])/) {
                if ($2 eq 's') {
                    $time = $1;
                } elsif ($2 eq 'm') {
                    $time = $1 * 60;
                } elsif ($2 eq 'h') {
                    $time = $1 * 3600;
                } else {
                    warn "internal error on ", __LINE__;
                    # NOTREACHED
                }
            }
            alarm $time;
        }
        chomp($line = );
        alarm 0 if $time;
    };
    if ($@) {
        warn $@ unless $@ eq "timed-outn";
        # on time out, init the vars to empty strings
        @ENV{@argv} = ('') x scalar @argv;
        return 0;
    } else {
        # set each var to the words
        #
        # XXX because ifsplit has no notion of a &split 'LIMIT'
        #     if we used ifsplit here instead of a manual split,
        #       read x y
        #       foo bar ham
        #     would set $y to 'bar' instead of 'bar ham'
        #
        my $ifs = defined $ENV{IFS} ? qr/[$ENV{IFS}]/ : qr/s/;
        @ENV{@argv} = grep !/^$/, split $ifs, $line, scalar(@argv);
        return 1;
    }
}

from the manual page updated in commit b1317d7e6e7f91b6c3a2650f44cd4f425e381d42 with message:

read built-in command documented

and blockquoted here using the pod2html output

read [-p prompt] [-t timeout] variable …]

Read a line from standard input, split by fields, and assign each field to the indicated variables. If the number of variables is less then the number of fields, the remaining fields will be stored ‘as is’ in the last variable. If there are more variables then fields, the excess variables will be undefined. A prompt may be printed before reading input, by using the -p option. The -t option may be used to specify a timeout in which to abort the operation, should the user take their sweet time about pressing CR. The timeout value can take an optional s, m, or h suffix to denote seconds, minutes, or hours. If no suffix is given, s will be assumed.

It’s not the greatest… but hey, I ain’t had any sleep since this mornings roll out… lol. It’ll do fine for an initial implementation, until I’ve actually got a functioning brain to deal with it 😛