August 2010 – Page 2 – Captain's Log Supplemental

August 20, 2010 by Terry Poulin

I think I’m more tired now than when I fell asleep, but at least Zombieland is on :-/.

The STAMAN Project: Phase III, of tasks and storage formats

October 8, 2023August 20, 2010 by Terry Poulin

At least, for me, there are only two pieces to STAMAN that are not trivial to work out before writing the code: choosing the storage format and implementation language. Both also happen to be areas where experience strongly augments ones intrusion, more so than the rest of the app’.

In the design outline, I noted that YAML would work quite nicely, yet an exposition of the outline suggests that something closer to SQL could better serve the applications design. The reasons behind it should be fairly obvious, if you’ve ever worked with textual data before.

During Phase I, I concentrated on the data involved with task management. It’s not hard to implement an SQL schema capable of representing that. Even better, most dialects offer useful features for handling times/dates. Virtually every programming language has a way of interfacing with such an SQL database, either through natural bindings or calling out to scriptable client programs. SQLite, MySQL, and PostgreSQL in fact provide both means, I’m not familiar with MSSQL. So that’s a big set of pluses all the way around. We even get a reusable DSL to help without having to write it!

The problem however, becomes one of migration paths: what happens if you need to change the data structures, perhaps heavily? That means having a lot more work whenever restructuring is needed, and it’s IMHO, less scriptable than a little perl golf: sufficiently so that I’m not going to screw with it. Insert shameless plug for Ruby on Rails here ;).

In a commercial environment; i.e. oriented on making money off the program, XML would be more likely than any other textual format, but not very convenient for me. I also hate XML parsing with a passion. It is however sufficient for getting the job done, if a bit, ahem, jacking the amount of internal documentation you need to write (or later wish you had) several notches higher than it need be.

Someone might think of a simple Comma Separate Value (CSV) format, but CSV is any thing but simple. Don’t believe me? Just think about data that may contain commas. That being said, the only good things I can say about CSV, from a programming perspective, is CPAN rocks. Unless you’re munging address books or spreadsheet data around, and need a LCD: it is best to avoid CSV, period.

The best bet, in terms of structured text: but one sufficiently able to represent the data set, and be easily edited by hand. What is really needed is a dedicated format: enter YAML. It’s basically a hierarchial way of recording data as sequences of elements and key/value mappings. Works excellently.

The SQL solution relinquishes fine control over the operations, where as the YAML method is assured to slurp up memory in proportion to the input. It’s a lot more like DOM oriented XML, only the translation between the code and textural representation is a hell of a lot more natural. When working with program generated output, it also doesn’t need to be fed through a pretty printer to be comprehensible, which can’t be said of XML—without more pain for someone.

Pro YAML:

Easily edited by hand (notepad) and many unix tools.
So simple you can skip reading the spec⁰
If you have to write your own parser, make it YAML and save grey hairs.
It’s easy to serialize/marshal data around, as easy as it gets without eval().
More likely to benefit from compression.

Pro SQL:

Less imperative-style code to be written.
The hardest processing code is already in the database engine.
Can focus on querying data, not parsing it.
Languages/frameworks are more likely to ship SQLite bindings then a YAML parser.

Con’ YAML:

It really is as simple as it looks.
You have to write your own list/dictionary handling code.
Scales less.

Con’ SQL:

You have to learn basic SQL.
Not the most fun in some languages (C, C++, Java, and C#).
Can’t really get at the data, short of a database client.

Note that I haven’t said anything about separating the data store from the client application: using an SQL server is just as viable as storing YAML files on a network drive. It really is that simple.

My personal view? SQLs virtues likely outweigh YAMLs here—unless you’re going to be designing by exploration. I’m not in this case, and I am also competent enough not to shoot myself in the foot. If I was smart, I would make the application wide interface to the data store more abstract than writing SQL queries all over the place like an asshole. Yes, I can be that smart. Don’t tell your neighbours.

0: I read the YAML specification the first time I used it for a project, which was for a rake based built system. How else could I expect to hand write my build spec’s in YAML? :-).

Rules of Family Survival

August 16, 2010 by Terry Poulin

Learn to become solid and expressionless as stone.
Don’t take open and seething hatred filled tirades personally; wish you had better ear plugs instead!
Remember you’re not a slave at beck and call.
Automatically disconnect yourself from being guilted over things you’re not responsible for.
Sometimes, you just need to duck…

Is it no surprise that most days are fucking miserable here?

August 16, 2010 by Terry Poulin

Without a doubt, my mother is one of the worst creatures I know on earth, that hasn’t filled an empty head with an associated MBA.

August 16, 2010 by Terry Poulin

Somehow, I’m really not sure what is worse: the curse of experience or a gringo’s rush.

Concept: tried Quassel IRC, didn’t like it – good software but not my bag. Switched to ircII – love the interface, don’t want to screw with hacking it. IRC clients are simple creatures but tend to be crap. While I could live with (or suitably script) ircII to my hearts content, I also want a more Windows usable client too.

Problem: When it comes to programming languages and what I want (something very ircII like, yet rather lisp like in a way). I can see all the pluses and minuses of any given implementation. If I was a nub, I would just pick a language, rush into it, and try and dig myself out.

Knowing so much can sometimes be a real let down o/.

For those that don’t know it, ircII is a very old school IRC client, even by the best CLI-whorish standards.

The typical IRC client is arranged as a text display area, for the current channel; a line edit for your messages; modern ones include a panel to list names in channel and some “Tab” like interface for marking the channels you’re chatting in. Text mode IRC clients work this way too.

ircII on the other hand, routes everything into a central display area and places a line edit under a “Status line”. Rather than jocking between tabs to see what’s up in other channels—which is very wasteful, even when using keystrokes: in ircII you simply use a command to change your current channel. Exempli gratia:

Typical:

Click #chan1 tab
Read what’s going on
Reply if desired
Change back to #chan0

Becomes:

See what’s going on both in #chan0 and #chan1
Use /j #chan1 to make your subsequent messages go to #chan1 instead of #chan0 until the next /j[oin] command.

It’s just more convenient for me than the ‘modern’ user interface. I like efficiency.

In terms of implementing something like this portably (unix/win), the issue is simply line editing. That’s not a subject I enjoy. Having worked on a unix shell, I know it’s a bitch of a subject. Colour support is another, but minor one. Cmd.exe doesn’t understand what a DEC does.

I also want something dynamically reprogrammable on the fly, basically access to a REPL. O.K. so lisp spoils you. This makes dynamic languages more convenient; which is also it’s own can of worms.

That’s the fact of Programming, it’s all a Kobayashi Maru problem: you’ve just got to deal with it.

August 16, 2010 by Terry Poulin

One obvious down side to eating everything in site and coding the night away, it’s 0400 and I’m not even drowsy yet… :-/

Reflections on C#

August 15, 2010 by Terry Poulin

Lately, I’ve been trying to use C#. No sense in learning a language and never using it, ever, lol. Over the years, I have generally skipped getting into C# – to much like Java for my tastes. Some months ago I picked up the lang’ as just a way of passing time. Found it interesting to note that C# was also about 3-4 times more complex than Java, syntactically. By contrast most of the complexity in Java comes from APIs or hoops you have to jump through to do xyz.

In putting my C# knowledge into practice, I’ve found that most of my linguistic gripes against learning it have been solved in .NET 3.0 / 3.5, and making portable code that works under Winows and Unix is just as easy as expected: in fact I test everything against the compilers from Microsoft and Mono. I’ve not had any troubles, and I am using like last years Mono version. Although, I must admit that I think of Monos setup as the “Novell implementation” and .NET as Microsoft’s >_>. The portability of C# is every bit as good as Java and dynamic languages. In fact, if it wasn’t for the Mobile version (Java ME), I would say C# is more portable than Java these days.

C# already have features that are expected in Java 7 and C++0x, but everyone will be damned if they will get to use any time soon. To top it off given the blasted prevalence of Windows machines, just about everyone will have a liveable version of the .NET runtime that you can program to in a pinch. Between actually using the computer, newer Windows versions, just about all of them will have a modern version. Plus several popular unix applications (and parts of the Gnome software stack) are written in C#, so the same goes for many Linux distributions. Alas the same can’t be said of getting various C/C++ libraries compiled….

Compared to Java, C# is a mixture of what Java should have evolved into as a business language, and a bit of C++ style. C# also goes to lengths to make somethings more explicit, in a way that can only be viewed as Java or COBOL inspired. I’ll try not to think about which. I think of professional Java and C# programming as our generations version of Common Business Oriented Language without the associated stigmatism.

The concept of “C++” style in C# however, is something of a moot point when we talk about Java too. Here’s a short comparison to explain:

// C++ class
class Name : public OtherClass, public SomeInterface, public OtherInterface { /* contents */ };


// Java class
public class Name extends OtherClass implements SomeInterface, OtherInterface { /* contents */ }

// C# class
public class Name : OtherClass, SomeInterface, OtherInterface { /* contents */ }

It should be noted in the above example, that C++ trades the ease of control over class visibility for fine grained control over inheritance. AFAIK, Java/C# has no concept of private / protected / virtual {x} inheritance. Likewise C++ is multiple inheritance, while Java and C# are single inheritance. This all leads to a more verbose class syntax in C++.

Now this one, is where you know how Java is run 😉

// C++ foreach, STL based
std::for_each(seq.end(), seq.begin(), func);

// C++ foreach, common technique
for (ClassName::iterator it = seq.begin(); it != seq.end(); ++it) /* contents */elementtype

// C++ foreach, to be added in the *future* standard (see below for disclaimer)
for (auto elem : seq) /* contents */

// Java foreach, <= 5.0
for (Iterator it = seq.iterator(); it.hasNext();) /* contents */

// Java foreach, >= 5.0
for (ElementType elem : seq) /* contents */

// C# foreach
for (var elem in seq) /* contents */

As you noticed, there’s three different examples for C++. The first uses the for_each algorithm and leads to rather simple code; the second is the most common way; the third is being added in C++0x and I haven’t bothered to read the details of it, since the version of GCC here doesn’t support it.

C++ again gives very fine grained control here, the for_each algorithm and iterator methods are extremely useful once you learn how C++ really works. If you don’t, than please don’t program seriously in C++! The C++0x syntax is +/- adding a foreach keyword, exactly what you would expect a foreach statement to look like, if C++ had one. Some things like Boost / Qt add a foreach macro that is mostly the same, but with a comma.

Java enhanced the for statement back in 2004, when Java 5 added a foreach like construct. Java hasn’t changed much since then. When you compare the keyword happy syntax of Java to the punctuation happy syntax of C++, it becomes clear that Java’s developers had decided doing it C++ style was worth more than adding any new keywords, like foreach and in. Guess they didn’t think to steal perls foreach statement for ideas on how to naturally side step it.

C# on the other hand, uses the kind of foreach statement that a Java programmer would have ‘expected’, one that actually blends in with the language rather than sticking out like a haemorrhoid. I might take a moment to note, that javac can be so damn slow compared to C++/C# compilers, that the lack of type inference in Java is probably a good thing!

In terms of syntax, Java is like C among it’s OO peers: it’s about as small and minimalist a syntax as you can get without being useless. I wish I could say the same about Java in general. Some interesting parts of C#, include properties and the out and ref keywords.

Here’s a comparison of properties in Java and C#:

class Java {

    private PropType prop;

    public PropType getProp() {
        return this.prop;
    }

    public void setProp(PropType prop) {
        this.prop = prop;
    }

    public void sample() {
        PropType old = getProp();
        setProp(new PropType());
    }
}

class CSharp {

    public PropType prop { get; set; }

    public void sample() {
        PropType old = prop;
        prop = new PropType();
    }
}

C# has a very sexy way of doing getter/setter like methods for properties. Personally I prefer the more C++ like style of just having a public field, unless you need to hook it (with a getter) or use a private setter. I like how C# can make it look like a field, when it’s actually a getter/setter method like construct. That means you don’t have to remember which fields are accessed directly and which need member function calls when writing code. Java convention is getter/setter bloat; C# convention is to use properties for anything non-private. I hope C# 5.0 or 6.0 will replace { get; set; } with just add a property keyword alongside the access modifier.

C++ is just as lame as Java in doing getter/setter methods, except you can (ab)use the pre processor for creating such basic accessors as the above, as well as any similar methods you need but don’t want to copy/paste+edit around. Java and C# always make you write your own, unless they are the basic kind. Tricks involving Java annotations and subclassing can kiss my hairy ass. It’s also worth noting that some Java projects can use an insane amount of getter/setter code. Come on guys. Using an external tool is not the right solution.

When we compare the age of these languages: C++ => 27 years old; Java => 15 years old; C# => 9 years old. It becomes obvious that C# is the only one that doesn’t suck at the concept of “Properties” and getter/setters in general. Perl made love constructs that respect the programmers time more than the compiler writers: you should too.

To anyone who wants to dare note that Java IDEs can often auto-generate getter/setters for you, and dares to call that better than language level support, I can only say this: you’re a fucking moron. Go look up an Abraham Lincoln quote about silence. Now if someone wants to be constructive and create another Java example equal to the C# example in the above listing, I’ll be happy to add it: rules must be shorter than existing Java example, uses: no subclassing, no beans, no external programs or libraries. Be sure to note what Java version it requies. Cheers.

The ref and out keywords in C#, are actually kind of oddities, if you come from another main stream language. In C it is not uncommon to pass a variable (perhaps more correctly a block of memory, if you think about it) to a function: and have the function modify the variables value instead of returning it.

    /* Common if not enjoyable idiom in C */
    if (!do_something("some", "params", pData) {
        /* handle failure */
   }
   /* use pData */

In this case, pData is a pointer to some data type, likely a structure, to be filled out by the do_something function. The point is, it’s intended as a mutable parameter. In C/C++, it’s trivial to do this for any data type because of how pointers work. Java passes by value just like C and C++ do: you can modify non-primitive types because a reference is used, not the ‘actual’ value. Making it more like a reference than a value type, in CS speak. C# does the same thing.

    // Java pass by value
    public void test() {
        int x=10;
        Java r = new Java();

        r.setProp(PropType.OldValue);
        mutate(x, r);
        // x = 10; r.prop = PropType.NewValue
    }

    public void mutate(int x, Java r) {
        x = 20;
        r.setProp(PropType.NewValue;
    }

Now a little fun with the self documenting ref keyword in C#:

    public void test() {
        int x = 10;
        var r = new CSharp;

        r.prop = PropType.OldValue;
        mutate(ref x, r);
        // x = 20; r = PropType.NewValue
    }

    public void mutate(ref int x, CSharp r) {
        x = 20;
        r.prop = PropType.NewValue;
    }

The out/ref keywords are similar, the difference has to do with assignment; RTFM. The important thing is that it is a compiler error to pass the data without the ref/out keywords at the call site. I’m both enough of a Python and C++ programmer to enjoy that. This explicitness helps catch a few typos, and helps document that it’s meant to be passed by reference, not value. That’s good because a lot of programmers suck at documentation and some also blow at naming parameters. I think the contractual post/pre conditions in Spec# are a good thing: by removing writing the handlers form the programmer, and not having to make the programmer rewrite the flibbin’ things in prose somewhere in the documentation. Not to mention the classic “Oops” just waiting to happen in less DbC oriented syntaxes. Hate to say it but the ref/out keywords presence in vanilla C# are likely due to Win32 API documentation conventions o/.

Where C# really rocks is in the CLI. Java has something good going for it, over the past 15 years the Java Virtual Machine (JVM) has been heavily tuned for performance, Mono and Hotspot also present quite an interesting set of options (that .NET lacks afaik). I assume that Microsoft’s implementation has also been battle tested for performance as well.

The thing of that is, the JVM was originally designed to run JAVA, first and foremost at the end of the day, that is what it had to do. The Common Language stuff on the other hand was intended to run several different languages. Although admittedly languages native to CLI tend to be similar, but so are most languages in general. The interoperability between CLI languages is wonderful, and at least in native .NET languages tends to be “Natural” enough. By contrast things crammed into JVM bytecode tend to become rather ugly IMHO, when it comes to interfacing with Java. I’m not sure if that’s due to the JVM or the language implementations I’ve seen, the changes coming in Java 7 make me guess it’s the former. The CLI is likely the next best thing to making a group of languages compile down to native code (for performance) and share some form of common ABI. Fat chance that will ever happen again. I’m sure I want to ponder about VMS, but the whole CLI thing tends to work quite nice in practice The performance cost is worth it for the reduction in headaches.

I’m sure that in terms of performance that Java mops the floor with Mono in some areas, because of how much hacking has gone into it making it a cash cow. That the C# compilers seems to run ring around the defacto standard Java compiler, is what really catches my interest performance wise. Using the mono 2.4.4 and Java 1.6.0_18 compilers, on my very modest system mcs processes a minimal program about 30% faster than javac. In real opeartion it tends to kick ass. When you consider that each compiler is also implemented in the target language, Java really gets blown away. O.K. maybe I care more about compile times than many people, it’s the virtue of using an older machine :-P. Combine that with how many slow, buggy, monstrosities have been written in Java—I’ll salute C# first. Another plus is less “Our tools demand you do it THIS WAY” than what Sun threw at everyone. Piss on javac and company.

What has hurt C# in my opinion is the Microsoft connection. The thing with Novell doesn’t help either. That Java is not exactly an insanely popular language among hackers, so much as enterprises, is another. The things that have hurt Java, being so closed and being academics choice for stupifying students.

What’s the upside to using Java over C#? Easier access to Java libraries, (J2ME) mobile phones, and more finger exercise from all that needless typing! Beyond that it’s essentially a win, win in favour of C#.

The STAMAN Project: Phase II, version control

October 8, 2023August 14, 2010 by Terry Poulin

First thing is first: I created the project on my choice of hosting site, than prepped


terry@dixie$ cd ~/proj;git init STAMAN; cd STAMAN; touch README
Initialized empty Git repository in /home/terry/proj/STAMAN/.git/
terry@dixie$ ls
README
terry@dixie$ git add README
terry@dixie$ git commit -m 'first commit'
[master (root-commit) f012d8e] first commit
 0 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 README
terry@dixie$ git remote add origin path spec to the repo
terry@dixie$ git push origin master
Counting objects: 3, done.
Writing objects: 100% (3/3), 222 bytes, done.
Total 3 (delta 0), reused 0 (delta 0)
To path spec to the repo
 * [new branch]      master -> master
terry@dixie$

In essence, create a new git repo in ~/proj/STAMAN, add a blank readme file and push it to the mirror. Simple.

If you’re not using version control, you’re brain damaged. It’s that simple.

I recommend Git and Mercurial, because I consider CVS, Subversion, and increasingly Bazaar as well, to be flawed models of doing version control. Git is what I use the most, so obviously I’m using it here :-P.

Much to tired to go into the sanity that is using version control, and using it rightly. Learn how to Google.

The STAMAN Project: Phase I, brain storming an outline

October 8, 2023August 14, 2010 by Terry Poulin

The first thing I did was sit down and spend about 5-10 minutes on brain storming, here’s what I came up with on Monday:

terry@dixie$ cat ~/Dropbox/todo-structure.outline
todo's contain
        task short name
        task notes (raw data)
        location
        assoicated url's
        due date
        time estimate
        associated contacts?
        reminder preferences
        list
        project (tag)
        priority

Shebang! YAML would work WELL
terry@dixie$

I like to dive in when designing a program, try and get a good big picture understanding of it, and try to identify the lower level issues that might chop up. The latter gets easier with experience, particularly with your tools rather than the art/science. After writing that file, I took a few more minutes to focus on the implications of its contents.

The purpose of a task management program, is obviously to manage tasks. It is fairly obvious that it is a fairly data centric program; so it’s a good place to start by thinking about what is the data. In this case, I took a couple minutes to think about what represents a task: what data it reflects. The short name being provided, as a convenience for listing tasks.

We can’t know for sure what sort of tasks will have to be managed, so what data will need to be attached should be kept abstract: it could be anything from a simple cat > notefile like stream of text, or an uploaded doc or photo. The important thing, is not shooting off a foot by making it restrictive. Since *I* am the principal user, I know the content will be quite variable. Excessively so, the more I utilise it.

Tagging a task with data like a location, associated URLs and contact info would likely be a good thing. You can easily imagine that going somewhere, talking to someone, or referencing a file off the web are all things that might go hand in hand with reviewing and completing a task.

Another frequent issue is keeping track of when xyz needs to get done, how often it needs to be done, how long its expected to take, and being able to do per-task preferences about the “Nag me about it” problem. Come to think of it, a way to note the tasks progress is a good idea too. Changes like these, are one of the main reasons I want something custom, rather than continuing with my beloved RTM – more control over the tasking details.

Keeping a flexible outline of the project, helps you identify spots to grow and or change it ‘in flight’, just like that realisation about progress tacking. Of course that assumes you will have time to think about the project, not just write its code like a drone.

Next up to plate, is the issue of organising tasks. I’ve got so much shit piled into RTM, that I have to periodically triage my task lists, almost like sorting them into a Trove. Notions of lists, priorities, and “Projects” are useful: in order to more easily create ad-hoc hierarchial lists based on such criteria. This is somewhat analogous to what’s possible using the SQL SELECT and JOIN statements. Database normalisation can actually be a good thing to learn about^link.

SQL is not a general purpose programming language, rather it targets the narrower domain of querying and manipulating rows and tables in a database. Although less needed around non web applications, knowing about SQL it is worth it, much like the concept of relational algebra in general. Why I have mentioned Structured Query Language here, is because it’s a useful train of thought to explore. Take some time and ponder about the possible formats, and what the code to manipulate it might.

A serious portion of programming is about solving problems, that’s what we use our languages for. If changing the rules makes solving the problem easier, that’s what we do. Knowing about various tidbits like declarative languages are valuable tools, if you remember to program more like Captain Kirk instead of a dry text book. Don’t bend yourself to the language…. bend the language to your problem, or find a tool or architecture that can help fill the gap.

Data storage formats are potentiality a lengthy issue, so I’ll go into that later.

October 8, 2023August 14, 2010 by Terry Poulin

With how often people have solicited my advice/opinions of programming matters and CS ed, I’ve been thinking about exposing the craft behind a project, and posting it as a journal entry. Well, the way my mother monkey wrenches things, I rarely have both the time, brain, and inclination to focus on detailing that much.

So instead, it’s probably best that I either decide never to go into the details of creating a program, or just stream it through more haphazardly across my journal. I’ll take a crack at the former, since I would like to work on the program.

One of the things on my “To roll own someday” list, is replacing my remember the milk with a local solution. The perks being that I can make custom changes without having to get hired by RTM o/, as well as integrate it with my work flow more naturally. It’s also a good series of mental exercises.

Since I’m not good at naming things, I’ll just call it STAMAN—Spidey01’s TAsk MANger. Which uniquely isn’t far off from Stamina, exactly the rate limiting factor for getting shit done. Especially under my living conditions.