Special thanks to Carpet Smoker of Daemon Forums. |
Programming
I guess an interview with the creator of C++, is worth a showing of the slashdot effect, especially when it involves C++0x?
What I want to know is when the flubber will every major compile support C++0x tolerably, and will it happen before 2020 rolls around lol.
*yawn* it’s a lazy wazy afternoon, and I feel somewhere between taking a siesta and getting into some code, before I go back to work tomorrow lol.
Spent some time screwing with my phone, GPS, updated maps, blah, blah. Personally I think that a touch screen is better than a mouse, worse than a keyboard; although interestingly swype seems to be handy with a terminal emulator! Maybe on a much larger display, an on screen keyboard might cut it for general typing needs. Also put in a bit of time seeing what level of integration I can get between Android and my `usual` work flow. So/so.
What I really would like? Is a CRUD interface to ‘everything’, that I could use from my unix shell, a gui, or perhaps, even my phone. Preferably I would like something Mail like in terms of interface. Something that could roll news feeds, e-mail, task management, calendaring, facebook, etc all into one thing. Like a big data funnel. On my weekends off, I’ve been grokking around for libraries that would help interface with the services I use, like my calendar: something increasingly important ;).
On the other side is the issue, what language is ideal to the mixture, and how many weekends would it take to get something, ‘useful’ ? Honestly, I like dynamic languages for the features. JavaScript and Lisp will never be pretty to look at but they really rock. By contrast more static languages are often easier to enforce ahead of time type checking but lack sexy features. I rather prefer it when the compiler can tell me, “Oops” before I execute a program, but I’d rather not have to code my way around the languages artificial limits either. Hmm, can’t have everything I guess.
So, to be perfectly honest, I have no idea what I’m doing for the next few hours lol.
*groan* today is time for the drop test—dropping things on MSVCs foot and hoping I don’t feel like drop kicking it ^_^.
My life would probably be a lot easier if I just rigged a cross compiler and gave up on using a platforms more ‘native’ compilers, i.e. Visual C++ on Windows. It’s about as much a native compiler as the platform has, and I still consider it insulting that a C compiler isn’t provided as part of a Windows install lol.
Post dinner notes
One value that this comment has served along with my compiler, is to teach me that default parameters on a template *function* are actually a C++0x feature, and apparently not part of C++03 as I have assumed for ages.
template <typename Num=int> Num square(Num n) { return n * n; }
int main() { return square(5); }
// g++ 4.5.1 reports:
//
// error: default template arguments may not be used in function
// templates without -std=c++0x or -std=gnu++0x
Somehow, I always figured if you could do it on template classes it would be legal on template functions as well. Guess that’s what I get for never reading the ISO C++ 2003 standard, and it having been a number of years since I last RTFM so to speak o/. About 95% of the times I use templates, it’s at the class level anyway, and most queries I have on C++ are easily solved by reading a header file. To be honest, while the drafts for C99 make for a very nice read, I’ve always worried that digging into the C++ standards might melt my brain or put me soundly to sleep. Frankly, C standards are helpful along side SUS, but the standards for C++, well all I can say is that I’ll be happy when 0x becomes as omnipresent as C89 capable C compilers. I’m just not sure I will live that long though :-(.
What I’ve done is implemented Engine::Call() as a sequence of template functions that use the vaguely normal method overloading semantics to cope with the arguments through Push(). Nothing else has been able to maintain the desired level of type safety. Basically if Call gets 0 arguments = use the current class state, and successive versions of Call(funcname, …) doing the obvious manipulations to generate the correct state and than run it. I would rather a recursive variadic template but life sucks and C++0x isn’t every where just yet. Having to supporting late model GCC 3.x is a real possibility for Cassius to bear fruit without thinning a few hedges.
Push() is just done as a set of virtual member functions for logistical reasons. Since each Engine instance is tagged with it’s language and implementation details (as an insurance policy), the same effect could be done through a set of template specializations and a little static_cast magic to forward it to the correct subclass without any inheritance tricks. At the price of a more stringent contract with subclasses, that would also allow for getting rid of other virtual members in the base class. I’m not what instructions a C++ compiler is likely to generate for a static_cast from parent to child class compared to what C++ virtual methods and casting things in C land will produce, but I could really care less about the cost of using virtual Push()’s versus templated Push()’s here: Call() is the one that matters at the template level. Why virtuals are used for Push() in point of fact, is to avoid a build time dependency against the backends (to be loaded at runtime), which obviously would be a circular dependency. So templated Call(), virtual Push() works more convineantly than a purely templated driven solution.
Being sublimely lazy, rather than write out a set of similar template functions for each version of Call() that’s needed to handle these successive amounts of arguments, I just wrote a Lua script to do it for me: then shoved that into the projects build scripts and adjusted the headers to include that auto generated file as needed. One of my favourite Boost libraries actually does something similar for handling a variable number arguments for a functor type class, but I’ll be damned if I’m writing them by hand!
Lately I’ve been relying on a mixture of rake (Ruby) and premake (Lua) based build systems, so it is rather nice to be able to do such things: and not kick myself later (make, vcbuild).
Currently Lua and Python are in the works, because Lua’s API is different from what I’m used to, and Python is well, not fun to embed but simple enough. Conceptually there is no major difference between a parameter list and a stack, it’s just a sequence of data at heart, and Python functions basically use sequence objects.
What would be awesome, is if the calls are defined in terms of stack manipulation, is to create a template method called Push, that users template specialisation in order to wrangle plain old data types and callables to the right scripting language types, so we would have something like this:
e->Run(SourceString("function f(a, b) print(a); return b * b end")
e->Push(/* instance of some type representing f() */);
e->Push("header message");
e->Push(2);
e->Call();
and rely on the compiler to Get It Right in figuring out the relevant overloads, e.g. Push
That’s a piece of cake thanks to C++ allowing the abuse of inheritance and casting:
/* An example */
#include <iostream>
template<class Impl> class Base {
public:
template <typename T> void Push(T arg)
{
static_cast<Impl *>(this)->Push(arg);
}
};
// implementation
class Impl : Base<Impl>
{
public:
template <typename T> void Push(T arg);
};
template <> void Impl::Push (int arg)
{
std::cout << arg << std::endl;
}
template <> void Impl::Push (const char *arg)
{
std::cout << arg << std::endl;
}
// subclass that just overrides one method
class X : Impl {
public:
template <typename T> void Push(T arg);
};
template <> void X::Push (int arg)
{
std::cout << arg * arg << std::endl;
}
template <> void X::Push (const char *arg)
{
static_cast<Impl *>(this)->Push(arg);
}
int main()
{
{
Impl test;
test.Push(2); // 2
test.Push("string");
}
{
X test;
test.Push(2); // 4
test.Push("string");
}
return 0;
}
Because the parent class is templated against the derived class, it’s possible to get jiggy with it at compile time. Namely enough is known by the parent about the child to invoke the correct method. Where it becomes somewhat annoying is when you want to continue with subclasses, like X in the above example.
A year or two ago I learned that people call this the “Continually Reoccurring Template Pattern (CRTP)”. Being lazy, I just think of it as the poor mans way of doing something similar to what “virtual template <…> …” would logically imply, if only the effing compiler was that smart. For what I need, just mating instance method overloading with virtual method calls is good enough.
Now all that is trivial, the real gripe however is how do you properly make a “Convenience” method, let’s say one we can do like e->Call(/* variable number of parameters */); and have it do the appropriate magic for us based on the type.
Well, sadly we can’t so easily. To use a va_list, there has to be a way to access the type of the argument. Normally this is done the same way that printf() and scanf() work in C, taking a format string saying what type to cast each parameter to. Pythons embedding API actually does this to convert from C data types to Python objects. Someday I need to open up a C library and look at how va_arg() actually works, I’ve always assumed it’s some sort of hack around a block of memory and type casting. It’s trivial to implement that kind of thing, already have done it for testing purposes (rather than templated Push()’s) but using a format string to describe the specifiers breaks down on type safety, where as at least with the template thing, the compiler can help some.
We can’t rely on Push() overloads to do the right thing because va_arg() is needed to access the arguments if Call() takes a variable number of arguments in the C++ 2003 compliant way. Obviously the easist solution is to find a smarter way of doing va_arg(a_va_list, a_type). Life would be a piece of cake with templates that can take a variable number of parameters, right? Well there’s few vendors out there who seem to know how to do that <_>.
So how the fuck do you do a smart va_arg() like behaviour? The only thing I can think of at the moment is to make them all the same templated type, so it’s known how to cast them; then try and work some sort of char_traits<> like magic to figure out which Push() is appropriate but binding the necessary info creates more of the same. That and looking up in the C++ standard how many template parameters (if any) compilers are required to allow, and generating every possible permutation of arguments using a script to make the necessary template code before calling the compiler.
Either way, I’m just taking a break for a few minutes to enjoy how peaceful the quiet has been for the last twenty minutes lol.
Despite being interrupted almost every five to fifteen minutes, I managed to get the backends for embedding Lua and Python sorted. Today I would like to start getting into making it useful for something besides evaluating scripts.
Cassius needs to allow two things in order to be useful to me: invoking the embedded languages procedures from C++, and a way to export code to the embedded language. My interest, is whether or not it’s actually possible to accomplish that using fairly standard C++. I’m kind of hoping, to see just how far that can be pushed.
From experience, I’ve learned that you can expect something vaguely C89 compliant anywhere in the world but expecting C++ compilers to agree on all things template related, can be like asking a goldfish to walk on air – a bad idea! That’s why I rarely do more with templates than I have to. With how much compilers have changed in the last four years, I reckon it’s time.
Out of everything traditional C++ offers, most of it is just sub standard compared to newer languages. A lot of the code I’ve read over the years, I would hardly even count as C++ so much as C with classes, but people have developed reasons I guess. IMHO how C++ can interface with C code is a killer feature, that could be just as readily solved by adapting a C compiler to generate JNI glue code or some shit like that. Throwing on inheritance based OOP isn’t that killer in my books, when you look at languages like Ruby and Python. The real killer feature of C++ is what you can do with templates. While supporting simple generics are part of it, that could be done in C by (abusing) the pre processor and adjusting your Makefile. It’s the opportunities to get creative at compile time that make its it worth while, someday I really should see if any good books have been written on TMP in C++.
The way I look at it, macros make Common Lisp stand out from it’s younger peers, C++ templates make you drool, or curse compilers more frequently lol. Leveraging languages is why more than one programming language should exist.
Relaxation time
One thing spending almost my entire life around computers has taught me, is that rarely is anything impossible, so much as it may just be a pain in the ass to get done. For R&R, my interest is in exploring whether or not a wrapper around scripting languages can really work without heavy introspection or SWIG style code generation.
Principally, embedding a scripting language amounts to initializing it, feeding it with code, and stitching together an interface between it and the desired parts of your C/C++ code. In my experience most time is spent on writing code to bridge C with the script language. It’s kind of like an adaptor for calling conventions, but in C rather than native code.
The question that interests me, is whether or not C++’s standard issue functors and binders, are good enough that it could be done without having to to cuddling up to much to a specific scripting engines embedding API, for each one being embedded. In most dynamic languages the manipulations needed are pretty trivial, but C++ is rather, more traditional. Because of that curiosity, I’ve had an idea on my mind for ages, which I dub “Cassius”. The idea is, to have an interface that knows how to embed several scripting languages, and use that to interface with the scripting languages, in a way more agnostic to which scripting language you’re using.
I thought of the name as a reference both to the Roman name and to Cassius Clay, better known as Muhammad Ali—because after adding support for a embedding a few scripting languages, it might very well knock me out ^_^. The part that I’m not sure, is whether or not, it is technically possible to do it in straight C++, or if the APIs would require more than is possible without using something like SWIG.
Sometimes for a change, it’s fun to ride an idea to the end of a tunnel without trying to speculate what’s waiting there.
Random codeness
Been contemplating about a few things, that are arguably, the programmers equivalent to several mortal sins. One of these involves standardising my world around a given language setup. Yes, choosing the best tools for the job rather than the same tools can sometimes be troublesome.
The languages I’m considering, are C++ and C#. Python would be a good candidate except that I’ve waay to many lines of Python over the years lol. Behind the C++ factor, is simply put, Richard Gabriel was correct when he said “The good news is that in 1995 we will have a good operating system and programming language; the bad news is that they will be Unix and C++”. Frankly programming in C++ is a bitch. It’s not so much the language, which has plenty of warts, as it is building projects causes headaches. Most of which is a mixture of complexity and the inability of people to manage that complexity before shipping it. The other factor, being C#. I’ve come to rather like C#, because it takes the best part of languages like Java, i.e. using bytecode rather than native code, but unlike Java 6, C# 3 and up is actually a modern fucking language. Java can kiss my rebel dick. It’s retarded.
C++ gives painless (as possible) support to C code, while adding some goodies: automatic ctor/dtor invocation, formal namespace schematics, semi-generic data structures, and often disabled or unused support for exceptions and runtime type information. There are also a lot of libraries written in C++, that are less than easy to use in other languages; the fact that many are often less than easy to use in C++, is aside from the point obviously ^_^.
C# is more convenient than C++, because of a more modern syntax (Java can really fuck a duck for all the modernness of it), and because it has the ultimate in language killers, which C++ lacks—A big fucking library. Where as C++ provides stream based I/O, container based data structures, and not much else beyond your systems C library. C# has a large cross-language class library, which essentially throws networking, XML, basic graphics on top of that, and a much more portable interface to system stuff, like Process class and the file system code.
That is the big killer: libraries that are easily incorporated. C++ lacks that. In fact, about the closest you can get is throwing in Boost, POCO, Qt, or Wx. Plus a few other odds and ends.
C# is a much more pleasant language to work in and takes the pain out of compiling projects, because it really can’t get much harder than which defines to set and which files to compile into what. Something that life would be fucking great if C++ could say the same, even on a single platform group. Unlike Java, it’s also possible to build C# projects promptly and trivially combine code with many other languages.
C++ however has a much wider range of libraries readily available without needing glue code, if one can stand the bitch and a half of making them work o/.
The other day I was thinking about a young semi-student programmer that I know, and thought about presenting him a small set of “Teeth cutting” exercises. Small tasks that would serve a double purpose, help me evaluate his present aptitude for development tasks, and try and prepare him a wee bit for what his future education is likely to throw out. Unlike what seems to be the most common norm in college environments, I can also gently push in more, ahem, practical directions that what most students I’ve met have learned. I still have yet to find out if the number of stupid programmers on earth is due to the schooling or the students. Alas, that’s drifting off topic.
When I stopped thinking about the whole teeth cutting thing, I had done so because no ideas of what to use as a starting exercise had come to mind. Today while chatting, one did: a bare bones version of the UNIX tail program.
(06:30:27 PM) Spidey01: A first exercise:
language: your choice
description: implement a program called ‘tail’ that displays the last N lines of a file, where N is supplied by the user. It need not be a GUI, but can be if you wish.
goals:
A/ Minimise the scope your variables are accessible from.
B/ Describe the procedure (algorithm) you came up with for finding the last N lines in the file.
C/ Think and discuss, is there a way to improve on your algorithm?
Tail is complex enough that some C implementations are horrendously overcomplicated, yet simple enough that it is an easily completed without a gruelling mental challenge. Especially if the -n option is the only one you care about. The choice of A was chosen it’s a very common foul up among programmers, young and old a like.
I wrote a more complex program that that in C years ago as a learning process, that was more or less a fusion of the unix cat, head, and tail programs. Since the student in question was using Visual Basic .NET (oi), I opted to use C# so as to keep things at least, in the same runtime family. Here is a listing of the example code I wrote, the display here was done by feeding it into gvim and using :TOhtml to get syntax highlighted HTML to post here, than clipping a few things, hehe. The gvim theme is github.
1 /**
2 * comments having // style, are notes to young readers.
3 *
4 * CAVEATS:
5 * line numbers are represented by int, and thus have a size limit imposed by
6 * the 32-bit integer representation of the CLR. Whether the users computer
7 * will run out of memory before that is irrelevant.
8 *
9 * If there are less lines read than requested by the user, all lines are
10 * displayed without error message. I chose this because the error message
11 * would be more annoying than useful.
12 */
13
14 using System;
15 using System.IO;
16 using System.Collections.Generic;
17
18 class Tail {
19 enum ExitCode { // overkill
20 Success=0,
21 Failure=1,
22 NotFound=127,
23 }
24
25 static void Main(string[] args) {
26 if (args.Length != 2) {
27 usage();
28 }
29
30 using (var s = new StreamReader(args[1])) {
31 try {
32 var n = Convert.ToInt32(args[0]);
33 foreach (var line in tail(n, s)) {
34 Console.WriteLine(line);
35 }
36 } catch (FormatException) {
37 die(ExitCode.Failure,args[0] + ” is not a usable line number”);
38 } catch (OverflowException) {
39 die(ExitCode.Failure, args[0] + ” to big a number!”);
40 }
41 }
42 }
43
44 static void usage() {
45 Console.WriteLine(“usage: tail.exe number file”);
46 Console.WriteLine(“number = number of lines to display from “
47 +“end of file”);
48 Console.WriteLine(“file = file to read from tail”);
49 Environment.Exit((int)ExitCode.Success);
50 }
51
52 // Instead of doing the display work itself, returns a sequence of lines
53 // to be displayed. This means this function could be easily used to fill
54 // in a textbox in a GUI.
55 //
56 // It could also take a delegate object to do the display work, thus
57 // improving runtime performance but that would be less flexible. In this
58 // particular programs case, just doing Console.WriteLine() itself would
59 // be OK. See the foreach loop over tail() up in Main() for reference.
60 //
61 // This method also sucks up memory like a filthy whore because it stores
62 // the whole file in memory as a IList<T>. That’s fine for a quick and
63 // dirty protype. In real life, this should use a string[] array of length
64 // ‘n’ and only store that many lines. That way it could handle files 5
65 // billion lines long just as efficently as files 5 lines long.
66 //
67 // I chose not to make that change in this example, in order to make the
68 // code as simple to read as possible.
69 //
70 // Incremental development + code review = good idea.
71 //
72 static IEnumerable<string> tail(int n, TextReader s) {
73 string line;
74 var list = new List<string>();
75
76 try {
77 while ((line = s.ReadLine()) != null) {
78 list.Add(line);
79 }
80 } catch (OutOfMemoryException) {
81 die(ExitCode.Failure, “out of memory”);
82 } catch (IOException) {
83 die(ExitCode.Failure, “error reading from file”);
84 }
85
86 if (n > list.Count) { // a smart bounds check!
87 n = list.Count;
88 }
89
90 // implecations of a GetRange() using a shallow copy rather than a
91 // deep copy, are left as an exercise to the reader.
92 return list.GetRange(list.Count – n, n);
93 }
94
95 static void die(ExitCode e, string message) {
96 Console.Error.WriteLine(“tail.exe: “ + message);
97 Environment.Exit((int)e);
98 }
99 }
100
It was about this time, that I decided that implementing a simple method like Perls die() or BSDs err() would be convenient. Thus I implemented die() and replaced the repetitive error code. Functions are almost like a reusable template in that way. Then I decided that ExitCode was a better than for the enumeration than ErrorCodes, since it was being used more generally as an exit status (code) than an error report; unlike Microsoft I do not consider Success to be an error code ;). That was a simple global search and replace, or :%s/ErrorCodes/ExitCode/g in vim. Followed by a quick write (save) and recompile to test. Job done.
While I was at it, I also had an intentional bug encoded into the exception handlers for Convert, originally n variable was in a higher scope than the Convert (the using instead of try block). The error message for handling FormatException, used n.ToString() and the one for OverflowException used args[0]. The bug here was a subtle food for thought: one displays the result of the conversion, which might not match what the user supplied -> thus confusing the user. The other displayed what the user entered, which might not be what the program had tried to used. That also pushes an interesting thought on your stack, since the same data is used by both die()’s why do we have to write and maintain it twice? Alas, I realised the n variable was in too wide a scope and thus made that mind-play a moot point (by removing n from the scope of the catch statements). If you recall: using minimal scope for variables was actually the intent of the exercise, not error handling and code reuse.
Next I focused on implementing tail(). At first it was a simple. Just take a number and a StreamReader, and do a little loop over reading lines—for a quick test. When I checked the documentation on MSDN, I noticed that StreamReader was an implementation rather than a base class for TextReader. I always find that weird, but that’s outside the scope of this journal entry. Thus I made the using statement in Main() create a StreamReader and pass it to tail(), now taking a TextReader. Originally it also had a void return type, and simply printed out its data. I did that to make testing easier. The comments above make a sufficient explanation of why IEnumerable
The heart of it of course, is just feeding lines from a file into a generic List of strings. Since the exceptional possibilities are more straightforward, I wrote the catch blogs first. After that it is merely the question of extracting the correct lines from the tail end of the list. That’s a simple one to one (1:1) abstraction to how you might do it manually. I believe simple is the best way to make a prototype. Since the student in question was joking about how his implementation would likely crash if the line numbers were out of whack from what’s really in the file, I was sure to include a simple check. If the # of lines requested is greater than what really is there, just scale down. Volia. The comments at the top of the listing above, show why there is no error message displayed.
Extracting the items was a bit more of a question, my first implementation was a simple C-style for loop over the list using Console.WriteLine(). In the conversion to returning the data to be displaced, in which the tail() call in Main() became the above foreach loop. I added the comment about GetRange() more so as food for thought (from a code reuse and optimizational perspective). The math needed to extract the correct range of lines is trivial.
I then took a few moments to look at things over, doing a sort of code review. A few things were rearranged for clarity. I also introduced a bug, breaking the specification goals. If you look close enough at tail(), you will see that the variable line is only used inside the try block, yet it is declared at method scope. The #1 goal of the exercise was to avoid such things, hehe. I also thought about adjusting things to use an n sized cache of lines, rather than slurping the entire file in memory but decided against it. To keep the code easier to read, since the target-reader knows neather C# nor a lot of programming stuff, I just left comments noting that pro and contra of the matter.
Some people might find the method naming style odd for C#, but it’s one that I’ve come to like, thanks to Go and C#. I.e. publicly exposed functions get NamesLikeThis and those that ain’t, get namesLikeThis. Although personally I prefer C style names_like_this, aesthetically speaking.
The test file I used during the run was this:
line one
line two
line three
line four
line five
and most tests were done using various adjustments on:
terry@dixie$ gmcs tail.cs && mono tail.cs 2 test.txt
After sending the files over, I also whipped up a Visual Studio solution in MonoDevelop, and than realised that I left a rather unprofessional bug. If the filename in args[1] didn’t exist, the program would crash. That was easily fixed on the fly.
Overall the program took about an hour and a half to write. For such a simple program, that’s actually kind of a scar on my pride lol. But hey, I’ve barely written any code this month and I had to look up most of the system library calls in MSDN as I went along, I also tried to make it more polished than your typical example code. Which usually smells.
I can also think of a few ways to incrementally adopt that first exercise, into several other exercises. That might be useful.