Category Archives: c#

Language design

When talking to people about the benefits of clojure often people point out that most modern languages have evolved to “support” the functional paradigm with lambda functions. The argument is that one can stay in the familiar safe environment of imperative programming and use the functional constructs when they fit the problem better. That is a very valid and good strategy but I’ll show in the following that it brings a whole lot of accidental complexity with it. Some of this is specific to the way .NET is implemented and some are a clash of the functional paradigm with the imperiative.

Exhibit 1:

My favorite example is what has been known in the office as the lambda bug. Coming from a background of having been coding in C++ for a couple of years, we switched to python in the last semesters of University. The lambda bug manifests itself when one combines two classical constructs of two different programming models: for-loop iteration and closures:

foreach (var i in list)
save_callback_function_for_later_use(x => System.Console.WriteLine(i));

Given that list contains the numbers 1,2,3,4,5 guess what the code will print when all the callback functions are called?

Exhibit 2:

In C# events seems to have been programmed to a pre-functional model and never updated when they added functional constructs. The interface for an event is += for adding events and -= for removing. This is all fine for point-and-click GUI programming. Meaning it works good on something like event += somefunc; and not event += delegate { use_the_power_of_closures_Luke ) 🙁

To add injury to insult, events doesn’t even support something like clear().

Exhibit 3:

In clojure all the seq library is lazy. Thus once one has figured that out (I must say it took a little while for me), everything behaves as would be expected. In C# some things are lazy (linq) while others are not. Imagine list contains 1,2,3. Then the following works:

var changed = list.ConvertAll(x => x * 2);
list.Clear();
list.Add(something)
list.AddRange(rest);

But the following doesn’t work because it behaves lazy:

var rest = observablelist.where(x => x != 1);
observablelist.Clear();
observablelist.Add(something)
observablelist.AddRange(rest);

Try guessing what the outcome of running the code above will be. I’ll confess that it was different from what I expected it to be.

Again it’s mixing two styles of programming, functional lazy code with imperative mutable objects.

Finally, my argument is that a language with clean design principles, even with relatively steep learning curve, far outweights the complexity of industry standard languages in the long run.

The problem with static typing in programming languages

I remember reading in Paul Grahams excellent hackers and painters book, about how he liked the language he worked in to be dynamically typed as it provided him with more flexibility. The first time I read it I was still very much entrenched in the bondage and discipline language C++, and as so I found the statement wrong, but at the same time interesting. There had to be some deeper insight into this. A couple of years later, which would be the present day it just dawned on me that the biggest problem with statically typed languages are not the typing, you can get used to that. Its rather the fact that it becomes very hard to refactor the code afterwards. And worse, it shifts the burden of figuring out the right data structures and the right types to the beginning of the coding process instead of afterwards. Thus going directly against the good programming practice of writing for clarity first and only rewriting code if it has been deemed inefficient by a profiler.

Python is dynamically typed and after spending a year programming (this was about two years ago) in it I still found the fact that the types where missing to be quite disturbing. Especially the fact that you could have a branch in your program that had a small spelling error and only failing after you had run the program for several hours maybe. Or worse shipped it off to the customer. I realize now that this was mostly contributed to the fact that I was still coding Python as if I was coding c++. Something that the last two programming books (common lisp and programming erlang) I have been reading really have put to light.

There are tricks you can use in statically typed languages that can make the program easier to refactor later on. The var symbol introduced in C# 3.0, using typedefs in C++ and actually the whole standard library in C++ has had this covered quite nicely with the use of iterators. Still I think that maybe something like the duck types in boo brings about some interesting mix of the two styles. My very limited experiences with the boo so far has not been enough to determine if they the implicit type system does more harm that good. I sometimes find myself cursing over the fact that the system can’t detect the types for me, and at other times are happy that the system has found some trivial errors for me for free.

Improving latency while keeping your sanity

We had this problem in Nemo 0.2.0 that it would use quite a lot of cpu while it was indexing. I kind of knew this could happen but hoped that a transaction optimization to the database could improve the performance enough to make this problem irrelevant. Sadly it did not. The indexing code works by running an event loop in a single thread and then having two queues, one for tasks that needs to be completed fast and one for background tasks. Each of the queues are basically just a list of function pointers that is popped when a new task need to run.

The nice property of the indexing code is that it would put each task into simple function that one could read from start to end. The same property that is usually associated with programming using threads compared to programming using events where the code has to be broken into several parts. I feared that I had to go the event way and break the indexing up into smaller functions and do a lot of tedious book keeping to make sure the code could be resumed.

After thinking about the problem for a little while it dawned on me that this is the a perfect case for yielding. Yielding allows you to suspend the execution of a function and return immediately which is perfect for this problem since the runtime will automatically do all the book keeping for you. Another really nice thing is that it turns out that converting the code was extremely simple.

Instead of storing simple function pointer in the queue, we just store an object of type IEnumerable<bool> which will wrap the function and make sure we can call it multiple times. When the Enumerable is called it will keep returning true as long as there’s still work to do in the function wrapped. So the function could just return false and it would work the exact same way as it did before. So one can think of the IEnumerable as a sort of higher order function.

So we just need to wrap the normal functions with the following code:

private IEnumerator TurnEnumerableIntoEnumerator(IEnumerable enumerable)
{
	IEnumerator < bool > t = enumerable.GetEnumerator();

	while (t.MoveNext()) {
		yield return t.Current;
	}

	// make compiler happy
	yield break;
}

And make the function wrapped return IEnumerable instead of void. Then the function can at any point in the function do a simple yield return true; to signal that the function can be resumed to complete its task. So this way it is also very dynamic since a function can return if it has done X amount of work, or if X amount of time has passed or if it is signalled by some other code that a low latency task has been added to the queue. It really adds a lot of flexibility while still keeping the nice structure of a simple function. The complete code is in nemo 0.2.1 under metadata/MetadataStore.cs.

Iimplement brain damage

Had this annoying bug today where something that seemed perfectly resonable just didn’t work. After much investigation it appears that once again brain damage from Java has managed to over into C#. The problem is illustrated with the following code (You can ignore the Tuple for now):

public struct Tuple < TFirst,TSecond >
{
    public TFirst first;
    public TSecond second;

    public Tuple(TFirst first, TSecond second)
   {
	this.first = first;
	this.second = second;
   }
}

[...]

Tuple < int,string > t = new Tuple < int,string >(1, "1");
Tuple < int,string > t2 = new Tuple < int,string >(2, "2");

List < tuple < int,string > > lt = new List < tuple < int,string > >();
lt.Add(t);
lt.Add(t2);

List < tuple < int,string > > lt2 = new List < tuple < int,string > >();
lt2.Add(t);
lt2.Add(t2);

System.Console.WriteLine("eq {0}, == {1}", lt.Equals(lt2), lt == lt2);

Which gives the following result:

eq False, == False

Ok that was strange, in Python and C++ one doesn’t have to use Equal or anything similar and furthermore == gives the correct result since it compares elements memberwise instead of just checking the reference. I recalled that in Java one has to use Equal on strings, since == just compares references. So I googled around and found an explaination in point in the following link at 6.7. Note all the special cases. The best part is the following paragraph: “The implementation of Equals() in System.Object (the one you’ll inherit by default if you write a class) compares identity, i.e. it’s the same as operator==”. Apparently List doesn’t to that, despite their efforts to help, great… So we’ll have to do that ourselves:

public static IEnumerable < tuple < T1,T2 > > zip < T1,T2 >
(IEnumerable < T1 > l1, IEnumerable < T2 > l2)
{
   IEnumerator < T1 > i1 = l1.GetEnumerator();
   IEnumerator < T2 > i2 = l2.GetEnumerator();

   while (i1.MoveNext() && i2.MoveNext())
       yield return new Tuple < T1,T2 > (i1.Current, i2.Current);
}

public static bool sorted_lists_equal < T > (List < T > l1, List < T > l2)
where T:IEquatable < T >
{
   if (l1.Count != l2.Count)
       return false;
   foreach (Tuple < T, T > t in zip < T,T >(l1, l2)) {
        if (!t.first.Equals(t.second))
              return false;
         }
   return true;
}

IEquatable is an interface what basically says that it will compare by value. But even though our Tuple implementation is a struct and thus is a ValueType it doesn’t implement this interface (the Equals method). It instead automatically defines the == operator to work as one expects since it’s a ValueType. So we have to change Tuple:

public struct Tuple < TFirst,TSecond >
: IEquatable < Tuple < TFirst,TSecond > >
{
   public TFirst first;
   public TSecond second;

   public Tuple(TFirst first, TSecond second)
   {
	this.first = first;
	this.second = second;
   }

   public bool Equals(Tuple < TFirst,TSecond > other)
   {
      return first.Equals(other.first) && second.Equals(other.second);
   }

    public static bool operator==(Tuple < TFirst,TSecond > lhs,
    Tuple < TFirst,TSecond > rhs)
    {
	return lhs.Equals(rhs);
    }

    public static bool operator!=(Tuple < TFirst,TSecond > lhs,
    Tuple < TFirst,TSecond > rhs)
    {
	return !(lhs == rhs);
    }
}

The last two functions was added because now that we’re implementing the IEquatable interface the compiler doesn’t seem to want to implement == and !=.

So instead of the following Python code:

a = [(1,"1"), (2,"2")]
b = [(1,"1"), (2,"2")]
a == b

We have to do the big mess above :-/

Finished reading Practical Common Lisp

Finally got through the mighty Practical Common Lisp tome. The style of the book is written in a nice mix of theory and practice (with relevant and good examples). My friend Lau asked me why on earth I would want to read a book on Lisp? A fair question since Lisp is really old, actually measured in computer time it might even be called ancient. But I had two main motivations for reading the book, to become a better programmer and secondly to better understand new language features introduced in languages like Boo and C#. Just look at the new LINQ features in C# 3.0 and specifically this video.

I wholeheartedly recommend this book to anyone interesting in a good programming book.

C# from a C++ developer perspective

Been wanting to get this out of my system for a while and since I can’t sleep I might as well do something productive.

During my time in iola I’ve begun writing code in C#. I’ve written quite a lot of C++ in the past and still does, so it was quite interesting for me to see what the next iteration of C would be like when I started out with this. This next section is written in a very practical sense, since at the end of the day, that is really what matters.

The nice stuff:

  • Fast compiler. I really think this is big. The slow compile time in C++ is contributed by several things, templates are really big here. And since they need to live in header files changing them can easily trigger a cascade effect of recompilation that is extremely detrimental to the whole coding experience. There are hacks around this but they are not really optimal solutions. When I did almost all my coding in C++ I didn’t think of this as a big problem, but once you’ve been seduced by the dark side (Take that Siebel ;-)) you really start to notice it a lot. And the fast compile time of C# makes it much more bare able to write in a statically typed language again.
  • Portable library with a lot of useful stuff like threading, network communication, File System abstraction, serialization etc.
  • No more dreaded header files. While this may seems strange at first (since header files are a nice way of describing the interface of the classes with comments that don’t get lost in the smoke of implementations), it’s actually really nice not having the headache of keeping
    the header files in sync with the implementation, making it much easier to move code around. Another big thing here is the whole template thing.
  • Delegates (or function pointers if you will) are extremely powerful and having them built into the language makes them very easy to use.
  • Support for yield.
  • Being able to use the power of .net (Newer in a million years thought utter such words) to mix and match languages. It’s actually quite trivial to write a piece of your program in Boo and have it work with the rest of the code. This is in particularly useful for writing unit tests.
  • Perhaps the biggest improvement, although this varies greatly from ones perspective, is that the whole language raises the bar. In the sense that a lot of the BAD programming practice that has been associated with C++, mostly caused by people simply using the language in the wrong way (char pointers, not taking care of memory management, generally pointers comes to mind here) has been stripped away from the language. A really good example of this is to compare the abomination that is pre .net C++ windows code and .net C# code.
  • C# is constantly improving and is doing so quite rapidly. Compared to the extremely slow pace of C++ (still hoping for a new standard in this decade!) it’s really refreshing to see a lot of good new ideas being pushed out to people.

The bad stuff:

  • Two thumbs up to the designer of C# for stepping up and actually implementing generics in C#. That being said, they are just a shadow of templates in C++ and the whole I-Implement-an-Interface crap is just plain dumb. Templates in C++ compared to generics in C# feels much more like macros in lisp, which if one thinks about it is quite the irony.
  • C# is clearly very proud of their garbage collection and thus sees no problem in demanding that everything must be created using the dreaded new syntax. While writing something like: List<string> l = new List<string>(); is not that big of a problem in something like monodevelop or visual studio. But it’s a big pain in the ass if you’re using the one and true editor. Another thing related to the GC is that needed to invent new syntax (using) to support the good programming practice of RAII instead of just doing it by creating objects on the stack and letting them fall out of scope and thus become deconstructed automatically.
  • C# falls into the trap of thinking that everything must be an object and that object orientation is the true path to righteousness. Namespaces was invented for a reason you know?
  • The built in libraries and especially the documentation is sometimes really really bad. Like methods returning results in different threads than the caller and not writing anything in the documentation about this.

Overall the experience so far has been quite positive and as far as I can see it will only continue to improve. That being said, there are still use cases where C++ is much more optimal and combined with a lot of legacy code C++ will continue to be used for a foreseeable future.

The power of lisp part 2

As mentioned earlier, lisp is incredible at abstracting away small everyday stuff you write over and over. A good example is this way one has to define a local variable in order to make variables sucked into a delegates from the surroundings static in C#. As mentioned briefly in this blog post, one needs to change the loop to read like the following code for it to print 0 to 9 instead of 10 times 9:

foreach (int i in numbers) {

        int j = i;

        l.Add(delegate() { System.Console.WriteLine(j); };
}

In lisp one would see that this is a pattern and simply abstract it away like this. As one can see, the macro static-loop is more or less exactly the same as the pattern which luckily made it quite easy to write.

(defmacro static-loop ((i container) &body body)
 (with-gensyms (j)
  `(loop for ,j in ,container do
    (let ((,i ,j))
     ,@body))))

The only small trick is the with-gensyms macro which prevents leaking the temporary variable j into the scope of where it will be substituted into. So in that regard the macro is even nicer than rolling your own temp variable hack 🙂

For reference the python example works the same way, but allows one to use the lambda function to redefine variables, although one must note that the static-loop solution is more general in that it will make the loop variable local for the whole scope and not just for one lambda function. Funny thing is that I had originally written it like this, which actually worked, although that means that the closure binding is not only by reference but by name.

I tried writing the macro in boo but I failed. Apparently their macro support is still quite new and there’s close to no examples to draw inspiration from. If anyone could help me out with the solution please mail it to me or even better add it as a comment.