I wrote that in about 3 hours thanks to the wonderful support in MMS for writing scripts in python.
I remember reading in Paul Grahams excellent hackers and painters book, about how he liked the language he worked in to be dynamically typed as it provided him with more flexibility. The first time I read it I was still very much entrenched in the bondage and discipline language C++, and as so I found the statement wrong, but at the same time interesting. There had to be some deeper insight into this. A couple of years later, which would be the present day it just dawned on me that the biggest problem with statically typed languages are not the typing, you can get used to that. Its rather the fact that it becomes very hard to refactor the code afterwards. And worse, it shifts the burden of figuring out the right data structures and the right types to the beginning of the coding process instead of afterwards. Thus going directly against the good programming practice of writing for clarity first and only rewriting code if it has been deemed inefficient by a profiler.
Python is dynamically typed and after spending a year programming (this was about two years ago) in it I still found the fact that the types where missing to be quite disturbing. Especially the fact that you could have a branch in your program that had a small spelling error and only failing after you had run the program for several hours maybe. Or worse shipped it off to the customer. I realize now that this was mostly contributed to the fact that I was still coding Python as if I was coding c++. Something that the last two programming books (common lisp and programming erlang) I have been reading really have put to light.
There are tricks you can use in statically typed languages that can make the program easier to refactor later on. The var symbol introduced in C# 3.0, using typedefs in C++ and actually the whole standard library in C++ has had this covered quite nicely with the use of iterators. Still I think that maybe something like the duck types in boo brings about some interesting mix of the two styles. My very limited experiences with the boo so far has not been enough to determine if they the implicit type system does more harm that good. I sometimes find myself cursing over the fact that the system can’t detect the types for me, and at other times are happy that the system has found some trivial errors for me for free.
We had trouble on the Nemo blog because some stinky casino spam overlord had found our nice little blog. Luckily the spam started of slow so I had a change to try different solutions to fight the problem. I went trough updating wordpress, to update Peter’s Custom Anti-Spam, to installing akismet, to try reCAPTCHA before I found the perfect solution WP hashcash. None of the other techniques got 100% of the spam. The wordpress plugin uses the clever hashcash technique (first proposed to fight email spam) implemented in javascript, to force a poster to pay a small amout of cpu time for each post. Supposedly though, the biggest effect just from requiring the client to have javascript installed in order to post 😉
We had this problem in Nemo 0.2.0 that it would use quite a lot of cpu while it was indexing. I kind of knew this could happen but hoped that a transaction optimization to the database could improve the performance enough to make this problem irrelevant. Sadly it did not. The indexing code works by running an event loop in a single thread and then having two queues, one for tasks that needs to be completed fast and one for background tasks. Each of the queues are basically just a list of function pointers that is popped when a new task need to run.
The nice property of the indexing code is that it would put each task into simple function that one could read from start to end. The same property that is usually associated with programming using threads compared to programming using events where the code has to be broken into several parts. I feared that I had to go the event way and break the indexing up into smaller functions and do a lot of tedious book keeping to make sure the code could be resumed.
After thinking about the problem for a little while it dawned on me that this is the a perfect case for yielding. Yielding allows you to suspend the execution of a function and return immediately which is perfect for this problem since the runtime will automatically do all the book keeping for you. Another really nice thing is that it turns out that converting the code was extremely simple.
Instead of storing simple function pointer in the queue, we just store an object of type IEnumerable<bool> which will wrap the function and make sure we can call it multiple times. When the Enumerable is called it will keep returning true as long as there’s still work to do in the function wrapped. So the function could just return false and it would work the exact same way as it did before. So one can think of the IEnumerable as a sort of higher order function.
So we just need to wrap the normal functions with the following code:
private IEnumeratorTurnEnumerableIntoEnumerator(IEnumerable enumerable) { IEnumerator < bool > t = enumerable.GetEnumerator(); while (t.MoveNext()) { yield return t.Current; } // make compiler happy yield break; }
And make the function wrapped return IEnumerable
Nemo 0.2.1 released
Did the release of nemo 0.2.1 today. The release is the result of a large amount of massaging of the nemo code to use less cpu and memory (a garbage collector suddently makes these interlinked in a new one is not usually familiar with when coming from a language like C++). Then some profiling and massaging of GTK# which is sadly still not 100% ready for prime time. And finally some poking at mono code for which I’m not ready with a patch yet (tomorrow it will be), but I’ve improved the performance of the inotify backend of the filesystemwatcher by quite a bit especially when watching a large number of directories.
mms 1.1.0 rc2 released
I’m pleased to announce a new release candidate of mms 1.1.0. Biggest improvements are a weather plugin, vsync in opengl, better lyrics support besides of course a large amount of bug fixes. The number of changes since rc1 has been quite high, on avarage 2.71 commit every day thanks to all the wonderful contributions.
Iimplement brain damage
Had this annoying bug today where something that seemed perfectly resonable just didn’t work. After much investigation it appears that once again brain damage from Java has managed to over into C#. The problem is illustrated with the following code (You can ignore the Tuple for now):
public struct Tuple < TFirst,TSecond > { public TFirst first; public TSecond second; public Tuple(TFirst first, TSecond second) { this.first = first; this.second = second; } } [...] Tuple < int,string > t = new Tuple < int,string >(1, "1"); Tuple < int,string > t2 = new Tuple < int,string >(2, "2"); List < tuple < int,string > > lt = new List < tuple < int,string > >(); lt.Add(t); lt.Add(t2); List < tuple < int,string > > lt2 = new List < tuple < int,string > >(); lt2.Add(t); lt2.Add(t2); System.Console.WriteLine("eq {0}, == {1}", lt.Equals(lt2), lt == lt2);
Which gives the following result:
eq False, == False
Ok that was strange, in Python and C++ one doesn’t have to use Equal or anything similar and furthermore == gives the correct result since it compares elements memberwise instead of just checking the reference. I recalled that in Java one has to use Equal on strings, since == just compares references. So I googled around and found an explaination in point in the following link at 6.7. Note all the special cases. The best part is the following paragraph: “The implementation of Equals() in System.Object (the one you’ll inherit by default if you write a class) compares identity, i.e. it’s the same as operator==”. Apparently List
public static IEnumerable < tuple < T1,T2 > > zip < T1,T2 > (IEnumerable < T1 > l1, IEnumerable < T2 > l2) { IEnumerator < T1 > i1 = l1.GetEnumerator(); IEnumerator < T2 > i2 = l2.GetEnumerator(); while (i1.MoveNext() && i2.MoveNext()) yield return new Tuple < T1,T2 > (i1.Current, i2.Current); } public static bool sorted_lists_equal < T > (List < T > l1, List < T > l2) where T:IEquatable < T > { if (l1.Count != l2.Count) return false; foreach (Tuple < T, T > t in zip < T,T >(l1, l2)) { if (!t.first.Equals(t.second)) return false; } return true; }
IEquatable
public struct Tuple < TFirst,TSecond > : IEquatable < Tuple < TFirst,TSecond > > { public TFirst first; public TSecond second; public Tuple(TFirst first, TSecond second) { this.first = first; this.second = second; } public bool Equals(Tuple < TFirst,TSecond > other) { return first.Equals(other.first) && second.Equals(other.second); } public static bool operator==(Tuple < TFirst,TSecond > lhs, Tuple < TFirst,TSecond > rhs) { return lhs.Equals(rhs); } public static bool operator!=(Tuple < TFirst,TSecond > lhs, Tuple < TFirst,TSecond > rhs) { return !(lhs == rhs); } }
The last two functions was added because now that we’re implementing the IEquatable interface the compiler doesn’t seem to want to implement == and !=.
So instead of the following Python code:
a = [(1,"1"), (2,"2")] b = [(1,"1"), (2,"2")] a == b
We have to do the big mess above :-/
Long time since I’ve blogged about Nemo. Since the initial release a little over a month ago we have had two major releases, the first one a bugfix release to fix some of the defects reported and the 0.2 release which adds support for beagle through Xesam and lots of other nice enhancements like pagination on the search results popup and better indexing performance. The software is still fresh but I consider this to be the first release that I would feel comfortable recommending to a stranger 🙂
I used to view the IMDb voting system as one of the greatest examples of why the wisdom of crowds is often much more in touch with reality than a panel of experts but lately some “b-movies” have been getting excellent initial scores. First it was Snakes on a Plane that started above 8.0 but now is at 6.5, and then we have AvP2 which apparently started at above 7.5 but now is below 6. The following comment on IMDb more or less summed up my feelings:
This movie really sucked. In fact, after using the IMDb for some years, I just now registered to be able to vote and comment on how bad this movie sucks.
Just before I went to see this, I looked it up here and was positively surprised to find a 7.6 rating with >800 votes. But what I saw at the theater makes me think that all the money the studio saved on writers and cast was spend on > 400 people who rated this thing 10.
Are movie studies actively abusing systems like IMDb to try and lure people into the theaters? And if they are, what will iMDB do about to not become irrelevant?