I’ve written about profiling before in the general case. But when it comes to multithreading, there are a few more factors to consider such as lock contention and cache invalidation that affect performance. And this is where mutrace (mutex profiler) comes into the picture. The profiler has a very low overhead compared (no perceived delay for me) to other solutions and gives quite nice results. I tested it on My Media System (C++) and Nemo (C#) and got interesting results for both. I knew more or less what the results for mms would be, but the results for Nemo was a bit suprising. The lock contention was significantly higher than for mms, even though I didn’t really use a multithreaded design for Nemo. So I guess it comes from the mono platform itself. Sadly it seems that mono programs aren’t compiled with -rdynamic, so the results that one gets are quite hard to decipher.
Category: profiling
I remember reading in Paul Grahams excellent hackers and painters book, about how he liked the language he worked in to be dynamically typed as it provided him with more flexibility. The first time I read it I was still very much entrenched in the bondage and discipline language C++, and as so I found the statement wrong, but at the same time interesting. There had to be some deeper insight into this. A couple of years later, which would be the present day it just dawned on me that the biggest problem with statically typed languages are not the typing, you can get used to that. Its rather the fact that it becomes very hard to refactor the code afterwards. And worse, it shifts the burden of figuring out the right data structures and the right types to the beginning of the coding process instead of afterwards. Thus going directly against the good programming practice of writing for clarity first and only rewriting code if it has been deemed inefficient by a profiler.
Python is dynamically typed and after spending a year programming (this was about two years ago) in it I still found the fact that the types where missing to be quite disturbing. Especially the fact that you could have a branch in your program that had a small spelling error and only failing after you had run the program for several hours maybe. Or worse shipped it off to the customer. I realize now that this was mostly contributed to the fact that I was still coding Python as if I was coding c++. Something that the last two programming books (common lisp and programming erlang) I have been reading really have put to light.
There are tricks you can use in statically typed languages that can make the program easier to refactor later on. The var symbol introduced in C# 3.0, using typedefs in C++ and actually the whole standard library in C++ has had this covered quite nicely with the use of iterators. Still I think that maybe something like the duck types in boo brings about some interesting mix of the two styles. My very limited experiences with the boo so far has not been enough to determine if they the implicit type system does more harm that good. I sometimes find myself cursing over the fact that the system can’t detect the types for me, and at other times are happy that the system has found some trivial errors for me for free.
Sometimes nice tools just come dumping down from the sky and PowerTOP is the latest of such programs. The Linux kernel 2.6.21 has become tickless which means that instead of waking up at specific intervals it will just sleep until it needs to do something next. This clearly has the potential to save quite a lot of power, but only if it really can sleep longer than it otherwise would have.
Intel has created a tool, PowerTOP, to test which programs causes the kernel to wake up. And thanks to the power of free software they have also been able to fix quite a lot of the programs people use a lot. So tonight I decided that I would run MMS and see how it performed. I installed the new kernel and the tool on my laptop and was ready to test. Initially the kernel was waking up between 60-80 times a second but after starting MMS it quickly rose to 1000! Something was definitely wrong. After a long debug session I finally found the cause of this massive spike in kernel wakeups: SDL. For some reason SDL was initialized using SDL_Init(SDL_INIT_VIDEO|SDL_INIT_TIMER). This small SDL_INIT_TIMER flag caused SDL to start a thread which just slept for 10ms, woke up, just to sleep for 10ms again over and over again…
Luckily the fix was easy and now MMS causes no mentionable extra wake ups 🙂
Inspired by the profiling work on Abiword by Hubert Figuiere, I installed Kcachegrind to try and profile mmsv2. His two posts on his blog explains very well how to find and fix bottlenecks in an application. What I thought was really cool about Kcachegrind, compared to other profiling solutions I have tried, was the way it handled threads and C++ perfectly. Guess being a KDE project helps a lot in this area 🙂