Improving latency while keeping your sanity

We had this problem in Nemo 0.2.0 that it would use quite a lot of cpu while it was indexing. I kind of knew this could happen but hoped that a transaction optimization to the database could improve the performance enough to make this problem irrelevant. Sadly it did not. The indexing code works by running an event loop in a single thread and then having two queues, one for tasks that needs to be completed fast and one for background tasks. Each of the queues are basically just a list of function pointers that is popped when a new task need to run.

The nice property of the indexing code is that it would put each task into simple function that one could read from start to end. The same property that is usually associated with programming using threads compared to programming using events where the code has to be broken into several parts. I feared that I had to go the event way and break the indexing up into smaller functions and do a lot of tedious book keeping to make sure the code could be resumed.

After thinking about the problem for a little while it dawned on me that this is the a perfect case for yielding. Yielding allows you to suspend the execution of a function and return immediately which is perfect for this problem since the runtime will automatically do all the book keeping for you. Another really nice thing is that it turns out that converting the code was extremely simple.

Instead of storing simple function pointer in the queue, we just store an object of type IEnumerable<bool> which will wrap the function and make sure we can call it multiple times. When the Enumerable is called it will keep returning true as long as there’s still work to do in the function wrapped. So the function could just return false and it would work the exact same way as it did before. So one can think of the IEnumerable as a sort of higher order function.

So we just need to wrap the normal functions with the following code:

private IEnumerator TurnEnumerableIntoEnumerator(IEnumerable enumerable)
{
	IEnumerator < bool > t = enumerable.GetEnumerator();

	while (t.MoveNext()) {
		yield return t.Current;
	}

	// make compiler happy
	yield break;
}

And make the function wrapped return IEnumerable instead of void. Then the function can at any point in the function do a simple yield return true; to signal that the function can be resumed to complete its task. So this way it is also very dynamic since a function can return if it has done X amount of work, or if X amount of time has passed or if it is signalled by some other code that a low latency task has been added to the queue. It really adds a lot of flexibility while still keeping the nice structure of a simple function. The complete code is in nemo 0.2.1 under metadata/MetadataStore.cs.