Monday, October 18, 2004

optimization by 10,000 commits

this morning while brushing my teeth i remembered an interview i read with some of the PostgreSQL developers around the time of one of the 7.x releases. the interviewer noted how much faster PostgreSQL was compared with an earlier 7.x version and especially when compared to the 6.x series, and asked what they had done to achieve these results. a developer responded that there wasn't any specific development in particular that had caused this speeed boost. rather, they had made a 1% improvement here, and a .5% improvement there, etc. eventually these little optimizations piled up across the code base and their combined weight created a noticeable improvement. a lack of obvious optimization targets led to a holistic approach of "improve whatever you can, even if it seems trivial right now."

coming up on KDE 3.4 and KDE 4.0 after that, i think we're in a similar situation to the PostgreSQL developers above. Qt4 may give us some nice speed boosts and lower our memory consumption in KDE4 "for free" (though the impact of Qt4 is yet to be seen and measured), but we have a large codebase in kdelibs and kdebase that forms the "base KDE platform" that we have complete control over. there may not be any single stretch of code that is needlessly responsible for 5% of execution time, but i be we can find 50 places that are each responsible for 0.1% of execution time that can be optimized.

a nice thing about this approach is that it doesn't require much distraction from other development efforts. improve what's near, and we'll see a cumulative improvement.

(p.s. i'm not suggesting radical, code-uglifying micro-optimizations be applied everywhere, as maintainability and reliability are equally important. but our code base isn't so tight that those are our only optimization possibliities.)

5 comments:

Anonymous said...

I fully agree, computers get faster and faster, but my beloved KDE 3.3 doesn't "feel" any faster than old 2.2.

Anonymous said...

I highly support harewith you that pretty good idea. I really think performance is important. Even more: apparent performance (to the user) is more imporetant many times.

Something I woulkd really love would be that kde startup were by default fast. Compare for example with Gnome statup time. It's really 3x slower or so, despite gnome was before as slow or even slower than kde to load.

For improvingg that there a lot of things that even me, who is not kde programmer, can hink about: for first, try to redseign the startup so that it doesn't need to load much crap. Modularize things. Show the desktop fast (only once finished loading), because now you can see how every icon is being drawed and moved to its position mopstly one by one. Then make the startup script smart: don't look for netscape plugins if the dirs are not updated, etc. Redesign KDE startup so that it can be highly preloaded by a init level, etc. Finally you could get KDE loaded in a few seconds.

Cheers,
Edulix.

Anonymous said...

Hi, I want to share a experience that I have optimizing the kde code.
The last two days ago, I look in the kde 3.3.1 code (kdelibs and kdebase especifically) and find any it++ inside a loop and change to ++it. I was become surprised with the result. A huge gain in speed. Really really huge. I can go in the /usr/lib directory in 2 or 3 seconds with konqueror. Before, this operation lasts about 10 seconds.
I only change the it++'s. I expect that with we change the others optimizations sugested by icefox (const QString, cache results of length() in loop and others) the gain in speed will be more than impressive. Only with simple things, that don't add new bugs.
This changes should be in kde 3.3.2 too. Don't look on kde head only.

Anonymous said...

How is ++i any faster than i++ ?

Aaron J. Seigo said...

++i is often faster than i++ on complex types as it will can preclude the creation (and subsequent destruction) of a temporary object. for things like integers and other basic types it makes little difference in the grand scheme of things, but when you're with complex types the compiler often has to resort to creating a temporary object for safety's sake. creating/destroying temporary iterator objects each time through a loop that iterates over a large collection will often result in a noticeable slow down.

this was discussed at length on the kde-optimze list, btw.