I really like Windows Live Photo Gallery (WLPG) and I depend on its photo-tagging (5-stars, people, and descriptive tags) to organize my photos. I love the fact that I can effortlessly find all 5-star photos of my daughters unicycling together with just a few clicks.
However, WLPG has a frustrating number of performance problems. It likes to read and write to the disk, and it often does it very poorly. One problem WLPG has is that it uses a SQL Server Compact database to store a copy of all of the metadata, and it accesses this database inefficiently. Some queries require doing thousands of reads from the database file (not enough indexes perhaps?) and these random 4-KB reads work poorly, especially on a slow laptop drive. When you first launch WLPG it is not unusual for it to take a minute or so to become responsive, and after that it may still hang when you start browsing. Tracing with ETW makes the cause of the hangs quite obvious, as this disk I/O summary table shows:
During this particular process startup WLPG spend 28.8 s reading from pictures.pd6, 6.7 s reading from FaceExemplars.ed1, and .5 s reading from FaceThumbs.fd1. And yet, it only read about 23 MB from these three files, which should not take 36 s. The reason is hinted at by the expanded details for FaceThumbs.fd1 which show lots of 4 KB reads. In fact, all of the reads for all three files are 4 KB reads, and they are not sequential. That’s 5,013 separate reads to just pictures.pd6! The poor disk head is bouncing all over the disk and is spending very little time reading data. If the files were read sequentially then it would just take a few seconds. Because they are read randomly it takes an excruciatingly long time, and there will be more delays later on because some of the data has still not been read!
The disk I/O graph shows only I/O that went to the disk – it filters out I/O that as handled by the system cache. The file I/O graph shows every read/write from every process, and it’s interesting to analyze the reads from Pictures.pd6. I copied the file offset of each file read, in time-order, to Excel. Then I calculated the seek distance (just subtract the previous offset). I filtered out the approximately 5,000 redundant reads (two in a row from the same offset) and found 30,044 separate 4,096 byte reads, with an average seek distance between them of 17 MB. In other words, the average seek distance was more than a quarter the length of the file.
This is particularly frustrating because the database file is not that large. I’ve got 47,000 photos and over 37,000 people tags but my WLPG database is just 70 MB. Even my lousy laptop drive can read that entire file in just a couple of seconds. All you have to do is ask it to read the entire file sequentially, instead of messing around reading 4 KB here and 4 KB there.
I finally got around to fixing this problem. I wrote a short Python program that reads in the “Pictures.pd6” files, and the three other WLPG data files for good measure. If I run this before launching WLPG, or anytime WLPG is being slow, then it will load these files into the Windows file cache, and suddenly WLPG will stop hanging.
Here’s the code:
“””This pre-caches valuable files so that WLPG will run better.”””
appdata = os.environ[“localappdata”]
files = glob.glob(os.path.join(appdata, r”Microsoft\Windows Live Photo Gallery\*”))
print “Found %d files” % len(files)
for file in files:
data = open(file, “rb”).read()
print “Read %s, %d bytes” % (file, len(data))
# Ignore errors from, for instance, directories
print “Ignoring %s” % file
If you prefer, you can use a two-line version that just reads the crucial file:
open(os.path.join(os.environ[“localappdata”], r”Microsoft\Windows Live Photo Gallery\Pictures.pd6″), “rb”).read()
Ideally this script should be run every minute or so while WLPG is running in order to keep the database file in the cache, but the file is locked so this doesn’t work. The cost of this is very low, especially if the file is actually still in the cache.
This doesn’t fix all of the WLPG performance problems. WLPG still does tens of MB of I/O when you zoom in on a photo. WLPG still takes too long to update its database, and it busy waits. WLPG doesn’t make use of my eight CPU cores to decode a few photos ahead, so it’s common to have to wait for it when stepping through photos. But, this does neatly solve one annoying problem.
All of the WLPG performance problems could easily be fixed. The database files could be precached, the excessive disk I/O when zooming on images could be avoided, and multi-core image decoding could easily let WLPG stay ahead of the user. None of these would be difficult. All of these solutions consume additional memory, but memory is cheap these days and I would happily spend 500 MB in order to make WLPG ten times more responsive.
My daughter refuses to use WLPG because it is too slow. I don’t blame her. But I wish she could use it.
Once again ETW has shown its value in allowing the cause of a performance problem to be diagnosed, and a workaround to be created.
This article was updated August 24 and August 28 2012 with more xperf details, and October 2015 with UIforETW links.