Benchmarking the storage of images as files or as BLOBs.
Anyone who has done any significant web development has no doubt resorted to the use of a relational database. I am no exception! And so I was very interested when today I read an article about whether images should be stored on the file system as files, or whether they should be stored within a database.
In this article, performance and database size were listed as being cons. But for each, it was suggested that the impact of storing an image in a database rather than in a file on the file system was quite minimal. This is intriguing, as storing an image in a relational database does have many benefits, one of the main ones being the ability to easily relate it to relevant metadata.
However, I would like to see the results of further testing and benchmarks. I am most interested in seeing whether or not the performance issues are still virtually nonexistent when simulating a production environment. I wish I had more time available to perform these benchmarks myself. Specifically, I would like to see what impact different levels of concurrency and image size would have on the performace of the PostgreSQL and SQLite relational database systems.
January 10th, 2007 at 3:16 am
Your blog entry and comments motivated me to run a few time trails. See what your thoughts are: http://blog.jasonbherald.com/index.php?/archives/25-Mysql-BLOB-vs-Filesystem-Timetrials.html
January 11th, 2007 at 4:56 am
[…] Recently I read an article that talked of storing images in databases versus on the file system. In an article I wrote, I expressed an interest in seeing more data regarding the performance differences between the two methods. Well, the author of the original article has done some more testing, and written about it. […]
February 12th, 2007 at 5:49 pm
I have recently been pondering the very same question. I think it really does make sense to store images in a DB if they have a dynamic nature. IE: the user of a CMS needs to change thumbnail/image galleries easily without having to modify the code of the page.
After seeing the benchmarks on Jason Heralds Blog, I feel that using a DB to serve images on the fly may be an acceptable solution after all.
Two questions remains for me though:
1) What happens when we increase the load on the DB server so that one page loads twenty different image files vs. loading the same twenty different images with just standard files in a directory? I’d like to see the benchmarks then.. not only on each image load, but entire page load time itself.
2) How does this affect the performance of the database/server for other operations?
For instance, if you have a large 5MB image being sent to one user’s browser, does that slow down other regular (non-blob) events in the DB that may need to occur? Does it affect other aspects of the server’s performance (ie: Apache, PHP, et cetera).
I’ll probably find most of this out on my own eventually when I attempt to implement a project using DB blobs for dynamic images, but it’d still be good to see hard numbers in some way. I’ll see if I can come up with something that tests out my theory.
Dustin Weber
http://www.dustinweber.com