Three different operations were performed in the tests:
- Insert/POST operation to create a new object
- Update/PUT operation to update an object
- Query/GET to search for objects by an (indexed) field/property
On the PHP/MySQL tests, all operations were handled by a very simple PHP script that first did a quick security check query against a small table (actually empty for the tests) to emulate the security capabilities of Persevere and CouchDB (although CouchDB’s security capabilities are limited, and probably often require additional logic), and then the main query was executed against MySQL, whether it be an INSERT, UPDATE, or SELECT. All the created objects/records had four properties/fields for all three systems. In MySQL, two properties were indexed, one being the primary key, the other being the property that was queried on in the query requests. In CouchDB, a simple view was created that indexed on a single property. This view was used for the requests that queried by index. Both Persevere and CouchDB tests used their standard HTTP interface for creating, updating, and querying.
Tests were carried out by a HTTP client running on 10 threads concurrently issuing a sequence of 200 of each type of request. The “full test” performed create, update, and query requests. The “write test” performed create and update requests, and the “read test” only performed query requests. The files used to perform the benchmarks are available here.
Direct Data-Bound Object Representation
Shared Cache of Objects with Copy-on-Write
Not only are in-memory objects shared between the application level and the database level, but Persevere also utilizes a shared cache of objects between threads to ensure that any given record/object only exists in memory at most one time. Traditional application frameworks process separate HTTP requests concurrently and each request will have its own result set and a copy of data. These can lead to significant duplication of data in memory. With Persevere, objects are always reused if they are still available in memory.
Append-based Database Storage
Adaptive On-Demand Concurrent Indexing
Batched writes in integrity mode
One of the most expensive operations that a database can perform is a forced synchronous disk write operation. These operations are necessary for high-integrity commit mode where the commit does not return until the database is certain that the data has actually been written to the disk, fulfilling the durability component of ACID compliance. These operations can take around 10ms. In order to improve the performance of high-integrity commits, Persevere will detect when multiple writes are taking place concurrently and batch multiple writes together in a single synchronous disk write operation. When a number of concurrent write requests are being sent to Persevere, this can significantly reduce the number of synchronous writes that must take place and greatly improve performance.
O(log n) time, but queries of the form
[?prop1.prop2='something'] will only execute in
O(n). Future versions will provide fast
O(log n) for a much broader range of queries. A later release will also provide true ACID compliance (the current version does not fulfill the atomicity constraint). Finally, replication/clustering services will be added in the future as well, for distributing Persevere workload across multiple servers.
As Alex Payne pointed out, the economy may be ending the era of disregard for system performance and efficiency with the excuse of buying more servers. More servers costs more money, and architectures like Persevere that can efficiently handle large numbers of users and traffic with minimal hardware resources equates to real money saved.
Update: Jan Lehnardt pointed out that CouchDB is now at version 0.9.0 and OS-X is not the optimal platform for CouchDB, so the latest version CouchDB can presumably improve upon the CouchDB performance shown in these tests. Hopefully we can progress towards better benchmarking tools for this new breed of databases.