From the website,
The main features of H2 are:
- Very fast, free for everybody, source code is included
- Written [in] Java; can be compiled with GCJ (Linux)
- Embedded, Server and Cluster modes
- JDBC and (partial) ODBC API; Web Client application
powered by performancing firefox

Java DB, Sun's distribution of Apache / Derby, can also be run in the embedded mode.
As of JDK 6, it is being shipped with the JDK for developer use!
It has high-grade transaction capabilities and very good performance and security features.
More work is being done on performance, security and more as we speak.
Java DB 10.2.2.0 has recently been released, and you can find out more about Java DB here!
Posted by M. Mortazavi on January 06, 2007 at 11:23 AM EST #
Posted by David Coldrick on January 06, 2007 at 04:17 PM EST #
Posted by 84.202.209.63 on January 06, 2007 at 09:07 PM EST #
Posted by 201.22.134.233 on January 08, 2007 at 09:33 PM EST #
Posted by David Coldrick on January 09, 2007 at 05:24 AM EST #
Durability is an important aspect of a database system - I will even say that it is fundamental - Derby is fully-ACID compliant whereas H2 is not - H2 does *not* guarantee that all committed transactions will survive a power failure or an application crash. I don't think that loosing committed transactions is an viable option when it comes to a database system.
The downloadable size is not really a big problem these days - You can even compress derby.jar (~2MB) down to ~600k with Java 5+ Pack200 and the Java plug-in or Java Web Start support Pack200 compressed Archives / JARs downloaded on the client.
Finally, an Open Source product such as Apache Derby has a strong community of many developers, including some from IBM and SUN, contributing source code changes (i.e. fixes / enhancements) back to the product.
It is good to have diversity and it is also good to know what you're dealing with when it comes to database systems ;-)
Posted by Francois Orsini on January 09, 2007 at 05:41 AM EST #
Of course performance is relative. There are situations where a tractor is faster than a car. It would be good to have a test case where Derby is faster, tell me if you have one.
Currently H2 doesn't support row locks. But I don't think that most embedded applications need it. Anyway, the trend is towards multi version concurrency control (MVCC), and that's the next big thing that will be implemented in H2 (however this will take some time).
Durability: The default isolation level of Derby is read committed, right? As far as I understand it, for fully-ACID compliant the isolation level should be serialized (see Isolation in Wikipedia). I'm not sure if supporting 'full ACID' compliance by default would make sense. I have implemented and run a durability test with various databases and the file system (a simple power-off test using two computers), and things don't look good. The problem is, even if the database tries to flush to disk for each commit, the operating system and/or hard disk does not always do that. For details see ACID. If you really want to enforce flushing to disk, you need to wait at least 0.1 seconds per transaction, and even Derby doesn't do that by default. That means, even Derby does *not* guarantee that all committed transactions will survive a power failure or an application crash. If you have other results using common hardware / default settings and this test, or if you find a way that is faster, please tell me! Hopefully the next generation hard drives (with integrated flash memory) will be better... But if you need 'no single point of failure' then you anyway need clustering / mirroring. H2 support clustering, Derby does not.
Download size: I think David was talking about size of product download (16 MB for Derby versus 3 MB for H2) not about the jar file size (2.2 MB for Derby versus 1 MB for H2). By the way, the H2 jar also contains the Console web application and web server, and other tools. And debugging info (line numbers) is switched on in H2, and switched off in Derby. But I agree the jar file size is not the most important factor.
Community: Yes, Derby has more developers (4, according to Ohloh, not sure if this is correct), but that doesn't necessarily mean a better product. Development of Derby started in 1996 or earlier, while H2 started in 2004 (it is now one year online). H2 is a very young product, and currently doesn't have professional support from a bigger company. This will be available in the future when there is demand. You could also say Derby has a liability (big, old, slow code base). Anyway, H2 also has quite a big community, given how young it is. But of course Derby has the advantage the Apache name ('branding'), but this doesn't mean it's better (there are many failed Apache projects).
But only time can tell which database is more successful.
Posted by Thomas Mueller on January 09, 2007 at 07:58 PM EST #
Ok trying again... the complete comment is here.
Posted by Thomas Mueller on January 09, 2007 at 11:54 PM EST #
Posted by David Coldrick on January 10, 2007 at 05:49 AM EST #
> It would be good to have a test case where Derby is faster, tell me if you have one.
>
I gave a specific database server stress test scenario case when I mentioned TPC-B (this is just 1 particular case) - A tractor does not compete in some F1 race and that is where your analogy is flawed because Derby actually performs more than decently in that context - Like I said, embedded applications is one particular facet of today's applications but that does not represent all of the applications out there - Everything is relevant to the particular tests one is defining and running (yours in that case) but that does not represent how a database performs in some other contexts (embedded or not).
> Currently H2 doesn't support row locks. But I don't think that most embedded applications need it.
> Anyway, the trend is towards multi version concurrency control (MVCC), and that's the next big
> thing that will be implemented in H2 (however this will take some time).
>
Good to hear this - Row-lock in Derby has been implemented since the first incarnation of Cloudscape in 96' - MVCC is good but not for applications which are doing intense updates and writes - The reason is pretty obvious versus a lock concurrency scheme and that is why "some" database(s) are supporting both approaches.
> Durability: The default isolation level of Derby is read committed, right?
> As far as I understand it, for fully-ACID compliant the isolation level should
> be serialized (see Isolation in Wikipedia). I'm not sure if supporting 'full ACID'
> compliance by default would make sense. I have implemented and run a durability test
> with various databases and the file system (a simple power-off test using two computers),
> and things don't look good. The problem is, even if the database tries to flush to disk
> for each commit, the operating system and/or hard disk does not always do that.
> For details see ACID. If you really want to enforce flushing to disk, you need to wait
> at least 0.1 seconds per transaction, and even Derby doesn't do that by default.
> That means, even Derby does *not* guarantee that all committed transactions will survive
> a power failure or an application crash. If you have other results using common
> hardware / default settings and this test, or if you find a way that is faster,
> please tell me! Hopefully the next generation hard drives (with integrated flash memory)
> will be better... But if you need 'no single point of failure' then you anyway need
> clustering / mirroring. H2 support clustering, Derby does not.
>
The golden rule is that you should not rely on the file system for write operations unless you have some means to force-flush & check I/O completions - that is why Unix Raw Devices were made available almost 20 years ago so that one could bypass the FS and use Async I/O's at the kernel level to retrieve status on a particular I/O (completion) and made sure it made it to disk(s) - there are technics such as write through-case where you don't rely on I/O write operations to be handled at all by the FS buffer (as it is bypassed) but rather expect a write I/O to be written to disks everytime you request it - it is a binary operation, either it works or not and you'd get an I/O error if an I/O has not complete to disk. Relying on the FS and some UPS hardware device is ok _but_ that is NOT what you usually find in every embedded devices or client desktop - You can't expect everyone to have a UPS to alleviate some issues due to a database system loosing committed rtansaction and therefore not handling ACID durability as it should and it is expected. I've worked at many database companies and dealing with critical-level type of applications and if I had told the customers that could loose committed transactions due to an application or system crash, then I don't think these database companies would have been as successful as they have been. Some things such as not loosing committed transactions have to be handled at the database level and that is what durability is all about. Today, Derby will not loose transactions that have been committed whether you have some UPS or not.
> Download size: I think David was talking about size of product download (16 MB for
> Derby versus 3 MB for H2) not about the jar file size (2.2 MB for Derby versus 1 MB
> for H2). By the way, the H2 jar also contains the Console web application and web server,
> and other tools. And debugging info (line numbers) is switched on in H2, and switched off
> in Derby. But I agree the jar file size is not the most important factor.
>
Download size is irrelevant in today's world except for web applications and in this case, one does NOT have to download the whole product - for embedded applications, it is only 1 JAR file basically and whether it is H2 or Derby, the size is not really an issue (as I mentioned in some earlier thread)
> Community: Yes, Derby has more developers (4, according to Ohloh, not sure if
> this is correct), but that doesn't necessarily mean a better product.
> Development of Derby started in 1996 or earlier, while H2 started in 2004
> (it is now one year online). H2 is a very young product, and currently doesn't
> have professional support from a bigger company. This will be available in the
> future when there is demand. You could also say Derby has a liability (big, old,
> slow code base). Anyway, H2 also has quite a big community, given how young it is.
> But of course Derby has the advantage the Apache name ('branding'), but this doesn't
> mean it's better (there are many failed Apache projects).
>
Derby has more than 30+ contributors - what you saw in Ohloh are the top committers for 2007 (new year eh) and this is why it is 4 - last year 23 committers checked-in code, so I'll let you do the stats as far as how many contributors there could be - not every contributor is a committer to the project - that's how Apache works and a lot of other open source projects. Derby has developers from Sun (Java DB), IBM (Cloudscape) as well as other independent contributors or companies. Derby is _not_ big - The footprint is not big (2MB) for the engine compared to some other databases out there and is more than adequate for a lot of today's embedded applications. Apache is not just about branding - it has and continue to be a set of communities for many successful projects with defined rules and guidelines. At the end of the day, it is all about Open Source projects and quite a few of them have made lots of noise in the past many years and still continue to do so.
> But only time can tell which database is more successful.
>
Again, I was not bashing H2 if this is the way you felt - I clearly mentioned that one has to know what type of database(s) one is dealing with before claiming it is faster for *all* use case scenarios out there.
Posted by Francois Orsini on January 10, 2007 at 10:59 AM EST #
Posted by David Coldrick on January 10, 2007 at 01:30 PM EST #
> I mentioned TPC-B
OK. The performance test (open source by the way) used by H2 currently uses algorithms similar to TPC-A and TPC-C, I will add one that is similar to TPC-B when I have time. I'm quite sure that Derby is not that much faster using this benchmark. I'm still waiting for a benchmark where Derby is nearly as fast as H2.
> MVCC is good but not for applications which are doing intense updates and writes
You mean concurrent connections updating the same rows again and again. I don't think this is such a big problem. All the newer engines are based on MVCC (MySQL Falcon), and MVCC is added to the older ones. But supporting both locking and MVCC does make sense of course.
> Derby will not loose transactions that have been committed
Well, unfortunately, this is not what I have found. Derby did loose transactions sometimes. You can test it yourself, the source code and documentation to do that is included in H2. See http://www.h2database.com/html/advanced.html#acid and 'Your Hard Drive Lies to You' (http://hardware.slashdot.org/article.pl?sid=05/05/13/0529252&tid=198&tid=128) Also, using FileDescriptor.sync() or FileChannel.force() after each file operation, only around 30 file operations per second can be made. And Derby does not call those functions for each commit.
> Derby is _not_ big - The footprint is not big (2MB) for the engine compared to some other databases out there
Well, compared to other Java databases, Derby is by far the biggest, right? I'm not sure if there is really so much more functionality than in H2...
> and whether it is H2 or Derby, the size is not really an issue (as I mentioned in some earlier thread)
It is a problem for some people. For those where even the size of H2 is a problem, I usually recommend PointBase Micro (50 KB jar file size).
Posted by Thomas Mueller on January 10, 2007 at 10:36 PM EST #
Posted by Hani Suleiman on January 11, 2007 at 11:28 AM EST #
I don't know when you used Derby for the last time but the latest documentation set is very complete with a lot of samples, as well as individual guides for the various contexts you need information about. In fact, Derby documentation has always been quite complete. Is it perfect? no, so are the documentation sets of many products or open source projects out there - but it is certainly way beyond the average. http://db.apache.org/derby/manuals/index.html#latest The community is very active and is always keen in helping users and/or developers.
I'm not sure how you can (especially without any vested interest as you mentioned) state that H2 is a far superior product as obviously Derby is a more mature database system (1st version in 97'), and that is expected as H2 is fairly recent. As far as embedding them, both products require a JDBC driver class and some URL to connect to the engine - both have their core database engine represented as 1 JAR file.
br> Derby installation is documented under: http://db.apache.org/derby/docs/dev/getstart/getstart-single.html#cgsinstallingderby and Java DB (based on Apache Derby) is bundled as part of Sun JDK 6.
Maybe what you're asking for in some kind of installer, but once the distribution archive is extracted on disk, you just need to set 2 environment variables and a 3rd optional one.
Regards,
Posted by Francois Orsini on January 24, 2007 at 07:54 AM EST #
> after each file operation, only around 30 file operations per second can be made.
Only when single user. Derby will happily support many more transactions per second when multiple threads are executing transactions. This report shows up to 500/sec when disk caching is disabled.
Derby Performance
> And Derby does not call those functions for each commit.
Yes it does, with optimizations that a single sync() can satisfy the commit of multiple transactions.
Posted by Dan Debrunner on January 26, 2007 at 04:18 AM EST #
Thanks for your comment! Using the write cache does not reduce the probability of recovery for H2. In the presentation, there is a scary statement: 'The write cache reduces probability of successful recovery after power failure' [for Derby]. Why is that? It this is true, then things don't look good for Derby. Because even when calling FileDescriptor.sync()/FileChannel.force() (this is called 'fsync'), data is not always flushed to disk. See also: 'Your Hard Drive Lies to You' http://hardware.slashdot.org/article.pl?sid=05/05/13/0529252&tid=198&tid=128
Or you can re-run the power off test that included in H2, see also: http://www.h2database.com/html/advanced.html#acid (Durability).
It this sounds bad, it gets worse. Today I run a test using FileDescriptor.sync(), FileChannel.force(), and Derby. When writing one byte and calling FileDescriptor.sync() or FileChannel.force(), I get about 50 operations per second. When using Derby, I get about 500 operations per second. That can only mean that Derby is actually not calling one of those functions. It looks like Derby uses RandomAccessFile(.., "rwd") instead (this is using O_SYNC flag when opening the file). This is actually not the same. fsync also writes through the hardware write cache. O_SYNC does not. With a hard drive running at 7200 rpms, you basically have an upper limit of about 7200/60=120 synchronous physical writes to the same position on the drive. This is also the result when using fsync. But instead of talking theory, I suggest you actually run the test. The source code is included in H2, as the source code of the H2 benchmarks is.
Unfortunately, the source code of the benchmark that was used in presentation you refer is closed source. I asked, and I was told it can not be released (now). May I ask you why you refer to a closed source benchmark? Is there no open source benchmark where Derby is faster than any competitor? After 10 years after Cloudscape / Derby was started?
Thomas
Posted by Thomas Müller on January 27, 2007 at 10:40 AM EST #
Posted by Ivan Dolvich on April 19, 2007 at 06:26 AM EST #
Just had to add some flame to the fire. I discovered h2 at the end of 2007 and agree that
1) setup is a breeze compared to derby, assuming you don't use the derby that is now packaged alongside glassfish with the jdk. For anyone who doubts this, its a simple download and try it yourself. The real kicker for h2 is the javascript sql client that fires up, derby 'ain't got nothin on it'.
2) in performance testing derby was a dog for me. Spent a week benching 6 different databases, and after reading all the forums and applying tweaks, etc., derby just wasn't up to the same speed as mysql (!), mckoi, hsqldb, or h2 (postgres was slowest for me). Durability was not a concern for me, I was wanting db setup and teardown to be as quick as possible for some functional testing using selenium.
Whoever speculated that derby has yet to shed its cloudbase heritage probably said it right.
Posted by Daniel Juliano on June 30, 2008 at 02:41 PM EST #
Hallo David, im going to be a little of the topic on this blog, i found it the only way i could have a message sent to you.
Im so looking for David coldrick, in switserland, im you name popped out first on my search engine, im from uganda...once studied im muyenga high school.
PLease do write back if you are familier with any thing like this.
Regards.
Edger Serungogi
Posted by Edger Serungogi on October 18, 2008 at 03:50 PM EST #