Zdenek Kotala's weblog
COMMIT or ROLLBACK that is the question
All | Personal | PostgreSQL | Sun

20080807 Čtvrtek srpen 07, 2008

Battle with 64bit beast

As a part of my work on in-place upgrade project I need cleanup code and prepare unified API for page content access. The work went smoothly and I thought that I'm ready with first prototype. Everything worked fine and regression test passed without any problems. However, I has expected that new interface should have a performance penalty and I wanted to know how big the impact is. I asked my colleagues from performance team if they can help me. Paul van den Bogaard offered me that he helps me but he needs 64bit SPARC binaries. I answered "It's no problem". I used same flags for compilation as we use for Solaris PostgreSQL official packages. And of course as a true hacker a tried gmake check at the end. Upsssssss...

...
     bit                  ... ok
     numeric              ... ok
     txid                 ... ok
     uuid                 ... FAILED (test process exited with exit code 2)
     enum                 ... FAILED (test process exited with exit code 2)
     money                ... ok
...

======================================================
 27 of 115 tests failed, 1 of these failures ignored. 
======================================================

Hmmm. It does not look really good. Quick look into regression.diffs confirm my doubt that it is in hash index.

  -- btree and hash index creation test
  CREATE INDEX guid1_btree ON guid1 USING BTREE (guid_field);
  CREATE INDEX guid1_hash  ON guid1 USING HASH  (guid_field);
! server closed the connection unexpectedly
!       This probably means the server terminated abnormally
!       before or while processing the request.
! connection to server was lost

Ok. Core dump shows more. Fortunately I set system to store core into special directory and with "core.%f.%p" pattern (see coreadm(1M)). It helps to keep all generated core files separately. And now what pstack(1) shows:

 00000001000858f8 _hash_metapinit (1006c2818, ff00, fc00, 1000038c9bc, 13d, 2) + 260
 000000010008238c hashbuild (ffffffff7fffa1c8, 1006c2708, 1006ba670, 100645840, 1004e2000, 1006c2818) + 84
 00000001002c66d8 OidFunctionCall3 (100082308, 1006c2708, 1006c2818, 100645840, 3, ffffffff7fffa570) + 48
 00000001000d1eb8 index_build (1006c2708, 1006c2818, 100645840, 1006c24c8, 1006c47f0, 0) + 3c
 00000001000d16c0 index_create (0, 1, 4085, 100645840, ffffffff7fffa7b8, 1006c2818) + 78c

Good, I adjusted HashMetaPageData structure and added new HashPageGetMeta macro for get pointer to the correct place and of course the change must be backward compatible.

--- a/src/include/access/hash.h Fri Aug 01 13:16:08 2008 +0000
+++ b/src/include/access/hash.h Sat Aug 02 18:04:16 2008 +0200
@@ -138,7 +138,6 @@
 
 typedef struct HashMetaPageData
 {
-       PageHeaderData hashm_phdr;      /* pad for page header (do not use) */
        uint32          hashm_magic;    /* magic no. for hash tables */
        uint32          hashm_version;  /* version ID */
        double          hashm_ntuples;  /* number of tuples stored in the table */

...

+#define HashPageGetMeta(page) \
+ ((HashMetaPage) (((char *)(page)) + SizeOfPageHeaderData + sizeof(ItemIdData)))

Because I got SIGBUS (Bus Error) there is problem with memory alignment but start address of hashm_magic is on correct place. Strange, time to take debugger:

_hash_metapinit+0x260:          clrx      [%l4 + 0x24]

After investigation (thanks to Vita Batrla for help with SPARC assembler). Register l4 contains pointer to the page. It means that it points to hashm_ntuples which is double and double has to be located on 8 bytes aligned address. But how can it happen when structure starts on same address? The problem is size of PageHeaderData structure. It is 28bytes long and of course it is not divided by 8. It means compiler has to put 4 padding bytes between hashm_version and hashm_ntuples in original structure. But padding is not necessary in the new one and unfortunately new structure is different. I compare it with code generated from head and difference is clear:

_hash_metapinit+0x248:          clrx      [%i3 + 0x28]

During investigation I found another interesting thing. When I compiled PostgreSQL without optimization problem disappeared. Close look into code shows what happened:

_hash_metapinit+0x32c:          std       %f36, [%l0 + 0x8]

The STD instruction works fine with nonalignment floating point.

And what is conclusion? I can break backward compatibility and cleanup hash code or create several version of HashMetaPage for different page layout. Everything depends how new hash index implementation ends up. When it will be integrated into PostgreSQL 8.4 compatibility will be broken anyway and clean up can be done.

Posted by hlipa ( VIII 07 2008, 03:26:14 odp. CEST ) Permalink


Archives
Links
Referrers