Zdenek Kotala's weblog
COMMIT or ROLLBACK that is the question
All | Personal | PostgreSQL | Sun

20080807 Čtvrtek srpen 07, 2008

Battle with 64bit beast

As a part of my work on in-place upgrade project I need cleanup code and prepare unified API for page content access. The work went smoothly and I thought that I'm ready with first prototype. Everything worked fine and regression test passed without any problems. However, I has expected that new interface should have a performance penalty and I wanted to know how big the impact is. I asked my colleagues from performance team if they can help me. Paul van den Bogaard offered me that he helps me but he needs 64bit SPARC binaries. I answered "It's no problem". I used same flags for compilation as we use for Solaris PostgreSQL official packages. And of course as a true hacker a tried gmake check at the end. Upsssssss...

...
     bit                  ... ok
     numeric              ... ok
     txid                 ... ok
     uuid                 ... FAILED (test process exited with exit code 2)
     enum                 ... FAILED (test process exited with exit code 2)
     money                ... ok
...

======================================================
 27 of 115 tests failed, 1 of these failures ignored. 
======================================================

Hmmm. It does not look really good. Quick look into regression.diffs confirm my doubt that it is in hash index.

  -- btree and hash index creation test
  CREATE INDEX guid1_btree ON guid1 USING BTREE (guid_field);
  CREATE INDEX guid1_hash  ON guid1 USING HASH  (guid_field);
! server closed the connection unexpectedly
!       This probably means the server terminated abnormally
!       before or while processing the request.
! connection to server was lost

Ok. Core dump shows more. Fortunately I set system to store core into special directory and with "core.%f.%p" pattern (see coreadm(1M)). It helps to keep all generated core files separately. And now what pstack(1) shows:

 00000001000858f8 _hash_metapinit (1006c2818, ff00, fc00, 1000038c9bc, 13d, 2) + 260
 000000010008238c hashbuild (ffffffff7fffa1c8, 1006c2708, 1006ba670, 100645840, 1004e2000, 1006c2818) + 84
 00000001002c66d8 OidFunctionCall3 (100082308, 1006c2708, 1006c2818, 100645840, 3, ffffffff7fffa570) + 48
 00000001000d1eb8 index_build (1006c2708, 1006c2818, 100645840, 1006c24c8, 1006c47f0, 0) + 3c
 00000001000d16c0 index_create (0, 1, 4085, 100645840, ffffffff7fffa7b8, 1006c2818) + 78c

Good, I adjusted HashMetaPageData structure and added new HashPageGetMeta macro for get pointer to the correct place and of course the change must be backward compatible.

--- a/src/include/access/hash.h Fri Aug 01 13:16:08 2008 +0000
+++ b/src/include/access/hash.h Sat Aug 02 18:04:16 2008 +0200
@@ -138,7 +138,6 @@
 
 typedef struct HashMetaPageData
 {
-       PageHeaderData hashm_phdr;      /* pad for page header (do not use) */
        uint32          hashm_magic;    /* magic no. for hash tables */
        uint32          hashm_version;  /* version ID */
        double          hashm_ntuples;  /* number of tuples stored in the table */

...

+#define HashPageGetMeta(page) \
+ ((HashMetaPage) (((char *)(page)) + SizeOfPageHeaderData + sizeof(ItemIdData)))

Because I got SIGBUS (Bus Error) there is problem with memory alignment but start address of hashm_magic is on correct place. Strange, time to take debugger:

_hash_metapinit+0x260:          clrx      [%l4 + 0x24]

After investigation (thanks to Vita Batrla for help with SPARC assembler). Register l4 contains pointer to the page. It means that it points to hashm_ntuples which is double and double has to be located on 8 bytes aligned address. But how can it happen when structure starts on same address? The problem is size of PageHeaderData structure. It is 28bytes long and of course it is not divided by 8. It means compiler has to put 4 padding bytes between hashm_version and hashm_ntuples in original structure. But padding is not necessary in the new one and unfortunately new structure is different. I compare it with code generated from head and difference is clear:

_hash_metapinit+0x248:          clrx      [%i3 + 0x28]

During investigation I found another interesting thing. When I compiled PostgreSQL without optimization problem disappeared. Close look into code shows what happened:

_hash_metapinit+0x32c:          std       %f36, [%l0 + 0x8]

The STD instruction works fine with nonalignment floating point.

And what is conclusion? I can break backward compatibility and cleanup hash code or create several version of HashMetaPage for different page layout. Everything depends how new hash index implementation ends up. When it will be integrated into PostgreSQL 8.4 compatibility will be broken anyway and clean up can be done.

Posted by hlipa ( VIII 07 2008, 03:26:14 odp. CEST ) Permalink

20080529 Čtvrtek květen 29, 2008

PGCon 2008 - Lets go Elephant

PGCon 2008 is over. It held in Ottawa, Canada last week and it was really busy week. Conference began on Tuesday with two days of tutorials. And two days of talks followed afterwards in three separate tracks. Quality of tutorials and talks were very high. It was very nice to be there again and met with people. Especially most of Sun's Elephants were there and we could meet personally.

The conference is not only about talks. Social events are big advantage of the conference. Meet and discuss with other developers or PostgreSQL users is very useful. You can see photos for example on Magnus Hanager's photo album. It is nice to connect face with email address.

My talk

My main task there was present about in-place upgrade project. I discuss this topic with many people during the conference and it is really big issue. One TB database is usual size today and migrate data from old version to new one is DBA's nightmare. Everybody asked me if I have some code for testing. It has been big motivation for me. I'm now focusing on storage upgrade and unfortunately, code which I have now is only experiment. The changes between 8.2 and 8.3 version of postgreSQL are big and I thought that online conversion will not be possible without major rework. Fortunately, I got idea how to process old data structure online during my long flight to Ottawa. I discussed it with Tom Lane during a lunch and on Friday morning I had to adjust my presentation. It seems that my presentation has been accepted. It is mentioned in Bruce's and Josh'es blogs. I think my mission was accomplished successful. I have green light and now, I can start coding.

Thanks all who gave me a feedback and who offered me a help.

Posted by hlipa ( V 29 2008, 01:56:11 odp. CEST ) Permalink Comments [2]

20080215 Pátek únor 15, 2008

Prague PostgreSQL Developers' Day 2008

Prague PostgreSQL Developers' Day 2008 conference is gone. Big thanks to all who helps me and Marek Kocan to prepare and organize it. Big thanks to all speakers and all attendees for friendly atmosphere. And big thanks to Sun microsystems, CZ Nic and Czech Technical University in Prague for their support. Conference was recorded by AVC center (big thanks) and video should be available soon.

About 76 people visited the conference and three of them arrived from Košice, Slovakia. We can say that it was international conference :-).

You can find more infromation on Julo's [1] and Peter's [2] [3] blogs.

Posted by hlipa ( II 15 2008, 04:04:45 odp. CET ) Permalink

20070528 Pondělí květen 28, 2007

PGCon 2007

PGCon 2007 went ahead last week in Ottawa. It started on Monday May 21st with two days of tutorials. I attended only Niel Conway's tutorial Introduction to Hacking PostgreSQL. Niel presented how to write PostgreSQL code, what kind of tools is useful for debugging and he also described how to use memory management and handle errors. He mentioned DTrace as a good tool for debugging, but he does not have experience with it. I had a few OpenSolaris Developer Express Starter kits and I gave him one copy.

Official registration started on Tuesday evening in Royal Oak pub. A lot of hackers and users were there when I came and they discussed a lot of topics.

My and Jim's mission there was to discuss PostgreSQL upgrade project. Nobody want to have eight versions of PostgreSQL in Solaris and there is not easy way how to upgrade from one version to another, especially when database is very big. We made decision that only on-fly upgrade, when new PostgreSQL version will understand old database structures, is correct. However, this solution requires a lot of changes in the current code and it is not possible start without wide acceptance. We started discussion with EnterpriseDB folks, because they work on PG Migrator tool. PG Migrator tool is useful for migration between 8.1 to 8.2, because there are mostly only catalog changes (disk layout is not touched exclude two datatypes). However for 8.2 to 8.3 migration PG Migrator is not suitable. It is designed as offline utility and conversion of Tera byte database disk structures take a long time. After hot discussion about technical details the fully integrated on-fly upgrade has been accepted as possible and correct solution.

I also had 5 minutes long lighting talk about it and a lot of people interested in this solution. Only two things are now necessary to do ... design and implementation.

We (Sun's folks) had also two other presentations. Jim Gates, Robert Lor, Josh Berkus and me presented Useful Solaris tools for PostgreSQL DBAs. I think there were over fifty attendants. Unfortunately, our presentation did not fit in one hour and we skipped SMF. People mostly interested in DTrace and ZFS features. Robert spent a lot of time after presentation with people who interested in DTrace and they wanted to learn more.

Me, Jim and Robert

Julo Stroffek and Tomas Kovarik presented Execution plan optimization techniques. PostgreSQL uses genetic algorithm to determine execution plan for twelve and more tables join optimization. This algorithm is not optimal and Julo and Tomas presented summary of suitable optimization methods. Their presentation was accepted and PostgreSQL hackers now interest in real result. However, some pluggable interface for optimizer is first step in their effort.

Julo and Tomas

There were a lot of excellent presentations and most of them are available on conference web site. If I could recommend one presentation Scaling PostgreSQL on SMP Architectures -- An Update is very interesting.

I think it was excellent conference and I'm looking forward to next year.

Posted by hlipa ( V 28 2007, 04:58:51 odp. CEST ) Permalink

20070307 Středa březen 07, 2007

Elephant in Berlin

The last week (Feb 27 - Mar 2 2007) German Unix User Group organized the first OpenSolaris Developer Conference. I had a pleasure to attend this conference and present there about PostgreSQL and its integration into Solaris. I was also surprised how strong Unix community is in Germany. The first day focused on technology inside SMF, RBAC, DTrace and also on OpenSolaris project itself. The second day was about alternative Solaris distributions, like BeleniX, Nexenta and others.

All presentations were very good and I learned new and interesting information. If I should choose only one presentation, Looking into the black-box - how the kernel may impact your application presented by Thomas Nau would be the winner. Thomas Nau described how to use a lot of system tools include D-Trace for analysis application bottlenecks and he demonstrated these tools on a real problem.

My presentation started at 4:30pm when many people already left the conference, however, ca thirty five folks still stayed to listen to my talk. The talk presented how SMF and RBAC is implemented for PostgreSQL and also showed how DTrace user probes are integrated into PostgreSQL 8.2. Presentation generated a few questions and most of them focused on high availability and load balancing solution for PostgeSQL. Unfortunately, PostgreSQL does not have functionality similar to Oracle cluster. However, SunCluster version 3.2 provides failover data service for PostgreSQL.

Me in speaker role.
Me in speaker role.

Conferences are also good place to meet people and discuss various things. I talked with Sasa Gorshenev about Sun Studio compiler and its optimization. He gave me some ideas how to tune PostgreSQL. Frank Batschulat also joined the conference and we discussed how to setup disk access to get best performance for databases.

And what to write at the end? It was amazing conference and I hope I will attend it the next year again.

Posted by hlipa ( III 07 2007, 08:39:46 odp. CET ) Permalink Comments [3]

20060812 Sobota srpen 12, 2006

The Freeze Elephant

The Freeze Elephant is not next volume of the Disk world book series from Terry Pretchet. It is not historical discovery as well. How many of you know Elephant is Postgres logo and August 1st was date for new release feature freeze. All patches received before this date will be processed and integrated after their review to the release 8.2. And what will be new in the 8.2? This question is heavy discussed in the hacker mailing list, because there are many improvements and small enhancements, but there are only few new major features. However, some hackers and users expect more. It triggered discussion about release cycle, because Postgres has the fastest release cycle in the database market and major features request more time for implementation.

I think that sometime is good to take time for improvements and cleanup in the old functionality than create new and new features. Some new features from 8.0 will be not until fully functional in the incoming version. It is probably necessary, because TODO list contains a lot of new major features, but many of them request a lot of time and high experience. The time for full implementation is in many cases longer than release period. Partial implementation makes possible fast integration into main branch. Review of this code is much easier and developer has fast feedback. It avoids a lot of problem with the source code merge as well. I expect a lot of major features in the 9.0. And when we will expect it? Nobody knows it.

And when and what will be in 8.2? Bruce expects that Postgres 8.2 will be released in October or November, but it depends on speed of stabilization process. We can expect following major features:

  • GIN (General INverse index)
  • bitmap index
  • udatable view
  • PL/SQL debugger infrastructure

Most of them are not committed yet and some of them should be still rejected.

Somebody may ask what Sun brings to the new release? The Sun announced support of Postgres in the December 2005 and the new support team has established. This team (yes I member of this team) started integration Postgres with solaris and has helped community implement some items from TODO list. The short list of our work you can see bellow:

  • D-Trace
  • Kerberos support on Nevada
  • Improvement in GUC (Genereal User Configuration) module
  • The 64 bit integer support in some SQL commands

Someone can say that it is not too much, but I think the D-Trace is Sun's big contribution for Postgres. Community started to use it immediately for searching concurrence locks problem on multiprocessor machine with more than four CPUs and it will be useful in many other cases as well.

I think it is good start and Sun has big power to bring more in the future. Now we do not have to wait until elephant will defrost and we can continue in participation on the Postgres development.

Posted by hlipa ( VIII 12 2006, 10:47:41 odp. CEST ) Permalink

20060721 Pátek červenec 21, 2006

PostgreSQL code sprint

The Postgres conference finished on Sunday, but postgres hackers stayed in Toronto until Tuesday at the Code Sprint sponsored by SUN. Over 30 hacker were in one room and spent these extra days on coding the last features to postgres 8.2. Flipcharts were written up very quickly during brainstorming sessions. Tom Lane and Bruce Momjiam did not have time to have a rest. Many problems were solved (at least on the flipchart) and I hope that postgrest 8.2 will be released soon. Feature freeze has been announced to July, 31st.

Code Sprint
Code Sprint

Bill Franklin attended this code sprint as well. He spoke with many hackers and was interested in their work. I think that the conference was a good experience for me, my colleagues and for Sun too. See you on the next code sprint ...

Sun postgres hackers
Sun postgres hackers

Josh, Robert, Jim, Zdenek, Dhanaraj
Posted by hlipa ( VII 21 2006, 12:24:45 dop. CEST ) Permalink

20060720 Čtvrtek červenec 20, 2006

PostgreSQL Anniversary Summit

PostgreSQL Anniversary Summit performed in Toronto from 8th and 9th July 2006. This summit was the first occasion to meeting postgres hacker face to face in its ten years history like opensource and free project. The conference unofficially started day before on 7th July evening with small refreshment in the hotel lobby. Many people met first time and everybody tried to assigned mailing address to real person. It was a little bit funny. I met with my Indian colleague Dhanaraj in hotel day before, but I have never seen my US colleagues. For example Jim's photo is little bit out of date. Fortunately, Jim identified us. The all group of hacker moved to boat for dinner and for ongoing person identification and unlimited discussion about postgres.

Toronto evening
Toronto evening

Main conference started on Saturday morning by Josh Berkus. He invited all conference participant and said thanks to conference sponsors. Sun was one of them. After this he gave floor to Bruce Momjian. Bruce is one of people who ware presented releasing of postgres source code ten years ago. He talked what happened during this time.

Next presentations had been divided to two separate tracks. Many of presentation ware related to present new projects. Korry Douglas showed PL/SQL debugger based on postgres proxy server which is stored close to main server. Communication is based on standard plsql protocol and debugger features are implemented as standard stored functions. Simon Riggs continued with presentation about performance problems. However, some problem generated big discussion and unfortunately, Simon did not have time to finished his all presentation.

Saturday afternoon was dedicated to Neil Conway's and Gavin Sherri's introduction to postgres hacking. It was very useful practical presentation based on real feature implementation.

Contributors dinner was closing Saturday. I met and talked with many people. I spoke with Luke Lonergan, CTO of GreenPlum company, about ZFS and its scalability. He connected from his PDA to his home server and shown me version of running opensolaris. Yes, My OpenSolaris T-Shirt identified me and many people ask me about our postgres support and future strategy. Many people trust us but not all of them. I spoke with Skype developers from Estonia. I asked them if they plan skype version for solaris. Unfortunately, no. They have version for linux, but they afraid that sound subsystem is different on solaris and is not easy to port skype on solaris. I'm not solaris sound guru, but if anybody interesting in it, I have a their email addresses. The local band interrupted all discussions and I went to the hotel.

Second day was opened by Robert Lor and his famous D-Trace presentation. Robert shown only part of D-Trace capabilities, but it was enough to persuade community. Community have accepted D-Trace and it will be implemented in the postgres 8.2. Andrew Dunstan presented buildfarm framework status and its problems and their solution. However, some buildfarm problems are still opened and is need of more investigation. Oleg Bartunov and Teodor Sigaev spoke about new features and improvements in the T-Search 2 project (full text search) and shown a big TODO list as well.

Robert Lor
Robert Lor

Sunday lunch I spent in Elephant pub on Yonge Street (longest street in the World). Bruce with his colleague came little bit later into this pub and Bruce invited me to join the lunch with them. Yes, really I had a lunch with Bruce Momjiam. You can see it on picture below. I asked him about future of his TODO list, if there are any plans to replace this TODO list by some bug tracking system. He answered that there were many discussion about it but all bug tracking system is too complicated for postgres purpose and hackers using primary mails for solving problem and there no exist any relevant system integrated it.

Me and Bruce
Me and Bruce

Peter Eisentraut and Josh Berkus closed the conference and all attendants went to take group photo.

And what is conclusion? Italy won World Championship in Football (for US citizens soccer :-). And conference had big success and big benefit for all attendants. Interest of this conference was three time bigger than available capacity. We can expect that next conference will be early maybe next year. I'm looking forward to it.

Posted by hlipa ( VII 20 2006, 03:56:05 dop. CEST ) Permalink


Archives
Links
Referrers