GullFOSS
OpenOffice.org Engineering at Sun
 
Subscribe

Today's Page Hits: 3693

 
Archives
 
« May 2008
SunMonTueWedThuFriSat
    
3
4
6
10
11
12
15
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
       
Today
Links
Flickr Photos
More Flickr photos tagged with openoffice
Locations of visitors to this page
all tags: accessibility api aqua architecture automated_tests base beta build calc chart code community compiler cws database development directx download draw eis events export extensions features filter framework graphics gsl gsoc gullfoss i18n import impress installation irc iso26300 java l10n localization mac macros netbeans odf odff ooo ooocon ooxml opendocument openoffice.org patch pdf performance plugin podcast porting qa quality quaste release report sdk snapshot software specification spreadsheet staroffice statistics statuspage sun svg toolkit tools usability user-experience vba web wiki writer writerfilter xml
Wednesday, 19 Dec 2007
PDF Import: First milestone reached
Thorsten Behrens

We've now reached a first milestone with the PDF import extension, in that we're able to import typical PDF documents with good layout fidelity in Draw and Impress. Below, you can see a sample CAD pdf, imported in Draw, with the title slightly edited (it read "Airplane Engine" before):

Sample import screenshot

When exporting this back to pdf, or printing it, this enables users to perform basic editing in formerly read-only PDF. Equally possible is filling out forms (though not overly convenient, as of now - the native pdf forms are not yet imported). 

The next steps are probably improvements in the Writer import, which focuses more on editability, and less on layout. The legal issues with an external pdf parser mentioned before are solved, so everything necessary to check this out is now available in CVS, as CWS picom.

tags:

Posted by Thorsten Behrens on 19 Dec 2007  |  PermaLink |  Bookmark to del.icio.us Bookmark to del.icio.us |  Digg this Digg this  |  Comments[3]

Friday, 23 Nov 2007
News from the PDF import
Thorsten Behrens

As announced some time ago, we're busy implementing a PDF import extension for OOo. Building upon a prototype from Philipp Lohmann (of Aqua port fame), and experimenting with agile development methodologies, we're quickly progressing towards a demoable preview. Work happens in CWS picom (one can probably glean from the commit comments what we're current hacking at).

Besides missing bits and pieces here and there (that we strive to iron out before), we need clearance for running a (GPLed) xpdf binary out-of-process, and will then provide the interested public with regular testing binaries.

tags:

Posted by Thorsten Behrens on 23 Nov 2007  |  PermaLink |  Bookmark to del.icio.us Bookmark to del.icio.us |  Digg this Digg this  |  Comments[2]

Thursday, 08 Nov 2007
News in PDFExport
Philipp Lohmann

Thanks to Giuseppe Castagno's (aka beppec56) considerable efforts two improvements are now on their way into OOo. Barring unexpected tragic events, OOo 2.4 will have support for PDF/A, a feature often requested in governmental environments. The other improvement is OOo's handling of internal and external links in exported PDF documents which can now be customized to point to a structure of files interlinking.

This is really outstanding work, which met with some difficulties early on, but Giuseppe persevered and now we have his two PDF related CWS in QA, just in time for feature freeze.

Kudos to beppec56.
 

tags:

Posted by Philipp Lohmann on 08 Nov 2007  |  PermaLink |  Bookmark to del.icio.us Bookmark to del.icio.us |  Digg this Digg this

Wednesday, 01 Aug 2007
Completing PDF support in OOo (Part II)
Kai Ahrens

Ok, now that we let the cat out of the bag, my inbox is filled with some mails asking for more information on the PDF import filter we're going to implement. So, I'd like to give you some details that are yet known, but still discussable if somebody comes up with a better idea:

  • As already mentioned in my comment regarding the initial blog entry, it won't be an option for us to import the PDF content into a Writer document containing floating text and as such a floating layout. So, we decided to write a filter that imports the PDF content as OOo Draw/Impress document.
    With this solution, we'll have the full benefit of a page orientated, fixed layout. All graphical elements will be at fixed positions given in the PDF file and text portions will be combined as most as possible to be anchored in text shapes, ensuring that text portions preserve their exactly given position, but are still editable by the user.
    The challenge with this solution is 'just' to find the most common bounding box for text portions that can be grouped together in one text shape. But this is nothing compared to the 'impossible' and life time task of reconstructing/guessing the whole layout of the original document the PDF document was created from. As you know, PDF files don't contain such structuring information in general, beside some tagged PDF files, on which we can't rely.

  • The next question that arises for development is, what kind of parser to use for reading the basic content of the PDF file. There exists a well known and widely used framework for this: the XPDF library and its derivatives like Poppler. Yeah, that would be a great and well tested framework for us, but unfortunately, it doesn't match with the OOo code licensing, at least at the moment. So, we'll have to write our own parser for this task, which is not bad at all due to the fact that XPDF still lacks some features we would have to implement in either case.

  • The filter itself will be available as a downloadable extension to the standard OOo release. This perfectly fits in our roadmap to create a more unitized OOo packet, consisting of several 'standalone' components, reusable in other context.

  • The most interesting question that came up is that of the timeline for this implementation. Please expect to have the product version of the filter ready for the OOo 3.0 release latest. A detailed release plan for OOo 3.0 is not known at the moment. But, as already mentioned, I expect to have first results available within a few months, so that most of you will be able to enjoy playing around with a pre-release of this filter till the end of this year. We will definitely need your feedback regarding this first release and upcoming ones to add missing parts, fix bugs etc.

  • Some of you asked, if there will be some additional goodies around the whole PDF story in OOo. The answer for this question is 'Yes, there will be some more stuff around the pure import and export filters'. One example for this would be the support for PDF/A, a feature that is currently implemented by community member Giuseppe Castagno.
    Another example would be the support for creating PDF documents containing the original ODF document itself, allowing to read the original content without loss by any ODF enabled application.

I hope that this blog entry answers the most urgent questions for the moment. Please don't hesitate to add any comments, questions, suggestions etc. you have.


tags:

Posted by Kai Ahrens on 01 Aug 2007  |  PermaLink |  Bookmark to del.icio.us Bookmark to del.icio.us |  Digg this Digg this  |  Comments[12]

Monday, 30 Jul 2007
Completing PDF support in OOo
Kai Ahrens

Having a very well working and mature PDF export filter in OOo for several years now, it's time to take the final steps regarding full PDF support. Yes, you're right, we're speaking of implementing a native PDF import filter for OOo within the Sun OOo Graphics development team.

As trivial as this task might look like at the moment, there are several topics that need to be discussed in detail before development can be started. This begins with the OOo application, that the import filter will be written for and definitely doesn't end with the appropriate parser that will be used to read the PDF content itself.

I don't want to go into details of the current planning and development phase by now, but please be assured that the final solution is planned to be a total replacement of the currently available tools you normally use in your everyday workflow, preserving the layout as good as possible plus offering editing capabilities for the imported document, a feature that you don't get for free with most of your common tools. Sounds great, doesn't it?

I don't want to be too optimistic, but we're planning for the first prototype to be available within the next few months. Please stay tuned for more details to be provided by the involved development team members within the next days...




tags:

Posted by Kai Ahrens on 30 Jul 2007  |  PermaLink |  Bookmark to del.icio.us Bookmark to del.icio.us |  Digg this Digg this  |  Comments[16]

Tuesday, 27 Mar 2007
Improved text output in PDF export
Philipp Lohmann

With the now nominated CWS glyphadv (to be integrated in 680m207 or perhaps 680m208) an improvement with respect to text rendering in PDF becomes available. Overall text output of course wasn't bad already - the current method is basically unchanged since OOo1.1 - but in case of occasional strange fonts ugly artifacts like characters overlapping, an uneven right margin in justified text and similar things could sometimes happen. These artifacts are always the result of subtle changes in the assumption of how wide a glyph is that is made by OOo at rendering time vs. what is contained in the actual downloaded font - an effect that could grow quite a bit in case of fonts made artificially bold. The improved text output will synchronize these two possible slightly different values so the position of a glyph can be output more precisely.

As a bonus the new text output saves some PDF code reducing the produced PDF file size a little - in extreme cases (only PDF builtin fonts used, no images) up to 30%. This is however true only for text in the same baseline, so no vertical text. Of course there is also a small drawback (isn't there always ?), namely that font files will have to be accessed one additional time to get their precise metrics before we actually know which characters we will actually use from them in the course of producing the PDF file. This effect is not dramatic however and correct PDF files are preferable to slightly faster PDF generation.

tags:

Posted by Philipp Lohmann on 27 Mar 2007  |  PermaLink |  Bookmark to del.icio.us Bookmark to del.icio.us |  Digg this Digg this

GullFOSS