Skip to main content

Bash and Python scripts to unzip and modify an OpenOffice odt document


.odt files are actually containers (you can see one using unzip -l document.odt). Within the container, content is in the content.xml file. Script info source


Heres what Ive figured out about opening, modifying, and saving the content of an .odt file:

  • To open the container for editing:
    # bash
    $unzip path/container.odt content.xml
    $unzip path/container.odt content.xml -d working/dir/path/
    # -d places content.xml in a different directory.
    # Creates a content.xml file where you want it.

    # python
    >>>import zipfile
    >>>odt_file = zipfile.ZipFile(path/to/file.odt,a)
    >>> # Options are read only, write only, and append to existing
    >>>raw_xml = odt_file.read(content.xml)
    >>> # Reads content.xml in as a string, doesnt place a file.

  • Modify the content.xml file by hand, or using a script, or using Python.
    >>> # Tip for using python: ElementTree is good at XML, and it can parse a file, but it cannot parse a string!
    >>> # So heres how to make a string look like a file using pythons StringIO module.

    >>>import StringIO, xml.etree.ElementTree as ET
    >>>fakefile = StringIO.StringIO(raw_xml) # Pretend raw_xml string is a file called fakefile
    >>>tree = ET.parse(fakefile).getroot() # Parse the fakefile
    >>>fakefile.close() # close() is desired by StringIO to free the buffer
    >>> # Make changes to your tree here.

  • To restore the container with modified content:
    # bash
    zip -j path/container.odt working/dir/path/content.xml
    # The -j flag adds the file content.xml instead of the useless path/content.xml. You need this!
    rm working/dir/path/content.xml # Clean up

    # python
    >>>new_xml = ET.tostring(tree) # If youre exporting from an ElementTree
    >>>odt_file.writestr(content.xml, new_xml)
    >>>odt_file.close()
  • Putting it all together in bash:
    cd path/to/working/directory
    cp path/to/template_filename.odt working_file.odt
    unzip working_file.odt content.xml

    # Change the xml...somehow

    zip -j working_file.odt content.xml
    rm content.xml
  • Putting it all together in python:
    def edit_the_odt_content(template_filename):
    """Exposes the content of an .odt file so you can modify it."""
    import os, shutil, StringIO, zipfile, xml.etree.ElementTree as ET

    shutil.copyfile(template_filename, working_file.odt) # Copy the template into a working file
    odt_file = zipfile.ZipFile(working_file.odt,a)
    xml_string = odt_file.read(content.xml) # Read the zipped content.xml within the .odt as a string
    raw_xml = StringIO.StringIO(xml_string) # Pretend the read string is a file so ElementTree will parse it
    tree = ET.parse(raw_xml).getroot() # Convert raw string to ElementTree
    raw_xml.close()

    office_namespace = {urn:oasis:names:tc:opendocument:he tree to find the elements you want to change
    text = body.find(office_namespace + text)
    new_text = your_function_to_modify_the_xml(text) # You can now change the XML any way you wish
    body.remove(text) # Replace the old XML with the new
    body.append(new_text)

    new_xml = ET.tostring(tree) # Convert the modified ElementTree back into an XML string
    odt_file.writestr(content.xml, new_xml) # Write the string into the zipped content.xml
    odt_file.close() # Close the zip archive (important!)
    return
  • Bug: Dont use the zip -m flag! It looks handy, claiming to delete the content.xml file from your file system after adding it to the archive...but instead it will unpredictably delete without adding to the archive.

  • You can avoid the whole "containers" muddle by saving an OpenOffice document as a flat file (.fodt). Theres no zipping or unzipping, just open the file in an editor - its xml already. Open the modified .fodt with OpenOffice, and your document is right there. Er, be sure your version of OO supports .fodt before using it. My Mac doesnt, for example.

  • OpenOffice also has its own script classes for Python and C, called UNO. However, I havent taken time to dig around through it.

Popular posts from this blog

Baixar PS1 Crash Bandicoot 2 Cortex Strikes Back

PARTE 1: http://www.megaupload.com/?d=1AVO7X8T PARTE 2: http://www.megaupload.com/?d=WUBY6EAH

Beelzebub English Sub Batch 720p Download Free

    English: Beelzebub Japanese: ?????  Type: TV Episodes: 60 Score: 8.04 Aired: Jan 9, 2011 to Mar 25, 2012 Premiered: Winter 2011 Studios: Pierrot Plus Source: Manga Genres: Action, Comedy, Demons, Supernatural, School, Shounen Duration: 24 min. per ep. Rating: PG-13 - Teens 13 or older   ======================================= Type : MKV Audio : Japanese Subs : English [HorribleSubs] ======================================= Beelzebub English Sub (Batch) Mega.nz Link 720p MKV: Batch (Mega): Download (60 Episodes) To download batch from Mega.nz Just Select all files in the folder and select download as ZIP  

Barbarian Level 70 Paragon Level 191

Ive been thoroughly enjoying Reaper of Souls, so much so that Ive skipped posting here for some time (sorry, guys) and just concentrated on leveling-up my barbarian character. As you can see from the screenshot below, Ive been spending quite some time transmogrifying and buying up dyes making sure everythings coordinated and shit. Right now my barbarian is the most powerful (i.e., the one able to dish out the most damage) among my characters but I have a feeling that my crusader is going to overtake him really fast (at least until he levels up to 70 - just 2 more levels to go). Also, hes the sugar daddy of the bunch, the one tasked to make the most money for spending by the group. Ive given him some rest in the last couple of weeks however, preferring to adventure with my crusader and exploring his different abilities. Ill be getting him back on track soon enough though as the barbarian is my favorite class. Im currently playing at Torment II but I hope to progress to Torment III as so...