Announcement

Collapse
No announcement yet.

Release 6.22.0: The bundle com.openexchange.textxtraction gets two configuration files

Collapse
This topic is closed.
X
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Release 6.22.0: The bundle com.openexchange.textxtraction gets two configuration files

    textxtraction.properties:
    # Specify the path to Tika configuration file
    com.openexchange.textxtraction.tikaConfig=/opt/open-xchange/etc/tika-config.xml


    A Tika parser must implement org.apache.tika.parser.Parser. A parser can be registered within tika-config.xml with its full qualified name. A set of parsers that are included in Tika are already registered to specify the document types that can be parsed by default.

    tika-config.xml:
    <config>
    <parser class=org.apache.tika.parser.html.HtmlParser />
    <parser class=org.apache.tika.parser.microsoft.OfficeParse r />
    <parser class=org.apache.tika.parser.microsoft.ooxml.OOXML Parser />
    <parser class=org.apache.tika.parser.odf.OpenDocumentParse r />
    <parser class=org.apache.tika.parser.pdf.PDFParser />
    <parser class=org.apache.tika.parser.rtf.RTFParser />
    <parser class=org.apache.tika.parser.txt.TXTParser />
    <parser class=org.apache.tika.parser.xml.DcXMLParser />
    </config>
Working...
X