Clone Tools
  • last updated a few minutes ago
Constraints: committers
Constraints: files
Constraints: dates
MODE-2684 Removes the compile time dependency of modeshape-core towards Apache Tika The mime type extraction functionality will still function as-is if Tika is present, but now there is also an independent extension-based default which will be used if Tika is not present in the CP at runtime

    • -0
    • +138
  1. … 18 more files in changeset.
MODE-2528 Integrates the new relational provider with the modeshape codebase This is a huge commit which makes the necessary changes to remove all Infinispan configuration and dependencies, replacing it with the new mechanism. It also contains several changes to the relational provider design because of various failing tests. This includes among other thing the necessity for ModeShape to notify the provider once exclusive locks have been obtained as part of each transaction.

  1. … 304 more files in changeset.
MODE-2489 Refactored mime-type handling and added the possibility of configuring the repository to use either "content", "name" or no mime-type detection at all.

    • -0
    • +140
    • -0
    • +154
  1. … 32 more files in changeset.
MODE-2321 Upgrade Tika from 1.3 to 1.6

  1. … 3 more files in changeset.
MODE-2097, MODE-2169, MODE-2197 Integrated the latest version of the jboss-integration BOM. This commit includes changes for multiple different issues that snowballed: - packaging Javadocs in a zip - updating Apache POI In addition, after integrating the BOM a number of unit tests had to be updated to reflect changes in dependencies both from a functionality perspective and from a deprecation perspective. The most significant change there was the rewriting of the ConnectorTestCase (modeshape-jca) because the new versions of Arquillian + IronJacamar hold filelocks on Windows:

  1. … 93 more files in changeset.
MODE-2081 Changed the remaining files over to the ASL 2.0 license

  1. … 1048 more files in changeset.
MODE-2148 Added checkstyle to our build, and corrected numerous potential problems or issues in the code. Also removed lots of meaningless JavaDoc

  1. … 366 more files in changeset.
MODE-1920 Corrected compiler warnings and JavaDoc errors

  1. … 26 more files in changeset.
MODE-1960 Updated the POI dependency to 3.10-beta1 and added back the MSOffice Sequencer and Tika Extractor, which were disabled as a result of

  1. … 15 more files in changeset.
MODE-1934 - "De-activated" all Apache POI dependencies. No code was removed, so that if the underlying issue is fixed in a future version of POI, we should be able to easily bring it back.

  1. … 15 more files in changeset.
MODE-1639 Minor improvements and fixes

A log message is output when no Tika mime type detectors could be found (meaning Tika is not on

the classpath), stating that automatic MIME type detection will be disabled. Also ensured that

the input stream used by Tika is always closed.

All unit and integration tests pass.

  1. … 4 more files in changeset.
MODE-1639, MODE-1640, MODE-1634 Replaced the Aperture-based MIME type detector with a Tika-based one

This required quite a bit of dependency gymnastics, since Tika has quite a few more transitive

dependencies than the Aperture library (which we had successfully pared down several years ago).

Tika references about 25 dependencies (including transitive dependencies), but this was reduced

in 'modeshape-jcr' to about 8 for basic MIME type detection. Note that Tika usually includes

two BouncyCastle libraries in its dependencies (used for encrypted PDFs, among other things),

but ModeShape intentionally excludes these (as we don't want to ship or depend on any

security-related JARs).

Not only do we get Tika's substantial MIME type database, we've made it possible for users

to edit the 'org/modeshape/custom-mimetypes.xml' file and provide the updated one on the application

classpath. What goes in that file will overwrite all of the other sources (namely Tika's built-in

file and its customization file, both of which are to be found on the classpath), which means

it's easiest to simply provide an updated version of this file at 'org/modeshape/custom-mimetypes.xml'.

Be sure to not remove any of the (few) customizations that ModeShape includes - those are important.

As we upgrade Tika, we'll get updated versions of the media type data. This is far more preferable

than having a ModeShape-specific version.

The MIME type related interfaces in ModeShape's public API (e.g., 'modeshape-jcr-api') have been removed.

These were added sometime in one of the 3.0 releases, so removing them will not introduce compatibility

issues for users.

Instead, we've decided to get out of the MIME type detection framework business, and have decided

to switch to Tika for all MIME type detection. In fact, you can still write your own MIME type detector,

but you do that by implementing Tika's interface and reference the implementation class(es) in the

corresponding service loader file in your JAR. (See the TIKA documentation for details.)

However, internally we still have an abstraction. This is because it is possible to remove the Tika

(and transitive dependencies) from a ModeShape installation, as long as your applications will not

expect any kind of automatic MIME type detection. This is a perfectly valid use case: for example,

using a repository to store data and do not store files (and don't use sequencers).

The AS7 kits required a bit more modification. There is now a new AS7 module for 'org.apache.tika'

that contains all of the JARs, and this is used by the ModeShape module and by the Tika text extractor


All unit and integration tests pass with these changes. Several new tests were added.

    • -1057
    • +0
    • -98
    • +0
    • -0
    • +134
  1. … 65 more files in changeset.
MODE-1527- Migrated initial version of the text extractors from 2.x and updated the binary store to extract the text and mime-type of binary values

Working on this, exposed how fragile - lock-wise - is working with the SharedLockingInputStream (FileSystemBinaryStore). Therefore, I've updated the mime-type detection so that mark & reset are avoided as much as possible, also making sure that streams are closed after each detector finishes with them.

The Tika version was bumped to 1.1 which required also the update of the POI version to 3.8.

    • -1379
    • +20
    • -38
    • +34
  1. … 56 more files in changeset.
MODE-1386 Updated Maven assemblies, corrected dependencies, and added examples

The Maven assemblies were corrected (a bit; still work to do to create usable distributions),

but several dependencies were removed and two examples were added to the codebase (but not to

the build yet).

  1. … 60 more files in changeset.
MODE-1368 Removed all legacy modules no longer needed in 3.x

ModeShape 3.x will not need a number of the 2.x modules. In particular:

- since 3.x will only have an AS7 kit, the AS5 or AS6 artifacts were removed

- all the connectors were removed, since they're no longer used

- the connector benchmark tests module was removed, replaced by our new

performance test suite

- the JPA DDL generator utility has been removed

- the 'modeshape-graph', 'modeshape-repository', 'modeshape-search-lucene'

and 'modeshape-clustering' modules have all been removed, since the new

'modeshape-jcr' module no longer uses them

- the DocBook modules were removed and are replaced by the Confluence space

- the two JDBC modules were moved out of the 'utils' directory to top-level modules

The build still works, but not all components have been included in the build.

This is because the query functionality doesn't yet work, so quite a few web

and JDBC driver modules all depend on this.

The assembly profile has not yet been changed or corrected.

    • -0
    • +0
  1. … 3652 more files in changeset.