ModeShape

Clone Tools
  • last updated a few minutes ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
MODE-1117 Minor change to local variable name

As suggested by Brian, the name of a local variable in a JcrVersionHistoryNode method could

have been better, so it was changed. This is a pretty minor change.

MODE-1117 Corrected the logic that sets the 'jcr:predecessors' and 'jcr:successors' properties

Corrected the JcrVersionManager logic upon checkout to set the 'jcr:predecessors' property to the

'jcr:baseVersion' property value. Prior to this fix, checkout was also adding all of the values

from the base version's 'jcr:predecessors'; the net result of that was that the 'jcr:predecessors'

property always had all prior versions. See MODE-1117 for a more detailed explanation of the

problem.

Once JcrVersionManager was corrected (as described above), two TCK tests failed because of the

logic in the JcrVersionHistoryNode.removeVersion(...) method. Corrected this logic to more correctly set

the 'jcr:predecessors' on each of the removed version's successor nodes, and to correctly set

the 'jcr:successors' on each of the removed version's predecessor nodes.

After these changes, all unit and integration tests (including those added in MODE-1114) pass.

MODE-1114 Corrected import (and replace) behavior when imported content includes versionable nodes

Added several new integration tests to try and replicate the problem. One of these tests did replicate it:

the trick was to import an XML file (with referenceable nodes and versionable nodes), to re-import the

same XML file while replacing content with duplicate UUIDs, and then checking out, changing, and checking

in one of the versionable nodes (that is a descendant of another versionable node that was imported).

The checkin causes the problem because it attempted to resolve the 'jcr:predecessors' REFERENCE property,

which contains a UUID that does not correspond to a node in the repository.

After quite a bit of debugging (almost two days), I was able to figure out what was actually going on.

Basically, when the second import is happening and is importing a node with the same UUID as an existing

node, the existing node must be replaced (e.g., removed). If that existing node is also versionable,

then there is a version history and this was not being removed. Removing these version histories did not

work and resulting in what appeared to be invalid, since it prevents recovering of previously-versioning

information.

A second approach was more successful: leave the version history and remove the "mix:versionable" properties

from the newly-imported nodes (as was being done already). The JcrVersionManager code that initializes

the version history was always generating a new UUID for the base version, even though this new UUID is

not written to the version history if the version history already exists. (The JcrVersionManager code

creates a node with the UUID only if the node is absent.) This is the fundamental source of the problem.

Simply re-reading the root version's UUID and using it for the "jcr:baseVersion" property and in the

"jcr:predecessors" property appears to work.

The new integration tests test a number of permutations of operations on both in-memory and JDBC

connectors, and these tests pass - as do all other unit and integration tests. Therefore, this appears

to fix the issue, and results in identical behavior and version history content whether or not the

content is imported multiple times. In other words, the behavior when importing multiple times and modifying

the versioned content (using checkout, modify, checkin) is the same as importing once and doing the

same modifications.

However, during verification of the results, MODE-1117 was discovered. Although not technically related

to this issue, it does cause the "jcr:predecessors" and "jcr:successors" properties to be incorrect.

Merge branch 'mode-1113-2.2x' of https://github.com/rhauch/modeshape into rhauch-mode-1113-2.2x

MODE-1113 Corrected Repository initialization so that existing node types are read properly

The JcrRepository initialization logic was corrected to read any node types defined in prior executions

of the ModeShape engine are read from the "/jcr:system/jcr:nodeTypes" content. When the engine

starts up and initializes the RepositoryNodeTypeManager, it first reads any existing node types

in the system content (this is the new behavior), then registers any node type definitions that are

referenced in the configuration file, and then registers all of the built-in node type definitions.

In this way, the built-in definitions can never be overridden. This behavior also means that

if the configuration file references some CND files with user-defined node type definitions,

these definitions will override any changes that were made to those definitions before the last

shutdown.

These changes work for clustered configurations. All unit tests and integration tests pass,

including several new tests that verify this new behavior.

    • -0
    • +61
    /modeshape-integration-tests/src/test/resources/config/configRepositoryForDroolsJpaNoNodeTypes.xml
Corrected the JDBC driver POM to use the version variable.

Changed release number on branch to 2.2.1.GA to reflect correct version.

    • -1
    • +1
    /extensions/modeshape-clustering/pom.xml
  1. … 38 more files in changeset.
MODE-1083 Corrected the expected results of an integration test using code that recently changed.

MODE-1083 Corrected the RepositoryNodeTypeManager's cache of property and child node definitions

Changed the ClusteredTest integration test to replicate the reported behavior.

In a clustered environment, whenever node types are registered (or re-registered) with the NodeTypeManager,

the RepositoryNodeTypeManager receiving the request persists the definitions in the system workspace,

and this generates events that are then received by all other RepositoryNodeTypeManager instances in the

cluster. They receive the events and update their in-memory representations of the node type definitions.

The RepositoryNodeTypeManager was correctly updating it's map of node type definitions keyed by the name,

but was not updating the cached maps of the property definitions and child node definitions with those

in the new/altered/removed node definitions. The fix is relatively minor, and involves updating these

maps in the same block of code that is updating the maps of node type definitions.

All unit and integration tests pass with these changes.

MODE-1079 Corrected the copyright dates.

MODE-1073 Corrected another place where the wrong NodeConflictBehavior is used.

MODE-1073 Correcting the logic to prevent creation of extra intermediary nodes

Turns out that even though I had altered the StreamSequencerAdapter to ask for the nodes to be created only if

they didn't exist, the GraphBatchDestination (the class with the implementation of the new method StreamSequencerAdapter

is now calling) was not using the correct graph methods given the desired NodeConflictBehavior. So the

GraphBatchDestination was always appending.

I'm correcting the GraphBatchDestination code to do the appropriate call for the supplied NodeConflictBehavior.

MODE-1066, MODE-1071 Added a new background garbage collection facility to the connector API and ModeShape engine

This new facility includes:

- a new CollectGarbageRequest, with a field allowing the connector to specify whether the garbage collection

operation was completed or whether another pass could/should be made by the requestor

- a default implementation of the RequestProcessor.process(CollectGarbageRequest) that does nothing; existing

connectors by default implement this behavior and don't need to change

- a new field on RepositorySourceCapabilities that allows the connector to express whether it automatically

cleans up unused content (collects garbage), or whether it needs to be done with an explicit CollectGarbageRequest

- a new method on RepositoryService (which owns the library of RepositorySource instances) that will find the

RepositorySource instances that require explicit garbage collection, and upon these (if there are any)

iteratively make the CollectGarbageRequest; the source will be enqueued again if another collection pass is required

- a new background scheduled executor in ModeShapeEngine (the superclass of JcrEngine) to periodically (and

in a background thread) have the RepositoryService collect garbage on its sources.

- a new configuration property that defines the time interval for the periodic, background garbage collection

(the default is every 10 minutes).

Also changed the JPA connector to override the RequestProcessor.process(CollectGarbageRequest) by removing all

unused LargeValueEntity records in all workspaces, and to declare in its RepositorySourceCapabilities that manual

GC is required. Since this is now done explicitly, removing nodes no longer removes unused LargeValueEntity records.

This should improve the performance of certain operations, and because the LargeValueEntity records (keyed by

their SHA-1) stick around for a longer period of time, they remain available to be reused. For example, if repository

content is removed but replaced with mostly similar content, any large values (strings or binary property values)

that are the same actually would be reused.

Also, the MySQL-specific logic for removing LargeValueEntity (see MODE-691) was changed to use a single native SQL

delete statement. This is much more efficient, and on-par with the HQL delete statement used for other Hibernate

dialects. The native MySQL statement was needed because HQL was inadequate for deleting the unused LargeValueEntity

without resorting to a subquery that used the LargeValueEntity table, which is not supported by MySQL. So instead,

the native MySQL delete statement uses a left outer join between the LargeValueEntity and "usages" table and a

criteria on the result to ensure that the only records returned are those without any "usages" records in the

tuples. This works because a left outer join between tables A and B always contains all records from A, whether

or not there are no corresponding records from B. And, if a criteria is added such that a column from B that

can never be null is actually null, then we know the result will contain only those records from A that

do **not** have a corresponding B record. In our case, this ends up deleting only those LargeValueEntity records

that are not referenced in the "usages" table (i.e., they are not used).

Updated a couple of test cases and added a few more that deal with the new feature. A new "mysql5_local" database

configuration was added to the POM, and this required modification of several test cases that depended on the

database configuration name to find files in test data.

All unit and integration tests pass while using HSQLDB and MySQL5.

  1. … 12 more files in changeset.
MODE-1073 Adjusted how several of the sequencing integration test cases wait for the sequenced information.

Several integration tests have recently been failing periodically, especially on Hudson (where the machines are generally slower). Many of the sequencing-related integration tests upload/publish one or more files, and then wait while ModeShape asynchronously sequences them and updates the indexes with the content derived from the uploaded artifact(s). This wait logic was relatively flawed, and resulted in periodic failures.

These test cases were changed to use a single method that looked for a particular node and, if the node was not available in the session or via query, would sleep for 200ms and then retry. This continues for approximately 10 seconds, after which the method causes an assertion failure. However, most of the time the method only has to wait a second or two (again, depending upon the speed of the machine); thus hopefully the 10 second timeout is sufficient at this time.

With these changes, none of the sequencing-related integration tests have been failing for me locally, even after repeated runs.

MODE-1073 Corrected the sequencing service's logic of creating intermediary nodes above the sequenced output to be more thread-safe.

It looks like this is a problem with the Sequencing Service and not the DDL sequencer. The problem appears to be

caused by a race condition between multiple sequencers running at once (in separate threads), and of course is

related to how quickly the files are uploaded (published) and how busy the service already is. For example, if

the files are published a bit more slowly, the threads are far less likely to run concurrently, and the problem

will not be seen.

I was able to replicate this with two new test cases that almost always exhibited the problem. (It is a race

condition, after all.)

However, the cause is actually the code that creates the intermediary nodes (the nodes about the sequenced output).

This code looks to see if the nodes already exists, and if so then it doesn't recreated them. If two sequencers

are running concurrently looking for the same intermediary nodes (and neither has committed their changes to the

persisted data), neither will see any intermediary nodes and both will attempt to create them.

The fix is to have this logic create the intermediary nodes only if they don't exist; IOW, to ask that the

connector create them lazily. Unfortunately, an interface needed to be changed to add a new method (to create

nodes lazily), and several implementations changed to implement the new method. However, as they are new methods,

the impact should be isolated to this new logic. And since this logic is always used in all sequencing operations,

the impact should be pretty easily verified.

After I made the correction to the logic, I was not able to replicate the problem with the new test cases, even

after running almost a dozen times. I'll continue testing and running a full integration build before continuing.

    • -0
    • +65
    /modeshape-integration-tests/src/test/resources/config/configRepositoryForDdlSequencing.xml
MODE-1078 Corrected the RepositoryLockManager.notify() method to use the correct "mode:workspace" property.

Indeed, the "mode:workspaceName" property (e.g., the ModeShapeLexicon.WORKSPACE_NAME constant) is used in

the configuration of the federation definitions, while "mode:workspace" is used within the lock manager

implementation. Therefore, the RepositoryLockManager.notify(...) method should be using

ModeShapeLexicon.WORKSPACE (the same as is already used in other methods of the class.

Also added an assertion to the RepositoryLockManager.getLockManager(...) method to ensure the workspace

name is not null, and added this pre-condition to the existing JavaDoc.

All unit and integration tests pass.

MODE-1077 Corrected the indexing logic to prevent nodes from not appearing in the indexes.

I was able to replicate this faulty behavior with an integration test that was sequencing a file and querying

the results, only after configuring the engine to use 'initial content'. In all our testing, we were able to

show that the content in the store was correct, and that the only problem appeared to be that some nodes were

missing from the indexes.

It turns out that the problem is in the 'modeshape-search-lucene' module's logic for how the ChangeRequest

objects are processed and used to update the indexes. Using 'initial content' changed the requests that were

being generated such that this condition occurred; when no 'initial content' was specified, the ChangeRequest

objects were different enough that this condition was not encountered.

The logic to process the ChangeRequests tries to figure out whether each change request contains all the

information necessary to update the indexes. Some changes requests, like CreateNodeRequest or

DeleteSubgraphRequest or AddPropertyRequest, have all the information necessary to update the indexes. Other

ChangeRequests don't have all the information necessary (e.g., an UpdatePropertiesRequest doesn't have the

old values, which we have to remove from the indexes), so in these cases we crawl the content to retrieve the

information and replace the record for that node. For many of the requests where we don't have sufficient

information, we crawl just that node (e.g., a crawl depth of 1), but in a few cases we determine that it's more

effective just to crawl the entire subgraph and replace all of the index entries for the entire subgraph. When

crawling the entire subgraph, we can safely ignore all those ChangeRequests that have a changed location below

the crawled paths, so we prune these requests.

The problem was that we were pruning the requests that applied below the crawled path

**even when we were just crawling to a depth of 1**. That means we were pruning some requests, and the

indexes would not reflect those nodes. Only if the subgraph containing those nodes were re-crawled would

the nodes reappear, but that would only happen occasionally.

The fix is actually pretty simple. In the LuceneSearchEngine, where we are pruning the change requests that

apply below the crawl location, we simply have to check the depth and only prune if the depth > 1.

Several test cases were added in the attempt to replicate, and these tests now pass. In fact, all unit and

integration tests pass after the changes are made.

    • -0
    • +74
    /modeshape-integration-tests/src/test/resources/config/configRepositoryForCndSequencingUsingJpa.xml
Added Van to the list of developers in the parent POM.

git-svn-id: https://svn.jboss.org/repos/modeshape/branches/2.2.x@2663 76366958-4244-0410-ad5e-bbfabb93f86b

MODE-1027 Made one additional tweak: if the new content is considered 'mix:versionable', then the RESTful service will do a checkin (which is how the JCR API is used when content is newly marked as versionable).

git-svn-id: https://svn.jboss.org/repos/modeshape/branches/2.2.x@2661 76366958-4244-0410-ad5e-bbfabb93f86b

MODE-1027 Improved the support for publishing versionable files.

Changed the RESTful service to support modifying a subgraph via a PUT. With this change, the same JSON request for a POST (for create) could be sent as a PUT to update the subgraph rather than delete it. Of course, the JSON request only needs to include properties that change, although every child node needs to be specified. The previous PUT JSON request format (containing the properties to change) is still supported and behaves as before. Note that, before updating a node, the RESTful service first checks if the node is 'mix:versionable'. If it is, then the service checks out the node, modifies it, and then checks it back in. If it is not versionable, it just directly modifies the node.

Also changed the REST client to no longer unpublish an existing file before publishing a new one. Instead, if the file is already there, it submits the new file as a PUT rather than a POST.

git-svn-id: https://svn.jboss.org/repos/modeshape/branches/2.2.x@2660 76366958-4244-0410-ad5e-bbfabb93f86b

MODE-1069 - missing localization in JDBC driver

git-svn-id: https://svn.jboss.org/repos/modeshape/branches/2.2.x@2656 76366958-4244-0410-ad5e-bbfabb93f86b

Added test case that verified the behavior as brought up in a discussion thread.

git-svn-id: https://svn.jboss.org/repos/modeshape/branches/2.2.x@2655 76366958-4244-0410-ad5e-bbfabb93f86b

MODE-1067 Removed duplicate test and ignored another problematic test.

git-svn-id: https://svn.jboss.org/repos/modeshape/branches/2.2.x@2653 76366958-4244-0410-ad5e-bbfabb93f86b

MODE-1066 Rolling back the correction of a unit test, since the Hudson builds are failing with this change (though this change is required locally).

git-svn-id: https://svn.jboss.org/repos/modeshape/branches/2.2.x@2651 76366958-4244-0410-ad5e-bbfabb93f86b

MODE-1067 The code contained three ways in which nodes were retrieved while processing the results of a query. One of these properly handled the case where the node was deleted in persistent storage, but was returned in the query result tuples because the indexes weren't yet updated to reflect the deletion. The two other ways were not handling this case, so these were corrected.

All unit and integration tests pass with these changes.

git-svn-id: https://svn.jboss.org/repos/modeshape/branches/2.2.x@2649 76366958-4244-0410-ad5e-bbfabb93f86b

This is a fix to only the 2.2.x branch to remove the duplicate files in the META-INF folder of the modeshape-jdbc-2.2.1-http-jar-with-dependencies.jar. This will now enable the signing of the jar.

git-svn-id: https://svn.jboss.org/repos/modeshape/branches/2.2.x@2646 76366958-4244-0410-ad5e-bbfabb93f86b

    • -2
    • +2
    /utils/modeshape-jdbc/src/assembly/kit.xml
MODE-997 - fix to the ResultSet date,time,timestamp methods that take a calendar as an argument, to better handle conversions from the internal calendar representation to the expected result based on the targeted calendar passed in.

git-svn-id: https://svn.jboss.org/repos/modeshape/branches/2.2.x@2644 76366958-4244-0410-ad5e-bbfabb93f86b

MODE-1049 Corrected four more unit tests that were failing due to recent changes with ISDESCENDANTNODE criteria.

git-svn-id: https://svn.jboss.org/repos/modeshape/branches/2.2.x@2643 76366958-4244-0410-ad5e-bbfabb93f86b

MODE-1049 Corrected one of the test cases (and added another) that deals with ISDESCENDANTNODE criteria. The expected results were incorrect.

git-svn-id: https://svn.jboss.org/repos/modeshape/branches/2.2.x@2641 76366958-4244-0410-ad5e-bbfabb93f86b

MODE-1052 Corrected the logic in the LuceneSearchEngine for handling NOT criteria.

git-svn-id: https://svn.jboss.org/repos/modeshape/branches/2.2.x@2640 76366958-4244-0410-ad5e-bbfabb93f86b