Clone Tools
  • last updated a few minutes ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
MODE-2735: improper formatting of full text search query constraints with other than literal RHS

  1. … 1 more file in changeset.
MODE-2661 Changes the logic of some of the Lucene queries to improve search performance The changes include both the logic of ConstantScoreWeightQuery which no longer goes through all the documents directly and also making sure to use Lucene native TermQueries when applicable. They also refactor some of the query code to use Java 8 idioms. This commit also upgrades the Lucene version to the latest 6.4.1

  1. … 15 more files in changeset.
support non-built-in namespaces in query selectors

MODE-2516 Updates JDK to 1.8 and jboss-parent to the latest version (19) This is the first significant commit of the 5.x series and contains a number of significant changes: - the naming of Maven version properties changed to adopt the standard pattern: 'version.<groupId>.<artifactId>' - build system and dependency updates so that the latest Maven plugin versions function correctly - updating source code to avoid compiler and javadoc warnings

  1. … 154 more files in changeset.
MODE-2166 Adds CAST dynamic operand for JCR-SQL2.

  1. … 9 more files in changeset.
MODE-2448 Fixed unicode support in regex matching for FTS.

  1. … 1 more file in changeset.
MODE-2347, MODE-1055 Made sure QOM behaves in the same way as SQL2 as far as missing selector columns in queries.

  1. … 2 more files in changeset.
MODE-2329 Fixed the handling of expanded form selector names for the query engine.

  1. … 2 more files in changeset.
MODE-2151 Added support for CHILDCOUNT dynamic operand

Pretty basic support that should prove quite useful in certain situations. This may be relatively

expensive when the repository has nodes with lots of children since it requires loading the parent

node's child references in order to obtain the count. The CHILDCOUNT criteria would therefore work

much better/faster as filtering criteria in a query that already defines criteria that indexes can

use.

  1. … 14 more files in changeset.
MODE-1671 Added 'mode:id' pseudocolumn for JCR-SQL2 queries

It is now possible to use the 'mode:id' pseudocolumn that exists on all selectors

to obtain the javax.jcr.Node.getIdentifier() value. It can be used in WHERE constraints

and JOIN criteria.

  1. … 17 more files in changeset.
MODE-2246 Implemented default FTS via strict regex matching. No stemming or punctuation processing is done by default (unlike what Lucene did in 3.x) which means that some of the tests had to be adapted. Also, enabled back the tests that had been previously disabled for full text search.

  1. … 10 more files in changeset.
MODE-2018 Implemented new query engine.

Refactored the query functionality to now use several new service provider interfaces (SPI),

and implemented a new query engine that can take advantage of administrator-defined indexes.

When no such indexes are defined, the query engine is able to still answer the queries

by "scanning" all nodes in the repository. This is like a regular relational database:

all query functionality works (albeith slowly) even when no indexes are defined, though

to improve performance simply define an appropriate index based upon the query or queries

that are being used.

All of ModeShape's query parsing, planning, and optimization steps are basically unchanged

from the previous query system. There is one addition to the rule-based optimizer: a new

rule looks at query plans and adds the potential indexes that might be of use in each

access query portion of a query plan. Then, the query execution process (see below)

chooses one of the identified indexes based upon the selectivity and cardinality. If no index

is available for that portion of the query plan, then the query engine simply iterates

over all queryable nodes in the repository.

A new kind of component, called a "query index provider", allows the query engine to delegate

various responsibilities around indexes to these providers. For example, a provider must

provide an index planner that can examine the constraints that apply to an access query

and determine if any of the provider's indexes can be used. When they are, ModeShape

adds those indexes to the query plan. If the query engine uses one of those indexes,

then provider must be able to return all of those nodes that satisfy the criteria

as described earlier by its index planner. Finally, as ModeShape content changes, ModeShape

will notify the index providers' of the changes so that they can ensure their indexes

are kept up-to-date with the content.

This means that a provider can implement the functionality using any kind of technology,

and consequently, that ModeShape can begin to leverage multiple kinds of search and index

technology within its query system. The ModeShape community anticipates having providers

that use Lucene, Solr, and ElasticSearch. ModeShape will also likely come with a provider

that maintains file-system based indexes. Additionally, providers can optionally support

indexes on one or more properties. Thus, it will be possible to mix and match

these providers, selecting the best technology for the specific kind of index.

The new query engine does the execution in a very different way than the previous engine,

which used Lucene to determine the tuples (that is, the values in each row) for each access

query and that were then further processed and combined to form the tuples that were returned

in the result set. The new engine instead uses a new concept of a "stream of node keys"

for each access query: what actually implements that stream depends on many factors.

A node sequence is an abstraction of a stream of "rows" containing one or more node keys.

The interfaces are designed to make it possibly to lazily implement a stream in a very

efficient manner. Specifically, a node stream is actually comprised of multiple "batches"

of rows, and batches can be of any size.

Consider when the engine findes no indexes are available for a certain access query. The

engine simply uses a "node sequence" (or NodeSequence) implementation that returns in batches

a row for each node in the repository.

But if an access query involves a criteria on the path of a node, such as

"... WHERE ISSAMENODE('/foo/bar') ...", then ModeShape knows that this query (or portion of

a query) will have only one result, namely the node at "/foo/bar". ModeShape doesn't need

an index to quickly find this node; it merely has to navigate to that path to find the one

node that satisfies this query. ModeShape has several other optimizations, too: it knows

when a query involves all children or descendants of a node at a given path, and can take

this into account when optimizing and executing the query. All of these are handled with

special NodeSequence implementations optimized for each case.

For many access queries (i.e., part of a larger query), the engine will use one of the

indexes identified by one of the providers. When this happens, ModeShape uses other

NodeSequence implementations that utilize the underlying indexes to find the nodes that satisfy

some of the criteria.

The above describes how the engine uses a single NodeSequence instance for each each access

query in a larger query. But how does the engine combine these to determine the ultimate

query results? Basically, the engine constructs a series of functions that process one or more

NodeSequence instances to filter and combine into other NodeSequences.

For example, a custom index might be used to find all nodes that have a 'jcr:lastModified'

timestamp within some range. Presumably this index is used because it has a higher selectivity,

meaning that it will filter out more nodes and return fewer nodes than other indexes.

Other criteria that are also applied to this access query might then be applied by a filter

that processes the actual nodes' property values.

While the result of this commit is a functioning query engine that is shown to work in most

of the query-related unit and integration tests, there still are a few areas that are not complete.

Specifically:

* The new engine does not support full-text search, and currently throws an exception

* No index providers are implemented. Therefore, all queries involve "scanning" the repository.

This can be time consuming, especially for federated repositories. Consequently, all such

tests that query federated content have been disabled/ignored.

  1. … 225 more files in changeset.
MODE-2081 Changed the remaining files over to the ASL 2.0 license

    • -18
    • +10
    ./DescendantNodeJoinCondition.java
  1. … 1036 more files in changeset.
MODE-2041 Corrected numerous compiler warninings, JavaDoc errors and warnings, and removed quite a few JavaDoc comments that are inherited via @Override.

  1. … 79 more files in changeset.
Corrected JavaDoc errors and compiler warnings.

  1. … 11 more files in changeset.
MODE-2062 Corrected full text search with a bind variable for expression

  1. … 2 more files in changeset.
MODE-2037: Extended like operation (reverse like implementation)

Added a reverse like (e.g., "RELIKE") constraint that is useful when

the LIKE pattern is stored in a node property and the intent is to

find all nodes that have a pattern that matches a given string:

SELECT *

FROM [service:Locator] AS locator

WHERE relike($phone, locator.[service:phonePattern])

  1. … 12 more files in changeset.
MODE-1969 Updated querying to support simple references. Also added unit tests for the JCR-SQL2 REFERENCE() operand. This exposed a bug in that previously strong references were no stored in the ALL_REFERENCES Lucene field and therefore never retrieved in queries.

  1. … 6 more files in changeset.
MODE-1969 Extended the standard JCR types with a new type: SIMPLEREFERENCE which acts as a weak reference, but doesn't cause any back-pointers to be stored. Also, properties of this type are not returned by the JCR specific methods handling normal references (strong & weak).

  1. … 39 more files in changeset.
MODE-1840 CONTAINS clause should allow use of bind variables

Per the JCR 2.0 specification, the 'CONTAINS' clause in JCR-SQL2 queries

allows using a bind variable in place of the full text search expression.

This commit adds this support, including two new test cases that

verify the functionality.

  1. … 5 more files in changeset.
MODE-1599 Corrected how result columns are qualified with table name for QOM queries

The org.apache.jackrabbit.test.api.query.qom.ColumnTest#testExpandColumnsForNodeType() TCK test runs

queries in both the Query Object Model (QOM) and JCR-SQL2 languages, and expects that the result set

columns for both are named like "selectorName.propertyName", even for queries such as

"SELECT s.* FROM [nt:unstructured] AS s".

The JCR-SQL2 query result set columns were named correctly, since we set the qualifyExpandedColumnNames'

field in the PlanHints for this language. However, the hints for the queries created programmatically

using the JCR-QOM QueryObjectModelFactory interface were not being set in the same way. This

change corrects that and no longer ignores the ColumnTest#testExpandColumnsForNodeType() TCK test.

  1. … 1 more file in changeset.
MODE-1496 JCR-3313 Added fixed JCR TCK test and corrected result set column behavior

Corrected the behavior of query result column names when the query contains a wildcard in the SELECT statement.

  1. … 9 more files in changeset.
MODE-1485 Corrected JOIN behaviors and other query-related issues

Several TCK tests were testing the JOIN behaviors, and ModeShape produced the incorrect results.

This was in part because of incorrect logic within the NestedLoopJoinComponent for JOINs.

In particular, the logic of which rows to include (and how) in the results was incorrect.

Additionally, the NULL handling logic was also incorrect: when evaluating the join criteria

for a pair of rows, NULL values should never match other NULL values. This is the behavior of

SQL and relational theory, although SQL-92 introduces the IS NOT NULL and IS NULL qualifiers

for the join criteria (while JCR-SQL2 does not).

Several methods within the query results were also corrected to ensure they cannot be called

when the query involves multiple selectors.

A number of TCK tests were re-enabled. Unfortunately, several query-related issues logged against the TCK

tests are still open, so some query-related tests are still disabled (and are captured as part of

MODE-1396).

  1. … 10 more files in changeset.
MODE-1468 Corrected JCR-JQOM functionality

Corrected a lot of incorrect JCR-JQOM functionality, especially in the QueryObjectModel

instances' string statements, which are now completely parsable as JCR-SQL2. Thus,

one can always convert QOM to JCR-SQL2 -- and since we internally parse the JCR-SQL2

as a QOM, we can actually go full circle.

That wasn't the only correction. When using the QOM, literal values can take on a different

form; the same form as when using explicit "CAST(...)" functions in JCR-SQL2. But since

CASTs are not often used, executing a typical JCR-SQL2 with string literals worked well

but executing a QOM with correctly-typed literals didn't. Now, executing a QOM with

string-form literals or properly-typed literals works the same way.

Additionally, many of the TCK QOM tests pointed out deficiencies in our QOM validation

and results. Most of these were corrected, although several outstanding problems are now

described by other issues (MODE-1485, MODE-1095, and JCR-3313).

All tests pass with these changes.

  1. … 54 more files in changeset.
MODE-1473 - Fixed parameter handling for JcrQueryObjectModelFactory.equiJoinCondition

MODE-1365 Added more support for queries

Lots of changes in the portions of the ModeShape code that's using Lucene, including the

general interfaces and reusable queries (in org.modeshape.jcr.query.lucene) and the basic

schema that uses a single index (in org.modeshape.jcr.query.lucene.basic).

At this point, all query functionality appears to work except for full-text search.

All current tests pass with these changes, though more testing is required.

  1. … 97 more files in changeset.
MODE-1368 Removed all legacy modules no longer needed in 3.x

ModeShape 3.x will not need a number of the 2.x modules. In particular:

- since 3.x will only have an AS7 kit, the AS5 or AS6 artifacts were removed

- all the connectors were removed, since they're no longer used

- the connector benchmark tests module was removed, replaced by our new

performance test suite

- the JPA DDL generator utility has been removed

- the 'modeshape-graph', 'modeshape-repository', 'modeshape-search-lucene'

and 'modeshape-clustering' modules have all been removed, since the new

'modeshape-jcr' module no longer uses them

- the DocBook modules were removed and are replaced by the Confluence space

- the two JDBC modules were moved out of the 'utils' directory to top-level modules

The build still works, but not all components have been included in the build.

This is because the query functionality doesn't yet work, so quite a few web

and JDBC driver modules all depend on this.

The assembly profile has not yet been changed or corrected.

  1. … 3639 more files in changeset.