Clone Tools
  • last updated a few minutes ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
MODE-2166 Adds CAST dynamic operand for JCR-SQL2.

  1. … 11 more files in changeset.
MODE-2151 Added support for CHILDCOUNT dynamic operand

Pretty basic support that should prove quite useful in certain situations. This may be relatively

expensive when the repository has nodes with lots of children since it requires loading the parent

node's child references in order to obtain the count. The CHILDCOUNT criteria would therefore work

much better/faster as filtering criteria in a query that already defines criteria that indexes can

use.

  1. … 16 more files in changeset.
MODE-2160 Completed the first stab at a local index provider. There are only a few very limited test cases, but they do pass and show that the provider is able to be included in the query plan, properly selected for use, and properly used during query execution.

  1. … 59 more files in changeset.
MODE-1671 Added 'mode:id' pseudocolumn for JCR-SQL2 queries

It is now possible to use the 'mode:id' pseudocolumn that exists on all selectors

to obtain the javax.jcr.Node.getIdentifier() value. It can be used in WHERE constraints

and JOIN criteria.

  1. … 19 more files in changeset.
MODE-2160 Refactored the query engine and index provider SPI.

Changed how index providers are initialized, changed the indexing to use only events, changed the reindexing mechanism to use a much simplified IndexWriter, and added a partial LocalIndex and provider implementation (still needs work).

  1. … 115 more files in changeset.
MODE-2018 Implemented new query engine.

Refactored the query functionality to now use several new service provider interfaces (SPI),

and implemented a new query engine that can take advantage of administrator-defined indexes.

When no such indexes are defined, the query engine is able to still answer the queries

by "scanning" all nodes in the repository. This is like a regular relational database:

all query functionality works (albeith slowly) even when no indexes are defined, though

to improve performance simply define an appropriate index based upon the query or queries

that are being used.

All of ModeShape's query parsing, planning, and optimization steps are basically unchanged

from the previous query system. There is one addition to the rule-based optimizer: a new

rule looks at query plans and adds the potential indexes that might be of use in each

access query portion of a query plan. Then, the query execution process (see below)

chooses one of the identified indexes based upon the selectivity and cardinality. If no index

is available for that portion of the query plan, then the query engine simply iterates

over all queryable nodes in the repository.

A new kind of component, called a "query index provider", allows the query engine to delegate

various responsibilities around indexes to these providers. For example, a provider must

provide an index planner that can examine the constraints that apply to an access query

and determine if any of the provider's indexes can be used. When they are, ModeShape

adds those indexes to the query plan. If the query engine uses one of those indexes,

then provider must be able to return all of those nodes that satisfy the criteria

as described earlier by its index planner. Finally, as ModeShape content changes, ModeShape

will notify the index providers' of the changes so that they can ensure their indexes

are kept up-to-date with the content.

This means that a provider can implement the functionality using any kind of technology,

and consequently, that ModeShape can begin to leverage multiple kinds of search and index

technology within its query system. The ModeShape community anticipates having providers

that use Lucene, Solr, and ElasticSearch. ModeShape will also likely come with a provider

that maintains file-system based indexes. Additionally, providers can optionally support

indexes on one or more properties. Thus, it will be possible to mix and match

these providers, selecting the best technology for the specific kind of index.

The new query engine does the execution in a very different way than the previous engine,

which used Lucene to determine the tuples (that is, the values in each row) for each access

query and that were then further processed and combined to form the tuples that were returned

in the result set. The new engine instead uses a new concept of a "stream of node keys"

for each access query: what actually implements that stream depends on many factors.

A node sequence is an abstraction of a stream of "rows" containing one or more node keys.

The interfaces are designed to make it possibly to lazily implement a stream in a very

efficient manner. Specifically, a node stream is actually comprised of multiple "batches"

of rows, and batches can be of any size.

Consider when the engine findes no indexes are available for a certain access query. The

engine simply uses a "node sequence" (or NodeSequence) implementation that returns in batches

a row for each node in the repository.

But if an access query involves a criteria on the path of a node, such as

"... WHERE ISSAMENODE('/foo/bar') ...", then ModeShape knows that this query (or portion of

a query) will have only one result, namely the node at "/foo/bar". ModeShape doesn't need

an index to quickly find this node; it merely has to navigate to that path to find the one

node that satisfies this query. ModeShape has several other optimizations, too: it knows

when a query involves all children or descendants of a node at a given path, and can take

this into account when optimizing and executing the query. All of these are handled with

special NodeSequence implementations optimized for each case.

For many access queries (i.e., part of a larger query), the engine will use one of the

indexes identified by one of the providers. When this happens, ModeShape uses other

NodeSequence implementations that utilize the underlying indexes to find the nodes that satisfy

some of the criteria.

The above describes how the engine uses a single NodeSequence instance for each each access

query in a larger query. But how does the engine combine these to determine the ultimate

query results? Basically, the engine constructs a series of functions that process one or more

NodeSequence instances to filter and combine into other NodeSequences.

For example, a custom index might be used to find all nodes that have a 'jcr:lastModified'

timestamp within some range. Presumably this index is used because it has a higher selectivity,

meaning that it will filter out more nodes and return fewer nodes than other indexes.

Other criteria that are also applied to this access query might then be applied by a filter

that processes the actual nodes' property values.

While the result of this commit is a functioning query engine that is shown to work in most

of the query-related unit and integration tests, there still are a few areas that are not complete.

Specifically:

* The new engine does not support full-text search, and currently throws an exception

* No index providers are implemented. Therefore, all queries involve "scanning" the repository.

This can be time consuming, especially for federated repositories. Consequently, all such

tests that query federated content have been disabled/ignored.

  1. … 229 more files in changeset.
MODE-2081 Changed the license for ModeShape code to ASL 2.0.

    • -18
    • +10
    ./QueryObjectModelConstants.java
  1. … 545 more files in changeset.
MODE-2037: Extended like operation (reverse like implementation)

Added a reverse like (e.g., "RELIKE") constraint that is useful when

the LIKE pattern is stored in a node property and the intent is to

find all nodes that have a pattern that matches a given string:

SELECT *

FROM [service:Locator] AS locator

WHERE relike($phone, locator.[service:phonePattern])

  1. … 14 more files in changeset.
MODE-1549 JCR-SQL2 uses '<>' rather than '!=' as not-equal-to operator

ModeShape abstract syntax tree used '!=' as the "not-equal-to" operator, but did

alias '<>' to '!='. This change reverses this so that the '<>' token is considered

the primary operator, and '!=' is aliased.

  1. … 1 more file in changeset.
MODE-1468 Corrected JCR-JQOM functionality

Corrected a lot of incorrect JCR-JQOM functionality, especially in the QueryObjectModel

instances' string statements, which are now completely parsable as JCR-SQL2. Thus,

one can always convert QOM to JCR-SQL2 -- and since we internally parse the JCR-SQL2

as a QOM, we can actually go full circle.

That wasn't the only correction. When using the QOM, literal values can take on a different

form; the same form as when using explicit "CAST(...)" functions in JCR-SQL2. But since

CASTs are not often used, executing a typical JCR-SQL2 with string literals worked well

but executing a QOM with correctly-typed literals didn't. Now, executing a QOM with

string-form literals or properly-typed literals works the same way.

Additionally, many of the TCK QOM tests pointed out deficiencies in our QOM validation

and results. Most of these were corrected, although several outstanding problems are now

described by other issues (MODE-1485, MODE-1095, and JCR-3313).

All tests pass with these changes.

  1. … 56 more files in changeset.
MODE-1365 Migrated JCR query functionality

Migrated the JCR query functionality from 2.x into the 3.x codebase.

At this point, all parsing and query object model code and tests have been moved,

all of the query-relate JCR interfaces have been implemented, and the internal

RepositoryQueryManager is created and wired up.

The repository configuration schema and RepositoryConfiguration class have been

changed to contain the indexing-related options, and we're creating and using the

Hibernate Search components correctly.

However, some work is still required:

- indexing content changes (upon session.save and upon creating Binary values),

- generating the Lucene Query objects for the various JCR-JQOM criteria

(these methods have not yet been migrated)

As a result, queries parse but never return results.

All unit and integration tests pass.

  1. … 326 more files in changeset.
MODE-1289 New approach for storing/caching JCR content

This is the first commit to start the 3.0 effort, which involves a major change to how

the JCR layer stores and caches information. The new approach is based upon Infinispan and uses

Infinispan's cache loaders for persistence, and JSON-like documents (that are in-memory

structures not needing to parsed/written) are used to store information for each node.

There are several new Maven modules:

- modeshape-jcr-redux

- modeshape-schematic

The 'modeshape-jcr-redux' module will eventually replace the 'modeshape-jcr' module once

the implementation is far-enough along. And the 'modeshape-schematic' module will likely

move into the Infinispan project, so that needs to remain separate.

Although it may seem strange and unkempt to have the new JCR implementation in a new module,

doing so means that we can continue to rebase from 'master' (and the 2.7 work) for at least

some time. When the new module becomes complete enough, we'll move it and replace the

existing 'modeshape-jcr' module. It's also convenient to have both the old and new implementations

around in the same codebase.

The build was changed to focus upon the (few) modules that are oriented around the new

implementation. So the following can be used to build the newer codebase:

mvn clean install

However, the build has a new Maven profile called "legacy" that can be used to build the

old modules. We kept this to make sure that any rebasing can be compiled and verified.

For example, to build everyhing, including the new modules and the 2.x-style modules,

use the following command:

mvn clean install -Plegacy

As the newer 'modeshape-jcr-redux' progresses and other modules (e.g., sequencers, web,

jboss, text extractors) are converted to use the new module, they should be moved

from the 'legacy' profile into the main set of modules in the top-level 'pom.xml'

  1. … 447 more files in changeset.
MODE-869 Extended the JQOM interfaces (in the 'org.modeshape.jcr.api.query.qom' package) to enable creating subqueries, and implemented this in the JCR-SQL2 parser and factory classes.

All unit and integration tests pass.

git-svn-id: https://svn.jboss.org/repos/modeshape/branches/2.2.x@2229 76366958-4244-0410-ad5e-bbfabb93f86b

  1. … 4 more files in changeset.
MODE-869 Extended the JQOM interfaces (in the 'org.modeshape.jcr.api.query.qom' package) to enable creating subqueries, and implemented this in the JCR-SQL2 parser and factory classes.

All unit and integration tests pass.

git-svn-id: https://svn.jboss.org/repos/modeshape/trunk@2228 76366958-4244-0410-ad5e-bbfabb93f86b

  1. … 4 more files in changeset.
MODE-802, MODE-772 Applied the patch that fixes these issues. For an in-depth discussion of the changes, see the comments on MODE-802.

git-svn-id: https://svn.jboss.org/repos/modeshape/trunk@1917 76366958-4244-0410-ad5e-bbfabb93f86b

  1. … 25 more files in changeset.
MODE-768 Implemented the JCR2 Query Object Model interfaces (e.g., javax.jcr.query.qom) by extending the Graph API's Abstract Query Model classes. This required a number of changes to the AQM classes, including:

- Most of the getter methods were changed from the 'get<Name>()' form to just '<name>()', so that the new JCR QOM implementation subclasses can use the 'get<Name>()' form and return potentially different types than used by the AQM classes. For example, the AQM classes use a SelectorName representation (whereas the QOM uses a simple string), along with a number of enumerations (whereas the QOM uses simple strings).

- Most of the abstract classes in the AQM were transformed into interfaces. Without this, creating the QOM implementation classes would require multiple inheritance. With these changes, the QOM interface hierarchy, the QOM implementation class hierarchy, and the AQM class hierarchy all could be merged successfully.

- Because the ModeShape AQM classes are richer and have more capabilities, additional interfaces for these new components were added to the 'org.modeshape.jcr.query.qom' package in the 'modeshape-jcr-api' module, including an extension for the QueryObjectModelFactory, QueryObjectModelConstants, several new interfaces (e.g., Between, ArithmeticOperand, Limit, NodeDepth, NodePath, ReferenceValue) and several interfaces that represent a greater range of queries (e.g., QueryCommand, SelectQuery, SetQuery, and SetQueryObjectModel).

Despite all of these changes, the grammars of all supported query languages, and the string representation of the AQM (which is equivalent to our extended JCR-SQL2 grammar) remain unchanged. Additionally, the new QOM implementation objects' string representation is simply that of the AQM (i.e., extended JCR-SQL2 grammar).

Finally, the javax.jcr.query.Query.JCR_JQOM query language was added to our JCR Repository implementation. Basically, this language's parser is a trivial subclass of our JCR-SQL2 parser. The JCR 2.0 specification does not explicitly define a textual query language or grammar for the Query Object Model (the grammar in the specification is written more in terms of the code of a Java-like language). However, per the Section 6.9 of the JCR 2.0 specification, the QueryManager.createQuery(String, String) method:

'is used for languages that are string-based (i.e., most languages, such as JCR-SQL2) as well as for the for the string serializations of non-string-based languages (such as JCR- JQOM)'

At this point, all unit and integration tests pass.

git-svn-id: https://svn.jboss.org/repos/modeshape/trunk@1880 76366958-4244-0410-ad5e-bbfabb93f86b

    • -0
    • +56
    ./ArithmeticOperand.java
    • -0
    • +76
    ./QueryObjectModelConstants.java
    • -0
    • +286
    ./QueryObjectModelFactory.java
    • -0
    • +47
    ./ReferenceValue.java
    • -0
    • +61
    ./SetQueryObjectModel.java
  1. … 151 more files in changeset.