An overview of the DFC Query Manager.
Blue Fish Development Group
701 Brazos St. #700
Austin, TX 78701
(512) 469-9300
An overview of the DFC Query Manager.
The Query Manager in DFC is found in the package com.documentum.fc.client.qb. Its package documentation says simply “Provides interfaces to construct and run complex queries and SmartLists.”. Now, most programmers who have worked in any way with DFC have at some point constructed and executed a query, without feeling an overwhelming need to be supported by a management infrastructure. So, since it clearly exists, what is it and what is it for?
Working with DFC can be a voyage of discovery, and for many the Query Manager may appear to be further than they may need to travel. After all, DFC provides any number of facilities to perform, execute and analyse queries without complicating the situation by using some kind of ‘manager’ to control things. However it would be a mistake to dismiss the QueryManager out-of-hand, since a little digging can unearth a number of useful facilities that could actually make things easier, not harder.
The first clue that the Query Manager may not be all that it seems comes from its package name. As noted, it is ‘qb’. Not ‘qm’, as might have been expected. ‘QB’ would be a more appropriate choice for a Query Builder package - and on closer inspection that is what a lot of the Query Manager is for - building queries. Managing queries, although important, is not the main purpose of the package.
Most programmers with some experience of DFC will have composed code along the following lines:
IDfQuery query = new DfQuery();
IDfCollection collection = null;
try {
query.setDQL("select r_object_id from dm_sysobject where object_name = 'my_document' and folder('/Temp')");
collection = query.execute(iDfSession, IDfQuery.DF_READ_QUERY);
while (collection.next()) {
IDfId id = collection.getId("r_object_id");
// do something useful on the returned id
process(id);
}
} finally {
if (collection != null) {
collection.close();
collection = null;
}
}
There is nothing hidden, no complexity. For simple queries there is little to beat it. However, there is at least one point that might give pause for thought. That is, it is not possible to tell how much time may elapse while executing the query. Also, even if we know that it is going to be a long time, there is nothing that we can do to make use of that time. The code will block inside the call to query.execute, until all of the results have been returned.
A second point would be that the results of the query are consumed very close to the point where they are returned. If the underlying system actually required several modules to collaborate on the construction of the query and the analysis of the results, then the code would become a lot more complex. Certainly the method would need to be able to determine all of the modules that were interested in the results, which may not be easy to achieve, and that might not always be the same set of modules that built the query in the first place.
Following this line a bit further, the very nature of queries makes them hard to share. They are DQL statements, dressed up in an object-oriented framework. Now, to even a moderately experienced DQL user, the two queries:
select r_object_id from dm_sysobject
where object_name = 'my_document' and folder('/Temp')
and
select r_object_id from dm_sysobject
where folder('/Temp') and object_name = 'my_document'
are essentially the same. However, writing a piece of program that will confirm this is quite hard. A more complex query, possibly featuring more than a single object type and a more complicated set of logical conditions makes the task sufficiently difficult that alternative approaches may need to be adopted.
Finally, many user interfaces are fashioned to present some pre-composed queries to the user that can then be customized according to the task that the user wishes to perform. Where the user is fluent in DQL this is not a problem - but this is not always the case. Further, these same user interfaces frequently want to allow the user to save their customized queries for subsequent reloading, additional customization and reuse. For applications with interface requirements along these lines, treating queries as simple strings is going to present problems.
The Query Manager, as previously observed, presents a substantial number of methods for building and examining queries. These facilities are tailored to situations where several independent modules are collaborating on the construction of a query. After all, when only a single module is interested in both the construction of the query and the analysis of the results, IDfQuery objects are more than satisfactory. So rather than handing around an IDfQuery object, an IDfQueryMgr object is used in its place. Each callee in the list is able to determine whether its own particular requirements for the query under construction have been met, and can make additions as necessary.
Also, each module that is interested in the returned result set can register its interest. By implementing the Observer pattern, the Query Manager can support any number of such modules without the need to construct any complex infrastructure. As is often the case with Observer patterns, the observer itself may be implemented as a separate thread, with the Query Manager handling all of the synchronization issues that are often time-consuming to get right.
The Query Manager also provides facilities to avoid blocking within the execution of the query. This means that processing can continue, even in the same thread that is performing the query, while execution proceeds. If the results are distributed by means of Observers then there may not be any reason to poll the query to see whether it has completed.
The Query Manager even provides the facility to compose and execute a query that is addressed to more than one respository. This can allow the underlying distributed data model to be hidden from the upper layers of the application, without requiring too much fancy footwork at the actual point of implementation.
Lastly, the Query Manager allows queries to be stored in the local filesystem for subsequent reuse. This facility allows applications much more flexibility that would otherwise be available in presenting a personalized, customizable set of searches to the user.
The Query Manager is a useful tool; the issue is knowing when to use it. Clearly there is a cost associated with learning about its facilities and cutting over to using it. In addition, there is a recurring cost that can arise from using it for situations where it simply adds inefficiency. It has been known - in fact it was the motivation for this article - for programmers to use the Query Manager to construct a query, kick it off and then poll until the results were returned. This scenario was repeated several times per second. The users of the application were at a loss as to why performance was so poor.
The Query Manager places an abstraction on top of one of the most flexible objects in java - the String. If something expressible in DQL then it can be written into a IDfQuery object. However, it does not follow that an equivalent query can be built using the facilities provided by the Query Manager.
In addition, the Query Manager performs a number of
‘behind-the-scenes’ operations that can result in unexpected traffic
between the client and the repository. For example, the
setObjectType() method will verify that the specified object type is,
in fact, available in the targetted repositories. This information is
not cached - it is refetched for every Query Manager instance.
As always, a good guideline is to keep things as simple as possible. When deciding whether to use the Query Manager, consider whether the requirements of the application justify making use of the more heavyweight solution.
The Query Manager allows additional information to be communicated within the query, so modules that construct queries may provide display details to other modules whose function is to display the results. This is done within the Query Manager object. Attributes that are to be returned by the query are divided into two groups - displayed and hidden. Attributes that are configured to be displayable are accompanied by a ‘width’ parameter. This value (by convention) is used to inform the display module the field width that should be used for presenting the information to the user.
Attributes that are not necessarily for display are ‘hidden’. They may be recovered from the Query Manager object, by using a different set of methods. This makes the task of synchronizing information across multiple, independent modules more straitforward.
While recognizing that queries can take many forms and perform many different activities, the large majority are of the common or garden ’select’ variety. The Query Manager appears to address only this type of query.
Also, the Query Manager can be quirky to handle in that many of its methods do not return any indication of whether the operation was sucessful or not. Such “silent failures” can lead to a good deal of head-scratching.
As is common within DFC, the objects that are handed around as IDfQueryMgr objects are actually interfaces. The concrete implementations are elsewhere in the package, in this case DfQueryMgr. So to instantiate a fresh object:
IDfQueryMgr queryManager = new DfQueryMgr();
The most essential component of the newly constructed object is the default session with which it is to be associated. This is attached:
queryManager.initialize(iDfSession);
A useful debugging tool when working with Query Manager objects is the
getDQL() method - this displays what is currently built.
The specification of the types of values to be returned by the query is controlled by inserting and removing attributes. There are two distinct kinds of attributes - display and hidden. These names - display and hidden - are assigned by Documentum. They may be usefully thought of as ‘markers’, rather than as defining the purpose of each attribute. There is nothing intrinsic in either type that makes it unsuitable for general purpose use. Each ‘marker’ has its own set of methods for administration.
So, to add a ‘display’ attribute:
queryManager.insertDisplayAttr(-1, "object_name", 20);
By convention, the value -1 indicates that the new attribute should be appended to the existing list. Since this is a display attribute, the method allows a display width value to be supplied. This value can be retrieved after the query has been executed.
Calling the getDQL() method at this point returns:
SELECT object_name AS "object_name" FROM dm_document
Note that the object type, dm_document, has been assigned by the system as a default value. It can be overridden in due course.
To add a ‘hidden’ attribute to the existing query:
queryManager.insertHiddenAttr(-1, "title");
Again, this value is appended to the list. Our query has now become (in DQL):
SELECT object_name AS "object_name", title as "title" FROM dm_document
Attributes may be removed from the query under construction. This is
done through the removeDisplayAttrs() and removeHiddenAttrs()
methods. These methods remove a range of attributes, allowing the
program to remove some or all of a particular kind of attribute in a
single operation. This involves determining the actual index that has
been assigned to an attribute - this can be done by iterating through
the set of attributes using the getDisplayAttrCount(),
getDisplayAttr(index) and similar methods.
Since removing an attribute may affect the index values of any other attributes in the set, it is safer to iterate through the set of attributes to determine the target indexes, then perform the removal.
Although the system has provided a default value for the query, it is better to explicitly specify the name of the type in which we are interested. [This also protects against the 'default' value ever changing, thereby potentially breaking the query.]
queryManager.setObjectType("dm_sysobject");
The query has now become:
SELECT object_name, title FROM dm_sysobject
The various components of a ‘where’ clause are controlled through IDfAttrLine objects. These objects contain (usually) an attribute, a relational operator and one or more values. They may also be joined together with logical operators to build up more complex conditions.
To make an AttrLine object, use the insertAttrLine() method. This
method takes three int parameters; type, index and group. ‘type’ is
ignored [per the documentation] and ‘group’ performs no apparent
function. ‘index’ uses the same convention as above, where ‘-1′
indicates “append” and the use of any other (positive) value relies on
knowing what has gone before.
So, we will take the safe route and append our attrLine:
IDfAttrLine line = queryMgr.insertAttrLine(-1, -1, -1);
Inspection of the query at this point shows that the Query Manager has made a modification on our behalf that was not necessarily expected:
SELECT object_name, title FROM dm_sysobject WHERE ((a_is_hidden = FALSE))
However, so long as this isnt going to significantly affect us, we can ignore it. What we were getting around to was restricting the values that are to be returned by the query, through the ‘where’ clause. So, we need to plug some values into the AttrLine object that we just created. So:
line.setAttr("r_modify_date");
line.setRelationalOp(IDfAttrLine.OPER_BETWEEN);
line.setValue("1/1/99");
line.setEndValue("1/1/00");
==> results in
SELECT object_name, title FROM dm_sysobject WHERE (((r_modify_date between date('1/1/99')
and date('1/1/00'))) and (a_is_hidden = FALSE))
This is a piece of shorthand implemented by the Query Manager to allow a range of values to be addressed. Of course, this could have been done longhand, by specifying two ‘lines’ and joining them together with a logical operator. As in:
attrLine.setAttr("r_modify_date");
attrLine.setRelationalOp(IDfAttrLine.OPER_GREATEREQUAL);
attrLine.setValue("1/1/99");
attrLine.setLogicOp("and"); // glue the two lines together
IDfAttrLine attrLine2 = queryMgr.insertAttrLine(-1, -1, -1);
attrLine2.setAttr("r_modify_date");
attrLine2.setRelationalOp(IDfAttrLine.OPER_LESSEQUAL);
attrLine2.setValue("1/1/00");
==> results in
SELECT object_name, title FROM dm_sysobject WHERE (((r_modify_date >= date('1/1/99'))
and (r_modify_date <= date('1/1/00'))) and (a_is_hidden = FALSE))
In addition to being able to control the attribute value components of the where clause, the Query Manager also allows control of the ‘location’ components of the query. In fact, this is where some of the additional power of the Query Manager comes in, because control of location, achieved through IDfQueryLocation objects, can allow concurrent querying of multiple repositories.
IDfQueryLocation objects are created in the usual manner, by appending them to the existing query.
IDfQueryLocation location = queryMgr.insertLocation(-1); // append
location.setPath("/Temp");
==> results in
SELECT object_name, title FROM dm_sysobject WHERE (((r_modify_date >= date('1/1/99'))
and (r_modify_date <= date('1/1/00'))) and (a_is_hidden = FALSE)) AND (folder('/Temp', descend))
The query being constructed is becoming more complex. Interestingly, the Query Manager has attached the location specifier to the previous logical clause with the “and” operator. There does not appear to be any way of avoiding this.
If we were constructing a query that could indeed search two repositories concurrently, then we would append a second IDfQueryLocation object to the query and specify its docbase by means of a second IDfSession.
The startSearch() method initiates the query. On constrast to IDfQuery
operations, this is not a blocking call and so useful processing can
be performed while the execution of the query proceeds.
queryMgr.startSearch(); // starting search
If the results are awaited in the same thread, the query will probably
need to be polled to completion by using the isSearchedFinished()
method.
queryMgr.isSearchFinished();
In addition, and again in contrast to IDfQuery operations, it is also
possible to curtail the execution of a query through the stopSearch()
method. This allows a user to intervene during excessively long
operations to stop them, rather than being forced to wait for a system
configured timeout to expire.
queryMgr.stopSearch();
Once the query has been executed its results are available for inspection. This is similar in style to the familiar methods used for IDfQuery operations. However, since the Query Manager can reduce the level of coupling between the construction and analysis of a query, it also allows the analysing code to determine precisely what was requested and what was returned.
The following code is a simple display routine, to separate the results into ‘display’ and ‘hidden’. Note that, to be strictly correct, the code should fetch all of the values as IDfValues and then make a determination as to their data type, and so how they should be displayed.
int count = queryMgr.getResultItemCount();
if (count > 0) {
for (int i = 0; i < count; i++) {
IDfQueryResultItem item = queryMgr.getResultItem(i);
if (item != null) {
System.out.println("docbaseName: " + item.getDocbaseName());
IDfTypedObject typedObject = item.getTypedObject();
n = queryMgr.getDisplayAttrCount();
for (int j = 0; j < n; j++) {
String valueName = queryMgr.getDisplayAttr(j);
System.out.println("display: " + valueName + ": " + typedObject.getString(valueName));
}
n = queryMgr.getHiddenAttrCount();
for (int j = 0; j < n; j++) {
String valueName = queryMgr.getHiddenAttr(j);
System.out.println("hidden: " + valueName + " " + typedObject.getId(valueName).toString());
}
}
}
}
Finally, having constructed this query it seems a shame to simply discard it. Instead we can preserve it on the local file system as follows:
queryMgr.save("/tmp/query");
This results in the serialized representation of the query being
written out to the file system. This representation can be
subsequently loaded and reused via the open() method.
The actual representation of the query in the file is:
[SmartList\Common]
Version=2.0
Information=
[SmartList\DQL]
DQLString=SELECT object_name, title FROM dm_sysobject WHERE (((r_modify_date >= date(’1/1/99′))
and (r_modify_date <= date(’1/1/00′))) and (a_is_hidden = FALSE)) AND (folder(’/Temp’, descend))
[SmartList\DisplayAttribute]
NumOfDisplays=1
DisplayAttr%%1=object_name
Width%%1=21
Visible%%1=T
[SmartList\SearchType]
ObjectType=dm_sysobject
TypeDocbase=jes_dev
[SmartList\Flags]
FindAllVersions=F
FindHiddenObjects=F
IgnoreCase=F
QueryType=14
QueryKind=1
QueryTitle=
[SmartList\WhereClause\Group0]
AndOr=and
AttrNum=1,2,
[SmartList\WhereClause]
AndOr%%1=and
Attr%%1=r_modify_date
Op%%1=>=
Value%%1=1/1/99
EscapeChar%%1=\
Visible%%1=T
EndValue%%1=
AndOr%%2=and
Attr%%2=r_modify_date
Op%%2=<=
Value%%2=1/1/00
EscapeChar%%2=\
Visible%%2=T
EndValue%%2=
NumOfExprs=2
[SmartList\Location]
NumOfLocs=1
Path=/Temp
Descend=T
Type%%1=DC_PATH
Docbase%%1=
Path%%1=/Temp
Descend%%1=T
RootVersion%%1=
FloatingVersion%%1=
Editable%%1=T
ObjectID%%1=
Username%%1=
Subscribe to our newsletter to be notified when new articles are posted. You can unsubscribe at any time.
You must be logged in to post a comment.