In this chapter, we highlight a number of FairCom DB features. These features are available in all FairCom DB database engine models unless noted otherwise.
Variable-Length Records
FairCom DB allows variable-length records which are useful for storing data that is, well, variable in length. XML and JSON information are typical text examples and images, PDFs are frequent types of binary documents stored in variable-length files. Any type of data that is not of a predetermined fixed size can be stored in the variable fields of a record. Variable-length records are limited to 2 GB, with a length (represented by the signed 4-byte varlen parameter). Adding and retrieving data is somewhat different in variable-length records as described in "Fixed versus Variable-length Records".
Note that variable-length records can still entertain a fixed-length portion. Specific variable-length API calls are used with variable-length records, and are denoted with a V in the name to distinguish from corresponding fixed record functions. For example, AddRecord() for fixed records, and AddVRecord() for variable-length. However, fixed length API calls can be used with variable-length records and only the fixed length portion is returned in this case. This is a useful technique for performance, as fixed handling, in general, is more efficient than variable-length handling. Simply re-read the record if needing to access the variable-length portion. The opposite is not true and calling a variable-length function on a fixed length file results in a FMOD_ERR (48, data file is not a variable-length file.)
Hint: The ReWritePartialRecord() API call is useful to avoid writing the entire variable-length portion of the record.
Each variable-length record is proceeded by a 10-byte variable-length record header consisting of the following portions:
- 2-byte record mark (0xFAFA or 0xFDFD)
- 4-byte total length
- 4-byte utilized length
The record mark is 0xFAFA for an active variable-length record. Deleted variable-length records are marked 0xFDFD.
The total length is the space used by the header, the actual data image, plus any extra space not currently used by the record. The utilized length is the length of the actual data not including the header or any extra space. In a properly configured file, moving from the first byte of the record header by the total length should place you on the first byte of the next record in the file.
Variable-length records are enabled with a filmod value of ctVLENGTH (0x0004) when creating a file. Index files CANNOT be created with a ctVLENGTH mode.
Deleted Space Management
Deleted space management is different in variable-length files compared to fixed length files. In variable-length files a space management index is embedded into the file and used to keep track of deleted space. With the FairCom DB server technology, a separate thread is used within the server to maintain this index.
A RECBYT index can improve the usage of this special index. Note that the space management index is invalidated under specific conditions should the file be rebuilt, and the old index will consume file space. If this is a problem for larger files, consider compacting the file.
FYI - Resources are specialized types of variable-length records, beginning with 0xFEFE as the 2-byte record mark.
Resources
There are times that you may want to attach auxiliary information to a particular data file that is not easy to integrate into the structure of that file. This could be a file version number, special flags relating to the status of the file, and so forth. Generally, this is information that is not repeated in each record of the file.
You could create a special record, with a special key value, that you do not process as you do your regular records. However, this forces you to handle this record, and it’s key, as an exception to the regular records, placing a burden upon the programmer.
c-tree provides a special feature to handle this sort of information, called a Resource. Resources are special variable-length records stored in your file, whether you use fixed- or variable-length records, accessed via a special set of functions. These records do not require a key in an index, therefore your program will not access them if you are working with an index or using the ISAM functions.
Resources are an important part of the advanced features of c-tree. FairCom defined resources are critical to the use of the c-tree Drivers, Incremental ISAM functionality, conditional index support, c-treeSQL and c-treeDB support, and they will continue to be important as new technology is add in the future. These resources are added automatically by some features or manually by the developer, and require the RESOURCES define in the c‑tree library, which is the default. This section focuses on user defined resources added by you, the developer, but provides some background information on the uses of resources.
Resources and Files
There are three general types of resources that can be attached to a file:
- User defined resources: Information that you wish to store in the file, such as a file version number, or an infrequently accessed counter. Use resources to store information associated with a file that varies from the type of information stored repetitively in the data records.
- FairCom defined resources: There is a variety of information that, under certain circumstances, FairCom wishes to store in the file. This can be information relating to IFIL structures, alternate collating sequences, special characters for key padding, and so forth. Usually you do not access this information directly. It is available to a variety of c-tree Plus functions that use it.
- Third party resources: As other developers create utilities integrating with c-tree Plus, FairCom assigns resource identifiers when necessary.
Resources can be turned off when you create a data or index file. Generally you will not do this, as a number of c-tree features will not be available for this file.
Resource Identification
Within a given data file, a Resource is identified by three elements:
- Resource Type (unsigned long integer),
- Resource Number (unsigned long integer),
- Resource Name (null terminated string).
Within each file, you can identify a Resource by its unique combination of Type and Number, or by its Name.
Note: The Resource Name is not guaranteed to be unique.
Resource Type
Resource Type gathers Resources into related groups. Resource Types in the range of 1 to 127 are reserved for use by FairCom. These are special Resources used by the FairCom Server and d-tree. Resource Types in the range of 128 to 65536 are also reserved. These are assigned to third-party developers by FairCom. Resource Types from 65537 and above are available to the programmer to be assigned as desired.
Resource Number
There are no restrictions on assigning Resource Numbers. They separate various Resources within a given Resource Type. When adding a Resource to a file, c-tree can assign the next available value for this Resource Type in the referenced file.
Resource Name
You can use the Resource Name as a way to access Resources in a file instead of using the Type and Number. You may find it easier to understand your code if you use a symbolic name rather than a pair of numbers. If you are adding a number of Resources to a data file over a period of time you may not know what Resource Number is being used, particularly since you can have c-tree assign the next available Resource Number. Access via Resource Name can be simpler than trying to keep track of what Number your Resource has been assigned.
On the other hand, be careful in assigning Resource Names. c-tree cannot guarantee that each Resource Name is unique within a file. It is possible to add a Resource to a file that has a Name already used in this file.
Resource Names are optional. You do not have to assign one to a Resource. The Resource Name is a null terminated character string of any length, although we recommend that you do not make them too long. Names starting with FC! are reserved for use by FairCom. Names starting with RD! are reserved for Resources assigned to third-party developers by FairCom.
Disabling Resources
Resources are enabled unless you specify otherwise when the file is created. The only situation where you may want to disable Resources is if you do extensive processing of a data file in reverse physical order. With Resources enabled, you can no longer use LastRecord() or PreviousRecord() to traverse the file in physical reverse order. You can still use these functions to process the file in reverse index order, though. Resources introduce special variable-length records into the data file, so reverse traversing by physical position no longer is possible, just as with variable record length files.
Disable Resources for a data file by using the ctDISABLERES file mode at file creation. It is possible to enable Resources for a file later with EnableCtResource(), as described later. To totally disable Resources for all files, change #define RESOURCE to #define NO_RESOURCE in ctoptn.h/ctree.mak. See “Conceptualizing a c-tree Application” in the Quick Start and Product Overview Guide for more information about setting configuration defines.
Resource Functions
Five functions are provided to manage Resources (see c-tree Function Descriptions for declarations);
- AddCtResource(): Add Resource to file.
- UpdateCtResource(): Update Resource.
- DeleteCtResource(): Delete Resource from file.
- GetCtResource(): Retrieve Resource from file.
- EnableCtResource(): Enable Resources for a file.
Each of these functions has as a parameter a pointer to a Resource Data Block. This is an area you create containing the description of the Resource, as follows:
| Byte Offset | Data Type | Resource Elements |
|---|---|---|
| 0 - 3 | Unsigned long integer. | Resource Type |
| 4 - 7 | Unsigned long integer. | Resource Number |
| 8 - m | null terminated character string. | Resource Name |
| m+1 - n | Any collection of data types desired. | Resource Information |
The Resource Information can be any collection of data fields that you desire. You are not limited to a null terminated character string.
AddCtResource
Add a new Resource to a file. The combination of Resource Type and Resource Number must be unique to this file, or an error is returned. c-tree does not check to see if the Resource Name is in use yet. As mentioned, Resource Name is optional.
When passed a Resource Number of 0xffffffffL, c-tree assigns the next available Number for this Resource Type in this file. The assigned Number is placed in the appropriate field in the Resource Data Block.
When adding a Resource to a file, a special variable-length record is written to the file, containing the information from the Resource Data Block. This is done even if a data file uses fixed-length records. Now the information is available to any user accessing the file.
DeleteCtResource
Delete a resource from a given file. Since Resource Names are not guaranteed to be unique, the proper Resource Number and Type are required to delete a Resource. If the Resource record was locked via UpdateCtResource(), the lock is released.
UpdateCtResource
Update the information in a Resource. The Resource Type and Number is used to select a Resource, and then the new Resource Data Block is placed in the file, replacing the prior information.
Note: UpdateCtResource() automatically tries for a low-level record lock on the Resource record, unless the record is already locked via GetCtResource(). If the record cannot be locked, an error value is returned. The lock is automatically released when UpdateCtResource() is finished.
GetCtResource
Retrieves a Resource. The value returned by the function is the record address of the Resource record. The Resource Data Block is written to a buffer that you specify.
When retrieving a Resource, use one of the following resource modes, resmode, values:
- RES_FIRST: Finds the Resource with the lowest available Type and Number.
- RES_LENGTH: This mode must be OR-ed with a Resource retrieval mode such as RES_FIRST or RES_NAME. This mode returns three 4-byte integers: resource type, resource number, and resource length.
- RES_NAME: Finds the first Resource with a matching Name (c-tree cannot guarantee unique Resource Names). resptr points to a buffer containing the Name for the search.
- RES_NEXT: This mode allows the next physical resource to be retrieved from the specified file. Seed this mode with a resource type and resource number to receive the next greater resource.
- RES_POS: Reads the Resource at a given record address obtained in a prior call to UpdateCtResource(). resptr points to a long integer with the record address to be read.
- RES_TYPE: Finds the Resource with the matching Type and lowest available Number. resptr points to the resource Type.
- RES_TYPNUM: Finds the Resource with the matching Type and Number. resptr points to a resource Type followed by a resource Number.
In addition you can OR the resmode with the value RES_LOCK. This makes GetCtResource() ask for a low-level exclusive lock on the matching Resource record. This lock is released by UpdateCtResource() or DeleteCtResource() for the same Resource, or with LockCtData(). It is important to note that Resource locks are NOT released by explicit or implicit calls to LockISAM().
EnableCtResource
Enable Resources in a file created with Resources disabled. First open the file with an ctEXCLUSIVE file mode, then call EnableCtResource().
Resource Example
The following example demonstrates how to add a Resource, and then retrieve it.
Resource Example
struct { /* Resource Data Block structure */
ULONG resource_type;
ULONG resource_number;
TEXT variable_info[1016]; /* This holds the Resource Name and Data */
} my_resource;
my_resource.resource_type = 0x10001L; /* 65537*/
my_resource.resource_number = 0xffffffffL; /* c-tree assign # */
strcpy(my_resource.variable_info,"MY!resource");
strcpy(my_resource.variable_info+12,"Actual Resource Data");
if (AddCtResource(datno,&my_resource,
(VRLEN)(8 + 12 + strlen(my_resource.variable_info+12))) == 0) {
printf("\nThe resource has been assigned number %ld",
my_resource.resource_number);
printf("\nThe resource is located at byte offset %ld.",
GetCtResource(datno,"MY!resource", &my_resource,1024L,RES_NAME);
} else
printf("\nCould not add resource. Error #%d",uerr_cod);
FairCom Defined Resources
FairCom uses resources to store file definitions in the data and index files. Generally, you will not refer directly to these resources. They are used by c-tree functions.
Note: If resources are disabled for a particular file, the following features cannot be used with that file.
IFIL Information
Incremental ISAM structure, (IFIL), information is stored as a Resource if a file is created with CreateIFile(). This information is used by OpenFileWithResource(), CloseRFile(), GetIFile(), and PermIIndex().
DODA and Record Schema
DODA (Data Object Definition Array) information, such as is used in FairCom’s r-tree product, can be stored in a file as a Resource with the PutDODA() function. GetDODA() can be used to retrieve this information. For more information, refer to Record Schemas.
File Security
The c-tree file security system depends on resources. If you have disabled resources for a given data or index file, you will not be able to use the security system with that file.
Padding and Delimiter Values
If you change the padding or key delimiter value for an index with SetVariableBytes(), the information is stored in the index as a resource.
Alternate Collating Sequence (ACS)
As described in ISAM Functions, you can specify an alternate collating sequence for an index. This information is stored as a resource in the index file.
Resource Locks
c-tree assumes resource entries will be locked only for short intervals. This means you should process resource updates without permitting user interactions to occur during the actual update. Do not lock a resource and then request user input. If necessary, adopt a three buffer approach, as described in Multi-User Concepts, to permit a resource update requiring user interaction.
If a file is created with transaction control, (either ctPREIMG or ctTRNLOG has been OR-ed into the file mode at create time), be careful not to include resource updates, (additions, deletions or updates), in long transactions. Locks cannot be released until a transaction commits or aborts.
Superfiles
When working on complex applications, you may find that you are creating an overwhelming number of data and index files. This proliferation of files can create problems for the operating environment and the user. Many operating environments place a limit on the number of files open at any given time. c-tree allows you to get around this by using the concept of virtual files, but the performance overhead this creates can be a problem in some cases. In addition, by having a large number of separate data and index files, you complicate the user’s backup procedure, and increase the risk of getting various files and indexes out of synchronization.
To help manage this problem, c-tree offers Superfiles. A Superfile is one physical file containing any number of logical data and index files. This limits the number of physical files open, and allows you to move or copy a group of files as a single unit.
The two major advantages of using superfiles are reduced directory clutter and reduced system resource demand. Since a superfile is a container file it acts like a drawer in a file cabinet that lets you hide all the individual files. Because it is a single file, only one system file descriptor and associated control block are needed, thereby reducing demand on system resources.
Superfile Members/Host
A c-tree file that is physically stored in a superfile is referred to as a superfile member. The file name for the member must be formed as follows:
<name of superfile>!<name of member>
Superfile members are created and accessed by adding the prefix superhost! to the front of the member file name. The ‘!’ is the name separator. For example, with a superfile named host.dat, to create a member file named member.dat use the name host.dat!member.dat to create or open the member.
Note: For Windows programmers, the superfile member names are case sensitive unlike the file system or file names.
Superfile members are created with the standard low level and/or ISAM level create and open calls, including the extended create and open calls.
To create a superfile host, simply OR superfile into the file mode at create time. You can create the superfile host at either the low level or ISAM level, but it is best to use CreateIFile().
When opening a superfile host, it uses two c-tree file numbers:
- One file number for the superfile
- One file number for the member index
Therefore, remember to allow for these two file numbers in your application.
Creating/Opening Superfiles and Members/Host
Superfiles are created with the standard CreateDataFile() function, and opened with the standard OpenCtFile() function. You will OR in the value ctSUPERFILE as a part of the file mode parameter.
c-tree automatically opens the superfile host if a superfile member is opened before the superfile host. However, if c-tree is allowed to automatically open the superfile host, you will not be able to explicitly close the superfile host. To close the superfile host, call CloseISAM() or StopUser().
Note: Unlike standard index files that can be opened with an active PAGE_SIZE larger than the one with which they were created, you must open a superfile with exactly the same PAGE_SIZE that was used to create it.
Closing Superfiles and Members
If you are using low-level file closes, (CloseCtFile()), it is very important that you close all superfile members before you close the superfile itself. If you close the superfile before all members are closed, c-tree will close the members of that superfile, but subsequent open operations during the same program execution may report unexpected FUSE_ERR errors.
If you are using ISAM closes, (CloseISAM()), c-tree automatically closes the members of superfiles before the superfile itself is closed.
Restrictions/Considerations
In general, after you have created/opened a superfile member you do not have to change how you process the files. There are, however, some restrictions and/or considerations when using superfiles.
Nested Superfiles
A superfile cannot contain another superfile.
Physical Order
Since multiple data and index files are stored in a superfile, the concept of processing a file in physical order no longer applies. Records from different files may be intermixed.
When using LastRecord() and PreviousRecord() with a data file number, they return an error code. FirstRecord() and NextRecord() traverse all of the records, but are not guaranteed to be visited in increasing record offset order.
All of these functions continue to work as expected with an index file number.
Node Size
When a superfile is created, the current size of index nodes is a critical parameter.
An index node is equivalent to a record of the index file. On non-server systems, the index node size is controlled by the sect parameter in calls to InitCTree() and InitISAM(), or by the third parameter in an ISAM parameter file. On Server systems, the node size is controlled by the Server configuration, which defaults to 64 (8192 bytes).
When a superfile is opened, the index node size currently in effect must be the same as the index node size at the time the superfile was created. This is not usually a problem unless one wishes to move the file between different environments. Error SPAG_ERR (417) results if the node size does not match. By contrast, an ordinary index file can be opened as long as the current node size is not smaller than the node size at the time the index file was created.
Detect if Superfile Member Names are too Long
The length of a FairCom DB Superfile name is restricted because the Superfile directory index has a restricted key length. A new default behavior checks when the Superfile member name is restricted due to small page sizes: if the Superfile name is truncated in the Superfile member directory index, then the create of the member will fail with SNAM_ERR (418). Without this new behavior, member names that would be truncated to the same string would cause DOPN_ERR (19) and/or KOPN_ERR (18) errors.
To address the situation where names may have been truncated but no conflicts arose and would now get error SNAM_ERR at create time, a provision has been made to revert to the original behavior. A configuration option for standalone applications, ctBEHAV_CHKMBRNAMLEN is on by default. It can be turned off at compile time by defining NO_ctBEHAV_CHKMBRNAMLEN.
When using FairCom DB the configuration keyword, COMPATIBILITY NO_CHKMBRNAMLEN, has been introduced to revert to the original behavior.
Temporary Files
You should not use superfile members for temporary files or any other files that are frequently deleted. Repeated file creations and deletions will adversely affect the superfile space management since the space from a deleted file cannot be reused.
Rebuild
Rebuilding of superfiles when c-tree is linked with the Standalone Multi-user FPUTFGET library is not supported and is reinforced in ctsbld by SBLF_ERR (512).
Be sure all files are closed prior to attempting to rebuild a superfile. The superfile pre-pass utility and function, CTSBLD and SuperfilePrepassXtd, and the rebuild function RebuildIFIle() open the files in exclusive mode and close the files upon completion.
The ctscmp utility can be used to compact a superfile and rebuild the indices in a single operation. However, you must have disk space for a copy of the superfile in order to do this.
Single System File Descriptor
You may see reduced file I/O performance because all the I/O for all the members are funneled through a single system file descriptor.
Risks
An embedded index is used to locate the individual members in a superfile. If a single superfile is damaged, you may have a higher risk of damage because it contains multiple data and index files. Superfiles are useful in the proper circumstances (i.e., if insufficient file descriptors are available, or if it is important to have just one physical file), however, file recovery is more difficult. This should be factored into the decision of whether or not to use superfiles.
In addition, because superfile members jump over one another as they expand, superfiles can become fragmented, and therefore may experience a higher risk of performance decrease and file size increase.
Batched Operations
To improve performance in situations in which a group of related records are to be operated on, c-tree offers the DoBatch() function. Currently this function provides the following capabilities:
- Retrieve: All records with key matching the partial key request are loaded into a buffer region. If there are more records than fit in the buffer, those that fit are loaded, and subsequent calls to DoBatch() can retrieve the rest.
- Delete: All records with a key matching the partial key request are deleted.
- Insert: (DoBatchXtd) All records with a key matching the partial key request are inserted into a c-tree data file.
DoBatch is an ISAM-level Set function. Specify an index file to search, and a partial key value for the search. DoBatch() attempts the specified operation on all records with keys matching the partial key value.
DoBatchXtd is an enhanced version of the batch function offering additional options such as batch record inserts, updates, index range retrievals, and batch reads in physical order.
Batch Parameters
The parameters for DoBatch() are:
- filno: the index file number to search,
- request: a pointer to a Partial Key Request structure, as defined later, defining the set of records,
- bufptr: a pointer to a buffer region into which the group of related records will be retrieved, if specified,
- mode: a value that specifies the type of action to be taken, together with a number of options.
DoBatchXtd() provides additional batch functionality. This function takes the same parameters as DoBatch except that the mode parameter is a ULONG instead of a UCOUNT. The function prototype is shown below:
ctCONV COUNT ctDECL DoBatchXtd(FILNO filno, pVOID request, pVOID bufptr, VRLEN bufsiz, ULONG mode);
Partial Key Request Structure
One of the parameters that is passed to DoBatch() is the Partial Key Request structure, PKEYREQ. This structure is used to specify which set to perform the batch operation on, and to return some statistics on that set. The PKEYREQ typedef is defined as:
typedef struct pkeyreq {
LONG btotal; /* total entries in set */
LONG bavail; /* # of entries locked */
LONG breturn; /* # of objects in buffer */
COUNT siglen; /* significant length */
TEXT target[MAXLEN]; /* partial key target */
} PKEYREQ;
When you call DoBatch() you pass a pointer to a PKEYREQ structure. The only fields that must be initialized are: target, containing the portion of a properly formed key to match, and siglen, specifying the significant number of bytes in the key. All keys in the appropriate index matching the first siglen bytes of target are processed.
After DoBatch() has been called, check the values in the PKEYREQ structure for the results. btotal tells how many records match the partial key target. If locks were requested on the records, bavail contains the number of records locked successfully. breturn specifies the number of records that fit in the buffer region.
It may be necessary to make a series of DoBatch() calls to process a batch. The first call specifies the batch and returns information on the number of records in the batch. When reading the records in the batch into the buffer area, if the buffer is not large enough to hold all of the records, make additional calls to process the rest of the records. Finally, make a last call to close or cancel the batch, in preparation for the next call.
Multiple Batches
Multiple batches can be active at the same time. To do this, it is necessary to call ChangeBatch() before calling DoBatch(). For more about using multiple batches, see ChangeBatch.
First Call
On the first call to DoBatch(), the mode parameter must be one of the following:
- BAT_GET: Retrieve matching records.
- BAT_DEL: Delete matching records.
In addition, you can OR in one of the constants from any, or all, of the optional groups listed in the following topics:
See also:
Verify Option
BAT_VERIFY
Verify that the keys and records match. If a record is selected based on its key value in the index, the data record will be read and the key recreated from the information in the record, to double check that the key is accurate. This prevents a record from being selected if it is in the process of being updated, and the index value has not yet been rewritten. BAT_VERIFY is required to apply an active record filter to the batch results, otherwise active filters are ignored.
Default
The default is to have no verification.
See also:
Result Option
BAT_RET_REC
Return the actual records to the buffer. As many records as fit are placed in the buffer region, and the number actually read is in breturn.
Fixed-length records are stored with the 4-byte record offset first, followed by the actual data record. Variable-length records are stored with the 4-byte record offset first, then the 4-byte record length, followed by the actual data record. It is up to the programmer to use the buffer correctly.
Note: When using Huge Files, BAT_RET_REC mode returns an 8-byte record offset preceding the record body for each record returned in the batch buffer. For non-Huge Files, a 4-byte record offset is returned instead.
BAT_RET_POS
With this option, only the 4-byte record offsets of the matching records are placed in the buffer. This allows you to do the actual reading of the records when you want.
BAT_RET_FIX
Operates in the same manner as BAT_RET_REC except that only the fixed length portion of a variable-length data record is returned. If BAT_RET_FIX is used for a fixed length data file the results will be the same as using BAT_RET_REC. When BAT_RET_FIX is called for a variable-length file, it returns information in the same manner as BAT_RET_REC called for a variable-length file except that only the fixed length portion is returned.
The form of the information returned is the same: record position, (entire) record size, fixed length portion. The size information provides the total record size of the variable-length record (even though only the fixed length portion is returned). Nowhere in the output is the fixed length size explicitly returned. In this, it is the same as when BAT_RET_REC is called for a fixed length file. It is assumed the application has knowledge of the fixed length size, say through a call to GetCtFileInfo(datno,RECLEN).
Default
The default is for the buffer not to be used. Only the first three fields of the PKEYREQ structure, (btotal, bavail, and breturn), are updated.
See also:
Lock Option
Note: Since this is an ISAM function, all locks can be released with LockISAM(ctFREE).
BAT_LOK_RED
With this option, a read lock will be placed on each record. This lock prevents other users from acquiring a write (update) lock on this record. If a lock cannot be obtained on a particular record then it will not be read. The PKEYREQ value bavail specifies the number of records that have been locked successfully. If bavail is less than btotal, not all records were locked.
BAT_LOK_WRT
Similar to BAT_LOK_RED, except that a write (update) lock will be acquired.
Default
No locks will be acquired.
See also:
Completion Option
BAT_COMPLETE
When BAT_COMPLETE is specified, DoBatch() returns a successful completion value only if ALL matching records are successfully locked. If a lock cannot be obtained on any record, DoBatch() signals failure, and all of the locks are released. This option makes sense only if you specify BAT_LOK_RED or BAT_LOK_WRT.
Default
Ignore unlocked records. Allow successful completion even if a lock has not been obtained on all records.
See also:
Automatic Batch Close Mode
Calls to BATSETX()/BATSET() that "exhaust" the batch do not typically close (or clean-up) the batch. The next call will return a BTMT_ERR (428) indicating the batch has no more remaining entries, and the batch is closed. More specifically, this may occur when BAT_GET or BAT_NXT requests are made for: partial key requests (BAT_PKEY); greater or less than key requests (BAT_GKEY, BAT_LKEY); and physical order requests (BAT_PHYS). These "exhausted" conditions can be detected, and we now close the batch and use the new sysiocod signal.
An application calling BATSETX() can now OR BAT_CLSE to the mode parameter. Then when a BAT_PKEY, BAT_RNGE, BAT_GKEY, BAT_LKEY or BAT_PHYS request exhausts all the entries, either in the first call to BATSETX() or in subsequent calls, the batch will be closed immediately, and sysiocod will be set to BTNO_COD to allow the application to know the batch has been closed.
By exhausting a batch we mean returning all the requested information (records, keys, or record positions) which satisfy the batch. Without this modification, an exhausted batch typically requires one additional call (BAT_NXT) to detect the batch is exhausted and cause the batch to be closed, or a call to explicitly request the batch close (BAT_CAN).
See also:
Continuation Calls
The breturn value of PKEYREQ tells how many records were read into the buffer. If breturn is less than bavail, the buffer was not large enough to hold all matching records. Make another call to DoBatch() to retrieve the rest of the records.
To process the next group of records in the batch, call DoBatch() with a mode value of BAT_NXT. btotal and bavail are not changed, but breturn reflects the number of records in the buffer for this DoBatch() call.
See also:
Complete or Cancel Call
When finished with a batch, call DoBatch() with a mode of BAT_CAN. Use this to cancel a batch before you have processed all records. It is also a good idea to make this call even if you have processed all of the records. There are some cases where DoBatch() will close the batch automatically. However, this doesn’t always happen. For instance, if you do not specify either BAT_REC_RET or BAT_RET_POS, so that the only effect is for the PKEYREQ structure to be updated, DoBatch() does NOT automatically close the batch. If you try to make another call of that same sort, you will see error BTST_ERR (427).
Therefore, we recommend that you always make a DoBatch() call with a mode of BAT_CAN to end a sequence of batch calls.
See also:
Signal Batch Clean-up on Client Side
It was found that redundant calls to BATSETX()/BATSET() with a mode of BAT_CAN could be eliminated for convenience and notify a client when a batch request has been completed.
This feature causes sysiocod to be set to BTNO_COD (-844) on calls to BATSETX()/BATSET() that result in the current batch being closed. If the client application finds sysiocod set to BTNO_COD, then the application does not need to issue a BAT_CAN call before issuing a new batch request.
This new behavior requires the following FairCom Server configuration entry:
COMPATIBILITY BATCH_SIGNAL
Batch Read Operations
Several options are provided for reading a batched group of records.
Batch Read in Physical Order
A batch record read in physical order is performed by calling DoBatchXtd() with a mode of BAT_PHYS, specifying a request block comprised of 5 LONG integers. The first three values are the standard btotal, bavail and breturn from the PKEYREQ structure. The last two values are the high-order and low-order four-byte record position. For the first call to DoBatchXtd(), these last two LONG integers specify the starting position for the physical retrieval. On return from the first and subsequent calls to DoBatchXtd() these last two LONG integers specify the next record location (i.e., the physical record position following the last record included in the output buffer). Subsequent DoBatchXtd() calls continue to return records in physical order until the logical end of the data file is reached (i.e., no more records remain in physical order).
Note: The next record position is returned for informational purposes. It is not used on a subsequent call (BAT_NXT) to retrieve additional records. The server maintains the starting point for the next batch retrieval internally.
Batch Read in Key Decreasing Order
A batch record read can be performed in key decreasing order by calling DoBatchXtd() with a mode of BAT_LKEY. Such a call returns records having a key value less or equal to the specified key value.
Batch Read with Index Range Definition
A batch record read can be performed under control of an index range definition by following these steps:
- Call AllocateRange() to establish the range criteria.
- Call DoBatchXtd() with a mode of BAT_RNGE.
When the BAT_RNGE mode is specified, the batch call behaves as when using the BAT_PKEY mode except that the records selected satisfy the range criteria established by the call to AllocateRange(). The siglen and target members of the partial key request (PKEYREQ) structure are ignored when the BAT_RNGE mode is specified.
Additional DoBatchXtd() Options
When using a batch read for the purpose of reading blocks of records to pass to a batch insert call, the following batch read modes may be useful to include in the batch read call:
- BAT_RET_BLK: Return a contiguous block of records as if they had been lifted directly from data file including VHDR structure for variable-length records. For batch reads, this mode can only be used with the BAT_PHYS mode. This mode is designed to return blocks of records in an appropriate format for a batch insert operation.
Batch inserts using BAT_RET_BLK record organization should be marginally faster than BAT_RET_REC record organization when the target file has associated keys, and even better when it has no keys as the data records are written with one write operation when BAT_RET_BLK is in use. By comparison, BAT_RET_REC returns record images preceded by their 4 or 8 byte record position and 4-byte length which must be stripped before writing the records to the target file.
- BAT_RET_RAW: In heterogeneous client/server implementations, turns off the conversion of record images, key values and record positions returned by a batch retrieval call. This may prove especially useful with calls to retrieve a batch intended to be inserted into another file with BAT_INS.
- BAT_LOK_ONE: An alternative locking strategy: only locks the record during the record read; original locking strategy keeps locks on during entire batch processing.
- BAT_LOK_BLK: Converts BAT_LOK_RED and BAT_LOK_WRT to blocking read and write locks, respectively.
Batch Insert Operations
The FairCom Server supports inserting a batch of records into a c-tree data file. Batch functions can even be used to load records from one file into another file of the same format.
To batch load records into a file, call DoBatchXtd() as follows:
- Specify the BAT_INS mode and either BAT_RET_BLK or BAT_RET_REC to indicate the format in which the records are organized in the input buffer:
- BAT_RET_BLK denotes that the records are essentially as they would be if read directly from disk, including variable-length record header information. That is, the records are packed in the batch buffer with no intervening information between them.
- BAT_RET_REC denotes that each record in the batch buffer is proceeded by its four-byte record byte-offset value (eight-byte for HUGE files).
- For the request parameter, pass a pointer to an array of 5 LONG integers immediately followed by a buffer containing the records to be inserted.
- Set bufsiz to the size of this additional buffer in bytes. (Ordinarily, bufsiz specifies the size of the output buffer pointed to by bufptr, however, bufptr is ignored in calls for BAT_INS.)
- Only the third of the 5 LONG integers (breturn) is used on input. Set breturn to the number of records in the buffer.
On return from a BAT_INS call, only the DoBatchXtd() return value is significant.
For transaction controlled files, the batch insertion operation is treated as one all or nothing operation. If no explicit transaction is started, each BAT_INS call will start and end its own transaction. Even if an explicit transaction is started, each call to BAT_INS is treated independently through save points. One failed BAT_INS call does not invalidate prior or subsequent BAT_INS calls.
An optional mode that can be used with the BAT_INS mode is BAT_INP_RAW. Its effect is described as follows:
- BAT_INP_RAW: In heterogeneous client/server implementations, turns off the conversion of record images, key values and record positions passed to a batch function.
Note: Currently, all calls for batch insertion (BAT_INS) behave as if BAT_INP_RAW is turned on, regardless of the actual mode parameter.
Batch Update Operations
A collection of records, each identified by its unique key value, can be updated using batch operations. The BAT_UPD option is used for the mode parameter specified when calling the batch function. Several considerations should be kept in mind:
- BAT_UPD and BAT_UPD_KEY
- Batch Update operations: BAT_UPD and BAT_UPD_KEY
- Heterogeneous support for BAT_INS and BAT_UPD
- Record locking - BAT_LOK_BLK, BAT_LOK_KEEP and BAT_LOK_ONE
BAT_UPD and BAT_UPD_KEY
By default, BAT_UPD extracts the unique key from the record image, reads the record and then uses the record image provided in the input buffer to perform an Update operation. If the extracted key does not exist, a record ADD operation is performed. A mode called BAT_UPD_KEY is provided in which the old key can be specified for situations in which a rewrite might change the unique key used to locate the record.
Overview of Operations Performed
For each record in the input buffer, the batch update operation will do the following:
- Either begin a transaction or establish a save point if the user already has an active transaction.
- Build target key on the unique key.
- Read the record with the requested lock:
- If the record exists, update it.
- If the record does not exist, add it.
- In case of success or failure: Commit or abort the transaction or clear or restore the save point.
- Release record locks if requested.
Notice that a missing key implies that a new record will be added.
A unique key is required on the file. If the index allows duplicates, the operation will return an error, so it is important to specify which unique index to use.
The function will return the number of records that were added and the number of records that were updated.
Update Modes
This operation provides two update modes:
- BAT_UPD (default) - The batch update call sends a series of record images and the filno parameter corresponds to a unique key index.
- BAT_UPD_KEY - The record images are preceded by the key value that uniquely identifies the record to be rewritten or added.
In both cases, variable-length data files have the 4-byte length of the data record as the first field of each entry.
In the default BAT_UPD mode, entries are formatted as shown below:
- For fixed-length data file: <record image 1> <record image 2> ...
- For variable-length data file: <reclen 1> <record image 1> <reclen 2> <record image 2> ...
When BAT_UPD_KEY mode is used, entries are formatted as:
- For fixed-length data file: <old key 1> <record image 1> <old key 2> <record image 2> ...
- For variable-length data file: <reclen 1> <old key 1> <record image 1> <reclen 2> <old key 2> <record image 2> ...
The BAT_UPD_KEY mode should be used when a rewrite might change the unique key used to locate the record. The old key permits the record to be located and read before the rewrite operation.
Batch Updates and BAT_RET_BLK Format
Beginning with FairCom DB V11, batch update now recognizes BAT_RET_BLK record format. When using batch update to add variable-length records that were read using a batch physical read with the BAT_RET_BLK option, the batch update failed with error SDAT_ERR. Batch update logic has been modified to recognize the BAT_RET_BLK record format.
Additional Modes
Additional modes can be used with BAT_UPD and BAT_UPD_KEY to provide record locking. See Batch Update operations: BAT_UPD and BAT_UPD_KEY.
Batch Update Operations: BAT_UPD and BAT_UPD_KEY
The batch update call to DoBatchXtd() (short name BATSETX) is formatted as follows:
BATSETX(keyno,request,NULL,bufsiz,mode)
Where the parameters are:
- keyno - Index file number of a unique index associated with the target data file
- request –
- Input buffer comprised of 5 LONG integers followed by the record entries as described above. On input, ((pLONG)request)[2] holds the number of entries.
- Output buffer comprised of 5 LONG integers:
| Symbolic name | Value | Description |
|---|---|---|
| BAT_STT_UPDATE | 0x00000001 | error on rewrite |
| BAT_STT_ADD | 0x00000002 | error on insertion |
| BAT_STT_DELETE | 0x00000004 | error on delete |
| BAT_STT_REDIREC | 0x00000008 | error on reading old record |
| BAT_STT_EQLKEY | 0x00000010 | error on equal key |
| BAT_STT_BUFSIZ | 0x00000020 | record image area too small |
| BAT_STT_FRMKEY | 0x00000040 | error assembling or transforming unique key value |
| BAT_STT_GTVLEN | 0x00000080 | error on get var rec length |
| BAT_STT_GETMEM | 0x00000100 | error allocating record buf |
| BAT_STT_CONVERT | 0x00000200 | error converting rec image |
- bufsiz - Size of the input buffer region that follows the 5 LONG integers
- mode - BAT_UPD or (BAT_UPD | BAT_UPD_KEY).
Error Handling
BATSETX() returns an error code or NO_ERROR. This operation will stop if an error occurs. In case of an error, the number of successful Updates and ADDs are returned in elements 0 and 1 of the request output buffer as described above.
Because the operation stops on the first record that fails, adding the number of Updates and number of ADDs gives you the record number that failed.
Transaction Control - If an error occurs when the files are transaction controlled, none of the successful updates will be committed since BAT_UPD starts a transaction if one is not active or a save point if a transaction is active. In the event of an error, it either aborts the transaction or restores to the save point. For files that are not transaction controlled, the successful updates survive the error.
Status Word Zero - If an error code is returned and the status word is zero, the error occurred during the setup of the BAT_UPD and no attempts were made to update any records.
Limitations
See Heterogeneous support for BAT_INS and BAT_UPD.
Heterogeneous Support for BAT_INS and BAT_UPD
BAT_INS and BAT_UPD support heterogeneous client/server environments in which the record images formed by the client do not conform to the record images stored on disk. In addition, this modification detects alignment discrepancies between the client's record images and the alignment on disk.
The following are important details related to this new capability in heterogeneous client/server situations:
- BAT_INP_RAW, USERPRF_NDATA and USERPRF_NTKEY change the batch conversion process:
If BAT_INP_RAW is included in the BATSETX mode, then the collection of records, key values (BAT_UPD_KEY only) and control information such as variable record lengths are not converted.
If BAT_INP_RAW is not turned on, then if USERPRF_NDATA is turned on, the record images are not converted, but the record lengths (variable-length data files only) will be converted.
If BAT_INP_RAW is not turned on, then if USERPRF_NTKEY is turned on and if BAT_UPD_KEY is included in the BATSETX mode, the key will not be transformed before it is used to find an existing data record.
- If the conversions have not been turned off as noted above, then the server performs the necessary conversions. If the server predates this modification, and conversions are required, the BATSETX fails on the client-side (i.e., no call is made to the server) with SCNV_ERR (994).
- If an alignment discrepancy is found, BATSETX fails with ALGN_ERR (992).
Record Locking - BAT_LOK_BLK, BAT_LOK_KEEP and BAT_LOK_ONE
Record locking for BAT_UPD (as well as BAT_INS) is different from batch calls used to retrieve records. The retrieval batches use batch-specific protocols to manage the locks. BAT_UPD (and BAT_INS when records are added one at a time) use standard ISAM API calls to update the files; the locking behaves just as if the client made the ISAM API calls.
BAT_UPD includes support for these modifiers to affect record locking:
- BAT_LOK_BLK: Acquire blocking lock when reading the record.
- BAT_LOK_ONE: Release lock immediately after add/update.
- BAT_LOK_KEEP: Keep locks after batch ends.
- BAT_LOK_WRT: Acquire write lock when reading the record
If BAT_LOK_BLK and BAT_LOK_WRT are included with BAT_UPD, the ISAM lock state is changed to ctENABLE_BLK.
- If either a transaction or LKISAM is called before BATSETX is invoked, the LKISAM state will be changed to ctENABLE_BLK only if the existing ISAM lock state is ctENABLE at the time BATSETX is invoked (typically this will be the case).
If BAT_LOK_BLK is included with BAT_UPD, this will cause the lock requests for the internal read operations (that find the existing versions of the records to be updated) to sleep if the record is locked by another user. This will cause the batch update to sleep until such locks are freed.
If BAT_LOK_KEEP is included with BAT_UPD, the data records updated or added to the file by BATSETX will remain locked on return from the BATSETX call. If either a transaction or LKISAM is called before BATSETX is invoked, the locks will remain after BATSETX returns whether or not BAT_LOK_KEEP is used. If neither was invoked before BATSETX, then BAT_LOK_KEEP will also cause the ISAM lock state set in BATSETX to persist after BATSETX returns.
If BAT_LOK_ONE is included with BAT_UPD and the file to be updated is not transaction controlled, then the record is unlocked after each record is updated or added; otherwise the records are unlocked at the end of BATSETX unless BAT_LOK_KEEP is in effect.
- If the file is transaction controlled, BAT_LOK_ONE is ignored.
Performance Enhancement
It is possible to improve batch performance for BAT_PKEY and BAT_RET_REC when not using BAT_VERIFY. Only the records which fit in the first batch buffer are read on the first DoBatch() call.
Note: BAT_VERIFY ensures the index entry and data record are in agreement. It is possible for an update operation to cause the index and data to be temporarily out of sync. Without BAT_VERIFY and record locks, one cannot be certain all records returned will actually satisfy the batch request.
In very early releases, a BAT_PKEY (partial key search) and BAT_RET_REC (return data record) that did NOT require verification (no BAT_VERIFY), still read all data records satisfying partial key criteria on the first DoBatch() call even though the records not fitting in the buffer must be reread on subsequent DoBatch() calls. In currently supported releases, to return to the pre-modification state, add the following code to the top of the c-tree Option Header, ctoptn.h:
#define ctPREV_66B_BATCH
Recall that ctoptn.h is dynamically created during execution of the make file created by m-tree, ctree.mak. If changes are to be made to ctoptn.h, be sure the changes are made to the ctoptn.h creation logic at the end of the make file ctree.mak.
Example Use of DoBatchXtd()
The following pseudo-code shows a strategy for using DoBatchXtd retrievals from one file to load into another file with the same record format.
ULONG batmode;
LONG irq[8192];
pPKEYREQ pirq = (pPKEYREQ) irq;
COUNT retval;
CHGBAT(0);
batmode= BAT_GET | BAT_PHYS | BAT_RET_BLK | BAT_LOK_RED |
BAT_LOK_BLK | BAT_LOK_ONE;
/*
** start batch retrieval at current ISAM position. [Can start at any
** legitimate record offset.]
*/
irq[4] = GETCURP(src_datno);/* low order word */
irq[3] = ctGETHGH();/* high order word */
/*
** first batch [0] retrieval
*/
retval = BATSETX(src_datno,pirq,irq+5,
(VRLEN) sizeof(irq) - 5 * sizeof(LONG),batmode);
/*
** prepare batmode for subsequent retrievals
*/
batmode &= ~BAT_GET;
batmode |= BAT_NXT;
TRANBEG(ctTRANLOG);
do {
printf("\nstatus=%d tot=%ld avl=%ld ret=%ld nxthw=%lx nxtlw=%lx",
retval,irq[0],irq[1],irq[2],irq[3],irq[4]);
}
if (retval)
break;
/*
** switch to batch [1] for insertion
*/
CHGBAT(1);
if (BATSETX(dest_datno,pirq,0x1,
sizeof(irq) - 5 * sizeof(LONG),
BAT_INS | (batmode & (BAT_RET_REC | BAT_RET_BLK))))
{
printf("\nBATSETX BAT_INS: %d\n",isam_err);
break;
} else
printf("\nBATSETX BAT_INS: success");
CHGBAT(0);
retval = BATSETX(src_datno,pirq,irq + 5,
(VRLEN) sizeof(irq) - 5 * sizeof(LONG),
BAT_NXT /*batmode*/);
} while (1);
TRANEND(ctFREE);
CHGBAT(0);
BATSETX(src_datno,NULL,NULL,0,BAT_CAN);
CHGBAT(1);
BATSETX(dest_datno,NULL,NULL,0,BAT_CAN);
Index Ranges
c-tree’s index range support permits ISAM record reads in a manner similar to sets, but with more control. An index range is defined by specifying a range of values for one or more key segments. Each record retrieved must satisfy all the specified ranges.
The following segment range types are supported:
| Mode | Explanation |
|---|---|
| CTIX_EQ | Equality (key == target) |
| CTIX_GT | Strictly greater than (key > target) |
| CTIX_GE | Greater than or equal (key >= target) |
| CTIX_LE | Less than or equal (key <= target) |
| CTIX_LT | Strictly less than (key < target) |
| CTIX_NE | Not equal (key != target) |
| CTIX_BET | Between inclusive (targetLow <= key) && (key <= targetHigh) |
| CTIX_BET_IE | Between inclusive lower range (targetLow <= key) && (key < targetHigh) |
| CTIX_BET_EI | Between inclusive upper range (targetLow < key) && (key <= targetHigh) |
| CTIX_BET_EE | Between exclusive (targetLow < key) && (key < targetHigh) |
| CTIX_NOTBET | Not between (key < targetLow) || (key > targetHigh) |
You do not need to specify a range for all segments, but you cannot skip a segment. For instance, you could specify a range for the first segment or the first and second segment, etc. If you wanted to specify ranges for the first and third segment, you would need to specify ranges for the first three segments, and you would make the second segment all inclusive (for example, using a segment range such as CTIX_GE with a zero lower limit).
Each user can have only one range definition for each index. Specifying a new range for an index automatically frees an existing range definition for that index.
Using Index Ranges from the c-tree ISAM API
Applications that use the c-tree ISAM API follow these steps to use index ranges:
Allocating an Index Range
To allocate an index range using the c-tree ISAM API, call the AllocateRange() c-tree API function. The function prototype is as follows:
ctCONV COUNT ctDECL AllocateRange(FILNO keyno, NINT segcount, pVOID lrange,
pVOID urange, pNINT operators);
where
- keyno: Specifies the index file number;
- segcount: Specifies the number of segments with range specifications;
- lrange: Points to a buffer containing the limits for the segments;
- urange: Points to an optional buffer containing upper limits for the segments with between ranges (e.g., CTIX_BET); and
- operators: Points to an integer array of segment range types.
AllocateRange() returns NO_ERROR (0) on success.
Index ranges are ISAM context safe. That is, each ISAM context initiated via an OpenISAMContext() call can support its own index range.
The lrange and urange buffers must each be long enough to hold the first segcount segments for the index. For example, if segcount is 3, and if the first three segments are 10, 8, and 4 bytes long, then the range buffer(s) must be 22 bytes long.
If between ranges are not used, the urange buffer is ignored and may be passed as NULL. If between ranges are defined, then there must be a urange buffer, and it must be long enough to hold all segcount segments (in our example 22 bytes) even if only one between range segment is defined. For instance, if the first segment, 10 bytes long, does not use a between range, but the second segment does, then urange must hold the upper limit for the second segment starting at 10 bytes from the beginning of urange. Continuing this example, here is some pseudo code that assumes all the segments involved hold ASCII strings:
/*
** 3 segment example for keyno 12
*/
NINT ops[3] = {CTIX_LT,CTIX_BET,CTIX_NE};
TEXT lbuf[22];
TEXT ubuf[22];
COUNTrc;
memcpy(lbuf,"TUVWXYZ ",10);
memcpy(lbuf + 10,"00001000",8);
memcpy(lbuf + 18,"AX00",4);
memset(ubuf,0,22);
memcpy(ubuf + 10,"00001999",8);
rc = AllocateRange(12,3,lbuf,ubuf,ops);
The index must be part of an ISAM set of files and segcount must be greater than zero and cannot exceed the number of key segments actually defined for the index.
Reading Records Using an Index Range
If the call to AllocateRange() succeeds, then FirstInRange(), FirstInVRange(), LastInRange(), or LastInVRange() must be called to activate the range and return the first or last data record in the range. If no record satisfies the range, then these routines return INOT_ERR (101). If the first/last range call succeeds, then NextInRange(), NextInVRange(), PreviousInRange(), and PreviousInVRange() may be used to traverse the range returning the next or previous record in the range. INOT_ERR is returned if no more records satisfy the range. It is permissible to call the first/last range routines any time to re-establish position at the beginning or end of the range. These routines are declared as follows:
ctCONV COUNT ctDECL FirstInRange(FILNO keyno, pVOID recptr);
ctCONV COUNT ctDECL LastInRange(FILNO keyno, pVOID recptr);
ctCONV COUNT ctDECL NextInRange(FILNO keyno, pVOID recptr);
ctCONV COUNT ctDECL PreviousInRange(FILNO keyno, pVOID recptr);
ctCONV COUNT ctDECL FirstInVRange(FILNO keyno, pVOID recptr, pVRLEN plen);
ctCONV COUNT ctDECL LastInVRange(FILNO keyno, pVOID recptr, pVRLEN plen);
ctCONV COUNT ctDECL NextInVRange(FILNO keyno, pVOID recptr, pVRLEN plen);
ctCONV COUNT ctDECL PreviousInVRange(FILNO keyno, pVOID recptr, pVRLEN plen);
The interpretation of the parameters in the range record read functions is the same as for the ISAM record read functions FirstRecord(), FirstVRecord(), etc. The range record read functions return NO_ERROR (0) on success.
Freeing an Index Range
Call the FRERNG() c-tree API function to free a range definition. The function prototype is as follows, where keyno is the file number of the index with which the range is associated:
ctCONV COUNT ctDECL FRERNG(FILNO keyno);
Index Range Support for Virtual Segment Groups
This new range operator can retrieve key ranges defined over a group of contiguous key segments.
The FairCom DB range retrieval function (e.g., AllocateRange) previously operated at the key segment level. In V11.5 and later, it has been enhanced to support a range operator that can be used to perform retrievals on key ranges that are defined over a set of contiguous key segments (referred to here as a “virtual segment group”).
This feature allows you to search for all entries at the key level between two targets: from a low-key to a high-key and from a lower-bound to an upper-bound. By introducing support for grouping two or more contiguous segments into one virtual segment group, it is now possible to perform a new type of range retrieval.
As an example, consider a database that contains two fields of one character each:
"First", CT_FSTRING, 1
"Second", CT_FSTRING, 1
The index would consist of two segments, one for each field.
For this example, assume the data in these fields is as follows:
(A, 1)
(A, 7)
(A, 9)
(A, 11)
(B, 1)
(B, 2)
(B, 5)
(C, 1)
(C, 2)
(C, 5)
It has always been possible to specify a key range that would get us all of the records where the first field is A or an even more complex range where the first field is between A and B and the second field is between 1 and 3. That would get us the following records:
(A, 1)
(B, 1)
(B, 2)
(C, 1)
(C, 2)
A sequential range of records between (A, 7) and (C, 2) is a more challenging case. It would require maintaining a second index of a single segment that includes the first field and the second field. This solution was not possible unless the two fields were adjacent to each other and in the correct order in the database.
With virtual segment group support, we can specify a key range involving two or more index segments, which are adjacent in the index, and treat them as a single segment. The fields need not be adjacent in the database. In our example, that would allow us to specify a key range from (A, 7) to (C, 2), which would get us the following records:
(A, 7)
(A, 9)
(A, 11)
(B, 1)
(B, 2)
(B, 5)
(C, 1)
(C, 2)
New Range Operator Modifiers
This revision introduces two range operator modifiers to be OR-ed in to the operator parameter values passed to AllocateRange():
- CTIX_SSG marks the beginning of a segment group
- CTIX_ESG marks the end of a segment group
Example:
NINT oper[2],rc;
oper[1] = CTIX_BET | CTIX_SSG; /* first segment: start segment group */
oper[2] = CTIX_BET | CTIX_ESG; /* second segment: end segment group */
rc = AllocateRange(keyno, 2, ""A4"", ""C6"", &oper);
Blocking Record Reads
The FairCom Server’s blocking record read feature permits an application to attempt a record read with an optional blocking condition, and if no records satisfy the read request, to wait until such a record exists or the block times out. This feature provides a convenient way for an application to process specified records from a file as they become available.
Using Blocking Record Reads
A blocking record read is performed by calling the BlockingISAMRead() c-tree API function. The BlockingISAMRead() function operates by performing the requested ISAM operation (FIRST(), NEXT(), etc.). If successful, it then checks if the optional blocking condition is satisfied. If no record was found or if the record does not satisfy the blocking condition, and a non-zero time-out is specified, then the read blocks (sleeps) for the specified time-out period. The sleep is interrupted if the target file is updated, and the process repeats.
Either a record is found that satisfies the condition, or the block times out. A time out condition is indicated by returning NTIM_ERR (156).
Some Parameters of BlockingISAMRead()
- opcode specifies one of the following ISAM read operations:
| ISAM Read Operation | Explanation |
|---|---|
| ctBLKIREC_FIRST | Read first record in physical or key order |
| ctBLKIREC_NEXT | Read next record in physical or key order |
| ctBLKIREC_PREV | Read previous record in physical or key order |
| ctBLKIREC_LAST | Read last record in physical or key order |
| ctBLKIREC_GT | Read record with key value greater than target key |
| ctBLKIREC_GTE | Read record with key value greater than or equal to target key |
| ctBLKIREC_EQL | Read record with key value equal to target key |
| ctBLKIREC_LTE | Read record with key value less than or equal to target key |
| ctBLKIREC_LT | Read record with key value less than target key |
- timeoutsec specifies the number of seconds the blocking read will wait before returning if no record satisfies the optional blockcond argument or no record is found. Set timeoutsec to zero to return immediately.
- blockcond is an optional conditional expression which may be a logical expression of the same form as used for conditional index support or a function callback as in a SetDataFilter() call. Set blockcond to an empty string ("") to specify no condition.
- target should be non-NULL only if filno specifies an index and the opcode requires a target value (for example, ctBLKIREC_GT).
- The record is read into the buffer pointed to by recptr.
- plen should be NULL for a fixed-length read, and should point to the length of the output buffer (recptr) for a variable-length read. *plen is updated to the actual record length upon successful return.
Data Filters
Developers consistently search for fast, efficient ways to retrieve data. While it is relatively easy to acquire huge masses of data, the more data available, the more critical it becomes to quickly and efficiently retrieve specific data.
FairCom data filters are a powerful method to define temporary specific criteria restricting return of only those records meeting these criteria with dynamic conditional expressions. Generally, indexes are also used to reduce unnecessary record retrievals; however, a filter allows "as needed" criteria on demand.
Unlike conditional index expressions, which become a permanent index definition (except for temporary indexes), a specific data filter lasts only until another filter is specified for the same data file or the data file is closed. Further, a specific filter applies only to the calling user: it does not affect retrieval of records by other users. Filters are similar to, though less powerful than, queries.
Data filters in client/server environments can substantially reduce network traffic as only records satisfying a given filter condition are returned from the server. ISAM-level record requests, including sets and batches, skip records failing filter criteria transparently to the user.
For example, given a data file of 100,000 records, if a filter excludes 90,000 of those records, a FirstRecord() - NextRecord() loop results in only 10,000 round-trip calls to the FairCom Server rather than traversing all 100,000 records individually. That's a 90% reduction with measurable performance gains!
Data filters are implemented with the SetDataFilter() API function. When a filter is established, record returns, such as from FirstRecord() and GetRecord(), can be limited to a specific range. Conditional expressions, in conjunction with the schema map and symbolic field names, describe the range of values returned. For example, the expression “(ZipCode >= 65000) && (ZipCode < 66000)” would limit returns to records with 65xxx ZipCodes. While there is a limit of one expression, expressions can be as complex as necessary to accomplish your goals.
For partitioned files, call SetDataFilter() for the host data file and the filter automatically applies to each partition member file.
See also
- SetDataFilter (SetDataFilter, /doc/ctreeplus/setdatafilter.htm)
- SETFLTRN (SETFLTRN, /doc/ctreeplus/setfltrn.htm)
- LoadFilter (LoadFilter, /doc/ctreeplus/loadfilter.htm)
- UnloadFilter (UnloadFilter, /doc/ctreeplus/unloadfilter.htm)
- FairCom DB Conditional Expression Parser
Conditional Indexes
A powerful way to enhance data filtering is directly with index-level control. Conditional indexes are sophisticated index-level filters. To illustrate, consider the following example:
An application requires an ability to dynamically view all new customers within the past 6 months who have purchased at least $50,000 worth of product.
This type of search can be accomplished with existing c-tree batch operations, set functions, or a compound index. In fact, with low-level c-tree functions, this was a common approach, that is, only adding index keys when necessary, as developers maintained complete control over all data files and indexes.
Each of these methods can be quite efficient, however an index will likely contain information unrelated to a specific request, requiring additional scanning and searching. Using low-level calls further precludes advanced ISAM features and desired functionality.
With conditional index support, an index can be uniquely defined containing only information satisfying exact search criteria, thereby providing a fast and elegant solution.
Conditional index expressions are defined by UpdateConditionalIndex() and GetConditionalIndex(). UpdateConditionalIndex() allows conditional expression associated with an index to be added, removed or changed. GetConditionalIndex() retrieves the current conditional expression for a given index.
With these functions, indexes can be limited to a specific range. The conditional expressions, in conjunction with the schema map and symbolic field names, describe the range of values included. For example, the expression “(ZipCode >= 65000) && (ZipCode < 66000)” would limit the index to entries with 65xxx ZipCodes. While there is a limit of one expression per index, the expressions can be as complex as necessary to accomplish your goals.
A potential disadvantage with this feature is when a data file has a large quantity of indexes. Each additional index requires extra processing when adding, updating or deleting key values, especially in batch processing situations. To minimize overhead caused by numerous indexes, see Faster ISAM Access (Disable Key Buffer Support for Faster ISAM Access, /doc/ctreeplus/36768.htm).
Note: Conditional index support requires ISAM support, RESOURCES, and a DODA in the data file to provide the schema map and symbolic field names.
See also
- UpdateConditionalIndex (UpdateConditionalIndex, /doc/ctreeplus/updateconditionalindex.htm)
- GetConditionalIndex (GetConditionalIndex, /doc/ctreeplus/getconditionalindex.htm)
- FairCom DB Conditional Expression Parser
FairCom DB Expression Parser and Grammar
A powerful expression parser/analyzer provides for complex conditional expressions that can be defined and evaluated at runtime.
Filter expression syntax closely follows the C language syntax for expressions, including order of precedence. An expression interpreted by the expression parser should compile without errors with a standard C compiler. As in C, you cannot compare strings directly like LastName > 'S'. However, the expression parser has a number of built-in functions that allow the comparison of strings. Example:
strcmp( LastName, "S" ) > 0
The expression handling assures proper alignment considerations are handled, and ensures buffer size of any record being evaluated is big enough.
Routines that evaluate conditional expressions maintain fixed data record lengths and total data record lengths. This permits correct alignment adjustments and detects if insufficient data is available. The latter condition results in a CVAL_ERR (598) error. The easiest way to produce a CVAL_ERR (598) is to read only the fixed-length portion of a data record, and have an expression that relies on fields in the variable-length portion of the record.
For additional control, a Conditional Expression Callback Function is available. This allows advanced control through an external user created function.
See also
- Data Filters
- Conditional Index Support
- Partitioned Files
- cndxparse
- cndxeval
- cndxrun
- cndxfree
- ctparsedoda
- getcndxmem
- putcndxmem
Constants
The expression parser uses the constants below internally as a 32-bit signed or unsigned integer:
- Character constants: any valid char enclosed in single quotes, e.g., 'a'
- Signed Integer constants: values from - 2,147,438,647 to 2,147,438,647
- Unsigned Integer constants: values from 0 to 4,294,967,295
- Hexadecimal constants: values from 0x00000000 to 0xffffffff. Any combination of lower case or upper case letters are accepted, e.g., 0Xbc4f or 0xF5C56d
- Date constants:
{d'yyyy-[m]n-[d]d'}
yyyy is 4 digit year
[m]m is 1 or 2 digit month
[d]d is 1 or 2 digit day
Examples:
{d'2024-12-06'}
{d'2024-8-6'}
'[m]m-[d]d-yyyy' or '[m]m/[d]d/yyyy'
yyyy is 4 digit year
[m]m is 1 or 2 digit month
[d]d is 1 or 2 digit day
Examples:
'12-06-2024'
'8-6-2024'
'8/6/2024'
'yyyy-[m]m-[d]d' or 'yyyy/[m]m/[d]d'
yyyy is 4 digit year
[m]m is 1 or 2 digit month
[d]d is 1 or 2 digit day
Examples:
'2024-12-06'
'2024-8-6'
'2024/8/6'
'[d]d-mmm-yyyy' or '[d]d/mmm/yyyy'
[d]d is 1 or 2 digit day
mmm is 3 letter month name
yyyy is 4 digit year
month names (case insensitive) are:
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
Examples:
'1/jan/2024'
'12-feb-2024'
- Time constants:
{t'[h]h:[m]m:[s]s'}
[h]h is 1 or 2 digit hour
[m]m is 1 or 2 digit minute
[s]s is 1 or 2 digit second
Examples:
{t'23:11:33'}
{t'1:2:3'}
'[h]h:[m]m:[s]s[:[i][i]i]'
[h]h is 1 or 2 digit hour
[m]m is 1 or 2 digit minute
[s]s is 1 or 2 digit second
[i][i]i is optional 1 to 3 digit millisecond
Examples:
'1:2:3'
'12:33:44.123'
- Timestamp constants:
{ts'yyyy-[m]m-[d]d [h]h:M[M]:s[s]'}
yyyy is 4 digit year
[m]m is 1 or 2 digit month
[d]d is 1 or 2 digit day
[h]h is 1 or 2 digit hour
[M]M is 1 or 2 digit minute
[s]s is 1 or 2 digit second
Examples:
{ts'2024-08-12 22:33:44'}
{ts'2024-08-12 2:3:4'}
'[m]m-[d]d-yyyy [h]h:[M]M:[s]s[:[i][i]i]' or '[m]m/[d]d/yyyy [h]h:[M]M:[s]s[:[i][i]i]'
[m]m is 1 or 2 digit month
[d]d is 1 or 2 digit day
yyyy is 4 digit year
[h]h is 1 or 2 digit hour
[M]M is 1 or 2 digit minute
[s]s is 1 or 2 digit second
[i][i]i is optional 1 to 3 digit millisecond
Examples:
'08-12-2024 22:33:44'
'08/12/2024 2:3:4.123'
'yyyy-[m]m-[d]d [h]h:[M]M:[s]s[:[i][i]i]' or 'yyyy/[m]m/[d]d [h]h:[M]M:[s]s[:[i][i]i]'
yyyy is 4 digit year
[m]m is 1 or 2 digit month
[d]d is 1 or 2 digit day
[h]h is 1 or 2 digit hour
[M]M is 1 or 2 digit minute
[s]s is 1 or 2 digit second
[i][i]i is optional 1 to 3 digit millisecond
Examples:
'2024-08-12 22:33:44'
'2024/08/12 2:3:4.123'
'[d]d-mmm-yyyy [h]h:[M]M:[s]s[:[i][i]i]' or '[d]d/mmm/yyyy [h]h:[M]M:[s]s[:[i][i]i]'
[d]d is 1 or 2 digit day
mmm is 3 letter month name
yyyy is 4 digit year
[h]h is 1 or 2 digit hour
[M]M is 1 or 2 digit minute
[s]s is 1 or 2 digit second
[i][i]i is optional 1 to 3 digit millisecond
month names (case insensitive) are:
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
Examples:
'08-dec-2024 22:33:44'
'08/dec/2024 2:3:4.123'
Any integer larger than the maximum size allowed for integers, or any number with decimal points, or any numbers in scientific notation are interpreted as a floating point constant by the expression parser .
String constants are similar to C string constants and represent any text enclosed by double quotes, for example, "This is a string". The maximum size of a string constant defaults to 255 characters. The filter expression parser allows the following escape characters in string constants or character constants:
| Escape char | Value | Explanation |
|---|---|---|
| \a or \A | ASCII 7 | bell |
| \b or \B | ASCII 8 | backspace |
| \f or \F | ASCII 12 | Form Feed |
| \n or \N | ASCII 10 | Linefeed |
| \r or \R | ASCII 13 | Carriage Return |
| \t or \T | ASCII 9 | tab |
| \v or \V | ASCII 11 | vertical tab |
| \\ | ||
| \any | Any character not listed above |
Variables
A filter expression variable is actually the name of the fields defined for the table. There is a limit of 128 characters for the name of variables and the names are case sensitive.
When a user specifies a variable name, the filter parser searches the table definition for a field of that name. The parser uses the type of the field and converts it to the types used internally by the expression evaluator. The conversion of field types is as follows:
| Field Type | Data Type | Field Type | Data Type |
|---|---|---|---|
| CT_BOOL | int | CT_SFLOAT | double |
| CT_CHAR | int | CT_DFLOAT | double |
| CT_CHARU | unsigned | CT_FSTRING | char* |
| CT_INT2 | int | CT_FPSTRING | char* |
| CT_INT2U | unsigned | CT_F2STRING | char* |
| CT_INT4 | int | CT_F4STRING | char* |
| CT_INT4U | unsigned | CT_STRING | char* |
| CT_DATE | unsigned | CT_PSTRING | char* |
| CT_TIME | unsigned | CT_2STRING | char* |
| CT_MONEY | int | CT_4STRING | char* |
Please note that "int" is a LONG, "unsigned" is a ULONG and "char*" is a pTEXT.
Field names that match a valid expression reserved word:
Consider a field named "year", which collides with the function YEAR. The expression "[year] == 2000" is needed to handle "year" as a field name rather than the function.
Parentheses
Use parentheses exactly like they are used in C expressions. There are no limits on the number of parentheses you may use in an expression, as long as each open parenthesis has a closing parenthesis. Parentheses are also used to enclose the arguments of built-in functions.
Examples
The functions strcmp(), stricmp(), strncmp() and strnicmp() do string comparisons. They return 0 when the strings match, -1 when the source string is sorted before the comparison string, and 1 when the source string is sorted after the target string. stricmp() and strnicmp() do case insensitive comparisons. strncmp() and strnicmp() limit the comparison to the first N bytes. If a UTF-8 character uses more than one byte, such as "é", increase N accordingly. For example, "strncmp( name, \"Pelé\", 5 ) == 0". Lastly, strncmp() is the same as strncmp().
Exact string matches
These examples show how to filter records by comparing a field value to a constant string.
|
In This Section |
Exact string match
Include records where the value of a "name" field exactly matches the string "Pele".
"name == \"Pele\""
"strcmp( name, \"Pele\" ) == 0"
"strncmp( name, \"Pele\", 4 ) == 0"
"match( name, \"Pele\" ) == 1"
Exact string non-match
Include records where the value of a "name" field does not match the string "Pele". Matches include "Michael Jordan", and "Babe Ruth".
"name != \"Pele\""
"strcmp( name, \"Pele\" ) != 0"
"strncmp( name, \"Pele\", 4 ) != 0"
"match( name, \"Pele\" ) == 0"
Less-than string match
Include records where the value of a "name" field is less than the string "Pele". Matches include "Michael Jordan", and "Babe Ruth".
"name < \"Pele\""
"strcmp( name, \"Pele\" ) < 0"
"strncmp( name, \"Pele\", 4 ) < 0"
Less-than-equal string match
Include records where the value of a "name" field is less than the string "Pele". Matches include "Pele", "Michael Jordan", and "Babe Ruth".
"name <= \"Pele\""
"strcmp( name, \"Pele\" ) <= 0"
"strncmp( name, \"Pele\", 4 ) <= 0"
Greater-than string match
Include records where the value of a "name" field is greater than the string "Pele". Matches include "Wayne Gretzky".
"name > \"Pele\""
"strcmp( name, \"Pele\" ) > 0"
"strncmp( name, \"Pele\", 4 ) > 0"
Greater-than-equal string match
Include records where the value of a "name" field is greater than the string "Pele". Matches include "Pele" and "Wayne Gretzky".
"name >= \"Pele\""
"strcmp( name, \"Pele\" ) >= 0"
"strncmp( name, \"Pele\", 4 ) >= 0"
Case insensitive matches
These examples show how to filter records by comparing a field value to a constant string while doing a case insensitive comparison. For example, the following strings match each other when doing a case insensitive comparison: "Pele", "pele", "PELE", "PeLe", "pELE", "pElE", and so forth.
|
In This Section Case insensitive string non-match Case insensitive less-than string match Case insensitive less-than-equal string match |
Case insensitive string match
Include records where the value of a "name" field matches with the string "pele" while doing a case insensitive comparison. Matches include "Pele" and "PELE" .
"stricmp( name, \"pele\" ) == 0"
"strnicmp( name, \"pele\", 4 ) == 0"
"matchi( name, \"pele\" ) == 1"
Case insensitive string non-match
Include records where the value of a "name" field does not match the string "pele" while doing a case insensitive comparison. Matches include "Michael Jordan", and "Babe Ruth".
"stricmp( name, \"pele\" ) != 0"
"strnicmp( name, \"pele\", 4 ) != 0"
"matchi( name, \"pele\" ) == 0"
Case insensitive less-than string match
Include records where the value of a "name" field is less than the string "pele" while doing a case insensitive comparison. Matches include "Michael Jordan", and "Babe Ruth".
"stricmp( name, \"pele\" ) < 0"
"strnicmp( name, \"PELE\", 4 ) < 0"
Case insensitive less-than-equal string match
Include records where the value of a "name" field is less than or equal to the string "pele" while doing a case insensitive comparison. Matches include "Pele", "Michael Jordan", and "Babe Ruth".
"stricmp( name, \"PeLe\" ) <= 0"
"strnicmp( name, \"pele\", 4 ) <= 0"
Case insensitive greater-than string match
Include records where the value of a "name" field is greater than the string "pele" while doing a case insensitive comparison. Matches include "Wayne Gretzky".
"stricmp( name, \"pElE\" ) > 0"
"strnicmp( name, \"pele\", 4 ) > 0"
Case insensitive greater-than-equal string match
Include records where the value of a "name" field is greater than or equal to the string "pele" while doing a case insensitive comparison. Matches include "Pele" and "Wayne Gretzky".
"stricmp( name, \"PELE\" ) >= 0"
"strnicmp( name, \"pele\", 4 ) >= 0"
Partial matches
These examples show how to filter records by comparing against the first N characters of a field.
NOTES:
- When using strncmp(), characters in the comparison string are not included in the comparison when they occur after the specified character count. For example, "strncmp( name, \"Mike\", 1 ) == 0", matches names where the first letter is "M".
- When using strncmp(), you can use a string of any length for partial comparison. Be sure to specify the proper number of characters to include in the comparison. For example, "strncmp( name, \"Mich\", 4 ) == 0", matches names where the first four letters are "Mich", such as "Michael Jordan" and "Michael Schumacher", but not "Muhammad Ali".
- When using strncmp() to compare a UTF-8 string that uses more than one byte per character increase N accordingly. For example, "strncmp( name, \"Pelé\", 5 ) == 0" matches all names in the name field that begin with "Pelé".
|
In This Section Partial less-than string match Partial less-than-equal string match |
Partial string match
Include records where the first character in a "name" field exactly matches the string "M". Matches include "Michael Jordan", "Michael Schumacher", and "Muhammad Ali".
"strncmp( name, \"M\", 1 ) == 0"
"match( name, \"M*\" ) == 1"
Partial string non-match
Include records where the first character in a "name" field does not match the string "M". Matches include "Babe Ruth", "Pele", and "Wayne Gretzky", but not "Michael Jordan", "Michael Schumacher", and "Muhammad Ali".
"strncmp( name, \"M\", 1 ) != 0"
"match( name, \"M*\" ) == 0"
Partial less-than string match
Include records where the first character in a "name" field is less than the string "M". Matches include "Babe Ruth".
"strncmp( name, \"M\", 1 ) < 0"
Partial less-than-equal string match
Include records where the first character in a "name" field is less than or equal to the string "M". Matches include "Babe Ruth", "Michael Jordan", "Michael Schumacher", and "Muhammad Ali".
"strncmp( name, \"M\", 1 ) <= 0"
Partial greater-than string match
Include records where the first character in a "name" field is greater than the string "M". Matches include "Pele" and "Wayne Gretzky".
"strncmp( name, \"M\", 1 ) > 0"
Partial greater-than-equal string match
Include records where the first character in a "name" field is greater than or equal to the string "M". Matches include "Michael Jordan", "Michael Schumacher", "Muhammad Ali", "Pele" and "Wayne Gretzky".
"strncmp( name, \"M\", 1 ) >= 0"
Case insensitive partial matches
These examples show how to filter records by comparing against the first N characters of a field while doing case insensitive string comparisons.
|
In This Section Case insensitive partial string match Case insensitive partial string non-match Case insensitive partial less-than string match Case insensitive partial less-than-equal string match |
Case insensitive partial string match
Include records where the first character in a "name" field matches the string "M" or "m" while doing a case insensitive comparison. Matches include "Michael Jordan", "Michael Schumacher", and "Muhammad Ali".
"strnicmp( name, \"m\", 1 ) == 0"
"matchi( name, \"m*\" ) == 1"
Case insensitive partial string non-match
Include records where the first character in a "name" field does not match the string "M" or "m" while doing a case insensitive comparison. Matches include "Babe Ruth", "Pele", and "Wayne Gretzky", but not "Michael Jordan", "Michael Schumacher", and "Muhammad Ali".
"strnicmp( name, \"m\", 1 ) != 0"
"matchi( name, \"m*\" ) == 0"
Case insensitive partial less-than string match
Include records where the first character in a "name" field is less than the string "M" or "m" while doing a case insensitive comparison. Matches include "Babe Ruth".
"strnicmp( name, \"m\", 1 ) < 0"
Case insensitive partial less-than-equal string match
Include records where the first character in a "name" field is less than or equal to the string "M" or "m" while doing a case insensitive comparison. Matches include "Babe Ruth", "Michael Jordan", "Michael Schumacher", and "Muhammad Ali".
"strnicmp( name, \"m\", 1 ) <= 0"
Case insensitive partial greater-than string match
Include records where the first character in a "name" field is greater than the string "M" or "m" while doing a case insensitive comparison. Matches include "Pele" and "Wayne Gretzky".
"strnicmp( name, \"m\", 1 ) > 0"
Case insensitive partial greater-than-equal string match
Include records where the first character in a "name" field is greater than or equal to the string "M" or "m" while doing a case insensitive comparison. Matches include "Michael Jordan", "Michael Schumacher", "Muhammad Ali", "Pele" and "Wayne Gretzky".
"strnicmp( name, \"m\", 1 ) >= 0"
Contains Matches
The locate() function finds the location of one string within another. It returns 0 when it doesn't find an exact, case sensitive match; otherwise, it returns the character position of the first match. The first character is position 1.
The substring() function returns a string out of another string starting at a specific character position for the specified number of bytes up to the end of the string. The first character is position 1.
The right() function returns a string out of another string starting at the end for the specified number of bytes up to the beginning.
The left() function returns a string out of another string starting at the beginning for the specified number of bytes up to the end.
TIP: A match can be made case insensitive by wrapping the field name in the lower() function and comparing it to a lower case string constant:
"locate( lower(favoriteSaying), \"there\" ) == 1"
Contains string
Include records where the "favoriteSaying" field contains "there". Matches include "There is no 'i' in team but there is in win." and "Once something is a passion, the motivation is there.".
"locate( favoriteSaying, \"there\" ) >= 1"
Does not contain string
Include records where the "favoriteSaying" field does not contain "is". Matches include "Every strike brings me closer to the next home run." and "Float like a butterfly, sting like a bee.".
"locate( favoriteSaying, \"is\" ) == 0"
Contains string after the Nth character
Include records where "favoriteSaying" field contains "there" after the first 10 bytes. Matches include "There is no 'i' in team but there is in win.".
"locate( favoriteSaying, \"there\", 10 ) >= 10"
Contains beginning-string
Include records where the first 3 characters in a "name" field match the string "Muh". Matches include "Muhammad Ali".
"left(name, 3) == \"Muh\""
Does not contain beginning-string
Include records where the first 3 characters in a "name" field do not match the string "Muh". Matches include strings like "Babe Ruth", "Pele", and "Wayne Gretzky", but not "Muhammad Ali".
"left(name, 3) != \"Muh\""
Contains end-string
Include records where the 3 rightmost characters in a "name" field match the string "Ali". Matches include strings like "Babe Ruth", "Pele", and "Wayne Gretzky", but not "Muhammad Ali".
"right(name,3) == \"Ali\""
Does not contain end-string
Include records where the 3 rightmost characters in a "name" field do not match the string "Ali". Matches include strings like "Babe Ruth", "Pele", and "Wayne Gretzky", but not "Muhammad Ali".
"right(name,3) != \"Ali\""
Contains mid-string
Include records where the 3rd and 4th characters in a "name" field match the string "ch". Matches include strings like "Michael Jordan" and "Michael Schumacher".
"substring(name,3,2) == \"ch\""
Does not contains mid-string
Include records where the 3rd and 4th characters in a "name" field do not match the string "ch". Matches include strings like "Michael Jordan" and "Michael Schumacher".
"substring(name,3,2) != \"ch\""
Wildcard Matches
The match() function does exact string comparisons using wildcard characters. It returns 1 when the comparison matches and 0 when it does not.
The period "." character matches any one character.
The asterisk "*" character matches any number of characters; thus, when the asterisk character is used, it should be the last wildcard character.
|
In This Section |
Wildcard string match
Include records where the characters in a "name" field match the wildcard string "M.c.a*". Matches include "Michael Jordan" and "Michael Schumacher".
"match( name, \"M.c.a*\" ) == 1"
Wildcard string non-match
Include records where the characters in a "name" field do not match the wildcard string "M.c.a*". Matches include strings like "Babe Ruth", "Pele", and "Wayne Gretzky", but not "Michael Jordan", "Michael Schumacher", and "Muhammad Ali".
"match( name, \"M.c.a*\" ) == 0"
Wildcard end-string match
Include records where the 3 rightmost characters in a "name" field match the wildcard string "A.i". Matches include strings like "Muhammad Ali".
"match( right(name,3), \"A.i\" ) == 1"
Wildcard end-string non-match
Include records where the 3 rightmost characters in a "name" field do not match the wildcard string "A.i". Matches include strings like "Babe Ruth", "Pele", and "Wayne Gretzky", but not "Muhammad Ali".
"match( right(name,3), \"A.i\" ) == 0"
Wildcard mid-string match
Include records where the 3rd and 4th characters in a "name" field match the wildcard string "c.". Matches include strings like "Michael Jordan" and "Michael Schumacher".
"match( substring(name,3,2), \"c.\" ) == 1"
Wildcard mid-string non-match
Include records where the 3rd and 4th characters in a "name" field do not match the wildcard string "c.". Matches include strings like "Babe Ruth", "Pele", "Muhammad Ali", and "Wayne Gretzky", but not "Michael Jordan" and "Michael Schumacher".
"match( substring(name,3,2), \"c.\" ) == 0"
Case insensitive wildcard Matches
The matchi() function does case insensitive string comparisons using wildcard characters. It returns 1 when the comparison matches and 0 when it does not.
The period "." character matches any one character.
The asterisk "*" character matches any number of characters; thus, when the asterisk character is used, it should be the last wildcard character.
|
In This Section Case insensitive wildcard string match Case insensitive wildcard string non-match Case insensitive wildcard end-string match Case insensitive wildcard end-string non-match |
Case insensitive wildcard string match
Include records where the characters in a "name" field match the wildcard string "M.c.a*" while doing a case insensitive comparison. Matches include "Michael Jordan" and "Michael Schumacher".
"matchi( name, \"m.c.a*\" ) == 1"
Case insensitive wildcard string non-match
Include records where the characters in a "name" field do not match the wildcard string "M.c.a*" while doing a case insensitive comparison. Matches include strings like "Babe Ruth", "Pele", and "Wayne Gretzky", but not "Michael Jordan", "Michael Schumacher", and "Muhammad Ali".
"matchi( name, \"M.C.A*\" ) == 0"
Case insensitive wildcard end-string match
Include records where the 3 rightmost characters in a "name" field match the wildcard string "A.I" while doing a case insensitive comparison. Matches include strings like "Muhammad Ali".
"matchi( right(name,3), \"A.I\" ) == 1"
Case insensitive wildcard end-string non-match
Include records where the 3 rightmost characters in a "name" field do not match the wildcard string "a.i" while doing a case insensitive comparison. Matches include strings like "Babe Ruth", "Pele", and "Wayne Gretzky", but not "Muhammad Ali".
"matchi( right(name,3), \"a.i\" ) == 0"
Case insensitive wildcard mid-string match
Include records where the 3rd and 4th characters in a "name" field match the wildcard string "C." while doing a case insensitive comparison. Matches include strings like "Michael Jordan" and "Michael Schumacher".
"matchi( substring(name,3,2), \"C.\" ) == 1"
Case insensitive wildcard mid-string non-match
Include records where the 3rd and 4th characters in a "name" field do not match the wildcard string ".H" while doing a case insensitive comparison. Matches include strings like "Babe Ruth", "Pele", "Muhammad Ali", and "Wayne Gretzky", but not "Michael Jordan" and "Michael Schumacher".
"matchi( substring(name,3,2), \".H\" ) == 0"
Compare string fields in the same record
You can compare the values of two fields in the same record.
"name == favoritesaying"
"name != favoritesaying"
"name >= favoritesaying"
"strncmp( name, favoritesaying, 1 ) > 0"
"stricmp( name, favoritesaying ) == 0"
"strnicmp( name, favoritesaying, 1 ) > 0"
Predefined Functions
The FairCom DB conditional expression parser has numerous built-in functions for advanced conditional filtering possibilities. These are arranged and described in the following categories:
|
In This Section |
C Language Equivalents
These built-in functions closely follow C library functions.
| Function | Explanation |
|---|---|
| int atoi( char* String ) | convert string to integer |
| int atol( char* String ) | convert string to long |
| double atof( char* String ) | convert string to double |
| int cabs( into Value ) | return the absolute value of a complex number. |
| int labs( into Value ) | return the absolute value of a long integer. |
| double fabs( double Value ) | return the absolute value of a double |
| double ceil( double Value ) | return the ceiling of a value |
| double floor( double Value ) | return the floor of a value |
| double fmod( double r, double t ) | return the remainder of r by t as a double |
| int strlen( char* String ) | return the length of a string |
| int strcmp( char* s, char* t ) | compare two strings |
| int stricmp( char* s, char* t ) | compare strings without regard to case |
| int strncmp( char* s, char* t, int len) | compare characters of two strings |
| int strnicmp( char* s, char *t, int len ) | compare characters of two strings without regard to case |
| int memcmp(char* s, char *d, int len) | compare bytes of two memory locations |
| u_strcmp | compare Unicode strings: Int u_strcmp (UChar* s1, UChar* s2) |
String and Text Functions
| Function | Return Type | Description |
|---|---|---|
| ASCII ( char_expression ) | int |
Returns the ASCII value of the first character of char_expression.
|
| CHR ( integer_expression ) | text |
Returns a character string with the first character having an ASCII value equal to the argument expression.
|
| CONCAT ( char_expression1 ,char_expression2 ) | text |
Returns a concatenated character string formed by concatenating char_expression1 with char_expression2.
|
| FIELD ( offset, size ) | text | Returns the content of record at offset with length size as if it were a CT_ARRAY field. |
| INSERT ( char_expression1, start, length, char_expression2 ) | text |
Returns a character string where length characters have been deleted from char_expr1 beginning at start and char_expr2 has been inserted into char_expr1, beginning at start.
|
| LEFT ( char_expression, count ) | text |
Returns the leftmost count of characters of char_expression.
|
| LENGTH ( char_expression ) | int |
Returns the number of characters in char_expression.
|
| LOCATE ( char-expr1, char-expr2 [ , start ] ) | int |
Returns the location of the first occurrence of char-expr2 in char-expr1. If the function includes an optional integer argument start LOCATE begins searching char-expr1 at that position. If the function omits start, LOCATE begins its search at the beginning of char-expr1. LOCATE denotes the first character position of a character expression as 1. If the search fails, LOCATE returns 0.
|
| LOWER ( char_expression ) | text |
Returns the result of the character expression after converting all characters to lowercase.
|
| LPAD ( char_expression, length [ , pad_expression ] ) | text |
Returns char_expression padded with optional pad_expression until the length of the resulting string is equal to argument length.
|
| LTRIM ( char_expression [, char_set ] ) | text |
Removes leading characters in char_expression, present in char_set and returns the resultant string. Thus, the first character in the result is guaranteed not in char_set. If char_set is omitted, the function removes leading and trailing spaces from char_expression.
|
| MATCH ( char_expression, pattern_expression ) | int |
Compares char_expression to pattern_expression. MATCH returns 1 if the expression matches the pattern and zero if the expression does not match.
The following special characters can be used in pattern_expression:
|
| MATCHI ( char_expression, pattern_expression ) | int |
Compares char_expression to pattern_expression as a case-insensitive comparison. MATCHI returns 1 if the expression matches the pattern and zero if the expression does not match.
The following special characters can be used in pattern_expression:
|
| MEMCMP ( char_expression1, char_expression2, count ) | int |
Compares count bytes of char_expression1 and char_expression2 and returns zero if both strings are equal, a negative value if char_expression1 is less than char_expression2 , or a positive value if char_expression1 is greater than char_expression2.
|
| REPEAT ( char_expression, count ) | int |
Returns a character string composed of char_expression repeated count times.
|
| RIGHT ( char_expression, count ) | text |
Returns rightmost count of characters from char_expression.
|
| ROWID (0, 0) | double | Returns the numeric row ID value if available. |
| RPAD ( char_expression, length [ , pad_expression ] ) | text |
Returns char_expression padded with pad_expression such that after padding, length of result equals length.
|
| RTRIM ( char_expression [, char_set ] ) | text |
Removes trailing characters in char_expression present in char_set and returns the resultant string. Thus, the last character in the result is guaranteed not in char_set. If char_set is omitted, the function removes trailing spaces from char_expression.
|
| SPACE ( expression ) | text |
Returns a character string consisting of a number of spaces specified by expression.
|
| SUBSTRING ( char_expression, start [, length ] ) | int |
Returns the substring of char_expression beginning at start and length characters long. If length is not specified, substring starting at start up to the end of char_expression is returned.
|
| UPPER ( char_expression ) | text |
Returns char_expression after converting all characters to uppercase.
|
Date and Time Functions
| Function | Return Type | Description |
|---|---|---|
| ADD_DAYS ( date_expression, integer_expression ) | DATE |
Adds to the specified date_expression value the specified number of integer_expression days and returns the resultant date value.
|
| ADD_MONTHS ( date_expression, integer_expression ) | DATE |
Adds to the specified date_expression value the number of integer_expression months and returns the resultant date value.
|
| CURDATE ( ) | DATE |
Returns the current date as a DATE value. This function takes no arguments. |
| CURTIME ( ) | TIME |
Returns the current time as a TIME value. This function takes no arguments. |
| DAYNAME ( date_expression ) | text |
Returns a character string for the day name portion of date_expression. date_expression can be the name of a column, the result of another scalar function, or a DATE or TIMESTAMP literal. The return value is one of SUNDAY, MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY or SATURDAY. |
| DAYOFMONTH ( date_expression ) | int |
Returns the day of month of date_expression as a short integer value in the range 1 - 31.
|
| DAYOFWEEK ( date_expression ) | int |
Returns the day of week of date_expression as an integer value in the range of 1 – 7, where 1 is Sunday, 2 is Monday, etc.
|
| DAYOFYEAR ( date_expression ) | int |
Returns the day of year of date_expression as an integer value in the range 1 - 366.
|
| HOUR ( time_expression ) | int |
Returns the hour of time_expression as an integer value in the range 0 - 23.
|
| LAST_DAY ( date_expression ) | DATE |
Returns the date corresponding to the last day of the month containing date_expression.
|
| MINUTE ( time_expression ) | int |
Returns the minute value of time_expression as an integer in the range 0 - 59.
|
| MONTH ( date_expression ) | int |
Returns the month number of date_expression as an integer value in the range 1 - 12.
|
| MONTHNAME ( date_expression ) | text |
Returns the name of the month (for example, JANUARY, through DECEMBER) for the month portion of date_expression.
|
| NOW () | TIMESTAMP |
Returns the current date and time as a TIMESTAMP value. This function takes no arguments. |
| QUARTER ( date_expression ) | int |
Returns the quarter of date_expression as an integer value in the range 1 - 4.
|
| SECOND ( time_expression ) | int |
Returns the seconds time_expression as an integer value in the range 0 - 59.
|
| SYSDATE () | DATE |
Returns the current date as a DATE value. This function takes no arguments, and trailing parentheses are optional. This function is similar to CURDATE. |
| SYSTIME () | TIME |
Returns the current time as a TIME value. This function takes no arguments, and trailing parentheses are optional. This function is similar to CURTIME. |
| SYSTIMESTAMP () | TIMESTAMP |
Returns the current date and time as a TIMESTAMP value. This function takes no arguments, and trailing parentheses are optional. |
| TIMET2CTDATE | CTDATE | convert integer Unix time_t value to a c-tree Date type (CTDATE) |
| TIMET2CTTIME | CTTIME | convert integer Unix time_t value to a c-tree Time type (CTTIME) |
| TIMET2CTTIMES | CTTIMES | convert integer Unix time_t value to a c-tree Timestamp type (CTTIMES) |
| WEEK ( date_expression ) | int |
Returns the week of date_expression as an integer value in the range 1 - 53.
|
| YEAR ( date_expression ) | int |
Returns the year of date_expression as an integer value in the range 0 - 9999.
|
Mathematical Functions
| Function | Return Type | Description |
|---|---|---|
| ABS ( expression ) | int |
Computes absolute value of expression.
|
| ACOS ( expression ) | double |
Returns arccosine of expression. ACOS takes the ratio (expression) of two sides of a right triangle and returns the corresponding angle. The ratio is the length of the side adjacent to the angle divided by the length of the hypotenuse. The result is expressed in radians and is in the range -PI/2 to PI/2 radians. To convert degrees to radians, multiply degrees by PI/180. To convert radians to degrees, multiply radians by 180/PI.
|
| ASIN ( expression ) | double |
Reurns the arcsine of expression. ASIN takes the ratio (expression) of two sides of a right triangle and returns the corresponding angle. The ratio is the length of the side opposite the angle divided by the length of the hypotenuse. The result is expressed in radians and is in the range -PI/2 to PI/2 radians. To convert degrees to radians, multiply degrees by PI/180. To convert radians to degrees, multiply radians by 180/PI.
|
| ATAN ( expression ) | double |
Returns the arctangent of expression. ATAN takes the ratio (expression) of two sides of a right triangle and returns the corresponding angle. The ratio is the length of the side opposite the angle divided by the length of the side adjacent to the angle. The result is expressed in radians and is in the range -PI/2 to PI/2 radians. To convert degrees to radians, multiply degrees by PI/180. To convert radians to degrees, multiply radians by 180/PI.
|
| CABS ( expression ) | int | Calculates the absolute value of a complex number. |
| CEIL ( expression ) | double |
Returns the smallest integer greater than or equal to expression.
|
| CEILING ( expression ) | double |
Returns the smallest integer greater than or equal to expression.
|
| COS ( expression ) | double |
Returns the cosine of expression. COS takes an angle (expression) and returns the ratio of two sides of a right triangle. The ratio is the length of the side adjacent to the angle divided by the length of the hypotenuse. To convert degrees to radians, multiply degrees by PI/180. To convert radians to degrees, multiply radians by 180/PI.
|
| EXP ( expression ) | double |
Returns the exponential value of expression (e raised to the power of expression).
|
| FABS ( expression ) | double |
Returns the absolute value of expression.
|
| FLOOR ( expression ) | int |
Returns the largest integer less than or equal to expression.
|
| FMOD ( expression1, expression2 ) | double |
Calculates expression1 modulo expression2, the remainder of expression1 / expression2.
|
| LABS ( expression ) | integer |
Computes the absolute value of an INTEGER value.
|
| LOG10( expression ) | double |
Returns the base 10 logarithm of expression.
|
| MOD ( expression1, expression2 ) | int |
Returns the remainder of expression1 divided by expression2.
|
| PI ( ) | number | Returns the constant value of pi as an approximated numeric value. This function takes no arguments. |
| POWER( expression1, expression2 ) | double |
Returns expression1 raised to the power of expression2.
|
| RAND( [ expression ] ) | int |
Returns a randomly-generated number, using expression as an optional seed value.
|
| SIGN ( expression ) | int |
Returns: +1 if expression > 0 -1 if expression < 0 0 if expression = 0
|
| SIN ( expression ) | double |
SIN takes an angle (expression) and returns the ratio of two sides of a right triangle. The ratio is the length of the side opposite the angle divided by the length of the hypotenuse. To convert degrees to radians, multiply degrees by PI/180. To convert radians to degrees, multiply radians by 180/PI.
|
| SQRT ( expression ) | double |
SQRT return the square root of expression.
|
| TAN ( expression ) | number |
TAN returns the tangent of expression. TAN takes an angle (expression) and returns the ratio of two sides of a right triangle. The ratio is the length of the side opposite the angle divided by the length of the side adjacent to the angle. To convert degrees to radians, multiply degrees by Pi/180. To convert radians to degrees, multiply radians by 180/Pi.
|
Type Casting
The filter expression parser allows you to use explicit type casts in expressions. This is very useful if you are comparing fields of different types and want to control the result of an expression.
For example, suppose "Salary" is a CT_MONEY field and "Average" is a CT_DFLOAT field; type casts can be used as illustrated in the following expression: (Salary - (int)Average) > 500
The following type casts may be used in conditional expressions:
- (int) or (long): Convert the result of expression to integer (32 bit).
- (unsigned [int | long]): Convert the result of expression to unsigned integer (32 bit).
- (double): Convert the result of expression to double.
You cannot type cast a string expression.
Automatic Type Promotion
When mixing different types in an expression without explicit type casting, the conditional expression parser automatically promotes the types using the following rule:
- signed and unsigned integers - promoted to unsigned integer (64-bit)
- signed integer and double - promoted to double
- unsigned integer and double - promoted to double
In the great majority of cases, mixing strings with numeric values returns a parsing error.
Operators
The following operators are allowed in filters and conditional expressions.
Mathematical Operators
| + | Adds two operands |
| - | Subtracts two operands or negates an operand (e.g., -5) |
| * | Multiplication |
| / | Division |
| % | Modulus |
Relational Operators
| == | Equal to |
| != | Not equal to |
| < | Less than |
| <= | Less or equal to |
| > | Greater than |
| >= | Greater than or equal to |
Logical Operators
| && | And |
| || | Or |
| ! | Not |
Binary Operators
| & | And |
| | | Or |
| ~ | Not |
| ^ | Xor |
NULL Operators
| IS NULL |
| IS NOT NULL |
Custom Application Expressions
Use the conditional expression parser/analyzer to evaluate application-specific expressions. The FairCom DB expression parser is a full standalone feature in and of itself, and can be used directly in your own applications for purposes other than data filters, conditional indexes, or partitioned files. Two core API functions provide a powerful interface into this advanced expression handling.
FairCom DB’s expression parser/analyzer requires two key steps:
- Call cndxparse() to parse your expression, producing an expression tree that the expression analyzer later evaluates.
- Call cndxeval() to evaluate your expression tree using data from a buffer in memory.
For a complete sample program, see ctexpr.c in the ctree/source directory.
See also:
|
In This Section |
Parsing Expressions
Parsing your expression involves three steps:
- Define a DODA structure.
- Parse the DODA into a record schema and field name list.
- Parse your expression to produce an expression tree.
Sample code to perform these steps is shown below. This code assumes FairCom DB has been initialized prior to calling ctparsedoda().
Expression Parsing Example
#include "ctcndx.h" /* For PTREE type */
/* Define a DODA structure. */
DATOBJ doda[] = {
{"CustomerNumber", 0, CT_INT4U},
{"ZipCode", 4, CT_FSTRING, 9},
{"State", 13, CT_FSTRING, 2},
{"LastName", 15, CT_STRING, 37},
{"FirstName", 52, CT_STRING, 37},
{"Address", 89, CT_STRING, 49},
{"City", 138, CT_STRING, 37}
};
COUNT retval; /* Return code. */
pTEXT schema; /* Record schema. */
pTEXT names; /* Field name list. */
PTREE ptree; /* Expression tree. */
pTEXT expr; /* Expression string. */
/* Parse the DODA into a record schema and field name list. */
if ((retval = ctparsedoda(doda, 7, &schema, &names)) != 0)
printf("Error %d parsing DODA.\n", retval);
/* Parse your expression to produce an expression tree. */
expr = "stricmp(LastName, \"Smith\") == 0
&& CustomerNumber > 10000";
ptree = cndxparse(schema, names, expr, strlen(expr));
if (!ptree)
printf("Error: Unable to parse expression.\n");
else
printf("Successfully parsed expression.\n");
To invoke the expression parser to parse a string expression and produce an expression tree, call cndxparse(), which is declared as follows:
PTREE cndxparse(pConvMap Schema, pTEXT Names, pTEXT InputText,
NINT InputTextSize)
Schema is a pointer to a record schema derived from a DODA definition. Names is a pointer to a list of the field names from the DODA. InputText points to a NULL-terminated string expression, and InputTextSize is the length of InputText.
One of the most useful features of the expression parser is its ability to associate symbolic names in expressions with data in a buffer in memory. To use this ability, you must define a record schema, known as a DODA (Data Object Definition Array). A DODA is an array of field specifications, each of which contains a field name, field offset, field type, and a field length. By providing the expression parser with a DODA, you may include references to DODA field names in your expressions. See the sample code shown later in this section for an example of a DODA definition.
While a DODA is conveniently defined in your application using an array of DATOBJ structures, cndxparse() does not take a DODA in DATOBJ form, but instead accepts a record schema and a list of the field names from the DODA. In order to simplify converting your DODA into the required record schema and field name list, FairCom has written a utility function, ctparsedoda(). This function can be found in the sample file ctexpr.c.
ctparsedoda() is declared as follows:
COUNT ctparsedoda(pDATOBJ doda, UCOUNT numfld, ppTEXT ppschema,
ppTEXT ppnames)
Evaluating Expressions
Having produced an expression tree, you are ready to evaluate the expression using the expression analyzer. To evaluate your expression, call cndxeval(), which is declared as follows:
COUNT cndxeval(PTREE Tree, pVOID Recptr, pConvMap Schema)
- Tree is an expression tree returned by cndxparse().
- Recptr points to a buffer containing data you wish to evaluate with your expression.
- Schema is the record schema ctparsedoda() produced from your DODA definition. The record schema is used to associate data in Recptr with field names specified in your expression.
Note: Before you call cndxeval() the first time, you must ensure that a run-time stack has been allocated for the expression analyzer.
To summarize, evaluating your expression involves three steps:
- Allocate a run-time stack for the expression analyzer (first time only).
- Set up a buffer containing the field data used in the expression.
- Evaluate the expression.
If you wish, you can repeat steps 2) and 3) multiple times. Sample code to perform these steps is shown below. It is assumed the Get_Buffer() routine allocates a record buffer and initializes it with data conforming to the field definitions specified in the DODA.
Expression Evaluation Example
COUNT retcidx; /* Result of expression evaluation. */
pTEXT recbuf; /* Record buffer. */
/* Allocate a run-time stack for the expression analyzer (first time only). */
if (!ctcidxStk) {
ctcidxStk = (pVOID) getcndxmem(CNDX_MAX_STACK * ctSIZE(PLEAF));
if (!ctcidxStk) {
printf("Unable to allocate memory for run-time stack.\n");
ctrt_exit(1);
}
}
/* Set up a buffer containing the field data used in the expression. */
Get_Buffer(&recbuf);
/* Evaluate the expression. */
retcidx = cndxeval(ptree, recbuf, (pConvMap)schema);
if (retcidx<0)
printf("The expression cannot be evaluated for this record
- error %d.\n", uerr_cod);
else if (retcidx)
printf("The expression evaluates to TRUE for this record.\n");
else
printf("The expression evaluates to FALSE for this record.\n");
Remember to always free any memory used by your record schema, field name list, and expression tree when you are finished with them. Use the following FairCom DB functions to do so:
- mbfree(schema)
- mbfree(names)
- cndxfree(ptree)
Variable-length Records with Conditional Expressions
When using data filters or conditional indexes with variable length data files, a record retrieval that does not bring back the entire record will return a CVAL_ERR error (598), which indicates the expression could not be evaluated.
There are two types of retrievals that result in less than the entire record being read:
- Calling the fixed-length versions of the ISAM routines such as FirstRecord() or NextRecord() instead of FirstVRecord() or NextVRecord(). The fixed-length calls cannot be used to read variable-length records with a data filter or conditional index.
- Calling the variable-length versions of the ISAM routines with a buffer length insufficient to hold the entire record.
When an ISAM call fails (with CVAL_ERR (598) or some other error), the current ISAM position is NOT updated. Therefore the following pseudo-code sequence will NOT work because the FirstRecord() did not establish the failing record as the current ISAM position, and the GETVLEN() call would reference the record at the current ISAM position before the FirstRecord() call:
SETFLTR(datno,...);
if (FRSREC(...) == CVAL_ERR) {
vlen= GETVLEN(datno);
rc= FRSVREC(...,&vlen);
}
Using the variable-length versions of the ISAM routines provides a workable approach. The following pseudo-code works, with one proviso - the subsequent calls to the ISAM routine can also fail with a CVAL_ERR (598) because they may have skipped forward to an even larger record:
SETFLTR(datno,...);
oldlen = vlen;
if (FRSVREC(...,bufr,&vlen) == CVAL_ERR && oldlen < vlen) {
free(bufr);
oldlen = vlen;
bufr = calloc(vlen);
rc = FRSVREC(...,bufr,&vlen);}
The second call to FirstVRecord() could also return the CVAL_ERR (598) because while the record that originally caused the CVAL_ERR (598) can now be read completely, if it failed the filter, the next records will be read automatically until a record is found that passes the filter; but these subsequent reads can also encounter a record that is bigger than the new buffer size.
The following pseudo-code loop should work with any of the variable length versions of the ISAM calls:
SETFLTR(datno,...);
oldlen= vlen;
while CVAL_ERR == (xyzVREC(...,bufr,&vlen) && oldlen < vlen) {
free(bufr);
oldlen = vlen;
bufr = calloc(vlen);
}
if (isam_err)
then problem or no record found
else
success
If one knows ahead of time that there is a maximum record length for the file, then simply using a buffer of this known maximum size eliminates the need to loop over the CVAL_ERR (598) caused by an insufficient buffer size.
Conditional Expression Callback Function
FairCom DB provides a developer-defined callback function to perform additional advanced custom data filtering and conditional index evaluation. Instead of calling c-tree’s internal expression evaluator to analyze criteria, a user-defined function is called, allowing advanced filtering and control with application specific code.
Callback Functions
The module ctclbk.c contains the routines declared below. ctfiltercb() is called when a user-defined filter must be evaluated. Whenever a file open retrieves stored conditional index callback expressions, when a conditional index callback expression is created, or when a SetDataFilter() callback expression is created, the ctfiltercb_init() routine is called with a pointer to the callback expression. Whenever a file is closed, or a SetDataFilter() is cleared, the ctfiltercb_uninit() routine is called for each callback expression.
NINT ctfiltercb_init(pTEXT Clbk pinHan)
NINT ctfiltercb(pTEXT Clbk, pVOID Recptr, pConvMap Schema,
VRLEN fixlen, VRLEN datlen pinHan)
NINT ctfiltercb_uninit(pTEXT Clbk pinHan)
| Parameter | Description |
|---|---|
| pTEXT Clbk | Pointer to a NULL terminated ASCII string beginning with the ctCNDXclbkCHR (@) character. Presumably, the string starting in the 2nd position (i.e., Cblk + 1) points to a callback function designator. |
| pVOID Recptr | Pointer to a record image |
| pConvMap Schema | Pointer to a record schema |
| VRLEN fixlen | Fixed length of the record image |
| VRLEN datlen | Full length of the record image |
| pinHan | A macro the converts to ctWNGV for standalone or client libraries defining ctNOGLOBALS or lctgv for Server or Bound Server code. |
ctfiltercb_init() is expected to return zero (0) on error and non-zero on initialization, though at this point the error return is ignored. ctfiltercb_init() should set a state variable so the ctfiltercb_uninit() knows whether the init was called for this particular callback filter.
ctfiltercb_uninit() is expected to return zero (0) on error and non-zero on uninitialization, though at this point the error return is ignored. ctfiltercb_uninit() should check a state variable to determine whether or not ctfiltercb_init() was called for this particular callback filter.
Stub functions with simple debug print statements are part of the base distribution. A developer taking advantage of expression callback routines must adapt these callback functions to their particular requirements.
Callback Expression String
As described above, Clbk points to an ASCII string beginning with the "at" sign ‘@’. The string following ‘@’ is completely arbitrary, and is interpreted to determine what type of callback routine is intended, assuming more than one type of callback is required by the application. A simple scheme would use callback strings with a unique character in the second position (following ‘@’). This permits a simple switch statement routing callbacks to desired code, as shown in the following pseudo-code:
/*
** my callback strings are: @CustomerNumber
** @ZeroBalance
** @TotalFunds
*/
switch (*(Clbk + 1)) {
case 'C':
do the Customer Number check
break;
case 'Z':
do the Zero Balance check
break;
case 'T':
do the Total Funds check
break;
default:
set the return code to -CVAL_ERR to indicate the
filter could not be evaluated
break;
}
In this example scheme, the only significant portion of the callback designator is the first character after the @.
Rebuild/Compact Utility Callback Feature
SetCallbackOnRebuild() sets a pointer to a callback function that is periodically called by the file reconstruction functions RebuildIFile(), RebuildIFileXtd(), CompactIFile(), and CompactIFileXtd(). The callback function can be used for progress notification and to implement a custom user interface for rebuild utility programs.
Within your application, create a function conforming to the prototype below:
void CallbackProc(ULONG tick, TEXT flag, TEXT* text);
Call SetCallbackOnRebuild() with a pointer to this function and a step value determining the polling frequency for the callback function. Your callback procedure can handle or ignore the various returns at your option. Our ctrblex.c example uses the information to display a progress bar.
To enable the callback feature, set the RB_CALLBACK define in ctoptn.h or the ctree.mak that creates ctoptn.h.
See the ctrblex.c example and SetCallbackOnRebuild for additional information.
Portable Data Through UNIFRMAT Support
The UNIFRMAT feature enables c-tree to store data in a byte order that is not native to the environment in which it is operating. When UNIFRMAT is defined on a HIGH_LOW platform, (Motorola 68000 processors, Sun Sparc Processors, and others), FairCom Servers or c-tree standalone applications will store data and index files in LOW_HIGH format, thus allowing interchangeable databases between these and LOW_HIGH platforms.
Conversely, when UNIFRMAT is defined on a LOW_HIGH platform, (Intel 80x86 processors, Alpha, and others), c-tree will store data in HIGH_LOW format, allowing it to be readable on HIGH_LOW platforms.
The general rule for UNIFRMAT starting with c-tree V7.12 is:
- Do not define UNIFRMAT on platforms with which you wish to operate in the native byte-order format.
- Define UNIFRMAT on any platforms which you wish to support opposite byte ordering.
Note: To determine whether your CPU is storing binary data in LOW_HIGH or HIGH_LOW byte order, compile and execute cttest
Modes
There are two modes of UNIFRMAT available: Automatic and Manual. FairCom recommends using a DODA to implement Automatic mode.
Automatic Mode
This is the default mode when using the extended initialization routines, such as InitISAMXtd(). With automatic UNIFRMAT support, the byte flipping of all binary fields managed by FairCom DB and all of the binary fields in your data records is automatically handled by FairCom DB.
For FairCom DB to properly manipulate your binary fields, a dynamic object definition array, DODA, must be defined and a PutDODA() call must be made after the file is created.
Manual Mode
In this mode only the binary fields managed by FairCom DB are transposed. The byte flipping of any binary fields stored in your data records is your responsibility. To implement this mode use the non-extended initialization routines, such as InitISAM(), or make sure that USERPRF_NDATA is OR-ed into the user profile mask, userprof, argument of the extended initialization routines.
Note: This mode does not require, or use, the data object definition array (DODA).
UNIFRMAT Server
c-tree’s standalone models have supported the concept of UNIFRMAT for years. In fact, UNIFRMAT is one of c-tree’s most popular features. UNIFRMAT allows machines with alternate architectures related to byte ordering to read and operate on each other’s data.
In the early years of c-tree, the best example of the utilization of this feature was allowing applications in Apple’s Macintosh platform (HIGH_LOW environment) to operate on Windows (LOW_HIGH environment) data.
FairCom extends this feature to the FairCom Server:
- “HIGH_LOW Host” Server: Operates on LOW_HIGH data (that is, the FairCom Server on Sun Sparc may read Windows data), but not on native HIGH_LOW data.
- “LOW_HIGH Host” Server: Operates on HIGH_LOW data (that is, the FairCom Server on Intel Linux may read Sun data), but not on native LOW_HIGH data.
This allows an existing system to be replaced with an opposite system without converting the data files. For example, a standard FairCom Server on a Sun Sparc system (HIGH_LOW) could be replaced with a LOW_HIGH Host FairCom Server on a Linux (Intel) system by simply installing the new Linux FairCom Server software and copying the existing data to the Linux system:

Limitations
UNIFRMAT support is only available for non-transaction processed files. An application will receive error FTYP_ERR (53) if it attempts to access a TRNLOG, PREIMG, or LOGIDX file.
The FairCom Server requires a specially generated executable to support UNIFRMAT. You must also relink client applications with the latest libraries. Older clients return CUNF_ERR (735) when connecting to a UNIFRMAT Server.
A special UNIFRMAT Server can only access files of the opposite flavor to the operating system. If a client attempts to access a native-format file, the Server returns error HDR8_ERR (672). For example, a UNIFRMAT Server for an Intel-based Linux machine (native LOW_HIGH format) can only access HIGH_LOW files.
A DODA is required for automatic data translation and index management if key segments include binary fields. See Automatic Mode.
Data Alignment
Although byte ordering can be handled transparently by FairCom DB, data alignment is still YOUR RESPONSIBILITY in a stand-alone environment or when a DODA is not present.
To determine the alignment for each environment in use, compile and execute cttest. The best way to approach data alignment from a portability perspective is to align your structures for the largest aligned environment. For example, to port a data file from an environment that places all integers on a byte boundary to an environment that places all integers on a word (2 byte) boundary requires placing all integers on a word (2 byte) boundary.
It may be necessary to add char fields as padding to get proper alignment or to reorder the members of structures. In addition, grouping all data types together in the record structure has the added benefit of once the first occurrence of a data type is aligned properly, all subsequent occurrences in the same group are properly aligned.
Multiple Instance Support
The ctNOGLOBALS define, default in c-tree, places all c-tree global variables in an allocated c-tree global control structure, ctWNGV. This is handled invisibly in most c-tree models and should not interfere with the normal operation of your application.
If your application uses multiple instances or threads, a more detailed understanding of the global structures is required. See Multiple Instance Support and Multi-threaded API for more information.
Basics
FairCom DB ctNOGLOBAL support, defined by default, places all FairCom DB global variables in an allocated structure, the FairCom DB global control structure, ctWNGV. This logic was originally developed for two primary purposes:
- To provide a multi-user FairCom DB DLL. By moving the global variables to an allocated structure, it was possible to allocate FairCom DB structures for each application which shares the same FairCom DB DLL. This is useful in environments where FairCom DB is built as a DLL, such as Windows and NT.
- To reduce DGROUP requirements. Large d-tree applications such as the d-tree catalog (DTCATLOG) could exceed the 64K DGROUP space limitation with some 16-bit compilers. NO GLOBAL support resolves this problem because variables are placed in allocated structures rather than absorbing DGROUP space (data memory).
When a FairCom DB initialization call is made, FairCom DB checks to see if the structure has been allocated by the application. If the structure has not been allocated, FairCom DB allocates the structure automatically. If additional instances are necessary, use the instance functions listed below. If a pointer is needed to the global structure, declare ctWNGV in your application and set it to the current instance with:
ctWNGV = (cCTGVAR) GetCtreePointer((pTEXT) WhichCtree());
| RegisterCtree() | Register an instance to FairCom DB. |
| UnRegisterCtree() | Unregister an instance from FairCom DB. |
| SwitchCtree() | Make the supplied registration ID active. |
| NextCtree() | Change to the next registered instance. |
| WhichCtree() | What is the current FairCom DB instanceID. |
These functions are described in the function reference section. For examples of this feature, see the Windows samples wtixmg and wtlxmg.
Multiple Instance support is replaced by the thread management API when using the multi-threaded FPUTFGET and client libraries.
Automatic Windows 3.1 Instance Support
Multiple instance support can be controlled automatically for you. This is the recommended approach for Windows 3.1 due to the way memory is shared across multiple instances. This feature is not necessary for Windows 95 and Windows NT systems because these operating systems provide unique memory space to each process, unlike Windows 3.1. The #define ctPORTAUTOWIN enables automatic instance support. If this #define is used, the application is NOT required to call RegisterCtree(), SwitchCtree(), or UnRegisterCtree(). These functions will be called automatically.
However, it is still possible to manually call RegisterCtree, giving additional control by returning the pointer to the FairCom DB global structure, allowing pointers to be held locally. By having this pointer, you can print out the FairCom DB error variables, file number, etc.
Note: If you don’t desire to call RegisterCtree() manually, embedding WhichCtree() inside of a call to GetCtreePointer() can obtain the FairCom DB global structure pointer. See RegisterCtree and WhichCtree in the function reference section for additional information.
In summary, you only need to do two things to take advantage of this Automatic Instance Support for Windows 3.1 feature:
- Define ctPORTAUTOWIN in ctoptn.h. The FairCom m-tree make system recreates the ctoptn.h during each execution of the generated make file.
- Ensure no calls to SwitchCtree() or UnRegisterCtree() are in the application.
Pre-Function Capability
A special function, ct_win_inst(), is automatically called prior to each FairCom DB API call. This function automatically calls RegisterCtree(), SwitchCtree(), and UnRegisterCtree() when the Automatic Instance Support is enabled for Windows 3.1. This capability can be used without the Automatic Windows Instance logic by enabling the following define:
#define ctPREFNC ct_win_inst
This allows you to call your own function by either placing your function logic within a function named ct_win_inst or by setting the ctPREFNC define to your function name. This is useful for any type of operation done prior to each function call. However, keep in mind the overhead introduced into applications by doing time-consuming operations.
Millions of Open Files
In licensed enterprise editions of V12 and later, c-tree supports opening more than 32,767 files. This enhancement has been designed so no application changes should be required. Each individual database connection is still limited to a maximum of 32,767 files (preserving the original definition of data and index file numbers as two-byte signed values, using the COUNT data type). Internally, new data types were introduced that support up to 2 billion files. Note that we have a compile-time limit of 1,000,000 files to reduce the memory footprint of the c-tree database engine. If you need more than 1,000,000 open files, please contact FairCom.
If opening many files, consider the possibility of using multiple c-tree database engines to host the files, rather than having one c-tree database engine hosting all the files. This is recommended because the files will compete for resources such as data and index cache, file system cache, operating system kernel memory, and transaction logs. Note the c-tree database engine is very resource friendly and it’s possible to execute multiple occurrences of the engine on the same host computer (be sure to comply with the c-tree license which requires a license per installed instance).
Compatibility Notes:
Transaction log compatibility: The checkpoint log entry now contains a 4-byte number of open files rather than a 2-byte value. This means that the transaction logs created by a server without 4-byte file number support are incompatible with a server that uses 4-byte file number support, and vice-versa. When this incompatibility is detected, the database engine fails to start up with error LFRM_ERR (666), "incompatible log format."
The following message in CTSTATUS.FCS indicates a server with 4-byte file numbers found transaction logs that use 2-byte file numbers:
- User# 00001 Incompatible log file format [10: 45800400x 47a00490x 02200090x]
- User# 00001 L0000001.FCS
The following message in CTSTATUS.FCS indicates a server with 2-byte file numbers found transaction logs that use 4-byte file numbers (or some other new feature that this server doesn't support):
- User# 00001 Incompatible log file format [5: 44800400x 07a00490x 43200090x]
- User# 00001 L0000001.FCS
Be sure to review the server upgrade best practices in the FairCom knowledgebase. The following document lists the only recommended procedures for safely upgrading to a server that has a transaction log file format change: Steps to Upgrade a FairCom DB Server.
Standalone Mode Utilities: Remember that this transaction log incompatibility also affects standalone mode utilities such as ctrdmp and ctldmp. These utilities must be compiled with 4-byte file number support in order to read transaction logs that were created by a database engine that supports 4-byte file numbers.
New API function: ctGetOpenFilesXtd()
The function ctGetOpenFiles() uses the ctFILINF structure, which contains a two-byte field for the system file number. To support retrieving the full four-byte system file number, we introduced the function ctGetOpenFilesXtd() (ctGetOpenFilesXtd, ctGetOpenFilesXtd), which uses the new ctFILINFX structure, containing a 4-byte system file number field. An application that calls ctGetOpenFiles() uses the ctFILINF structure, so the output for each file contains a 2-byte system file number. When the file number is greater than 32,767, this function returns -1 for the system file number. To avoid this limitation, call ctGetOpenFilesXtd() instead. Note that the ctadmn utility has been updated to call ctGetOpenFilesXtd(). If that call returns an unsupported function number, ctadmn calls ctGetOpenFiles() instead.
State Variable: The c-tree connection-level state variable isam_fil is still a 16-bit value. A new state variable, isam_fil32, holds the full 32-bit file number in case of an error.
License
Support for handling more than 32,767 files is a licensed feature. The maximum number of files that can be specified by the FILES keyword is limited to 32,767 files when the feature is not enabled.
Performance Optimization
This section discusses information you can use to optimize performance, such as caching.
PAGE_SIZE (sect) optimization
This section applies to all FairCom DB operational models.
The sect argument, the third argument to InitISAM() and the parameter file initialization record, is used to determine the node size (PAGE_SIZE) of the index tree.
Note: If sect is changed, all indexes affected by the change MUST be rebuilt.
Due to the vast number of variables that affect the setting of the sect parameter (key length, number of key values, etc.) it is impossible to provide strict guidelines for setting this value. The best method for setting this value is by performing time trials with various settings on the target platform.
A good “rule of thumb” for determining a starting point for the sect setting is to set the node size of the index to the operating system page size (often referred to as block size).
index node size (in bytes) = sect * 128
Note: The client-side sect setting is ignored by FairCom DB. Instead of relying on the client sect setting, FairCom DB specifies the index node size using the configuration keyword PAGE_SIZE. This keyword accepts the index size in bytes, rather than a multiple of 128 as the sect setting accepts.
Index cache size
This section applies to single-user mode and Standalone Multi-user mode with READ_ONLY and ctEXCLUSIVE file modes.
The index cache size is approximated with the following formula:
memory (in bytes) = bufs * (sect * 128 + MAXLEN + 128)
bufs is the first argument to InitISAM() and first number in the parameter file initialization record. Increasing bufs allocates more memory for the index cache. The cache uses a hashing algorithm, therefore the more memory for the cache the better the performance. If bufs is set too high, the initialization function, (InitISAM() for ISAM structures or OpenISAM() for parameter files), fail with SPAC_ERR (10) or a similar out of memory error. MAXLEN is the maximum key length defined in ctopt2.h (defaults to 1024 bytes).
The FairCom DB Standalone Multi-user mode forces all reads and writes directly to disk, (except for files opened with the file mode set to READ_ONLY or ctEXCLUSIVE), bypassing the cache. Therefore, set bufs to a minimal value, such as 6.
See IDX_MEMORY in the FairCom Server Administrator’s Guide for more information.
Data cache size
This section applies to single-user mode only.
The data cache size is calculated with the following formula:
dbufs * ((sect + 1) * 128)
Extended initialization calls, InitISAMXtd(), InitCTreeXtd(), and OpenISAMXtd(), allow the dbufs to be specified. If non-extended calls are made, dbufs defaults to bufs (index buffers).
See DAT_MEMORY in the FairCom Server Administrator’s Guide for additional information.
Rebuild and Compact Optimization
In addition to the other options given, the following adjustments will optimize rebuild and compact speeds.
Single-user Standalone Index Cache
Utilities built around the RebuildIFile() function should use Single-user Standalone libraries to take advantage of index caching. To increase the size of the index cache, increase the number of index buffers, i.e., bufs parameter to InitISAM(). The index cache uses a hashing algorithm, therefore the larger the index cache the faster the rebuild will be. The size of the index cache can be calculated as follows:
memory(in bytes) = bufs *(sect * 128 + MAXLEN + 128)
MAXLEN, the maximum key length defined in ctopt2.h, defaults to 1024 bytes.
The buffer size has a default maximum of 16000 bytes. Use #define ctMAXSORTBUF in ctoptn.h/ctree.mak when building your library to specify a larger maximum for an additional speed boost.
Sort Files
Rebuild times can also be optimized by increasing the number of temporary sort files used. Increase the #define MAX_HANDLES found in ctsort.h from its default of 50 to a larger number up to the maximum of 255. The value to use is a function of the size of the index to be rebuilt and the number of available file handles and is best determined by performing timing tests on the target system. The temporary files created during rebuild are FairCom DB files. After changing ctsort.h, recompile the entire FairCom DB library, including removing the object files and library, and the rebuild application.
NO_IDXENT
This section applies to Standalone Multi-user mode only.
The NO_IDXENT #define can be added to ctoptn.h to improve Standalone Multi-user performance by disabling the internal count of index entries. When NO_IDXENT is defined, NbrOfKeyEntries() is not available. See NbrOfKeyEntries in the function reference section for further information.
Disable Key Buffer Support for Faster ISAM Access
By suppressing the evaluation of all ISAM keys, ISAM record retrieval performance can be enhanced. For each user, each ISAM data file is assigned a set of key buffers containing the current key image of each index for the current ISAM position. To speed record access, UpdateHeader() accepts a mode, ctISAMKBUFhdr, which toggles support of the key buffers for a particular ISAM data file for the calling user.
Key buffer support is enabled/disabled with UpdateHeader() as shown in the sections below:
Several considerations apply.
Disable key buffer support
UpdateHeader(datno, (LONG) 1, ctISAMKBUFhdr) turns off key buffer support. The effect on performance will be most noticeable when the data file supports many indexes, especially if the indexes take advantage of the omit/select logic. Any non-zero LONG value for the second parameter turns off support.
Enable key buffer support
UpdateHeader(datno,(LONG) 0,ctISAMKBUFhdr) turns on key buffer support, the default state of an ISAM file.
Considerations
- Without key buffer support the application cannot perform updates to the file, including adds, deletes, or re-writes. Updates require key buffers. Attempting to do so returns error KBUF_ERR (121).
- Unlike the previous UpdateHeader() modes, this mode does not make a permanent change to the file header, does NOT require the file to be opened exclusively, and applies only to the user making the call.
- UpdateHeader(...,ctISAMKBUFhdr) returns FMOD_ERR (48) when called for an index file.
- When the key buffer support is turned off, the current ISAM position for the data file is set to zero which implies no current ISAM position.
- If the key buffer support is off, one cannot traverse a file by one key and perform a NextRecord() or PreviousRecord() for a different key. Only the key that established the ISAM position, through calls such as FirstRecord(), GetRecord(), or GetGTERecord(), can be used in NextRecord() or PreviousRecord(). If the data file is traversed in physical order, no ISAM key buffer is supported.
See UpdateHeader (UpdateHeader, UpdateHeader).
Process All Files Forward and Backward
Frequently, the newest and most relevant data is appended at the end of data files, for example, application audit logs and other time series captured data. While indexes can be used for reverse traversal, reading records backwards from the end of the file toward the beginning is the fastest method to process this data.
FairCom DB has always been able to read records backwards using an index, or when resources are not enabled on a data file. For example, you can create an index on RECBYT and use it to read records backwards in the order they are stored in the data file. This is fast, but it is even faster to read records directly out of a data file without using an index.
This enhancement allows FairCom DB to read records backwards directly out of a data file without using an index. It only applies to data files containing fixed-length records. Data files containing variable-length records still require an index to read records in reverse order.
Compatibility
- This enhancement works when the database runs as a server supporting one or more clients (client-server mode).
- It also works in non-server single-user mode and non-server multi-user mode.
- This new functionality does not affect the client library. Thus, existing applications can take advantage of this feature without recompiling or linking to a new version of the client library.
- You must install the latest version of the database server or recompile or link to the latest version of the database when using it in non-server mode.
Limitations
- This enhancement does not work in non-server multi-threaded multi-user mode.
- Data files containing variable-length records still require an index to read records in reverse order.
Database Internals
Before this enhancement, FairCom DB did not allow reading a data file in reverse physical order if that file contained resource records. For example, calling LSTREC() on such a file would fail with error 48 (FMOD_ERR).
FairCom DB imposed this restriction because resource records can span more than one fixed-length record block in the data file. Thus, scanning backward through a data file could cause the record pointer to end up in the body of a resource record. This is not a problem when reading records in forward order because the start of the resource includes its length, which makes it easy for FairCom DB to skip over the resource record.
FairCom DB eliminated this limitation by creating a list of resource records that contains the offset and length of each one. This list is constructed on the first backward scan call. When a resource is added, deleted, or updated, the resource list is refreshed.
Enhanced Precision for International Languages
FairCom supports the optional use of CharUpper() instead of toupper() when calling the c-tree function ctrt_toupper(). This enhanced support is for Windows only.
The differences between these two functions are rather subtle, however, when dealing with international languages it becomes quite obvious. This is best understood using an example:
- toupper() converts “Téc1” to “TéC1”
- CharUpper() converts “Téc1” to “TÉC1”
Client/Server Support
Follow these steps to enable this feature in client/server mode:
- Specify the following keyword in the FairCom Server configuration file:
COMPATIBILITY USE_CHARUPPER
- Add #define ctUSE_CHARUPPER to the makefile used to compile your c-tree client library as shown below:
echo #define ctUSE_CHARUPPER >>$(fcTOM)\ctoptn.h
Standalone Support
To enable this feature in standalone mode, add #define ctUSE_CHARUPPER to the makefile used to compile your c-tree standalone library as shown below:
echo #define ctUSE_CHARUPPER >>$(fcTOM)\ctoptn.h
FairCom DB API Multi-Record "Virtual" Tables
A common technique in many early applications was to combine multiple types of records into a single physical data file. Many of those applications have been very successful for many years and are still in service. This technique was a common practice when storage space was at a premium. However, this technique does not conform to the standard, single-schema, relational model necessary for SQL access.
Developers are looking at adding more complex features and interoperability over their data with new technologies such as web access and Windows .NET support. This has proven to be a challenge when the data is not of a consistent schema due to multiple record types in a single table.
FairCom DB allows a single table with multiple schemas (multiple record types) to appear to SQL as multiple virtual tables, each with a single schema.
For information about multi-record tables, see the FairCom DB API Developer's Guide.
Helpful Internal Functions
The following internal functions may be of use to developers. Although they are not part of the formal FairCom DB API, they demonstrate useful methods of accomplishing specific tasks. Developers may be interested in them as examples.
File System First/Next
FairCom DB reads file names from a disk volume with the internal function ct_findfile() using ‘findfirst/findnext’ type operations. Although not a formal FairCom DB API call, this “file names on disk list” function could be of low-level use to developers. The various implementation are controlled with #define’s for specific platforms.
Look for the following function declaration in ctinod_a.c for more information.
LONG ct_findfile(pctOS_FILE_FIND_INFO ctOsFindFile);
Internal Disk Space Available Function
FairCom has created an internal function, ct_diskavl(), to determine the available disk space on a volume. Although not a formal part of the API, this cross-platform implementation may be of use to some developers working at a low level on several platforms. The various implementations are controlled by #defines. See the following function declaration in ctinod_a.c for more information.
ULONG ct_diskavl(pTEXT fn);
Internal Wildcard Matching Logic
Although not a formal FairCom DB API call, ctmatch() is a useful wildcard pattern matching function of low-level use to developers. This is the same wildcard logic used for years in FairCom’s r-tree report engine product.
NINT ctmatch(pp,ep,tp,et,casen)
pTEXTpp;/* startingptr to the pattern */
pTEXTep;/* endingptr to pattern */
pTEXTtp;/* startingptr to target */
pTEXTet;/* ending ptr to target*/
NINTcasen;/* YES => case sensitivity*/
For more information, see ctmatch() in ctsstr_a.c.
Date Routine Added to ctsadm
The r-tree date out routines are in the ctsadm.c modules. This gives developer access to this routine, because the r-tree date format is used in the SystemLog() logic and for the FairCom Driver date fields.
Multi-Threaded API
FairCom DB includes a full featured cross-platform thread management API for the multi-user stand-alone model and the multi-threaded client/server model. This API gives applications enhanced performance and better logical program flow. In addition, this API takes care of all system specific threading issues when moving between operating systems, even those not supporting native threading. The ability for advanced inter-thread communication further strengthens the power of this robust API.
FairCom has years of experience in developing multi-threaded applications. Multi-threading has been built into all FairCom Servers since the late 80’s. Operating systems not supporting native threading required a custom thread manager, including a high speed task manager/scheduler. FairCom’s ctThrd API system encapsulates this expertise into an easy to use, system-independent, thread management interface. Extensive experience with a variety of system-specific thread management packages has further enhanced our development expertise.
Introduction to Threads
The concept of an application having multiple threads has been around for some time. Various microprocessor-based operating systems have supported threads or some form of add-on thread package for years. pthreads library is a typical example on Unix and Linux systems. Multithreaded applications perform best when executing on hardware with multiple CPUs where true multitasking is available.
Each thread of an application can be considered as a separate flow of execution, each thread operating in parallel with the other threads. Different threads may be executing the same code sequence or different code sequences. In the latter case, one thread may handle keyboard input, another thread may handle a communications port, and yet another thread may perform complex calculations.
The power of threads not only resides in the ability to have multiple flows of execution within an application, but also in the clever way these multiple threads may interact and interrelate. Here, we will discuss some fundamentals related to multi-threaded application development, and introduce the ctThrd API, FairCom’s powerful, portable thread control/management subsystem.
Some Thread Fundamentals
Threads, sometimes referred to as lightweight tasks, allow an application process to be divided into precise sub-processes executing concurrently. Conceptually, a thread is the smallest amount of processor context state necessary to encapsulate a computation. Practically speaking, a thread consists of a register set, a program counter, and a stack. Due to the minimal context state requirements, threads have a very fast context switch time. Since threads operate within the application’s context, each thread has full application global access and shares the same address space, file access paths, and other system resources associated with the application.
Although threads have many benefits, the two most important are ease of logical program structure and performance. Program structure is simplified because each application task can be coded as an almost independent subsystem. If tasks interact and/or share resources, they must use synchronization objects, which the API provides. Performance is enhanced since some threads can make progress while one or more other threads may be in a wait state. For example, a keyboard thread waiting for a keystroke does not have to block all other code executions.
Concurrent thread execution means two or more threads are in process at the same time. If one thread blocks for some reason, another thread from the same program executes in its place. This feature is especially relevant to the FairCom Server, or any other I/O bound application. Parallelism occurs when two or more threads execute simultaneously across multiple processors, utilizing the power of multi-processor systems.
Multi-threaded benefits, particularly performance, are directly related to efficient thread management. This is precisely why FairCom determined that a well written, portable thread management subsystem was the key to its multi-thread support. By allowing our ctThrd API functions to handle your thread management needs, you will have an efficient, system-independent approach for your multi-threaded application development.
Thread API Summary
The FairCom DB Threading API is a thin layer over the native thread API and should be considered an advantage. Besides taking care of the FairCom DB internal necessities (initialization needs), this API is also extremely portable. The ctThrd API is available for all FairCom platforms by supporting either native OS threads or FairCom’s proprietary thread manager. In addition, inter-thread Communication has been incorporated which allows threads to communicate with each other through an efficient queue mechanism.
Important: When using a multi-threaded FairCom DB library, ctThrdInit() MUST be called before c‑tree is initialized for the first time and only threads created with ctThrdCreate(), or attached with ctThrdAttach(), can call FairCom DB functions. The rest of the ctThrd API is an optional cross-platform threading standard.
ctThrd Function Overview
The multi-threaded client API is defined below. FairCom’s ctThrd API is grouped into three categories:
Thread Management Functions
ctThrdInit
Initialize thread management. REQUIRED.
ctThrdTerm
Terminate thread management.
ctThrdCreate
Create a thread.
ctThrdExit
Clean up after a thread.
ctThrdAttach
Make an existing thread FairCom DB compatible.
ctThrdDetach
Clean up the FairCom DB compatibility created with ctThrdAttach().
ctThrdSleep
Sleep a thread for a given time.
ctThrdData
Get pointer to private data area.
ctThrdDataSet
Set pointer to private data area.
ctThreadHandle
Return a small thread ID.
Thread Communication
Inter-thread FIFO/LIFO queue message handlers:
ctThrdQueueOpen
Open a queue returning a small queue handle.
ctThrdQueueClose
Close the queue specified by the queue handle.
ctThrdQueueWrite
Write a queue message to the queue specified by the queue handle.
ctThrdQueueRead
Read a queue message (with timeout).
ctThrdQueueWriteDirect
Pass a queue message pointer to the queue specified by the queue handle.
ctThrdQueueReadDirect
Read a queue message pointer (with timeout).
ctThrdLIFOWrite
Write message at the head of the queue.
ctThrdQueueMlen
Return length of the message at the head of the queue (with timeout).
ctThrdQueueCount
Return the number of messages in the queue.
Thread Synchronization
Thread synchronization can be separated into 3 groups as follows:
- Mutex - mutual exclusion synchronization. The same thread that gets the mutex must release it.
ctThrdMutexInit
Initialize a mutex.
ctThrdMutexGet
Acquire the mutex. Wait until available.
ctThrdMutexTry
Acquire the mutex immediately or return.
ctThrdMutexRel
Release the mutex, permitting other threads to contend for it.
ctThrdMutexCls
Close the mutex, making it in active.
- Block - a block differs from a mutex in two significant respects
- It may be released by a thread other than the thread that acquired it.
- A timeout value may be specified which causes a ctThrdBlockGet() or ctThrdBlockWait() to return with an NTIM_ERR (156) if the call could not succeed in the specified timeout interval. Like a mutex, a block can be acquired by only one thread at a time.
- Semaphore - a semaphore is a synchronization object that may be acquired by more than one thread at a time, and may be released by a thread that has not acquired it.
ctThrdSemapInit
Initialize a semaphore.
ctThrdSemapGet
Get one unit of semap. The thread will block indefinitely until a unit of the semaphore is available.
ctThrdSemapTry
Same as ctThrdSemapGet() except that if no unit of the semaphore is available, NTIM_ERR (156) is returned immediately.
ctThrdSemapRel
Release one unit of the semaphore.
ctThrdSemapCls
Close the semaphore, making it inactive.
Thread API Details
This Section provides details on the functions making up the FairCom Threading API. Note the following points:
- ctThrdInit() must be called before any FairCom DB API calls, including initialization.
- FairCom DB must be initialized in each thread individually, and only in threads created with ctThrdCreate() or attached with ctThrdAttach().
- The ctThrd API and the Instance API (RegisterCtree(), etc.) are not compatible in Multithreaded Standalone model. Each thread should contain exactly one instance of FairCom DB. This is not an issue with the Multithreaded Client model.
- Functions specifying a time interval may use ctNOWAIT to indicate a zero interval and ctWAITFOREVER to indicate an indefinite interval.
- Unless stated otherwise, a return of zero indicates success. uerr_cod is not set by these functions, however sysiocod is set when unexpected threading errors occur.
- To override the default stack size defined in ctatrd.h, add the following code to ctoptn.h, FairCom DB Option Header, or your application before any FairCom DB headers are included. If adding to ctoptn.h, recall that ctoptn.h is dynamically created during execution of the make file created by m-tree, ctree.mak. If changes are to be made to ctoptn.h, be sure the changes are made to the ctoptn.h creation logic at the end of the make file ctree.mak.
#define ctAppSTACKSIZE stack_size
The following error codes are particularly relevant to the threading API:
| Value | Symbolic Constant | Explanation |
|---|---|---|
| 90 | NQUE_ERR | Queue ID out of range. |
| 92 | QMRT_ERR | Queue memory error during write. |
| 95 | STSK_ERR | Could not start thread. |
| 129 | NINT_ERR | FairCom DB not initialized. |
| 156 | NTIM_ERR | Timeout occurred. |
| 514 | CQUE_ERR | Queue has been closed. |
| 635 | TSYC_ERR | Could not create thread synchronization object. |
| 636 | TSYF_ERR | Thread sync object Get failed. |
| 637 | TSYR_ERR | Thread sync object Rel(ease) failed. |
| 638 | TQUE_ERR | Queue message truncated to fit buflen |
| 639 | TZRO_ERR | Semaphore must be initialized with count > 0. |
| 640 | TINT_ERR | Semaphore already initialized. |
| 641 | TSYX_ERR | Thread sync object close failed. |
Examples
Following is a brief discussion of example programs located in ctreeSDK\ctreeAPI\ctree\samples\isam for viewing the FairCom threading API. See the full descriptions in the Sample Programs.
ctmtap.c and wtmtap.c - Multi-threaded testing programs
FairCom’s Multi-threaded API sample and performance testing programs. These sample programs demonstrate FairCom’s Multi-thread API. The main routine runs a trial of database operations; however, this application spawns one or more threads, each of which runs a trial.
ctmtap divides the screen into columns, with the output from each thread placed in its own column. If you specify more than 8 threads, only the output from the first eight threads will be shown. This limit is for on-screen clarity and can be changed by editing the #define for MAX_COLS in ctmtap.c. Regardless of the value of MAX_COLS, all threads will be running, however, output will be shown for only MAX_COLS threads.
wtmtap is a Win32 GUI version of ctmtap.
In V10.3 and later, the ctmtap test program supports command-line options to specify the user name and password. If these options are not specified, ctmtap connects as guest. The options are:
- uUSERID
- pPASSWORD
The following example connects as user ADMIN to server FAIRCOMS, creates new files using the ctTRNLOG mode and runs 8 threads, each adding 100000 records:
ctmtap uADMIN pADMIN sFAIRCOMS fc mT t8 aA n100000
ctmtex.c - Multi-threaded testing program
A multi-threaded, incremental file based, variable-length sample based on ctixmg.c.