Partitioned Files

Use Cases

 

Quickly Purge Data

Many applications collect continuous data that goes out of scope within a period of time. Common examples are time based log and auditing data and even financial transactions. When this data is stored in a very large file, removing bulk portions is time consuming, generally requiring an operation such as the following

delete all data where key value older than target

An alternative is to store associated data sorted into key ranged data files. That requires application processing of new files and dropping old files, while keeping file naming on disk unique. This technique is classically known as sharding.

A better approach is a specialized data file that can be accessed as a single file, and automatically physically partition data based on a single key allowing rapid data management. This FairCom DB feature is termed Partitioned Files.

Applications requiring rapid purging of large data ranges is the target audience for this feature.

 

Archive Data

We described the most common use case of purging data. However, partitioned files can take this to another level as data doesn't have to be just deleted. Data partitions can be marked "offline" and archived for long term storage. This allows large volumes of data to be quickly removed and stored in alternate locations. Archived partitions can later be "reactivated" for searching when needed.

 

Partitioned Files Overview

A partitioned file is composed of a host data file and its associated indexes combined with a rule. The rule determines how to map the partition key value (e.g., a date, or invoice number, or some other characteristic) into a particular partition member of the host. The host file does not contain any actual data, but serves as a logical file that appears to contain all the data.

A partition member file has the same definition as the host, but only holds data whose partition key maps to the member. For example, customer orders may be partitioned by calendar quarter. All orders booked in the same quarter are stored in the same member. A member is composed of a standard c-tree data file and its associated indexes.

To add data to the partitioned file, simply call an add routine for the host. The code will add the record in the proper partition member, creating the partition member if necessary. Rewriting a data record may move the record between partitions if the partition key entry for the record is changed. Under transaction control, such a delete/add record operation is done atomically.

Searching the partitioned file in key order is fairly straightforward and efficient if the key is the partition key (the key used to determine which partition should contain the record). Searches on other, non-partition, keys are less efficient than normal because the record may exist in any of the partition members. The more active partition members there are in the logical file, the less efficient the non-partition key search will be.

It is possible to manage the partition members so that archiving or purging partitions is very quick and efficient.

 

Partitioned File Highlights

  • Maintain data in separate c-tree data/index files, while enjoying access from a single host file.
  • Rapidly purge or archive individual member files.
  • Partitioned files are easily implemented at the ISAM and FairCom DB API development levels with a single additional API call.
  • Easily partition an existing linked file with SQL.
  • Use all standard FairCom DB data searches on the entire file or directly on a member file.

 

Prerequisites

Partitioned file usage depends on specific file attributes to be present. With c-treeDB and SQL created files, these resources and extended headers are transparent to your application. C-based ISAM applications require these extended header attributes be defined at the application layer.

  • The file must be created with Xtd8 creation function (extended file header) with ctPARTAUTO in the Xtd8 file mode.
  • The relative partition index number must be defined in your Xtd8 create block.
XCREblk xcreblk[2] = {0, 0};
xcreblk[0].x8mode |= ctPARTAUTO;
xcreblk[0].prtkey = 0;
  • To use a DODA field in your partition rule, you must define the index with a SCHSEG mode. Beginning December 16, 2023, all other key segment modes are also recognized.
  • The partition rule must evaluate to an integer.

Supported FairCom DB Features

Partitioned files support most, but not all, FairCom DB features.

  • Partitioned files support SQL query across all partitions, with query optimizations for specific range searching.
  • Partitioned files can be created, accessed and administered via ISAM and c-treeDB, C/C++, .NET, and Java interfaces.
  • Partitioned files can be included in hot backups.
  • While not required, HUGE file support is strongly recommended as partitioned data are designed and intended for large volumes of data. However, functions requiring a record offset must use the ctSETHGH() and ctGETHGH() functions, even if the partitioned files are not HUGE to ensure high-order bytes are included.
  • Transaction Processing is fully supported. When enabled, requires transaction-dependent (TRANDEP) creates which is automatically enforced.
  • Auto-numbering SRLSEG is managed from the data file host header to maintain serial numbers used by key segments across all partitions.
  • Two levels of unique keys are supported. Global unique keys across all partitions as well as partition member specific unique keys.
  • Partitioned files can be encrypted being mindful of prior encryption keys required for any reactivated archived member.
  • Host file security attributes are propagated to all partitioned file members including file passwords.
  • Partitioned files can be rebuilt in entirety from the host, or independently at the member level.

 

Creating Partitioned Files

Partitioned File Key Rule Using Conditional Expressions

Conditional expressions make it easy to build a partition rule. A partition rule is a conditional expression ultimately evaluating to a number. That number associates data with a specific partition within the file.

FairCom DB expression parsing can now evaluate an expression into a numeric representation to be used for the partition rule. Many built-in conditional expression functions exist for flexible rule generation. Time and date based functions, numeric functions, and string manipulation functions are all available. FairCom DB expression syntax can even reference complex functions via an external shared library (DLL) in calculating partition numbers.

If the table has an embedded DODA resource and a conditional expression references schema segments, an expression can refer to index key fields explicitly by DODA name. Together with field names, complex expression rules can be crafted.

Prior to FairCom DB V11, partitioned files required these rules to be hard-coded at compile time (specified in ctpart.c) and did not allow run-time flexibility in adapting rules for unique deployed environments.

By combining existing FairCom DB advanced support for conditional expressions with our high performing partitioned files, applications can now create exact expression rules directly when creating partitioned files. In addition, existing partitioned files can have their partition rule changed and subsequently rebuilt based on a new rule logic allowing applications to grow and adapt with their customer needs.

Partitioned files are supported directly from the FairCom DB ISAM API, FairCom DB API and FairCom DB SQL.

Create with SQL

In FairCom DB SQL, a table is partitioned when a partition index is created. To create a partition index with conditional expressions for partitioning, use the following syntax:

CTREATE INDEX .... STORAGE_ATTRIBUTES 'partition=<expression>'

To create a partition index forcing hard-coded rules (pre-V11), the following is supported:

CTREATE INDEX .... STORAGE_ATTRIBUTES 'partition'

To change a partition rule on an existing partitioned file, call ALTER INDEX with your new rule in the STORAGE_ATTRIBUTES clause.

Example

This script demonstrates partition file rules in SQL. It creates a table, prtest, with an index on the integer field f7. The storage_attributes clause define a partition rule in which each record is stored in a partition number equal to the value of field f7 plus 1:

create table prtest (f1 integer, f2 char(50), f3 char (50), f4 char(50), f5 timestamp, f6 varchar(50), f7 integer, f8 time);
create index pridx on prtest (f7) storage_attributes 'partition=f7+1';

With a rich assortment of conditional expression functions available, much more complex rules can be created.

For example, a table containing a field "invoice_date" and requiring monthly partitions can be created with this simple expression:

month(invoice_date)

For syntax details, refer to Conditional Expression Parser topics in the FairCom DB Programmer's Reference and Function Reference Guide.

Example

Using functions to convert Unix time_t fields to c-tree Date and Time types (TIMET2CTDATE) in a partitioned file expression to partition into months since Jan, 2010:


CREATE TABLE unixtest (name CHAR(10), u_date INTEGER)
    or
CREATE TABLE unixtest (name CHAR(10), u_date BIGINT)

CREATE INDEX unixtest_date_idx ON unixtest (u_date) STORAGE_ATTRIBUTES 'partition=( ( YEAR( TIMET2CTDATE( u_date) ) -2010) * 12) + MONTH ( TIMET2CTDATE(u_date))'

    Date String   Unix Date Partition created
    Mon, 23 Feb 2015 11:01:32 GMT 1424689292 62
    Sat, 23 Jan 2016 11:01:32 GMT 1453546892 73
    Tue, 23 Feb 2016 11:01:32 GMT 1456225292 74
    Wed, 23 Mar 2016 11:01:32 GMT 1458730892 75

For syntax details, for the TIMET2* functions, see C Language Equivalents.

Create with FairCom DB API

Partitioned file support is extended to the FairCom DB API API. While creating a file it is possible to call ctdbSetTablePartitionIndexNbr to set the partition index, ctdbSetTablePartitionNumberBits to set the number of bits reserved for partition numbers, and ctdbSetTablePartitionRule to set the partition rule.

On existing tables, after calling the above function you then call ctdbAlterTable forcing an CTDB_ALTER_FULL action.

ctdbSetTablePartitionRule

This FairCom DB API function sets partition rules:

ctdbEXPORT CTDBRET ctdbDECL ctdbSetTablePartitionRule(CTHANDLE Handle, pTEXT expr);

  • expr - The expression, expr, will be evaluated against the key for the partition index. It must evaluate to an integer.

The partition rule uses standard c-tree expression syntax.

Create with ISAM

Partitioned files are created with standard c-tree APIs. A partitioning conditional expression is defined with the PTADMIN Partition Administration API call and the ptADMINrule parameter. Specify an extended (Xtd8) ctPARTAUTO mode in the x8mode parameter of an extended file creation block. While partitioned files require an extended (Xtd8) header they do not have to be HUGE files unless the logical file size will exceed the 2GB/4GB limit.

A partition key is set when the file is created using the prtkey parameter of the extended file creation block. The default is 0 (the first key associated with the data file). Set this value to the relative key number for the desired index if the default is not appropriate for your application.

A partition file name is the base data file name with a 3-digit raw partition number as the file extension. Only automatic naming is available in this mode.

Note: Alternative file naming is possible with custom modifications to the ctpart.c module and recompiling your FairCom Database Engine.

By default, 16 bits of the 64-bit record offset reference the raw partition number, allowing each partitioned file to support up to 65,535 member files over its lifetime, thus also somewhat limiting overall file size (i.e., the more bits used for partition numbering, the smaller the overall size of the file can be). The maximum value is 32-bits (4,294,967,295 member files). This numbering vs. size tradeoff is set at file create time using the callparm parameter of the extended file creation block (Xtd8), where a value of 0 defaults to 16-bits. Values less than 4-bits default to 4-bits (maximum 15 member files).

Note: Default partition naming, partition rules, and maximum number of partitions are used by default when not defined by the Xtd8 extended parameter block and the PTADMIN API call.

 

Managing Partitioned Files

Partitioned File Administration Utility

A partitioned file administration utility, ctpartadmin, is available to script partitioned file administrative actions. It provides all the operations of the API.

  • Add new partition
  • Delete existing partition
  • Archive existing partition
  • Activate archived parition
  • Set partition number low limit
  • Return partition status
  • Rebuild partition
  • Get active low partition number
  • Get active high parition number

Partitioned File Functions in SQL

FairCom DB SQL includes built-in procedures to administer partitions directly from your SQL application. The fc_ptadmin_num() procedure provides access to many basic administration functions.

call  fc_ptadmin_num('admin',  'custmast',  'archive',  123)

To identify current lowest and highest active partition numbers, the built-in SQL procedure fc_get_partbounds() returns this information.

call  fc_get_partbounds('admin',  'custmast')

Partitioned File API Function

The Partition Administration function, PartitionAdmin(), allows on-the-fly adjustment to the partitions associated with a given host file. This includes the capability to:

  • Add, remove, or archive partition(s)
  • Modify lower limit of the raw partition number
  • Modify limit on the number of partitions
  • Reuse the raw partition number of a purged member
  • Activate archived member(s)
  • Return a member file status

For additional information on PartitionAdmin(), see PartitionAdmin.

 

Examples

 

Rules

In V11 and later, user-defined conditional expressions can be used for partition rules, as described in the next section.

Legacy Versions

The partition key can be set when the file is created using the prtkey parameter of the extended file creation block. This value defaults to 0, indicating the first key associated with the data file. Set this value to the relative key number for the desired index if the default is not appropriate for your application.

The default rule, developed for testing purposes only, uses a simple algorithm to divide added records into partitions based on the first byte of the selected key as a test of the partition capability. See the function kprawno() in ctpart.c for the sample algorithm.

Cautions and Restrictions

See "Partition Ordering and Range Query" in Raw Partition Numbers for information on number generating rules.

 

Partitioned Files Available via FairCom DB SQL

Partitioned files are available directly through FairCom DB SQL. As partitioned files rely on a key value to partition the data, the option is part of the CREATE INDEX statement. By including the STORAGE_ATTRIBUTES clause with the PARTITION option, the file will be rebuilt to enable partitioned support. All other operations on the file will continue as usual.

Note: Partitioned file support requires a custom build of FairCom DB SQL with the partition rule compiled from the ctpart.c module.

FairCom DB SQL CREATE INDEX Syntax

CREATE [ UNIQUE ] INDEX index_name
         ON table_name
       ( {column_name [ASC | DESC]} [, ...])
       [ STORAGE_ATTRIBUTES 'attributes' ];

New Argument:

STORAGE_ATTRIBUTES 'attributes'

A quoted string that specifies index attributes that are specific to FairCom DB SQL indexes. FairCom DB SQL interprets the following attribute strings:

  • 'PARTITION' - Creates the index as the partition index to the table enabling multiple table partitions. This option can be specified after a table has been created, and the table will be rebuilt according to the partition rule in effect.

 

Partitioning by Windows Date Values

The previous date partition rule was implemented for a Unix time value (epoch 1 Jan 1970) and was based on a key value type of double (float).

This revision introduces a modification to the kprawno() function to interpret the first 8 bytes of the partition key value as a Windows local file time timestamp (that is, the timestamp contains the number of 100-nanosecond intervals since January 1, 1601). This option is useful when an application uses the .NET DateTime.ToFileTime() method to generate the timestamp values that are used in the partition key.

Internally, this support converts Windows format to Unix format for flexibility in the partition logic. As a result, it is only enabled for Windows compiles and allows for both formats to potentially be used.

 

Partition by Timestamp Rule Modified to Support GMT/UTC Time

When an attempt to read data from a partitioned file on a machine whose time zone differs from the time zone on the machine that stored the data, the data was not found. This happened as the function that maps the partition key, a timestamp, to a partition number, converts the timestamp to year, month, and day using the local time zone. To provide a consistent mapping of timestamp values to year, month, and day (which determines the partition number), an additional option has been added using GMT instead of the local time zone to convert the timestamp value.

 

New GETFIL() Modes to Retrieve First and Last Active Partition Members

When working with partitioned files, it is useful to quickly locate the first (the "oldest" when partitioning by date) and last ("newest") partition members for administrative purposes. Two modes have been added to the GetCtFileInfo() function to support this ability.

  • FRSACTPRT - returns the first active partition number for the file
  • LSTACTPRT - returns the last active partition number for the file

If GETFIL() returns -1, check the value of uerr_cod. If uerr_cod is zero, the file has no active partitions. If uerr_cod is non-zero, an error occurred. For example, if the specified file number does not correspond to a partition host file, GETFIL() returns -1 and sets uerr_cod to PHST_ERR.

Example

    COUNT datno;
    NINT  rc;
    LONG  partno;

if ((partno = GETFIL(datno, FRSACTPRT)) == -1L) {
if ((rc = uerr_cod))
printf(
"Error: Failed to get first active partition number: %d\n",
rc);
else
printf("The file has no active partitions.\n");
} else
printf("first active partition: %d\n", partno);

if ((partno = GETFIL(datno, LSTACTPRT)) == -1L) {
if ((rc = uerr_cod))
printf(
"Error: Failed to get last active partition number: %d\n",
rc);
else
printf("The file has no active partitions.\n");
} else
printf("last  active partition: %d\n", partno);

 

FairCom DB API Methods to Retrieve Partitions

Additional methods have been added to the FairCom DB API .NET API CTTable class to support retrieving the first (oldest) and last (newest) partition members.

FairCom DB API .NET API

LONG CTTable.GetFirstPartition()

LONG CTTable.GetLastPartition()

These methods return the partition rawno value of the partition member, which can then be used to purge or otherwise administer the partition member directly. A CTException is thrown if an error occurs and the specific FairCom DB error code should be examined from that.

This support is rooted in the FairCom DB API API with the following additions.

FairCom DB API C API

LONG ctdbGetFirstPartition(CTHANDLE Handle)

LONG ctdbGetLastPartition(CTHANDLE Handle)

Returns the rawno of the first or last partition for the file if partitions exist. If returns -1, call ctdbGetError() to retrieve the ISAM error code. If this value is zero, the file has no active partitions. If uerr_cod is non-zero, an error occurred. For example, if the specified file number does not correspond to a partition host file, ctdbGetFirstPartition() returns -1 and ctdbGetError() returns error PHST_ERR (713). ctdbGetError() will return CTDBRET_NOTACTIVE, or CTDBRET_NOTTABLE, if an invalid table handle is passed in.

FairCom DB API C++ API

LONG CTTable::GetFirstPartition()

LONG CTTable::GetLastPartition()

Returns the partition rawno value of the partition member. A CTException is thrown if an error occurs.

 

FairCom DB API .NET Interface Support for Partition File Management

The FairCom DB API and FairCom DB API .NET interface layers have been enhanced with added functions and methods for administering and managing FairCom DB partitioned files.

FairCom DB API C API

The following functions have been added to FairCom DB API for partitioned file administration:

  • ctdbPartAdminByName()
  • ctdbPartAdminByKey()
  • ctdbPartAdminByNumber()

FairCom DB API C++ API

Three methods have been added to the CTTable class for partitioned file administration:

  • PartAdminByName()
  • PartAdminByKey()
  • PartAdminByNumber()

FairCom DB API .NET

Three methods have been added to the CTTable class for partitioned file administration:

  • PartAdminByName()
  • PartAdminByKey()
  • PartAdminByNumber()

Descriptions for the .NET methods are in the Function Reference appendix in Partitioned File Management API.

 

Additional FairCom DB API .NET CTTable.PartAdminByKey() Method

The FairCom DB API .NET CTTable.PartAdminByKey() method initially had only a parameter requiring a IntPtr type and required marshaling of data into this type, which can be somewhat cumbersome. A new overloaded method has been added to allow a CTRecord type and an integer key length to be passed. PartAdminByKey() now internally calls CTRecord.BuildTargetKey() and passes the proper target key value to the partition administration method for ease of use.

COUNT PartAdminByKey(CTRecord Record, int KeyLen, CTPART_MODE_E PartMode)

Example

// allocate objects
MySession = new CTSession(SESSION_TYPE.SQL_SESSION);
MyDatabase = new CTDatabase(MySession);
MyTable = new CTTable(MyDatabase);

Console.WriteLine("\tLogon to server...");
MySession.Logon("FAIRCOMS", "ADMIN", "ADMIN");
MyDatabase.Connect("ctreeSQL");

Console.WriteLine("\tOpen table...");
MyTable.Open("custmast", OPEN_MODE.EXCLUSIVE_OPEN);

// Duplicate keys allow for 8 extra bytes
dupLen = 8;
Int64 ts = 121140092800000000;
int keyLen = 8 + dupLen;

CTRecord rec = new CTRecord(MyTable);
rec.Clear();
rec.SetDefaultIndex("custmast_idx_partition");
rec.SetFieldValue("entry_date", ts);
MyTable.PartAdminByKey(rec, keyLen, CTPART_MODE_E.PURGE);

 

Advanced Concepts

 

Maximum Partition Number vs File Size

By default, 16 bits of the 64-bit record offset are used to reference the raw partition number, allowing each partitioned file to support up to 65535 member files over its lifetime. This can be adjusted at create time using the callparm parameter of the extended file creation block, where a value of 0 defaults to 16 bits, values less than 4 bits default to 4 bits (maximum 15 member files), and 32 bits is the maximum value (4,294,967,295 member files). The number of bits determines the total number of raw partitions for the entire life of the host file. This is not the number of partitions active at one time. Raw partitions are not reassigned.

 

Raw Partition Numbers

The raw partition numbers must be 1 or greater. When passing a file position that includes a partition number to a routine, the partition number is encoded in the high-order bits of the high-order word. Ordinarily, the application will only get such information from a call to CurrentFileOffset() followed by a call to ctGETHGH().

Partition numbers are stored in the higher-order bytes of the 64-bit record offset. This allows the ISAM API calls to remain unchanged. Simply change the parameters of your file creation call, and your application is ready to use partitioned files. For this reason, functions requiring a record offset must use the ctSETHGH() and ctGETHGH() functions, even if the partitioned files are not HUGE to ensure these high-order bytes are included.

Partition Ordering and Range Query

Partitions are assigned in increasing order of the partition key values. That is, if KeyValue2 > KeyValue1, then the partition assigned to KeyValue2 will be the same as or after the partition assigned to KeyValue1.

We allow any user-defined expression that evaluates to a numeric value to be used as a partition rule. However, our partition search logic requires that a partition rule assigns partitions in increasing order of the partition key values. That is, the partition function is required to be a monotonically increasing function: for any two partition key values A and B, if A > B then the partition rule must output values p(A) and p(B) such that p(A) > p(B).

We don’t currently check that a user-defined partition rule meets the monotonically increasing property. If a rule is supplied that doesn't have this property, partition queries will return incorrect results such as not finding key values that exist in the table.One example of a function that does not meet this requirement is partitionRule = (partitionKeyValue MOD 12). Note that the values of this function increase then decrease again rather than always increasing as the partition key value increases.

It is up to the developer to be aware of this requirement and to only use partition rules that meet this requirement.

Partition Number Base

Use the PartitionAdmin() function to increase or decrease the lowest permitted partition number, called the “base” partition number. The system enforces an absolute lowest value for the base of one (1), but PartitionAdmin() can be used to change the base as long as it is one or greater. However, when changing this base value, PartitionAdmin() ensures no inconsistencies will arise. For example, one cannot increase the base value if it would eliminate any active or archived partitions (however it can eliminate purged partitions).

 

Partitioned File Naming

Partition file names are automatically created as the base file name with the 3-digit raw partition number as the file extension. This can be customized with the FairCom DB Server SDK See the function partnam() in ctpart.c defines the naming algorithm and is required to be compiled into the server binary at this time.

 

Optimized FairCom DB SQL Partitioned File Queries

A detailed analysis of how partitioned files were opened and queried by various SQL constructs was taken. Many enhancements were identified that could greatly improve performance when multiple physical files are taken into consideration:

  • Estimation of key values - FairCom DB SQL requires an estimation of key values as part of the query optimization phase. It was discovered that this phase of query execution frequently consumed the largest amount of time when working with large numbers of partitioned data files. It was found that the calling of key estimation routines opened large numbers of files to obtain the key estimate. To better optimize this phase, a sampling technique is now performed on a much smaller subset of partitions to reduce time spent in this critical phase. The partitions sampled are the first and last partitions that ordinarily would have been used, and one or more in the “middle” of the remaining active (or covering) partitions.
    • FairCom DB SQL defaults to three samplings. The following configuration keywords change this behavior:
    • PARTITION_ESTIMATE_LIMIT <limit> increases this limit to a desired value. A negative value resorts to the previous behavior of reading from each active partition (or covering partition).
    • PARTITION_ESTIMATE_LIMIT <limit>% increases this limit as a percentage of eligible partitions.
  • Active number of key values - An enhanced ability to return the active number of key values without having to examine each active partition member. All necessary information is now stored in the host Global Unique Index (GUIx).
  • Query logic modifications - Query logic was modified to check for empty covered ranges, and also to check for unexpected missing partitions in middle partitions that are sampled.
  • Range search - When a range search is performed on a unique index and the range criteria specify an equality match on all segments of the key, a direct equal key function is now called rather than a key range function. For a partitioned file global unique index that does not cover the partition key, this greatly improves performance when many active partitions exist as the equal key call can use the global unique partition host index to find the partition that contains the key value directly, avoiding costly searches through multiple partitions.
  • Improved hashing - An improved hashing mechanism for determining if a given file is already open. For large numbers of open files (such as when partitioned files are in use) this substantially reduces initial open times by reducing search times.

 

ALTER TABLE Add and Drop Columns Supported for Partitioned Files

The ability to add and drop columns for Partitioned Files via an ALTER TABLE (either via SQL or FairCom DB API) has been added. Previously an invalid argument error was returned (CTDBRET_INVARG) when attempting this operation. For very large data sets this could take time, as currently, every record is visited to update based on the new schema. In addition, if indexes require a rebuild, this will require additional time.

 

Improved Rebuilding of Partitioned Files

Enhanced partition file rebuild is supported via three modes:

  • Calling RBLIFILX8() for the partition host forces the host and all partitions to be rebuilt. This is the ctrbldif utility default.
  • Calling RBLIFILX8() for the partition host with tfilno set to badpartIFIL (which cannot be combined with updateIFIL), and, if the host is clean, rebuilds only specific partitions that are not clean.
  • Calling PTADMIN() for a specified partition using ptADMINrebuild mode.This is available in ctpartadmin utility.

For the first two rebuild modes of RBLIFILX8(), the XCREblk argument must be included as we check the x8mode member of XCREblk for partition attributes.

 

Updated Partition Admin Modes Reuse and Base

After purging a partition member, that partition number is no longer available for use, as the member is marked purged. The partition administration function PartAdmin(), had originally stubbed in a reuse mode, however, it was not implemented, and this mode is now available for use. The ptADMINreuse mode only supports reuse of a previously purged partition member.

As part of this change, partition instance numbers are introduced such that the host file can distinguish between different versions of the same partition. By “same” partition we mean partitions that contain the same range of partition key values. Reasons for having different versions of the same partition include purging a partition and then recreating the partition, or rebuilding a file partition (that could result in modified contents). The instance numbers are used in the host’s global unique index (GUIx), if any.

A GUIx contains key values from all partitions as a means of ensuring global uniqueness for a key value across all the partitions. The key value is stored in the GUIx, however, instead of storing a record location to go with the key, we stored the partition number that holds the record. This way, if we purge a partition, we do not need to find all the entries in the GUIx that correspond to the partition because if we find a duplicate conflict when trying to add a new key value, we can check if the existing key value is for a purged partition, and, if so, we can replace it by the new key value for an active partition.

The new implementation now stores not only the partition number in the GUIx, but also the instance number (that defaults to zero). Now it is not only possible to distinguish between purged and active partitions in the GUIx, we can also distinguish between a purged and recreated partition since we force them to have different instance numbers.

For non-huge files, the instance numbers are in the range of 0 to 255. Huge files use four bytes for the instance number.

The ptADMINbase mode behavior has also been improved such that it is possible to change the base raw partition number (that is, the lowest permitted partition number) to any desired value as long as it does not exceed any active partition. Instance numbers permit purged partitions to fall outside the new base number because we can distinguish between different versions of the partitions.

When a partition is opened, its instance number is checked against the host list of instance numbers by partition. If they don’t match the open fails with error PNST_ERR (927).

Note: The addition of instance numbers has caused the partition resource stored in the host data file to be revised, and assigned a version number 2. Prior code will not be able to open a partition file with a version 2 resource, and will fail with error PVRN_ERR (725).

 

Understanding FairCom's Partitioned File and FILESET Technologies

FairCom DB offers many valuable—and, in many cases, unique—features aimed at our goal of making it easy for developers to create highly efficient applications based on c-tree technology. Two of these technologies address the issue of searching and maintaining large numbers of files: Partitioned Files and FILESETs. Both of these features simplify operations by making a large number of files appear as a single file to your application.

This paper compares these technologies. The following questions are examined:

Before answering those questions, an overview and comparison of the two technologies is presented:

See Also

For additional reading, including details of implementation, see the following locations in the FairCom documentation:

 

How Do Partition Files Compare to FILESETs?

A Partitioned File allows a set of files to appear to the application as a single file (called the "Partition Host") with its associated index files. The partition host and its member files are all physical files that exist in the filesystem.

The FairCom DB FILESET feature creates a partition host with connection-level dynamic members. The FILESET is in essence a “runtime” partition host, allowing the user to select at runtime which c-tree files to umbrella under the host file.

 

The key architectural difference between the two technologies is as follows:

  • A Partition Host of a traditional Partitioned File maintains a single partition member list at the system-file level. All connections access the same set of partition members. The member accessed is determined by the partition rule.
  • The Partition Host of a FILESET contains no members at the system level. Each connection that opens the file uses the PTADMIN() API function to associate members with that open instance of the partition host. A FILESET does not have a partition rule.

Other differences exist, some of which are transparent to the application and some of which need to be understood by the developer.

The next sections explain more about these technologies:

 

Partitioned Files

The FairCom Server supports a unique feature known as Partitioned Files. A partitioned file logically appears to be one file (or more accurately one data file and its associated index files), but is actually a set of files whose contents are partitioned by the value of the partition key. Both the data files and index files are partitioned. This permits data with a defined range of values for the partition key to be rapidly purged or archived (instead of having to delete each record within this range record-by-record).

A partitioned file is composed of a host data file and its associated indices combined with a rule. The rule determines how to map the partition key value (e.g., a date, or invoice number, or some other characteristic) into a particular partition member of the host. The host file does not contain any actual data, but serves as a logical file that appears to the application to contain all the data.

A partition member file has the same definition as the host, but only holds data whose partition key maps to the member. For example, customer orders may be partitioned by calendar quarter. All orders booked in the same quarter are stored in the same member. A member is composed of a standard c-tree data file and its associated indices.

To add data to the partitioned file, simply call an add routine for the host. The code will add the record in the proper partition member, creating the partition member if necessary. Rewriting a data record may move the record between partitions if the partition key entry for the record is changed. Under transaction control, such a delete/add record operation is done atomically.

Searching the partitioned file in key order is fairly straightforward and efficient if the key is the partition key (the key used to determine which partition should contain the record). Searches on other, non-partition, keys are less efficient than normal because the record may exist in any of the partition members.

It is possible to manage partition members such that archiving or purging partitions is very quick and efficient.

For more information, see the Partitioned Files chapter in the FairCom DB Programmer's Reference Guide.

PTADMIN

The Partition Administration function, PartitionAdmin() or PTADMIN, allows on-the-fly adjustment to the partitions associated with a given host file, such as:

  • Add, remove, or archive partition(s)
  • Activate archived member(s)
  • Return a member file status
  • Modify lower limit of the raw partition number
  • Modify limit on the number of partitions
  • Reuse the raw partition number of a purged member

See Also

For additional information about this function, see PartitionAdmin in the FairCom DB Programmer's Reference Guide.

FILESETs

In FairCom DB V11 and later, the FILESET syntax allows simple SQL queries to operate on one or many files with little to no loss in performance. Bridging the gap between SQL and the more-efficient direct record access, this extension to SQL gives you the performance you need for challenging statistical analysis and other complex queries.

In many situations, traditional SQL views are not feasible due to the sheer number of tables involved and the ad hoc characteristics of the queries:

  • Data files are sometimes created “on-the-fly” by the application.
  • Traditional SQL queries require static SQL dictionary management consisting of defined entities (i.e., all entities are required to be present in the SQL Dictionary).
  • Performance concerns arise when building SQL views over 1,000s of tables.

The Dynamic FILESET

FairCom DB V11 introduces a concept of a dynamic FILESET. It allows you to treat a number of files as a single source when making SQL queries.

The SQL grammar has been extended to allow specifying the dynamic list of partitions when running queries on dynamic partitioned files. FairCom DB accesses the necessary files and makes them appear as a single table to SQL, thereby eliminating the overhead of creating a SQL view over a large number of files. The FILESET is created dynamically—on-the-fly—so that SQL sees the results as a simple, static table.

FILESETs Help You Find the “Needle in a Haystack”

The FILESET concept can be used to simplify SQL queries. Rather than writing a complex query across multiple files—possibly thousands—FILESETs allow you to write a simple query as though you were searching only a single table. A new function allows you to define a list of dynamic partition members so you can set the partition table dynamic members when using FILESETs.

For more information, see the FILESET topic in the FairCom DB V11 Update Guide.

Supported FILESET Features

The initial FILESET release supports reading data from dynamic partition members through the partition host file. It does not support the following operations on a partition host that has dynamic members:

  • ISAM record add, delete, and update
  • Index rebuild
  • File compact
  • SQL alter table
  • Adding a new permanent or temporary index

See Also

For additional information, see Working with Multiple Files (FILESET) and the topics under it, which explain the details of implementation and operation, in the FairCom DB API Developer's Guide.

 

Comparison Chart

The chart below highlights some of the most important differences between Partitioned Files and FILESETs:

 

  Partitioned File FILESET
Partition Rule Yes, the rule determines which member file a record belong is No. Members are specified at runtime
Definition of Member Files The Partition Rule determines members The members are specified when a connection is made
Member File List Maintained in the host and handled with a specific API Specified at runtime when the connection is made
Full Read-Write Access Yes No. Read-only
Direct Access to Members A partition member can be opened directly only if you specify read-only access, otherwise the open fails with error 733. Dynamic members of the FILESET can be opened for write access because access through the host is read-only.
File Block File block cannot be applied to a single partition member (you need to file bock the partition host to add/remove a partition from a partitioned file) File block cannot be applied to a single member file while it is used in a FILESET
Adding Member Files To add a partition member, add data that belongs to that partition member based on the partition rule The member files included in a FILESET are determined when the FILESET is opened
Archive/Purge Members Archiving a partition member makes it inactive; that member still exists on disk. Purging actually deletes that partition member file from disk. Individual member files can be archived and purged (because the FILESET is read-only, archiving is not an issue)
Common Schema with Variations All members must share the same schema—we do not support variations on a shared common schema All members must share the same schema—we do not support variations on a shared common schema
Full-Text Search Not presently supported. Yes. On the individual member files, but not on the FILESET
Replication The Partitioned File can be replicated Replication is not an issue for FILESETs because they are read-only

 

How Does the Host File Appear to the Application?

To an application, Partitioned Files and FILESETs appear to be standard files. Although internal differences exist, for the most part these differences are not visible to the application.

Partitioned Files

Because a partition host file maintains a single partition member list at the system-file level, all connections access the same set of partition members. The fact that the partition host points to a list of partition member files is transparent to the application.

Most c-tree operations can be performed on the partition host.

File passwords are supported for partitioned files. A partition created on-the-fly is assigned the same security information as the host file.

Encryption is not supported with partitioned files.

Replication of the Partitioned File (and, therefore, its member files) is supported.

FILESETs

A FILESET appears to the application similar to a standard (non-partitioned) file. The only difference is that the FILESET is created on an ad hoc basis for a single connection. The FILESET host file points to a set of member files that is determined when the connection is made. The individual members of a FILESET are standard files.

Although appearing fairly standard, the FILESET has several limitations, such as being read-only. See Supported FILESET Features in FILESETs.

 

Can We Perform Full-Text Search Queries on a Host File?

A Full-Text Search (FTS) provides fast, efficient access to character-type data elements. A Full-Text Index (FTI) can be defined by specifying which character-type fields to include in the index. Once an FTI is defined for a file, it is maintained in “real-time” along with any other b-tree type c-tree indices, including deferred index support. FTS allows you to specify a word or phrase for which you are searching. All records whose FTI-indexed fields contain that text are returned.

FTS operations are theoretically possible on a Partitioned File and a FILESET, although verification is still required. Furthermore, because a FILESET is a collection of individual files, FTS could be performed on an individual member file, however, this needs to be verified as well.

A Full-Text Index cannot be used as a partition index.

 

Can We Operate on a Single Member File Independently?

Partitioned Files and FILESETs are composed of a group of individual files. Some c-tree operations can be performed on the individual files in certain situations.

Partitioned Files

Partitioned Files allow data with a defined range of values for the partition key to be rapidly purged or archived (instead of having to delete record-by-record each record within this range). This administration is done through the FairCom DB PTADMIN API function or the fc_ptadmin_num() SQL stored procedure.

Individual member files can be purged and archived using these functions.

The member files of a Partitioned File cannot be opened by accessing them directly.

FILESETs

FILESETs are composed of individual files, which are standard c-tree files. As such, the member files can be accessed individually, and standard c-tree operations can be performed on them, when they are not part of a FILESET.

It is important that the individual files are not changed while they are being used as members of a FILESET.

 

Can We Do a File Block on an Individual Member File?

In some situations it can be advantageous to block access to a data file by current users. For example, an administrator may desire to compact a data file if the file system is near capacity. Because the compact operation requires exclusive access to the data file and indices, the application would need to be shut down. FairCom provides methods of temporarily blocking access to individual files, which allow the file to be compacted without shutting down the entire application.

Partitioned Files

A file block cannot be applied to a single partition member. To add/remove a partition from a partitioned file, you need to file bock the partition host. The FairCom DB PTADMIN API function or the fc_ptadmin_num() SQL stored procedure is used to archive and purge member files.

FILESETs

The file block concept does not apply to FILESETs because the FILESET is created at runtime and destroyed when the file is closed. While the file set is in effect, it actually behaves like a partitioned file so the same behavior applies.

 

Is It Possible to Add Extra User Fields to an Individual Member File?

FairCom has designed FairCom DB to be extremely flexible in its record definitions. As a truly multimodel database, it provides several means to accommodate non-traditional schema. In the case of Partitioned Files and FILESETS, what appears to the application to be a single file is actually multiple member files. Although all the member files must have a schema in common, variations are allowed in certain circumstances.

At this time, all member files of a Partitioned File or a FILESET must have the same schema. We do not have support variations in the schemas of the various member files, even if they all have a common portion.