Example

R3lib is very easy to use. All you need to know is three functions: r3t_create, r3t_destroy, r3t_findPairs. For quick start, here is an example function:

/* this function demonstrates usage of R3lib functions
* */

int example()
{
int result, i;
t_r3t r3t; // r3 tree (representation of repeats in text)
int* p2; // storage for p2 component of (p2,l) pair
int* l; // storage for l component of (p2, l) pair
int pair_cnt; // number of found pairs
int k; // minimal repeat length
int p1; // first component of maximal repeat
int limit; // maximal number of returned repeats, textLen is upper bound of all possible repeats

/* input text */
/* PATTERN occurs at positions 4, 16, 28 and 40 */

BYTE* text = "abcdPATTERNabceaPATTERNbcfabPATTERNcgabcPATTERNhabc";

/* size of input */
int textLen = strlen(text);

/* create r3 tree for given input */
result = r3t_create(&r3t, text, textLen);
if(result)
return result; /* allocate space for query results */
p2 = (int*) malloc(textLen*sizeof(int));
l = (int*) malloc(textLen*sizeof(int)); if(!p2 || !l) return -1;

/* define parameters for findPairs query */
p1 = 4; /* first component of maximal repeat */
k = 5; /* minimal repeat length */ limit = textLen; /* maximal number of returned repeats */

/* find all pairs (p2, l) such that (p1, p2, l) is maximal repeat in text and l >= k */
result = r3t_findPairs(&r3t, p1, k, &pair_cnt, limit, p2, l);
if(result)
return result;

/* now we should have
*
* pair_cnt == 3
* (p2[0], l[0]) == (16, 7)
* (p2[1], l[1]) == (28, 7)
* (p2[2], l[2]) == (40, 7)
*
*/
printf("*** Example of findPairs query ***\n");
printf("Text: '%s'\n", text);
printf("Position: %i\n", p1);
printf("Min. length: %i\n", k);

for(i=0; i<pair_cnt; i++) {
/* (p1, p2[i], l[i]) now represents maximal repeat */
printf("found repeat: (%i, %i, %i)\n", p1, p2[i], l[i]);
}

/* free resources held by r3 tree structure */
r3t_destroy(&r3t);
return 0; }

R3 tree functions

These functions comprise main functionality of R3lib. They are implemented in source file r3.c. All functions return 0 on success and non-null value on error. See conf.h for ERROR_xxx constants.

int r3t_create(t_r3t* r3t, BYTE* text, int textLen)
Create R3 tree for data of length textLen stored at text. Max peak memory usage during construction is (52*textLen + constant). Finished R3 tree will take up max (44*textLen + constant) bytes.
Argument Comment
r3t (out) - t_r3t structure to be initialised
text (in) - location of data
textLen (in) - size of input data
int r3t_findPairs(t_r3t* r3t, int p1, int k, int* count, int count_limit, int* p2, int* l)
Finds all pairs (p2, l) such that (p1, p2, l) is maximal repeat with l>=k. Pairs are stored as (p2[0], l[0]), ..., (p2[count-1], l[count-1]).
Argument Comment
r3t (in) - valid r3 tree
p1 (in) - position in text
k (in) - minimal repeat length
count (out) - number of found pairs (stored in *count)
count_limit (in) - maximal number of returned pairs. count won't exceed count_limit. Normally for small inputs, we can afford to allocate arryas p2 and l to size of input n. Number of found pairs naturally won't exceed n. If we can't afford to make p2 and l that long, we use count_limit to avoid overflow.
p2 (out) - storage location for p2 component of pair (supply pointer to location allocated to at least count_limit length)
l (out) - storage location for l component of pair (supply pointer to location allocated to at least count_limit length)
int r3t_destroy(t_r3t* r3t)
Frees memory allocated by r3t_create.
Argument Comment
r3t (in) - valid r3 tree

Other functions

These are auxilliary functions that may come in handy. See source file debug.c and suffixArray.c.

int output_r3t_html_file(t_r3t* r3t, char* output_filename)
Debugging output of r3 tree in html format. Use only for very small inputs.
Argument Comment
r3t (in) - valid r3 tree
output_filename (in) - output filename
int createSuffixArray(BYTE* text, int textLen, int** suffixArray)
Create suffix array for given input. for text of length n creates suffix array of length n+1. This function works with imaginary endmarker position at the end of input data.
Argument Comment
text (in) - input data
textLen (in) - size of input data +1 (include imaginary endmarker)
suffixArray (out) - suffix array (pointer *suffixArray will be initialised by this function)
int createLCPTable(BYTE* text, int textLen, int* sa, int** lcp)
Create lcp-table for given input and suffix array
Argument Comment
text (in) - input data
textLen (in) - size of input data +1 (include imaginary endmarker)
sa (in) - suffix array
lcp (out) - lcp-table (pointer *lcp will be initialised by this function)
void output_sa_lcp(BYTE* text, int textLen, int* suffixArray, int* lcp, char* filename)
Output suffix array and lcp-table in user readable format into text file. Also prints suffixes. Use only for small files.
Argument Comment
text (in) - input data
textLen (in) - size of input data +1 (include imaginary endmarker)
sa (in) - suffix array
lcp (in) - lcp-table
filename (in) - output file name