Example of Inverted-file Index Construction

 

Let us consider a small imaginary protein A with 11 AA residues and 2 SSEs.

(All the measurements are in Angstroms.)

Figure S1: The C-alpha coordinates and SSE annotations in protein A.

 

We can represent protein A as a distance matrix.

Figure S2: Distance matrix of Protein A. Shaded areas are the inter-SSE contact regions.

 

We can represent the two SSEs as vectors in 3D space.

Figure S3: Vector representation of two SSEs in protein A.

 

We can extract a feature vector from each contact region Kab formed by SSE a and SSE b.

Figure S4: 3 feature vectors for 3 contact regions in protein A.

 

We hash the feature vectors into a hash table which points to the posting lists of the proteins in which they occur.

Figure S5: Sample portion of an inverted-file index.