Pair-cores detects reliable blocks (plus-blocks) for input alignment of sequences with known 3D structure. Reliable block consists of parts of sequences, which correspond each other in the alignment and share similar 3D structures. The degree of similarity is regulated by algorithm parameters. Steps of the algorithm:
- Find geometrical cores for each pair of sequences from input alignment.
- Filter pair geometrical cores. Pair geometrical cores with too few alignment positions (less than threshold) are discarded. Option "Ignore cores, owned by one SS element" can be enabled to discard pair cores, located in only one element of secondary structure of protein (because each SS elements of the same type (alpha-helix, beta-chains) share similar 3D structure).
- Create high blocks (plus-blocks). There are two implementations of this step: homology (recommended by default) and blocks3D.
- Homology. Amino acids of the same position of a pair geometrical core are considered homologous; relation of homology is enclosed by transitivity in each alignment position. Classes of homologous residues, followed one after another and located on same sequences, are considered plus-block.
- blocks3D. Construct a graph. Vertices are all residues of the alignment. Two residues are connected with an edge, if there is a pair geometrical core, including both residues. Find cliques in the graph. Found cliques are sorted by number of affected sequences and then by number of residues. First clique is considered a plus-block; it is removed from the graph; find next clique and so on. Cliques are searched using time-limited Bron-Kerbosch algorithm. After timeout, fast heuristic algorithm is run.