This study was published in Molecular Biology (Moscow) 2008 Jan-Feb;42(1):163-71.
English translation of this paper is here:
Motivation: Off-target' silencing effect hinders the development of siRNA-based therapeutic applications and interpretation of gene function and phenotypes. Common solution to this problem is an employment of the BLAST that may miss significant alignments. An exhaustive Smith–Waterman algorithm may return accurate answers but is very time-consuming.
Results : We have developed a new algorithm CRM (Comprehensive Redundancy Minimizer) that allows one to map all unique short-string sequences (“targets”) 9-to-15 nt in size within large sets of sequences, e.g. a set of all known transcripts in certain organism. The CRM produces an output file with a list of potential siRNA candidates for every transcript in a certain transcriptome. This output file could be used as an input file for the traditional “set-of-rules” types of siRNA predicting software. 91% of human transcripts are covered by candidate siRNAs with kernel targets of N =15.
Availability : An interactive database listing human siRNA candidates with minimized redundancy is available at http://18.104.22.168. Database is searchable gene by gene and will return all possible siRNA with minimized off-target hybridization for your human gene of interest. The list of the resulting siRNA can be used as input file for your favorit siRNA optimization software.
Please see information about ongoing effort in this direction (jointly with Department of Mathemitics, CoS, GMU)