Generate schema from a list of motifs.
Arguments:
o motif_repository - A MotifRepository class that has all of the
motifs we want to convert to Schema.
o motif_percent - The percentage of motifs in the motif bank which
should be matches. We'll try to create schema that match this
percentage of motifs.
o num_ambiguous - The number of ambiguous characters to include
in each schema. The positions of these ambiguous characters will
be randomly selected.

00572 :
"""Generate schema from a list of motifs. Arguments: o motif_repository - A MotifRepository class that has all of the motifs we want to convert to Schema. o motif_percent - The percentage of motifs in the motif bank which should be matches. We'll try to create schema that match this percentage of motifs. o num_ambiguous - The number of ambiguous characters to include in each schema. The positions of these ambiguous characters will be randomly selected. """# get all of the motifs we can deal with
all_motifs = motif_repository.get_top_percentage(motif_percent)
# start building up schemas
schema_info = {}
# continue until we've built schema matching the desired percentage# of motifs
total_count = self._get_num_motifs(motif_repository, all_motifs)
matched_count = 0
assert total_count > 0, "Expected to have motifs to match"while (float(matched_count) / float(total_count)) < motif_percent:
new_schema, matching_motifs = \
self._get_unique_schema(schema_info.keys(),
all_motifs, num_ambiguous)
# get the number of counts for the new schema and clean up# the motif list
schema_counts = 0
for motif in matching_motifs:
# get the counts for the motif
schema_counts += motif_repository.count(motif)
# remove the motif from the motif list since it is already# represented by this schema
all_motifs.remove(motif)
# all the schema info
schema_info[new_schema] = schema_counts
matched_count += schema_counts
# print "percentage:", float(matched_count) / float(total_count)return PatternRepository(schema_info)
def _get_num_motifs(self, repository, motif_list):