Before 10.09.2004
The update pages was launched.
The initation pattern was changed to be from Arabidopsis thaliana.
Option to search overlapping patterns added

10.09.2004
Background for Anopheles gambiae added. The upstream sequence collection was downloaded from the Ensembl Ensmart and it contains 14364 sequences (length 3000 pb). Note that all sequences were not marked as known-genes in the Ensembl database. (The tip was given by Jesús Hernández from México, Thanks)

18.05.2005
All background upstream sequence sets were updated. Each background has now two different models (full and clean), the full is a sequence set as it is in the ENSEMBL or TAIR and clean is a sequence set where sequences containing other letters than A,C,G or T have been removed. Since the number of sequences does not differ between the two sequence sets, the recommended one is the clean version.
Another change is that now all (except A. thaliana) upstream sequence sets are updated from ENSEMBL, this may have the biggest effect on S. cerevisia background that was previously updated by using RSA-tools.
Since ENSEMBL ENSMART tool had a bug (one sequence per gene -function was not working) during the updates, background upstream sequence sets were filtered manually. The manual filtering was performed with MySQL where each instance with the same ENSEMBL Gene ID was accepted once. Althought this is not the most optimal way to filter the sequences, it offers consistency between the old and new background models. (the bug was reported nearly one week ago, but it still has not been fixed.)


This tool was developed by Matti Kankainen, University of Helsinki
Contact the Webmaster.
© 2004 University of Helsinki