Showing posts with label pseudogene. Show all posts
Showing posts with label pseudogene. Show all posts

Wednesday, August 22, 2007

Compiled Sequences for Human and Chimp

I am seeking sequences for all the pseudogenes listed in the flat files for (at minimum) human and chimp ( 9606.71.gtf and 9598.2.gtf). I tried to look at the assembled sets on the website but I only found compiled sequences for processed or putative pseudogenes, and not duplicate pseudogenes. I wanted to ask you if there are files somewhere on the site that have sequence data for all pseudogenes listed in the species gtf files.

Sorry, we don't have that.

Monday, June 18, 2007

Adding Original Genes with Gene Names to Pseudogenes Website

On the Pseudogenes website, only a few(~300) of the 16K pseudogene hits could be linked to a gene in the RefSeq gene list file. Would it be possible to add a list off all the original genes with their genome location on your website.

Most of the original genes are listed by Ensembl ID. You can look up their information at http://www.ensembl.org/

Tuesday, June 5, 2007

Pseudogene Sequence Data Download

We know there is a way to download the pseugogenes of each organism, the file that is downloaded comes with the name of the pseudogene, the start and end position etc. But we wanted to download the sequences of each pseudogene of each organism directly, and we didn't find a way to do that in the database. Is it possible to download the sequence? Or do we have to make a program that, given the genome of the organism and the start/end positions of each peseudogene, extract the correspondent sequence?

None of the flatfiles contain the raw sequence information. On an individual pseudogene basis, however you can query the system for either the amino acid or nucleotide sequence. Simply search for the pseudogene you're looking for and on the results page click either the red or yellow button.

(Example results page: http://www.pseudogene.org/cgi-bin/search-results.cgi?tax_id=9606&set_search=63&criterion0=&operator0=&searchValue0=&all=View+All+Pseudogenes&sort=1&output=html )

To get the sequence information for a large set of pseudogenes, however, it would probably be best to write the program you suggested.