Tuesday, December 18, 2007
Affy CEL files for Tilescope
Tilescope does support CEL files. It is possible that it doesn't recognize your extension if it is not in all lower case. Since it is a web program, it doesn't matter if you're using a Mac or PC or any other OS as long as you're using a browser.
Thursday, December 13, 2007
Tiling array analysis tools
You might want first to take a look at:
http://tiling.gersteinlab.org/platformcmp/
and the tool Tilescope as described in this paper:
http://papers.gersteinlab.org/papers/tilescope/
Monday, November 5, 2007
How to Cite Morph Server and CNS Script
Best to just cite
http://papers.gersteinlab.org/papers/molmovdb-update-nar/
http://papers.gersteinlab.org/papers/morphs-nar/
in the paper and end of suppl. material (if possible).
Wednesday, October 24, 2007
Mixed up figures in "Divergence of TFBS across related yeast species"
- Should Fig 2C be current Fig 2E?
- Should Fig 2D_1 be current Fig 2C?
- Should Fig 2E be current Fig 2D_1?
up with the Figure, but in the legend parts C-E are mixed up. It should
read:
Comparison of binding by Ste12 and Tec1 across S. cerevisiae (red), S.
mikatae (blue), and S. bayanus (green). (A) Conserved binding. (B)
Conserved binding with quantitative signal differences. (C) Species-specific
binding despite conserved consensus sequences. (D) Binding only in S.
mikatae and S. bayanus. (E) Conserved binding with loss of consensus
sequences in one species. ChIp-chip enrichment signals are shown (log 2
ratios). Circles and squares represent matches to Tec1 PWM and Ste12
PWM, respectively. Triangles, nonconserved peaks; **, >2-fold difference
in peak signal intensity; *, >1.5-fold difference in peak signal
intensity.
Friday, October 19, 2007
Electronic Versions of Papers
Essentially all of my work is available on-line. Go to:
http://papers.gersteinlab.org
and click on the appropriate "preprint" link. You will be get a preprint or (if appropriate) journal reprint of the paper you want. There should be NO password challenges or other barriers. Usually, the papers are in PDF format but some are in HTML. (Other formats are available directly from http://papers.gersteinlab.org/e-print.)
Thursday, October 18, 2007
How to input parameters in NuProt Calculator
The NucProt Packing-Eff calculator works on PDB structures NOT sequences, so they cannot mean that:
http://www.molmovdb.org/cgi-bin/voronoi.cgi
The NucProt PSV and Volume Calculator:
http://www.molmovdb.org/cgi-bin/psv.cgi
is not the best written CGI code that can take several sequences at once, but since I didn't understand HTTP GET and POST very well, I had it put all in the information in the GET rather than the POST, so there is a browser limit. I don't think I have write access anymore to this file.
Technically it can take several sequence in FASTA format though.
Wednesday, October 17, 2007
How to obtain datasets from interolog?
Please visit http://interolog.gersteinlab.org/
Tuesday, October 9, 2007
Are there errors in "Relating 3D structures to protein networks provides evolutionary insights"?
You're right, also refer to our website sin.gersteinlab.org, where we list this typo.
Data for Ste12 amd Tec1
Please find it on GEO: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE5421
Wednesday, October 3, 2007
A tricky problem with PDB files
The most common problems arise from use of a flavor of the PDB format that the morph server doesn't recognize, as you may be doing for the modelled structure. You could try submitting a truncated version of the the crystal structure (which I'm guessing you downloaded from the pdb) as structure 1, change a few coordinates, and submit that modified file as structure 2. This will give us a quick test to determine if it is your modelled structure that is causing the problem.
Monday, October 1, 2007
Morph Server functioning correctly?
The morph server has experienced an unexpected surge in popularity recently. We are discussing upgrading hardware and this should help. Also we will be investigating whether we have software issues that need to be addressed. I can tell you that new submissions are successfully being added daily, so the server does work. My suggestion for fast results is to try small proteins and ask for a small number of frames (maybe just 4 or so). Also we have three different morph engines as you may appreciate, so if one doesn't work you might try another the next day. Please don't flood our server with multiple submissions of similar proteins in a single day though.
Thursday, September 13, 2007
Tilescope not working
Currently Tilescope is going through some revisions and will be offline until further notice. Thank you for your patience.
Thursday, September 6, 2007
CK19 pseudogenes effect on primer design
cancer patients, I have a few questions to you and would be very grateful if you could answer them:
We designed our primers for the PCR approach in that way that the two pseudogenes (Genbank accession number M33101 (CK19a) and U85961 (CK19b)) mentioned in the literature cannot be amplified. You wrote in your article that there exist 4 pseudogenes of CK19 on chromosome 4, 6, 10 and 12. As we want to avoid false positive results in our detection approach it is very important for us to know the sequence of the other 2 pseudogenes. Could you provide us with the sequence or some further information on these pseudogenes – possibly the alignment of the CK19 sequence with the 4 pseudogenes? Do you know if CK19 pseudogenes can be transcribed? I looked in the literature and databases for this information but could not find an answer.
There are indeed 4 processed pseudogenes identified by the pipeline methodology developed in Dr. Gerstein's lab. You can retrieve the nucleotide sequences and the alignment of each pseudogene to CK19 from the following URL
http://homes.gersteinlab.org/people/suganthi/outbox/CK19/
I have also included one other potentially pseudogenic fragment upstream of CK19 at Chr 17. This is labeled 'ambiguous' as only a small portion matches to the parent gene. I have provided nucleotide alignments as I assume that is what is relevant for PCR purposes. Also, please note that all the pseudogenes are processed pseudogenes.
Wednesday, August 22, 2007
Method for applying spectral biclustering algorithm
The code was written in MATLAB. The key steps involve using an iterative bi-normalization procedure followed by the standard SVD function. The subsequent steps involve partitioning of the left and right eigenvectors using a k-means like procedure.
We have not put all the components in one fully automated version. If you think that getting the components is useful for you I will dig in the directories of my old computer.
Turning morph to a movie
On the jmol morph page you will see a link that says:
Orient the molecule to your liking and then:
Generate high-res gif
Unfortunately it is just a wireframe animation. I had it set to do cartoons but for some reason it's back to wireframe now.
Alternatively, you can download the interpolated trajectory, then animate using vmd or jmol.
Look for an NMR-formatted PDB file called movie.pdb here:
http://www.molmovdb.org/uploads/b061676-11139
Compiled Sequences for Human and Chimp
Sorry, we don't have that.
Monday, August 20, 2007
Local Clustering of Expression Data Software Download
(J Mol Biol. 2001 Dec 14;314(5):1053-66). I would really appreciate if
you provide me the link to download the software for my academic use.
http://bioinfo.mbb.yale.edu/expression/cluster/program.html
Friday, August 17, 2007
Real motions for residues in paper "Normal Modes for Predicting Protein Motions"
Regarding my "Normal modes" paper: the “real (observed) motions for residues” are not
actually “real motions” – as long as we could find two substantially different conformations for the same protein in PDB, we assumed that such motion (1st conformation and 2nd one) could potentially take place. We didn’t do any MD simulations in that paper (although, we did possess technologies that would allow us to simulate such motion – e.g. our MorphServer). In our normal modes paper we figured that any such simulation is unnecessary – all we needed for that study was just a set of vectors connecting the residues from the two conformations (to examine how they correlate with the NM-predicted motion vectors).
Wednesday, July 18, 2007
Non-PDF format for 1136174s_TableS6 in Paper
We have put up Table S6 (and other supplementary data) as .xls and .tab format on our website:
http://sin.gersteinlab.org
Monday, July 2, 2007
Information concerning source code of the calculation for betweenness
The supplementary website: http://www.gersteinlab.org/proj/bottleneck/ contains more information about this paper.
Saturday, June 23, 2007
Program for calculating number of waters inside protein
93), do you have an algorithm available for distribution (or know of one)
that can calculate the number of waters inside a protein ?
Only programs are available at geometry.molmovdb.org which calculate packing density but position waters best done by other programs.
Monday, June 18, 2007
Adding Original Genes with Gene Names to Pseudogenes Website
Most of the original genes are listed by Ensembl ID. You can look up their information at http://www.ensembl.org/
Sunday, June 17, 2007
Deciding which Fragment is Flexible or not on MolmovDB
If your protein has been crystallized in two conformations you can submit a job to our "morph server." Visual inspection may then give you the information you seek. If you have only one structure, you can submit it to our HingeMaster server, and several different flexibility analysis programs will be used to find the hinge location. Note that this is designed for use with domain hinge bending proteins. It will not be very helpful if your protein moves by shear or order-disorder transition mechanisms.
We also have a motion prediction program, the Conformation Explorer, which predicts conformational change for hinge bending proteins, either induced by ligand or otherwise. It's still under development, and use would require further discussion.
Tuesday, June 12, 2007
MolMovDB Data Dump
Monday, June 11, 2007
How to Create a Movie of Motions if there's DNA Lesion
If you submit the protein and DNA separately this should work. You would then have the problem, though, of putting the two back together. What I would suggest is submitting the two jobs and emailing me when they are done. I would then remove them from the public part of the database, but you could still download the structural interpolations. I would suggest using a small number of interpolated frames, at least for a first try, to minimize compute time.
Tuesday, June 5, 2007
Pseudogene Sequence Data Download
None of the flatfiles contain the raw sequence information. On an individual pseudogene basis, however you can query the system for either the amino acid or nucleotide sequence. Simply search for the pseudogene you're looking for and on the results page click either the red or yellow button.
(Example results page: http://www.pseudogene.org/cgi-bin/search-results.cgi?tax_id=9606&set_search=63&criterion0=&operator0=&searchValue0=&all=View+All+Pseudogenes&sort=1&output=html )
To get the sequence information for a large set of pseudogenes, however, it would probably be best to write the program you suggested.
Tuesday, May 29, 2007
Packing Software on Mac OS X
The packing software on molvovdb geometry site working on Mac OS X? If not, can you help with porting it to work on Mac OS X?
There has been problems compiling the program and making it work on Mac OS X. The program was written by a lab member who no longer is available for this. Please continue using the web interface.
PNAS Paper Error
Regarding the PNAS paper titled Genomic analysis of the hierarchical structure of regulatory networks, I am having trouble understanding the organization of the columns. How was the placement of proteins in columns determined? And what is the purpose behind the duplication of level 1 proteins such as SPT8, and gaps within the columns?
Unfortunately, I think that is a typo when PNAS edited our paper. It is not in our pdf file that we sent to PNAS.
Sunday, May 20, 2007
Background Probability
We used spatial ellipsoid of coordinate dist. in ali. for atom's position and used a flat prior in estimating this.
Table 1 Discrepancy
Regarding the Science paper titled Relating Three-Dimensional Structures to Protein Networks Provides Evolutionary Insights the numbers in the Table 1 tell something different. According to Table 1, Simultaneously possible interactions have less fraction of same functions, as well as less fraction in co-expression correlation. Can you please clarify?
The table headings unfortunately got switched at some unknown point during copy editing. We have posted a note to that point on our site. Do let me know if you need more help on this!
Friday, May 18, 2007
Sieve-Fit Program
see http://geometry.molmovdb.org/
then
http://bioinfo.mbb.yale.edu/geometry/screw-axis/
http://geometry.molmovdb.org/files/geometry/readme.html
then
http://geometry.molmovdb.org/files/geometry/src-prog3/sieve-fit.main.c
Unix Version of Voronoi Calculator?
Where is there a stand alone version of the Voronoi packing efficiency calculator program for unix platform?
The website runs the packing-eff.exe program (in src-prog3/ folder)from this package: http://geometry.molmovdb.org/files/libproteingeometry-2.2.tgz
Currently there is no unix version.
Monday, May 14, 2007
How do I get your original, published papers online?
Essentially all of my work is available on-line. Go to:
and click on the appropriate "preprint" link. You will be get a preprint or (if appropriate) journal reprint of the paper you want. There should be NO password challenges or other barriers. Usually, the papers are in PDF format but some are in HTML. (Other formats are available directly from http://papers.gersteinlab.org/e-print.)
Please let me know if you have any problems with this service. If you can't get
what you want, we can easily post you normal paper reprints.
PS I'm CC'ing this message to our FAQ archive (faq@bioinfo.mbb.yale.edu) as it
enables me to track the popularity of various papers.
Monday, April 16, 2007
Complicated question regarding a protocol to decrease the transition
Do you think that this behavior is correct? Does it look strange that I have any energetic barrier of the transition?
I would be very cautious about assuming that a linear interpolation between two structures represents the thermodynamically most probable trajectory of motion. I don't know why using more frames would increase hysteresis -- presumably you mean that an energy difference resulted when the protein nearly finished its morph trajectory. Maybe the coarser interpolation jumped over a barrier and found a more favorable path. If you clarify what you are doing I may be able to provide a hint, though I think most likely I do not have a rigorous answer for you.
Sunday, April 15, 2007
Atomic Structure
I think this is possible. I think we can put this on our data download page on
sin.gersteinlab.org . We'll follow up shortly on this.
Tuesday, January 30, 2007
Clarification for table 1 of "Relating Three-Dimensional Structures to Protein Networks Provides Evolutionary Insights"
Yes, you're right, unfortunately the column headings of table 1 got switched at a late stage of editing. Since we've been getting a fair share of email about this, there is a note regarding this on the paper website http://sin.gersteinlab.org