Is there any automatic way to convert a list of long gene names (like Cadherin_3453) to its abbreviations, like CDHRN_3453? Are there any abbreviation name convention in Genomics, Bioinformatics?
Sorry, no code herein
Is there any automatic way to convert a list of long gene names (like Cadherin_3453) to its abbreviations, like CDHRN_3453? Are there any abbreviation name convention in Genomics, Bioinformatics?
Sorry, no code herein
Since you did not post a programming language that you want, I am guessing that this is just a simple, one-time exercise that you would like to do.
Though it is not a true abbreviation, you could just remove all of the vowels in the gene name (as you may have done by accident in your example).
You should use:
http://www.togglecase.com/convert_to_disemvowelled_text.php
It was able to change Cadherin_3453
to Cdhrn_3453
.
If you a looking to be able to do this with a program that you can tailor to your specific needs, you can look into this SO question: String replace vowels in Python?
There is the HUGO database which tries to standardize gene names. Depending on your use case you can either try to access their online search every time or download the data and use your own database.