BioNLP Reading Group: Useful Links |
| Resources * Corpora * Tools * Conferences/Workshops/Shared Tasks* Mailing Lists * Biology |
| Resources |
Entrez-PubMed i.e. MEDLINE. Searchable collection of >12m biomedical abstracts. E-Utilities enable multiple query/article searches (from the command line). MeSH vocabulary can be used in advanced search.
PubMed Central (PMC). Searchable archive of full text articles (some in pdf format only), similar to regular PubMed.
BioMed Central (BMC). Online publishers. Full text articles.
Gene Ontology (GO). Three searchable ontologies: cellular component, biological process and molecular function.
Stanford Biomedical abbreviation server. Probability scored long form of abbreviations (and vice versa). Mined from PubMed.
LocusLink. Provides a single query interface to curated sequence and descriptive information about genetic loci. Good source of gene names/synonyms/symbols and gene function with evidence citations.
Edinburgh Mouse Atlas. 3D embryonic mouse database. Mouse Anatomical Nomenclature of mouse development. Contains links to MGI database.
Mouse Genome Informatics (MGI). Jackson Labs. Provides integrated access to data on the genetics, genomics, and biology of the laboratory mouse.
Nucleic Acids Research Journal 2005 special issue on biological databases. Clickable list of databases.
OBO: Open Biological Ontologies.
Protein Interaction Databases. A collection of protein interaction database links.
Unified Medical Language System (UMLS) Integration of several "controlled" vocabularies. Loosely - a biomedical WordNet.
UMLS Kowledge Source Server (UMLSKS). Online lexical tool to match terms to the UMLS Specialist Lexicon, Metathesaurus and Semantic Network. Get licensee log in details from Sarah Luger
Medical Subject Headings (MeSH) National Library of Medicine's (NLM) controlled vocabulary. Includes online vocabulary look-up aid.
NCI Metathesauraus Browser. Incorporates the UMLS Metathesaurus as well as vocabularies developed by the National Cancer Institue (NCI).
Online Medical Dictionary. Handy resource for the non-biologist.
HGNC: HUGO Gene Nomenclature Committee - giving unique and meaningful names to every human gene. Source of gene synonyms.
Synonyms. Lists of gene synonyms by species and orthologues.
IUBMB. International Union of Biochemistry and Molecular Biology - recommendations on biochemical & organic nomenclature, symbols & terminology etc. Includes protein names and function.
Swiss-Prot. Commonly cited, annotated protein sequence database.
BioNLP.org: a font of knowledge for everything BioNLP. Also has associated mailing list.
Concordia University BioNLP links site.
BioText Marti Heast's BioText project at UC Berkeley.
BioNLP Resources a la Alex Morgan.
Keep up to date with publications: BARF Bioinformatics Aggregated RSS Feed; Nature RSS feed; EBI RRS feed. Possible newsreader: Bloglines
Keep up to date with Google searches: Google Alert - sends you mail each time Google indexes your word.
| Corpora |
Genia. 2000 annotated MEDLINE abstracts.
MuchMore. A collection of medical journal aricles with queries and corresponding relevance judgements for Information Retrieval. Articles and queries available in German and English for cross-language tasks.
BioCreative. Corpus used in the BioCreative 2004 task.
Nigel Collier's corpus of abstracts annotated with protein, DNA, RNA and several 'source' tags. The source tags denote species, cell type, tissue type etc.
PathBinder: a collection of sentences extracted from MEDLINE with every sentence containing 2 or more different biomolecules. Also contains a synonym index.
| Tools |
MedPost: a part-of-speech tagger for bioMedical text. Full text article. Download.
ABNER: A Biomedical Named Entity Recognizer. Recognizes gene, protein, RNA, cell line and cell type (individually).
YAGI: Yet Another Gene Identifier. Recognizes genes, proteins and RNA names as one named entity.
AbGene. A gene name tagger trained on MedLine abstracts.
BioPerl. Includes a set of modules that provide access to MEDLINE and OpenBQS-compliant servers using SOAP (Bio::Biblio). Bio::DB::Flat deals with indexing bio-databases.
BioPython: contains a collection of modules for dealing with biological databases like MEDLINE, Swissprot etc.
| Conferences/Workshops/Shared Tasks |
Upcoming:
ISMB 2005 13th International Conference on Intelligent Systems for Molecular Biology, Detroit, June 25-29, 2005.
BioLINK SIG: Linking Literature, Information and Knowledge for Biology, ISMB/ACL joint workshop, Detroit, June 24, 2005.
BioSysBio Bioinformatics and Systems Biology Conference, New Royal Infirmary, Edinburgh, July 14-15.
LLL05 Learning Language in Logic, ICML05 workshop, 7 August 2005, Bonn, Germany. Includes a challenge task based on protein interactions.
NaTeMed2005 Special Session on Natural Language Processing and Text Mining in Medicine at 9th International Conference on Knowledge-Based & Intelligent Information & Engineering Systems (KES2005), Melbourne, Australia, 14-16 September 2005
ECCB05 4th European Conference on Computational Biology (ECCB05), Madrid, Spain, September 28 - October 1, 2005
TREC Genomics 2005. Conference: Gaithersburg, Nov, 2005. Competition.
PSB06 Pacific Symposium on Biocomputing Maui, Hawaii, January 3-7, 2006
ISMB 2006 14th International Conference on Intelligent Systems for Molecular Biology, Fortaleza, Brazil, August 6-10, 2006.
Past:
SMBM. First International Symposium on Semantic Mining in Biomedicine, Hinxton, Cambridge, April, 2005.| Mailing Lists |
Scottish Bioinformatics Forum (SBF)
Edinburgh bioinformatics mailing list. Email Majordomo@lists.ed.ac.uk with 'subscribe bifx' in the body of the message.
BioNLP mailing list from BioNLP.org
| Biology |
Central Dogma of Molecular Biology: Link 1; Link 2; Link 3.