BioNLP Reading Group: Useful Links

Resources * Corpora * Tools * Conferences/Workshops/Shared Tasks* Mailing Lists * Biology

Resources

Entrez-PubMed i.e. MEDLINE. Searchable collection of >12m biomedical abstracts. E-Utilities enable multiple query/article searches (from the command line). MeSH vocabulary can be used in advanced search.

PubMed Central (PMC). Searchable archive of full text articles (some in pdf format only), similar to regular PubMed.

BioMed Central (BMC). Online publishers. Full text articles.

Gene Ontology (GO). Three searchable ontologies: cellular component, biological process and molecular function.

Stanford Biomedical abbreviation server. Probability scored long form of abbreviations (and vice versa). Mined from PubMed.

LocusLink. Provides a single query interface to curated sequence and descriptive information about genetic loci. Good source of gene names/synonyms/symbols and gene function with evidence citations.

Edinburgh Mouse Atlas. 3D embryonic mouse database. Mouse Anatomical Nomenclature of mouse development. Contains links to MGI database.

Mouse Genome Informatics (MGI). Jackson Labs. Provides integrated access to data on the genetics, genomics, and biology of the laboratory mouse.

Nucleic Acids Research Journal 2005 special issue on biological databases. Clickable list of databases.

OBO: Open Biological Ontologies.

Protein Interaction Databases. A collection of protein interaction database links.

Unified Medical Language System (UMLS) Integration of several "controlled" vocabularies. Loosely - a biomedical WordNet.

UMLS Kowledge Source Server (UMLSKS). Online lexical tool to match terms to the UMLS Specialist Lexicon, Metathesaurus and Semantic Network. Get licensee log in details from Sarah Luger

Medical Subject Headings (MeSH) National Library of Medicine's (NLM) controlled vocabulary. Includes online vocabulary look-up aid.

NCI Metathesauraus Browser. Incorporates the UMLS Metathesaurus as well as vocabularies developed by the National Cancer Institue (NCI).

Online Medical Dictionary. Handy resource for the non-biologist.

HGNC: HUGO Gene Nomenclature Committee - giving unique and meaningful names to every human gene. Source of gene synonyms.

Synonyms. Lists of gene synonyms by species and orthologues.

IUBMB. International Union of Biochemistry and Molecular Biology - recommendations on biochemical & organic nomenclature, symbols & terminology etc. Includes protein names and function.

Swiss-Prot. Commonly cited, annotated protein sequence database.

BioNLP.org: a font of knowledge for everything BioNLP. Also has associated mailing list.

Concordia University BioNLP links site.

BioText Marti Heast's BioText project at UC Berkeley.

IE for Bio

Web Search and Mining

BioNLP Resources a la Alex Morgan.

Keep up to date with publications: BARF Bioinformatics Aggregated RSS Feed; Nature RSS feed; EBI RRS feed. Possible newsreader: Bloglines

Keep up to date with Google searches: Google Alert - sends you mail each time Google indexes your word.

Corpora

Genia. 2000 annotated MEDLINE abstracts.

MuchMore. A collection of medical journal aricles with queries and corresponding relevance judgements for Information Retrieval. Articles and queries available in German and English for cross-language tasks.

BioCreative. Corpus used in the BioCreative 2004 task.

Nigel Collier's corpus of abstracts annotated with protein, DNA, RNA and several 'source' tags. The source tags denote species, cell type, tissue type etc.

PathBinder: a collection of sentences extracted from MEDLINE with every sentence containing 2 or more different biomolecules. Also contains a synonym index.

Tools

MedPost: a part-of-speech tagger for bioMedical text. Full text article. Download.

ABNER: A Biomedical Named Entity Recognizer. Recognizes gene, protein, RNA, cell line and cell type (individually).

YAGI: Yet Another Gene Identifier. Recognizes genes, proteins and RNA names as one named entity.

AbGene. A gene name tagger trained on MedLine abstracts.

BioPerl. Includes a set of modules that provide access to MEDLINE and OpenBQS-compliant servers using SOAP (Bio::Biblio). Bio::DB::Flat deals with indexing bio-databases.

BioPython: contains a collection of modules for dealing with biological databases like MEDLINE, Swissprot etc.

Conferences/Workshops/Shared Tasks

Upcoming:

ISMB 2005 13th International Conference on Intelligent Systems for Molecular Biology, Detroit, June 25-29, 2005.

BioLINK SIG: Linking Literature, Information and Knowledge for Biology, ISMB/ACL joint workshop, Detroit, June 24, 2005.

BioSysBio Bioinformatics and Systems Biology Conference, New Royal Infirmary, Edinburgh, July 14-15.

LLL05 Learning Language in Logic, ICML05 workshop, 7 August 2005, Bonn, Germany. Includes a challenge task based on protein interactions.

NaTeMed2005 Special Session on Natural Language Processing and Text Mining in Medicine at 9th International Conference on Knowledge-Based & Intelligent Information & Engineering Systems (KES2005), Melbourne, Australia, 14-16 September 2005

ECCB05 4th European Conference on Computational Biology (ECCB05), Madrid, Spain, September 28 - October 1, 2005

TREC Genomics 2005. Conference: Gaithersburg, Nov, 2005. Competition.

PSB06 Pacific Symposium on Biocomputing Maui, Hawaii, January 3-7, 2006

ISMB 2006 14th International Conference on Intelligent Systems for Molecular Biology, Fortaleza, Brazil, August 6-10, 2006.

Past:

SMBM. First International Symposium on Semantic Mining in Biomedicine, Hinxton, Cambridge, April, 2005.
DBiBD Workshop on Database Issues in Biological Databases (DBiBD), National e-Science Centre, Edinburgh, Scotland, January 8-9, 2005. (Gail has hard copy of proceedings.)
ISMB 2004 12th International Conference on Intelligent Systems for Molecular Biology combined with 3rd European Conference on Computational Biology, Glasgow, July/Aug 2004.
The 2004 BioLink meeting ISMB SIG.
NLPBA/BioNLP 2004 Joint Workshop on Natural Language Processing in Biomedicine and its Applications, incorporating a bio-entity recognition Shared Task.
BioCreative Task 2004. Bio-entity recognition.
TREC Genomics Information Retrieval competition and conference, with plans to include information extraction.
2004 Data Mining and Text Mining for Bioinformatics. Workshop at the ECML / PKDD 2004, Pisa, Sept, 2004
2003 Data Mining and Text Mining for Bioinformatics. Workshop at the ECML / PKDD 2003 in Dubrovnik-Cavtat, Croatia; 22. September, 2003.
Workshop on Natural Language Processing in Biomedicine, 2003, ACL Workshop.
Workshop on Natural Language Processing in the Biomedical Domain, 2002, ACL Workshop.
KDD Cup 2002 Competition involving data mining in molecular biology domains.

Mailing Lists

Scottish Bioinformatics Forum (SBF)

Edinburgh bioinformatics mailing list. Email Majordomo@lists.ed.ac.uk with 'subscribe bifx' in the body of the message.

BioNLP mailing list from BioNLP.org

Biology

Central Dogma of Molecular Biology: Link 1; Link 2; Link 3.


Maintained by Gail Sinclair. This page was last updated on 17/05/2005.