1.Representation of Chemicals in Biomedical Terminologies
Stefan Schulz
Medical Informatics
Research Group
UniversityMedical Center
Freiburg, Germany
2nd CHEBI User Group Workshop 201023-24 June 2010EMBL-EBI, Hinxton, Cambridge, CB10 1SD, UK
2.Purpose of this talk
To give an overview of sources of chemicals in biomedical terminologies based on the UMLS
To estimate their coverage related to ChEBI
To analyze the ontological representation in the sources
To discuss cross mapping with ChEBI
3.Overview of UMLS
4.Unified Medical Language System (UMLS)
Metathesaurus
Very large, multi-purpose and multi-lingual vocabulary database (158 sources)
information about biomedical concepts (2M), their various names (8M), and relationships among them (41M)
IP restrictions apply
Semantic Network
Semantic Types, that provide a consistent categorization of all concepts represented in the UMLS Metathesaurus
9.Semantic Labeling of UMLS concepts
Semantic labeling
Done by the NLM
Each UMLS concept is assigned to one or more semantic types
10.Chemicals in UMLS and its sources
11.Semantic Network types for chemicals:
T103|Chemical
T104|Chemical Viewed Structurally
T109|Organic Chemical
T110|Steroid
T111|Eicosanoid
T114|Nucleic Acid, Nucleoside, or Nucleotide
T115|Organophosphorus Compound
T116|Amino Acid, Peptide, or Protein
T118|Carbohydrate
T119|Lipid
T120|Chemical Viewed Functionally
T121|Pharmacologic Substance
T122|Biomedical or Dental Material
T123|Biologically Active Substance
T124|Neuroreactive Substance or Biogenic Amine
T125|Hormone
T126|Enzyme
T195|Antibiotic
T192|Receptor
T127|Vitamin
T129|Immunologic Factor
T130|Indicator, Reagent, or Diagnostic Aid
T131|Hazardous or Poisonous Substance
T196|Element, Ion, or Isotope
T197|Inorganic Chemical
T200|Clinical Drug
20.Alternative Billing Concepts
ALT Matthiola graeca / giliflower
ALT Adoxa moschatellina / common moschatel
ALT Cinnamomum camphora, camphor, Homeopathic preparation
ALT Croton eleuteria / cascarilla / amber kabug / sweet bark
ALT Pediculus capitis, Homeopathic preparation
ALT Zea italica, corn silk, Homeopathic preparations
ALT Hippozaeninum / glanders nosode
ALT Cobalt
ALT Salvia officinalis, homeopathic preparation
ALT Arbutus andrachne preparation
ALT Andira araroba / chrysarobinum / chrysophan / goa powder
ALT Sedum acre / small houseleek
ALT Urinum humanum / human urine
ALT Aurum muriaticum natronatum / double chloride of gold and sodium / sodium chloroaurate
ALT Cistus canadensis preparation
ALT Xanthorrhea arborea preparation
ALT Cornus florida preparation
ALT Aquilegia vulgaris preparation
ALT Ergotinum, homeopathic preparation
ALT Mimulus lewisii / rose colored musk
ALT Lac vaccinum coagulatum / milk curds
ALT Robinia pseudoacacia / yellow locust
ALT Solidago virgaurea, homeopathic preparation
ALT Cholesterinum / cholesterine
ALT Benzinum dinitricum, benzinum, benzol, coal naphtha, Homeopathic preparation
ALT Derris pinnata / pongram
ALT Calcarea renalis, Homeopathic preparations
ALT Centella asiatica, homeopathic preparation
ALT Culex musca, Homeopathic preparation
ALT Python regia (homeopathic remedy)
ALT Darlingtonia californica / California pitcher plant
ALT Five flower formula / rescue remedy
21.Chemicals in UMLS source vocabularies
22.Hidden references to chemicals
Accidental poisoning by other opiates NOS
Accidental poisoning by codeine
Accidental poisoning by pethidine
Accidental poisoning by morphine
Accidental poisoning by opium
Accidental poisoning by aromatic analgesics NOS
Accidental poisoning by aromatic analgesics NEC
Accidental poisoning by acetanilide
Accidental poisoning by phenacetin
Accidental poisoning by aminophenazone
Accidental poisoning by antirheumatics NOS
Accidental poisoning by pentazocine
Accidental poisoning by pentobarbitone
Accidental poisoning by quinalbarbitone
Accidental poisoning by bromides
Accidental poisoning by cabromal derivatives
Accidental poisoning by carbamic esters
Accidental poisoning by chlorpromazine
Accidental poisoning by fluphenazine
Accidental poisoning by prochlorperazine
Accidental poisoning by promazine
Accidental poisoning by spiperone
Accidental poisoning by chlordiazepoxide
Example: ICD9-CM
Example:
*intox* or “*poison* or *allerg* returns 10800 non-chemical concepts
roughly half of them refer to chemicals
23.Explicit reference to chemicals
24.Chemicals in UMLS: Summary
MeSH (85% Substance terms) is the most important source for chemicals
Health care related sources include also natural products, drugs, lab procedures
Pharmacy related sources include pharmaceutical preparations and products
Many sources are rather heterogeneous (UMLS typing not always consistent)
(implizit) reference to chemicals in most clinical terminologies
25.Ontology aspects of UMLS chemistry sources
26.Ontology aspects of UMLS chemistry sources
UMLS only includes Concept – Relation – Concept triplets
Only very few UMLS sources are “ontology-like”, i.e. they have some formal semantics, e.g. SNOMED CT or NDF-RT
UMLS distinguishes thesaurus-style broader/narrower hierarchy-building relations from more precise ones (“relation attributes”)
Only part of the latter describe the entities to be represented themselves (e.g. part-of, has-active ingredient), other ones describe the representational units and the attached terms (“mapped-to”, “has-translation”)
27.Ontological relations involving chemicals (608,315)
Chemical – Rel – Non-Chemical
28.Ontological relations between chemicals (173,502)
Chemical – Rel – Chemical
29.Analysis of relations in UMLS
Broad spectrum and high number of relations between chemicals and non-chemicals. Of interest for relating chemical with other concepts of biomedical interest.
Rather poor in terms of inter-chemical relations, often due to Semantic type misassignments
SNOMED CT: quinupristin-dalfopristin has_active_ingredient dalfopristin
NDFFT: Raloxifene Hydrochloride has_mechanism_of_action Selective Estrogen Receptor Modulators
CRISP: Reserpine used_for reserpate derivative
NCI: Rimantadine Hydrochloride has_free_acid_or_base_form Rimantadine
30.MeSH in PubChem
Properties as parents in informal hierarchy
31.Mapping / Tagging
32.UMLS MetaMap / Medical Text Indexer
567 Morphinans [Organic Chemical]
577 Seconal [Organic Chemical,Pharmacologic Substance]
604 Talwin [Organic Chemical,Pharmacologic Substance]
627 Acetanilides [Organic Chemical,Pharmacologic Substance]
637 Aromatic (AROMATICS) [Organic Chemical,Pharmacologic Substance]
645 Esters [Organic Chemical]
645 derivatives [Chemical Viewed Structurally]
660 Acetanilid (acetanilide) [Organic Chemical,Pharmacologic Substance]
660 Amidophenazon (Aminopyrine) [Organic Chemical,Pharmacologic Substance]
660 Bromides [Inorganic Chemical]
660 Chlorpromazine [Organic Chemical,Pharmacologic Substance]
660 Codeine [Organic Chemical,Pharmacologic Substance]
660 Morphine [Organic Chemical,Pharmacologic Substance]
660 Opium [Organic Chemical,Pharmacologic Substance]
660 Pentazocine [Organic Chemical,Pharmacologic Substance]
660 Pentobarbitone (Pentobarbital) [Organic Chemical,Pharmacologic Substance]
660 Pethidine (Meperidine) [Organic Chemical,Pharmacologic Substance]
660 Phenacetin [Organic Chemical,Pharmacologic Substance]
660 Quinalbarbitone (Secobarbital) [Organic Chemical,Pharmacologic Substance]
1000 Fluphenazine [Organic Chemical,Pharmacologic Substance]
MetaMap Version Used: metamap09
MetaMap Options: -A+
Lexicon Used: 2009
Knowledge Source Used: 09
Input Text:
Accidental poisoning by codeine
Accidental poisoning by pethidine
Accidental poisoning by morphine
Accidental poisoning by opium
Accidental poisoning by aromatic analgesics NOS
Accidental poisoning by aromatic analgesics NEC
Accidental poisoning by acetanilide
Accidental poisoning by phenacetin
Accidental poisoning by aminophenazone
Accidental poisoning by antirheumatics NOS
Accidental poisoning by pentazocine
Accidental poisoning by pentobarbitone
Accidental poisoning by quinalbarbitone
Accidental poisoning by bromides
Accidental poisoning by cabromal derivatives
Accidental poisoning by carbamic esters
Accidental poisoning by chlorpromazine
Accidental poisoning by fluphenazine
33.Whatizit
34.Conclusions
Most Biomedical Terminologies contain chemical concepts, drugs or concepts referring to them
MeSH has the highest coverage
Fairly good coverage of semantic relations linking chemicals to non-chemicals
No significant source for semantic relations between chemicals
Mappings ChEBI – UMLS:
to MeSH via PubChem, but only higher level MeSH terms
NLP tools (MetaMap, Medical Text Indexer, WhatIzIt) not yet optimized for Chemical names.