International Chemical Identifier
The IUPAC International Chemical Identifier ( InChI , pronounced "Intschie"), ( English : International chemical name of the IUPAC ) is a chemical structure code that enables a molecule to be translated into a standardized character string. This can then be used to search databases or the Internet more easily.
The code was developed by IUPAC and NIST between 2000 and 2004 and represents the digital equivalent of the IUPAC nomenclature for each specific chemical compound . In addition to the main level, the chemical structure is defined by five levels of information - connectivity , tautomerism , Isotopy , stereochemistry and formal charge.
The InChI algorithm converts entered structural information into the identification string in a three-step process - normalization (removal of redundant information), canonicalization (creation of a unique set of atomic identifiers), serialization (arrangement of information in a unique order). The software and the format are protected by trademark law, but published as free software under the LGPL .
Examples
CH 4 methane |
InChI = 1S / CH4 / h1H4 |
CH 3 CH 2 OH ethanol |
InChI = 1 / C2H6O / c1-2-3 / h3H, 2H2,1H3 |
L- ascorbic acid |
InChI = 1 / C6H8O6 / c7-1-2 (8) 5-3 (9) 4 (10) 6 (11) 12-5 / h2,5,7-10H, 1H2 / t2-.5 + / m0 / s1 |
Levels
There are six InChI levels:
- Main level
- Charge level
- Stereochemistry level
- Isotope level
- fixed-H-plane
- "Reconnected Layer"
Sub-levels
Each level can be split into sub-levels (subordinate levels). For example, the main level can be split into three sub-levels:
- Chemical formula (no prefix)
- Atomic compound (prefix: "c")
- Hydrogen atoms (prefix: "h")
notation
Level and sub-levels are separated from each other by a "/". All levels and sub-levels, with the exception of the chemical formula sub-level of the main level, begin with a lowercase letter, which indicates the information type of the levels.
literature
- Ulrich Rößler: Saving structural formulas as text . In: Nachr. Chem. 60, No. 2, 2012, p. 140, doi : 10.1002 / nadc.201290083 .