InChi generation options

6 messages Options
Embed this post
Permalink
yezrewot

InChi generation options

Reply Threaded More More options
Print post
Permalink
I am trying to generate InChi codes using pybel like this:

descriptor={}
descriptor['INCHI']=mol.write('INCHI')

While running, the output window reports an InChi warning that protons have been added or
removed. When I examine the structures, the InChi generation has produced a new tautomeric form
for many molecules eg amides as imine-alcohols. The corresponding smiles representation using
mol.write('CAN') works as expected ie the same tautomer as the input structure.

I am aware that InChi allows the generation of mobile H atom and fixed H atom models - can I pass
something to mol.write('INCHI') to generate a fixed-H atom representation?



-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
_______________________________________________
OpenBabel-scripting mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-scripting
Noel O'Boyle

Re: InChi generation options

Reply Threaded More More options
Print post
Permalink
pybel just uses the default options of each format. But even OpenBabel
currently doesn't seem to have the option you are looking for:
   "InChI format\n"
    "IUPAC/NIST molecular identifier\n"
    "Write options, e.g. -xat\n"
    " X <Option string> List of InChI options:\n"
    " t add molecule name\n"
    " a output auxilliary information\n"
    " K output InChIKey\n"
    " u output only unique molecules\n"
    " U output only unique molecules and sort them\n"
    " e compare first molecule to others\n"
    " w don't warn on undef stereo or charge rearrangement\n\n"

I am not very familiar with InChI beyond the basics. Perhaps Geoff or
Chris can comment on whether this option could be added.

Noel

On 27/03/2008, [hidden email] <[hidden email]> wrote:

> I am trying to generate InChi codes using pybel like this:
>
>  descriptor={}
>  descriptor['INCHI']=mol.write('INCHI')
>
>  While running, the output window reports an InChi warning that protons have been added or
>  removed. When I examine the structures, the InChi generation has produced a new tautomeric form
>  for many molecules eg amides as imine-alcohols. The corresponding smiles representation using
>  mol.write('CAN') works as expected ie the same tautomer as the input structure.
>
>  I am aware that InChi allows the generation of mobile H atom and fixed H atom models - can I pass
>  something to mol.write('INCHI') to generate a fixed-H atom representation?
>
>
>
>  -------------------------------------------------------------------------
>  Check out the new SourceForge.net Marketplace.
>  It's the best place to buy or sell services for
>  just about anything Open Source.
>  http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
>  _______________________________________________
>  OpenBabel-scripting mailing list
>  [hidden email]
>  https://lists.sourceforge.net/lists/listinfo/openbabel-scripting
>

-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
_______________________________________________
OpenBabel-scripting mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-scripting
Noel O'Boyle

Re: InChi generation options

Reply Threaded More More options
Print post
Permalink
Just to clarify, I assume that you are aware that the current
behaviour is the correct default. Different tautomers of the same
molecule should have the same InChI. That is one of the core ideas of
the InChI.

Noel

On 27/03/2008, Noel O'Boyle <[hidden email]> wrote:

> pybel just uses the default options of each format. But even OpenBabel
>  currently doesn't seem to have the option you are looking for:
>    "InChI format\n"
>     "IUPAC/NIST molecular identifier\n"
>     "Write options, e.g. -xat\n"
>     " X <Option string> List of InChI options:\n"
>     " t add molecule name\n"
>     " a output auxilliary information\n"
>     " K output InChIKey\n"
>     " u output only unique molecules\n"
>     " U output only unique molecules and sort them\n"
>     " e compare first molecule to others\n"
>     " w don't warn on undef stereo or charge rearrangement\n\n"
>
>  I am not very familiar with InChI beyond the basics. Perhaps Geoff or
>  Chris can comment on whether this option could be added.
>
>
>  Noel
>
>
>  On 27/03/2008, [hidden email] <[hidden email]> wrote:
>  > I am trying to generate InChi codes using pybel like this:
>  >
>  >  descriptor={}
>  >  descriptor['INCHI']=mol.write('INCHI')
>  >
>  >  While running, the output window reports an InChi warning that protons have been added or
>  >  removed. When I examine the structures, the InChi generation has produced a new tautomeric form
>  >  for many molecules eg amides as imine-alcohols. The corresponding smiles representation using
>  >  mol.write('CAN') works as expected ie the same tautomer as the input structure.
>  >
>  >  I am aware that InChi allows the generation of mobile H atom and fixed H atom models - can I pass
>  >  something to mol.write('INCHI') to generate a fixed-H atom representation?
>  >
>  >
>  >
>  >  -------------------------------------------------------------------------
>  >  Check out the new SourceForge.net Marketplace.
>  >  It's the best place to buy or sell services for
>  >  just about anything Open Source.
>  >  http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
>  >  _______________________________________________
>  >  OpenBabel-scripting mailing list
>  >  [hidden email]
>  >  https://lists.sourceforge.net/lists/listinfo/openbabel-scripting
>  >
>

-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
_______________________________________________
OpenBabel-scripting mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-scripting
Noel O'Boyle

Re: InChi generation options

Reply Threaded More More options
Print post
Permalink
On 28/03/2008, lunchbox99 <[hidden email]> wrote:

> Thanks Noel. Perhaps I should also declare my relative newbie status when it
>  comes to matters InChi related. I usually use canonical smiles.
>
>  I am aware that mobile H treatment is deliberate and the default behavior to
>  presumably facilitate comparisons between tautomers from various sources. I
>  am using openbabel via the python wrappers to write a structural profiler
>  script for a sample registration tool. The InChi code is being tested as a
>  structure storage mechanism for a database of ~200k samples (real samples,
>  not virtual). I will also store canonical smiles to provide a backup
>  solution if the InChi proves problematic.
>
>  If the stored InChi structure is read from the database and viewed,
>  chemists/biologists will not understand why the structures are "wrong"...
>  they would expect amides to look like amides, not imino-alcohols. If
>  openbabel does not support fixed-H InChi, maybe I will just stick with
>  cansmiles for structure viewing and only use InChi internally to facilitate
>  duplicate checking.

This may be a good solution. My understanding of InChI is that they
are intended for creating a unique string for each molecule. In other
words, they are not ready meant to be read and interpreted. It may be
possible to use OpenBabel to generate the various tautomers
corresponding to a particular structure though. Also, if storage space
is a problem, and you are not going to be reading the InChI, you may
want to use InChiKeys instead.

HTH

>  mc
>
>
>
>  -----Original Message-----
>  From: Noel O'Boyle [mailto:[hidden email]]
>  Sent: Friday, 28 March 2008 6:52 AM
>  To: [hidden email]
>  Cc: [hidden email]; Chris Morley
>  Subject: Re: [OpenBabel-scripting] InChi generation options
>
>  Just to clarify, I assume that you are aware that the current
>  behaviour is the correct default. Different tautomers of the same
>  molecule should have the same InChI. That is one of the core ideas of
>  the InChI.
>
>  Noel
>
>  On 27/03/2008, Noel O'Boyle <[hidden email]> wrote:
>  > pybel just uses the default options of each format. But even OpenBabel
>  >  currently doesn't seem to have the option you are looking for:
>  >    "InChI format\n"
>  >     "IUPAC/NIST molecular identifier\n"
>  >     "Write options, e.g. -xat\n"
>  >     " X <Option string> List of InChI options:\n"
>  >     " t add molecule name\n"
>  >     " a output auxilliary information\n"
>  >     " K output InChIKey\n"
>  >     " u output only unique molecules\n"
>  >     " U output only unique molecules and sort them\n"
>  >     " e compare first molecule to others\n"
>  >     " w don't warn on undef stereo or charge rearrangement\n\n"
>  >
>  >  I am not very familiar with InChI beyond the basics. Perhaps Geoff or
>  >  Chris can comment on whether this option could be added.
>  >
>  >
>  >  Noel
>  >
>  >
>  >  On 27/03/2008, [hidden email] <[hidden email]> wrote:
>  >  > I am trying to generate InChi codes using pybel like this:
>  >  >
>  >  >  descriptor={}
>  >  >  descriptor['INCHI']=mol.write('INCHI')
>  >  >
>  >  >  While running, the output window reports an InChi warning that protons
>  have been added or
>  >  >  removed. When I examine the structures, the InChi generation has
>  produced a new tautomeric form
>  >  >  for many molecules eg amides as imine-alcohols. The corresponding
>  smiles representation using
>  >  >  mol.write('CAN') works as expected ie the same tautomer as the input
>  structure.
>  >  >
>  >  >  I am aware that InChi allows the generation of mobile H atom and fixed
>  H atom models - can I pass
>  >  >  something to mol.write('INCHI') to generate a fixed-H atom
>  representation?
>  >  >
>  >  >
>  >  >
>  >  >
>  -------------------------------------------------------------------------
>  >  >  Check out the new SourceForge.net Marketplace.
>  >  >  It's the best place to buy or sell services for
>  >  >  just about anything Open Source.
>  >  >
>  http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
>  >  >  _______________________________________________
>  >  >  OpenBabel-scripting mailing list
>  >  >  [hidden email]
>  >  >  https://lists.sourceforge.net/lists/listinfo/openbabel-scripting
>  >  >
>  >
>
>
>

-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
_______________________________________________
OpenBabel-scripting mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-scripting
Chris Morley-2

Re: InChi generation options

Reply Threaded More More options
Print post
Permalink
In reply to this post by Noel O'Boyle
Noel O'Boyle wrote:
> pybel just uses the default options of each format. But even OpenBabel
> currently doesn't seem to have the option you are looking for:
>    "InChI format\n"
>     "IUPAC/NIST molecular identifier\n"
>     "Write options, e.g. -xat\n"
>     " X <Option string> List of InChI options:\n"

On 27/03/2008, [hidden email] <[hidden email]> wrote:
 > > I am trying to generate InChi codes using pybel like this:
 > >
 > >  descriptor={}
 > >  descriptor['INCHI']=mol.write('INCHI')
 > >
 > >  While running, the output window reports an InChi warning that
protons have been added or
 > >  removed. When I examine the structures, the InChi generation has
produced a new tautomeric form
 > >  for many molecules eg amides as imine-alcohols. The corresponding
smiles representation using
 > >  mol.write('CAN') works as expected ie the same tautomer as the
input structure.
 > >
 > >  I am aware that InChi allows the generation of mobile H atom and
fixed H atom models - can I pass
 > >  something to mol.write('INCHI') to generate a fixed-H atom
representation?
 > >

On the babel commandline adding the option -xX "FixedH" would include a
fixed hydrogen layer in the InChI. This is a bit cumbersome so I have
added a couple of other options to the development code.
   F include fixed hydrogen layer
   M include bonds to metal

You can now call simple output options like these from C++ code like
conv.AddOption("F");
where conv is the OBConversion object.

I guess the syntax from Python is similar. Maybe Pybel could wrap option
calls for output formats somehow?

Chris




-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
_______________________________________________
OpenBabel-scripting mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-scripting
Charlie Zhu

Re: InChi generation options

Reply Threaded More More options
Print post
Permalink
Hi Noel,

I am using the OBDotNet lib 2.0 and find the OBConversion don't add FixedH layer to InChI output by default.
i.e. these 3 SMILES
c1([nH]c(nn1)N)CC(=O)O 115285.mol
c1(nc(n[nH]1)N)CC(=O)O 154294.mol
c1(nc([nH]n1)N)CC(=O)O 154307.mol

Yield the same InChI (they have no Fixed H layer)
InChI=1S/C4H6N4O2/c5-4-6-2(7-8-4)1-3(9)10/h1H2,(H,9,10)(H3,5,6,7,8)

And if I add below option to OBConversion object it will produce empty string.

    public static SqlString conversion(SqlString molecule, SqlString intype, SqlString outtype)
    {
        OBConversion obconv = new OBConversion();
        OBMol mol = new OBMol();

        obconv.AddOption("X", OBConversion.Option_type.OUTOPTIONS, "FixedH");
        obconv.SetInAndOutFormats(intype.ToString(), outtype.ToString());
        obconv.ReadString(mol, molecule.ToString());
        String os = obconv.WriteString(mol);
        return obconv.WriteString(mol);
    }

So I replaced the "libstdinchi.dll" with the renamed "libinchi.dll" shipped with Windows Open Babel GUI, the code accepts the option and outputs the right InChI.

InChI=1/C4H6N4O2/c5-4-6-2(7-8-4)1-3(9)10/h1H2,(H,9,10)(H3,5,6,7,8)/f/h6,9H,5H2
InChI=1/C4H6N4O2/c5-4-6-2(7-8-4)1-3(9)10/h1H2,(H,9,10)(H3,5,6,7,8)/f/h7,9H,5H2
InChI=1/C4H6N4O2/c5-4-6-2(7-8-4)1-3(9)10/h1H2,(H,9,10)(H3,5,6,7,8)/f/h8-9H,5H2

Could you point out this is some defect or miss using? Thanks.

Charlie

Chris Morley-2 wrote:
Noel O'Boyle wrote:
> pybel just uses the default options of each format. But even OpenBabel
> currently doesn't seem to have the option you are looking for:
>    "InChI format\n"
>     "IUPAC/NIST molecular identifier\n"
>     "Write options, e.g. -xat\n"
>     " X <Option string> List of InChI options:\n"

On 27/03/2008, yezrewot@tpg.com.au <yezrewot@tpg.com.au> wrote:
 > > I am trying to generate InChi codes using pybel like this:
 > >
 > >  descriptor={}
 > >  descriptor['INCHI']=mol.write('INCHI')
 > >
 > >  While running, the output window reports an InChi warning that
protons have been added or
 > >  removed. When I examine the structures, the InChi generation has
produced a new tautomeric form
 > >  for many molecules eg amides as imine-alcohols. The corresponding
smiles representation using
 > >  mol.write('CAN') works as expected ie the same tautomer as the
input structure.
 > >
 > >  I am aware that InChi allows the generation of mobile H atom and
fixed H atom models - can I pass
 > >  something to mol.write('INCHI') to generate a fixed-H atom
representation?
 > >

On the babel commandline adding the option -xX "FixedH" would include a
fixed hydrogen layer in the InChI. This is a bit cumbersome so I have
added a couple of other options to the development code.
   F include fixed hydrogen layer
   M include bonds to metal

You can now call simple output options like these from C++ code like
conv.AddOption("F");
where conv is the OBConversion object.

I guess the syntax from Python is similar. Maybe Pybel could wrap option
calls for output formats somehow?

Chris




-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
_______________________________________________
OpenBabel-scripting mailing list
OpenBabel-scripting@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-scripting