multiSubstitute.py#

usage: multiSubstitute.py [-h] [-ls]
                          [-if {xyz,log,com,gjf,sd,sdf,mol,mol2,out,dat,fchk,crest,xtb,sqmout,47,31,qout}]
                          [-s i,j,k,...=substituent1,substituent2,...] [-m]
                          [-o output destination]
                          [input file [input file ...]]

add or modify substituents with permutations

positional arguments:
  input file            a coordinate file

optional arguments:
  -h, --help            show this help message and exit
  -ls, --list           list available substituents
  -if {xyz,log,com,gjf,sd,sdf,mol,mol2,out,dat,fchk,crest,xtb,sqmout,47,31,qout}, --input-format {xyz,log,com,gjf,sd,sdf,mol,mol2,out,dat,fchk,crest,xtb,sqmout,47,31,qout}
                        file format of input - xyz is assumed if input is stdin
  -s i,j,k,...=substituent1,substituent2,..., --substitute i,j,k,...=substituent1,substituent2,...
                        substitution instructions
                        i, j, k are the 1-indexed position of the starting position of the
                        substituent(s) you are replacing
                        a substituent name prefixed by iupac: or smiles: (e.g. iupac:acetyl
                        or smiles:O=[N.]=O) will create the substituent from the
                        corresponding identifier
                        substituents can be separated by commas to use all combinations
  -m, --minimize        rotate substituents to try to minimize LJ energy
  -o output destination, --output output destination
                        output destination
                        $INFILE will be replaced with the name of the input file
                        $SUBSTITUTIONS will be replaced with the substitution info
                        Default: stdout

Examples#

Let’s say we want to generate a library of substituted pyridines. Here’s the numbering for our starting pyridine structure:

../_images/pyridine_numbered.png

For this example, we’ll be keeping positions numberd 7 and 11 the same as each other, and independently modifying position 9.

Here’s a command we can use to do this:

multiSubstitute.py pyridine.xyz -s 7,11=iupac:butyl,iupac:propyl,Et,Me,H -s 9=F,Cl,Br,CH2F,CHF2,CF3,NO2,iupac:acetyl,OMe,SMe,Me -o pyridines\$INFILE_$SUBSTITUTIONS.xyz

This command generates 55 structures. If 7 and 11 are modified independently but still use the same list of substituents (i.e. -s 7=... -s 11=...), it will generate 275 structures. Keep combinatorial explosions in mind when generating your libraries!