Properties calculations
From a molecular formula, it is possible to :
- calculate the molecular masses
- calculate the isotopic distributions
- simulate a mass spectrum providing and isotopic distribution and a peak width.
Parsing a chemical formula
The private function MSj.formula
takes an input string representing a chemical formula, such as "CH4", counts the number of atoms in the formula and returns a dictionary.
MSj.formula("CH3Br")
Dict("Br" => 1,"C" => 1,"H" => 3)
Dict{String,Int64} with 3 entries:
"Br" => 1
"C" => 1
"H" => 3
In the previous example, the dictionary had three elements corresponding to the three atomic species found in the "CH3Br" formula. The "Br" key has the value 1, the "C" key has also 1 and the "H" key has a value equals to 3. The output of this function will be used by all the other functions, which will calculate properties from it. The MSj.formula
function accepts stoichiometric regular molecular formulas as well as more developed forms. For hexane the following entries "C6H14", "CH3CH2CH2CH2CH2CH3" or "CH3(CH2)4CH3" are equivalents.
MSj.formula("C6H14");
Dict("C" => 6,"H" => 14)
MSj.formula("CH3CH2CH2CH2CH2CH3");
Dict("C" => 6,"H" => 14)
MSj.formula("CH3(CH2)4CH3");
Dict("C" => 6,"H" => 14)
The indices found after a parenthesis are used to multiply the elements inside the parenthesis. If no indices are found after ")", then the group is not multiply. The parenthesis are also used to specify an isotope, for example, consider ethane isotopically labelled with carbon 13:
MSj.formula("CH3(13C)H3");
Dict("C" => 1,"13C" => 1,"H" => 6)
These expressions are equivalent: "CH3(13C)H3", "CH3(13CH3)", "C(13C)H6". The following isotopes are recognized by MSj.formula:
- 1H for ${}^1H_{1}$ (protium)
- 2H for ${}^2H_{2}$ (deuterium)
- D is equivalent to 2H
- 12C for ${}^{12}C_{6}$
- 13C for ${}^{13}C_{6}$
- 14N for ${}^{14}N_{7}$
- 15N for ${}^{15}N_{7}$
- 16O for ${}^{16}O_{8}$
- 18O for ${}^{18}O_{8}$
- 32S for ${}^{32}S_{16}$
- 34S for ${}^{34}S_{16}$
The elements are stored in a dictionary called MSj.Elements. Each key of the MSj.Elements points to an Array
of MSj.Isotope
, which is a structure used to stored the different properties of the isotopes:
struct Isotope
m::Float64 # mass
f::Float64 # natural frequency
logf::Float64 # logarythm of the natural frequency
Z::Int # atomic number
A::Int # mass number
active::Bool # is radioactive
end
The isotopes are sorted by natural frequency. Hence, for instance, the first sulfur isotope is ${}^{32}S_{16}$ with a natural frenquency of 0.995 followed by ${}^{34}S_{16}$ with 0.043, etc.
julia> E = MSj.Elements["S"]
4-element Array{MSj.Isotope,1}:
MSj.Isotope(31.9720711741, 0.9498500119990401, -0.051451188958515866, 16, 32, false)
MSj.Isotope(33.96786703, 0.04252059835213182, -3.157766653355949, 16, 34, false)
MSj.Isotope(32.9714589101, 0.00751939844812415, -4.890269137820559, 16, 33, false)
MSj.Isotope(35.9670812, 0.00010999120070394368, -9.115110188972029, 16, 36, false)
The first element of the array, is the most naturally abundant isotope. The properties of the isotopes may be accessed by:
julia> E = MSj.Elements["S"];
julia> E[1] # returns the most abundant isotope of sulfur
MSj.Isotope(31.9720711741, 0.9498500119990401, -0.051451188958515866, 16, 32, false)
julia> E[2].f # returns natural abundance of the second most abundant istotope of sulfur.
0.04252059835213182
Molecular masses
The public function masses
, return a dictionary with three keys "Monoisotopic", "Average" and "Nominal", defined as:
Mass | Element | Molecule or ion |
---|---|---|
Nominal | Mass number of the most abundant stable isotope | Sum of the nominal masses of the most abundant stable isotopes |
Monoistopic | Atomic mass of the most abundant stable isotope | Sum of the atomic masses of the most abundant stable isotopes |
Average | Mass average function of the relative abundances | Sum of the average atomic weights of the constituents |
The masses for hexane are calculated by:
julia> m = masses("CH3(CH2)4CH3")
Dict("C" => 6,"H" => 14)
Dict{String,Float64} with 3 entries:
"Monoisotopic" => 86.1096
"Average" => 86.178
"Nominal" => 86.0
And the specific masses may be accessed by:
julia> m["Average"]
86.178
julia> m["Monoisotopic"]
86.10955045178
julia> m["Nominal"]
86.0
Isotopic distributions
The public function isotopic_distribution
calculates the isotopic distribution of the given formula. The function takes the following arguments:
- formula:
String
- target probability:
Real
number - charge: optional argument
Int
, by default = +1 - tau: optional
Real
number set by default to 0.1 - the elements dictionary: set by default to
MSj.Elements
.
The calculations is based on the implementation of the isospec
algorithm. Briefly, the algorithm search for the small set of isotopologues
for which the total abundance is equal to the target probability. The calculation return a vector with all the configurations found. The first column gives the masses, the second column the probabilities of the configurations, and the following columns gives the configurations, such as:
julia> I = isotopic_distribution("C6H14", 0.9999, charge = +1)
Dict("C" => 6,"H" => 14)
5×6 Array{Union{Float64, Int64, String},2}:
"Masses" "Probability" "12C" "13C" "1H" "2H"
86.1096 0.935476 6 0 14 0
87.1129 0.0612122 5 1 14 0
88.1163 0.00166891 4 2 14 0
87.1158 0.00151559 6 0 13 1
The resulting array is intend to be readable easily, and thus the first line contains the descriptors or the columns. It may also be easily exported to a delimited file.
Simulated mass spectra
The result of an isotopic distribution calculation may be convoluted with a peak shape to produce a simulated mass spectrum. Such an operatio, is achieved by the simulate
function. The function takes the following arguments:
- the
Array
resulting from theisotopic_distribution
calculation - the width in Dalton of the peak shape
- the model used for the peak shape. By default it is set to
model=:gauss
, but may bemodel=:lorentz
ormodel=:voigt
. - An
Int
value for the number of points in the mass spectrum. By default set toNpoints=1000
.
Hence, the simulated istopic distribution of haxane can be obtained and plotted like this:
julia> I = isotopic_distribution("C6H14", 0.9999, charge = +1);
Dict("C" => 6,"H" => 14)
julia> sim = simulate(I, 0.05, model= :lorentz);
julia> Using Plots
julia> plot(sim)
See Plotting for more information on how to plot mass spectra.