Mulliken constrained CI calculations#

Approximate methods have a tendancy to distribute populations continuously, even at bond distances approaching infinity. By artificially distributing the population within the molecule, we can model the correct behaviour using full configuration interaction calculations and an appropriate population scheme.

# Force the local gqcpy to be imported
import sys
sys.path.insert(0, '../../build/gqcpy/')

import gqcpy
import numpy as np
import math as m
from scipy import optimize
import pandas as pd
import plotly.graph_objects as go
import plotly.express as px

Molecular setup#

To clearly show the changing behaviour when the bond distance increases, we will look at $H_2$ at three different bond distances.

distances = [4.5, 6.5, 8.5]
molecules = []

for distance in distances:
    molecules.append(gqcpy.Molecule.HChain(2, distance, 0))

Constrained CI setup#

In order to make thins as clear as possible, we will start by generating a helper class that will facilitate the calculations. This class will provide us with the necessary tools for two types of calculations: a scan over a series of multiplier values and a more targetted optimization in order to find mu values that lead to pre-defined population values.

class MullikenConstrainedCI:
    
    def __init__(self, molecule, basis_set):
        """
        :param molecule:      The molecule required for the calculations.
        :param basis_set:     The basis set in which the calculations will take place.
        """

        # We will use GQCP to set up all the required variables for the calculation.
        # Set up a spin orbital basis.
        spin_orbital_basis = gqcpy.RSpinOrbitalBasis_d(molecule, basis_set)
        
        # Orthonormalize the basis and use it to quantize a Hamiltonian. 
        # The second quantized Hamiltonian representation will be stored within the object.
        spin_orbital_basis.lowdinOrthonormalize()
        fq_hamiltonian = gqcpy.FQMolecularHamiltonian(molecule)
        self.sq_hamiltonian = spin_orbital_basis.quantize(fq_hamiltonian)
        
        # For full CI calculations, we will need an ONV basis, which will also be stored within the object.
        # N_total is divided correctly in N_a alpha electrons and N_b beta electrons.    
        K = int(spin_orbital_basis.numberOfSpatialOrbitals())
        N_total = molecule.numberOfElectrons()
        
        if N_total % 2 == 0:
            N_a = int(N_total / 2)
            N_b = int(N_total / 2)
        else:
            N_a = int(np.ceil(N_total / 2))
            N_b = int(np.floor(N_total / 2))
        
        self.onv_basis = gqcpy.SpinResolvedONVBasis(K, N_a, N_b)
        
        # Calculate the nuclear repulsion term and store it within the object for later use as well.
        self.nuclear_repulsion = gqcpy.NuclearRepulsionOperator(molecule.nuclearFramework()).value()
        
        # In order to constrain the Mulliken population, we require the Mulliken-partitioned number operator.
        # We can store this operator within the object, since it is the same for any value of the Lagrangian multiplier.
        S = spin_orbital_basis.quantize(gqcpy.OverlapOperator())
        mulliken_domain = spin_orbital_basis.mullikenDomain(lambda shell: shell.nucleus().position()[1] == 0 and shell.nucleus().position()[2] == 0)
        
        self.S = S
        self.mulliken_domain = mulliken_domain
        P = self.mulliken_domain.projectionMatrix(spin_orbital_basis.expansion())
        self.mulliken_operator = S.partitioned(P)

    # Next, we will define a couple of methods for this class that will help us later on with our calculations.
    # We will define a private method that can solve the corresponding full configuration interaction eigenvalue problem.   
    def _solveCIEigenproblem(self, hamiltonian):
        """
        A method used to solve a CI eigenproblem.

        :param hamiltonian:      The SQHamiltonian used for the calculation.
        :param onv_basis:        The ONV basis used for the calculation.

        :returns:                The ground state energy.
        :returns:                The ground state parameters
        """
        # Use GQCP to set up a full CI calculation.
        CIsolver = gqcpy.EigenproblemSolver.Dense_d()
        CIenvironment = gqcpy.CIEnvironment.Dense(hamiltonian, self.onv_basis)
        qc_structure = gqcpy.CI(self.onv_basis).optimize(CIsolver, CIenvironment)

        return qc_structure.groundStateEnergy(), qc_structure.groundStateParameters()

    # We will define a method that calculates the energy and the population when a specific multiplier is provided.
    def calculateEnergyAndPopulationAtMultiplier(self, multiplier):
        """
        A method used to calculate the energy and the population at a given multiplier `mu`.

        :param multiplier:      The multiplier used to modify the Hamiltonian.

        :returns:               The energy at the given `mu` value.
        :returns:               The population at the given `mu` value.
        """
        # Modify the Hamiltonian with the given multiplier.
        modified_hamiltonian = self.sq_hamiltonian - multiplier * self.mulliken_operator

        # Use the private method to solve the full CI eigenproblem.
        gs_energy, gs_parameters = self._solveCIEigenproblem(modified_hamiltonian)

        # Use the ground state parameters to calculate the 1DM and use it to calculate the expectation value of the Mulliken operator.
        D = gs_parameters.calculate1DM()
        W = self.mulliken_operator.calculateExpectationValue(D)[0]

        # Calculate the energy by correcting the ground state energy of the modified Hamiltonian.
        energy = gs_energy + (multiplier * W) + self.nuclear_repulsion

        return energy, W

    # Another approach is that we use a line search algorithm to find the multiplier that leads to a certain population value.
    # Note that it can be difficult to find these data points, due to the discontinuous behaviour that we expect.
    def calculateMultiplierAtPopulation(self, population):
        """
        A method that uses an `optimize_function` to calculate the multiplier value belonging to a specified population value.

        :param population:               The population value for which we want to find the multiplier.
        
        :returns:                        The optimized mu value.
        """
        def objective_function(mu):
                
            # The constrained CI solver contains the necessary method to calculate the average population.
            _, W = self.calculateEnergyAndPopulationAtMultiplier(mu)
        
            # We want to optimize the difference between the calculated and the expected population. 
            # The square is used to make the function more well-behaved.
            return (W - population)**2

        optimized_mu = optimize.golden(objective_function)
        return optimized_mu

    # In order to guarantee that a correct multiplier value has been found, we perform a check that tells us whether the population corresponding to the calculated mu value leads to the population that we were trying to find.
    # If this is not the case, a `None` value will be returned. 
    def verifyMultiplierPopulationCombination(self, optimized_mu, expected_population, threshold=1e-5):
        """
        A method used to check whether the multiplier found corresponds to the expected population.

        :param optimized_mu:                      The optimized mu found with an optimization function.
        :param expected_population:               The population value expected at the optimized mu value.
        :param threshold:                         The threshold at which the calculated and expected population are compared.

        :returns:                                 The energy belonging to the multiplier/population combination. If the expected and calculated populations don't match, `None`is returned.
        """
        # We calculate the energy and population at the optimized multiplier value.
        E, W = self.calculateEnergyAndPopulationAtMultiplier(optimized_mu)

        # If the population corresponding to this multiplier value is incorrect, we set the energy value to `None`.
        if not m.isclose(W, expected_population, abs_tol=threshold):
            E = None
            
        return E

Scan over a range of Lagrange multipliers#

The populations can be influenced by applying a constraint on the Hamiltonian. This constraint can be implemented by modifying the Hamiltonian with the Mulliken operator, multiplied by a potential.

Lagrange_multipliers = np.arange(-3, 3.1, 0.1)

We will create a dictionary containing the data. The calculations will loop over the defined multiplier range and calculate a population and an energy at each multiplier.

energy_dictionary = {}
population_dictionary = {}
for i in range(len(distances)):
    setup = MullikenConstrainedCI(molecules[i], "STO-3G")
    energies = []
    populations = []
    for multiplier in Lagrange_multipliers:
        E, W = setup.calculateEnergyAndPopulationAtMultiplier(multiplier)
        energies.append(E)
        populations.append(W)
    energy_dictionary[distances[i]] = energies
    population_dictionary[distances[i]] = populations

We will now visualize the results. First we will show the energy values plotted against the Lagrange multiplier.

fig = go.Figure()
for distance, energies in energy_dictionary.items():
    fig.add_trace(go.Scatter(x=Lagrange_multipliers, 
                             y=energies, 
                             mode='markers',
                             name=distance))
fig.show()

We can also show the populations plotted against the multiplier.

fig = go.Figure()
for distance, populations in population_dictionary.items():
    fig.add_trace(go.Scatter(x=Lagrange_multipliers, 
                             y=populations, 
                             mode='markers',
                             name=distance))
fig.show()

Search for multipliers corresponding to population values#

For $H_2$ the populations must lie between 0 and 2. We will look for the populatipons in steps of 0.1.

populations = np.arange(0, 2.1, 0.1)

The calculation itself looks very similar to the previous one. Except that now we have to check whether the calculated mu value corresponds to the population we were looking for.

energy_dictionary_2 = {}
for i in range(len(distances)):
    setup = MullikenConstrainedCI(molecules[i], "STO-3G")
    energies = []
    for population in populations:
        optimized_mu = setup.calculateMultiplierAtPopulation(population)
        E = setup.verifyMultiplierPopulationCombination(optimized_mu, population)
        energies.append(E)
    energy_dictionary_2[distances[i]] = energies

Finally, we can also visualize the energies as a function of the populations.

fig = go.Figure()
for distance, energies in energy_dictionary_2.items():
    fig.add_trace(go.Scatter(x=populations, 
                             y=energies, 
                             mode='markers',
                             name=distance))
fig.show()

Spin resolved Mulliken constrained CI calculations#

The previous calculations were used to examine the total population as a function of a single Lagrange multiplier. With a few modifications however, we can perform the same study with a separate multiplier for the $\alpha$- and $\beta$ populations respectively.

Spin resolved constrained CI setup#

We will once again define a helper class that sores the required variabled and methods.

class SpinResolvedMullikenConstrainedCI:
    """
    :param molecule:         The molecule used for the calculations.
    :param basis_set:        The basis set used for the calculations.
    """
    def __init__(self, molecule, basis_set):
        
        # Initialize and orthonormalize a spin orbital basis.
        spin_orbital_basis = gqcpy.USpinOrbitalBasis_d(molecule, basis_set)
        spin_orbital_basis.lowdinOrthonormalize()
        self.basis = spin_orbital_basis

        # We can now create a first quantized hamiltonian and use the spin orbital basis to "quantize" it to second quantization.
        # The SQHamiltonian is stored in the class object.
        fq_hamiltonian = gqcpy.FQMolecularHamiltonian(molecule)
        self.sq_hamiltonian = spin_orbital_basis.quantize(fq_hamiltonian)

        # Since we are going to do full CI calculations, we need an ONV-basis.
        # From the total number of orbitals and total number of electrons we can set up an ONV basis.
        K = int(spin_orbital_basis.numberOfSpinors() / 2)
        N_total = molecule.numberOfElectrons()

        if N_total % 2 == 0:
            N_a = int(N_total / 2)
            N_b = int(N_total / 2)
        else:
            N_a = int(np.ceil(N_total / 2))
            N_b = int(np.floor(N_total / 2))

        # The ONV basis gets stored within the class object.
        self.onv_basis = gqcpy.SpinResolvedONVBasis(K, N_a, N_b)

        # Calculate the nuclear repulsion term and store it in the class object.
        self.nuclear_repulsion = gqcpy.NuclearRepulsionOperator(molecule.nuclearFramework()).value()

        # Calculate the Mulliken-partitioned number operator.
        # For this we need the overlap operator and the Mulliken partitioning.
        S = spin_orbital_basis.quantize(gqcpy.OverlapOperator())
        mulliken_domain = spin_orbital_basis.mullikenDomain(lambda shell: shell.nucleus().position()[1] == 0 and shell.nucleus().position()[2] == 0)

        # The overlap operator, Mulliken partitioning and Mulliken operator are all stored within the object.
        self.S = S
        self.mulliken_domain = mulliken_domain
        P = self.mulliken_domain.projectionMatrix(spin_orbital_basis.expansion())
        self.mulliken_operator = S.partitioned(P)
    

    def _solveCIEigenproblem(self, hamiltonian):
        """
        A method used to solve an unrestricted CI eigenproblem.

        :param hamiltonian:      The USQHamiltonian used for the calculation.
        :param onv_basis:        The ONV basis used for the calculation.

        :returns:                The ground state energy.
        :returns:                The ground state parameters
        """
        # Use GQCP to set up a full CI calculation.
        CIsolver = gqcpy.EigenproblemSolver.Dense_d()
        CIenvironment = gqcpy.CIEnvironment.Dense(hamiltonian, self.onv_basis)
        qc_structure = gqcpy.CI(self.onv_basis).optimize(CIsolver, CIenvironment)

        return qc_structure.groundStateEnergy(), qc_structure.groundStateParameters()
        

    def calculateEnergyAndPopulationAtMultiplier(self, multiplier_array):
        """
        A method used to calculate the energy and the population at a given multiplier `mu`.

        :param multiplier_array:      The multiplier used to modify the Hamiltonian. Must be an array/list that contains an alpha and beta value.

        :returns:                     The energy at the given `mu` values.
        :returns:                     The alpha/beta populations at the given `mu` values.
        :returns:                     The total population at the given multiplier combination.
        """
        # Raise an exception if the multiplier is not conform with the requirements.
        if type(multiplier_array) is float or len(multiplier_array) != 2:
             raise ValueError("Multiplier_array must be of dimension 2 as it must contain both an alpha and beta value.")

        # Modify the Hamiltonian with the given multiplier.
        # Do this respectively for the alpha and beta parts.
        u_P = self.mulliken_domain.projectionMatrix(self.basis.expansion())
        u_mulliken_operator = self.S.partitioned(u_P)
        u_mulliken_operator.alpha *= multiplier_array[0]
        u_mulliken_operator.beta *= multiplier_array[1]
        modified_hamiltonian = self.sq_hamiltonian - u_mulliken_operator

        # Use the private method to solve the full CI eigenproblem.
        gs_energy, gs_parameters = self._solveCIEigenproblem(modified_hamiltonian)

        # Use the ground state parameters to calculate the 1DM and use it to calculate the expectation value of the Mulliken operator.
        # The separate alpha and beta populations are calculated as well.
        D = gs_parameters.calculateSpinResolved1DM()

        W_tot = self.mulliken_operator.calculateExpectationValue(D)[0]
        W_a = self.mulliken_operator.alpha.calculateExpectationValue(D.alpha)[0]
        W_b = self.mulliken_operator.beta.calculateExpectationValue(D.beta)[0]

        # Calculate the energy by correcting the ground state energy of the modified Hamiltonian.
        energy = gs_energy + (multiplier_array[0] * W_a) + (multiplier_array[1] * W_b) + self.nuclear_repulsion

        return energy, [W_a, W_b], W_tot


    def calculateMultiplierAtPopulation(self, population_array):
        """
        A method that uses an `optimize_function` to calculate the multiplier value belonging to a specified population value.

        :param population_array:         The population value for which we want to find the multiplier.
        :param options:                  The possible options that can be passed to the optimization function.
        
        :returns:                        The optimized mu values for alpha and beta.
        """
        # Raise an exception if the multiplier is not conform with the requirements.
        if type(population_array) is float or len(population_array) != 2:
             raise ValueError("Population_array must be of dimension 2 as it must contain both an alpha and beta value.")

        def objective_function(mu):
            # Unpack the parameter to be optimized.     
            mua, mub = mu
        
            # Calculate the populations.
            _, spinResolved_W, _ = self.calculateEnergyAndPopulationAtMultiplier([mua, mub])

            # Create a delta N array. Squares are used to make the function more well behaved.  
            delta_N = np.array([(spinResolved_W[0] - population_array[0]) ** 2, (spinResolved_W[1] - population_array[1]) ** 2])
                    
            return delta_N.T @ delta_N
                
        # Use the global optimization algorithm `brute` in order to find the minimim.
        optimized_mu = optimize.brute(objective_function, ranges=((-1, 1), (-1, 1)), Ns=100)
       
        return optimized_mu
        
        
    def verifyMultiplierPopulationCombination(self, optimized_mu, expected_population_array, threshold=1e-5):
        """
        A method used to check whether the multiplier found corresponds to the expected population.

        :param optimized_mu:                            The optimized mu found with an optimization function.
        :param expected_population_array:               The population values expected at the optimized mu values for both alpha and beta.
        :param threshold:                               The threshold at which the calculated and expected population are compared.

        :returns:                                       The energy belonging to the multiplier/population combination. If the expected and calculated populations don't match, `None`is returned.
        """
        # Raise an exception if the multiplier is not conform with the requirements.
        if type(expected_population_array) is float or len(expected_population_array) != 2:
             raise ValueError("Expected_population_array must be of dimension 2 as it must contain both an alpha and beta value.")

        E, spinResolved_W, _ = self.calculateEnergyAndPopulationAtMultiplier(optimized_mu)
    
        if not m.isclose(spinResolved_W[0], expected_population_array[0], abs_tol=threshold) and not m.isclose(spinResolved_W[1], expected_population_array[1], abs_tol=threshold):
            E = None

        return E

Scan over a range of Lagrange multipliers#

We can once again perform the same two types of calculations. However, one should take note that because the calculation now takes two variables, the range becomes a grid, as for each $\alpha$-multiplier, wa can enter all possible $\beta$-multipliers.

multiplier_range = np.arange(-3, 3.1, 0.1)
multiplier_grid = np.meshgrid(multiplier_range, multiplier_range)
output_data_dictionary = {}
for i in range(len(distances)):
    setup = SpinResolvedMullikenConstrainedCI(molecules[i], "STO-3G")
    output_data = pd.DataFrame({'alpha_mus': multiplier_grid[0].flatten(), 'beta_mus': multiplier_grid[1].flatten()})

    output_array = output_data.apply(lambda row: setup.calculateEnergyAndPopulationAtMultiplier([row['alpha_mus'], row['beta_mus']]), axis=1)

    # Save the output array elements in their respective locations in the data frame
    for j in range(len(output_array)):
        output_data.loc[j, 'energies'] = output_array[j][0]
        output_data.loc[j, 'alpha_populations'] = output_array[j][1][0]
        output_data.loc[j, 'beta_populations'] = output_array[j][1][1]
        output_data.loc[j, 'total_populations'] = output_array[j][2]

    output_data_dictionary[distances[i]] = output_data

We can now visualize the energies as a function of both the $\alpha$- and $\beta$-multipliers.

fig = px.scatter_3d(output_data_dictionary[4.5], x='alpha_mus', y='beta_mus', z='energies')
fig.show()

We can do the same for the populations. However, let us use a different distance.

fig = px.scatter_3d(output_data_dictionary[6.5], x='alpha_mus', y='beta_mus', z='total_populations')
fig.show()

Search for multipliers corresponding to population values#

As with the scan over multipliers, when searching for specific multiplier values, we must also create a grid out of the populations. Take into account that while the total population range is 0-2, the $\alpha$- and $\beta$-population ranges are each equal to half of that.

population_range = np.arange(0, 1.1, 0.1)
population_grid = np.meshgrid(population_range, population_range)

Due to the expensiveness of the calculations, we will only calculate the data for one of the three distances.

setup = SpinResolvedMullikenConstrainedCI(molecules[0], "STO-3G")
output_data = pd.DataFrame({'alpha_populations': population_grid[0].flatten(), 'beta_populations': population_grid[1].flatten()})

output_array = output_data.apply(lambda row: setup.calculateMultiplierAtPopulation([row['alpha_populations'], row['beta_populations']]), axis=1)
        
# Save the output array elements in their respective locations in the data frame
for j in range(len(output_array)):
    output_data.loc[j, 'alpha_mus'] = output_array[j][0]
    output_data.loc[j, 'beta_mus'] = output_array[j][1]

output_data['energies'] = output_data.apply(lambda row: setup.verifyMultiplierPopulationCombination([row['alpha_mus'], row['beta_mus']], [row['alpha_populations'], row['beta_populations']]), axis=1)

Now we can also visualize the energy as a function of the spin resolved population. Any points that are missing in the figure have their energy values set to None, due to the optimization procedure not finding the correct multiplier value for the requested population value.

fig = px.scatter_3d(output_data, x='alpha_populations', y='beta_populations', z='energies')
fig.show()