**Objective**

Get a flavour of quantum mechanical calculations. Crystallize the differences between semi empirical, Hartree Fock and DFT calculations. Test QM/MM calculations using ONIOM. Perform minimizations with different approaches and consequently compare them together. Discuss chemical mechanisms in a biological system.

**Introduction**

In order to taste computational chemistry tools, we will work on a few aspects of the action mechanism of the chromophore bound to an antitumoral protein: the Neocarzinostatin(Kim et al.). Prior to the study have a look at the work of Myers et al. in J. Am. Chem. Soc. 1996 and get familiar with the system and the questions mentioned by this paper.

**Tutorial**

**1. Pre-analysis of NCS with its chromophore**

Open the pdb files of the holo-NCS in complex with its natural chromophore in Chimera (pdb 1nco). Look at the general form of the protein, the chromophore and the complex formed between them.

Discuss briefly and elaborate molecular hypothesis with respect of Myers' paper (possible sites of interaction of the thiol, participation of water molecules, discuss possible motions of the enzyme and its binding site, etc...)

**2. Initial and benchmarking calculations on the chromophore.**

In this part of the work, we will get familiar with simple quantum mechanical calculations that we will apply on the same chromophore.

**Optimizing structure with QM codes**

We will look for the minimum of the chromophore outside the protein using different approaches, namely: AM1, PM6 (semi-empirical both), HF and B3LYP. The two latest are quite slower and you might consider doing the calculations with these two over night or during the time you have before providing with the report of this section. You can also share your calculations with you colleagues to go faster. Nonetheless, please, inform the professor with whom you plan to share information with.

**1. Setting up the calculations:**

In Chimera, separate the chromophore from the protein and save it in a mol2 file.

Open Gaussview.

Open your chromophore structure.

Check the charge and the multiplicity of your system (should be 0 and 1).

Go to menu Calculate and select Gaussian Calculation setup.

In job type, ask for optimization

You should try the methods: AM1 and PM6, Hartree Fock and DFT B3LYP

Try calculation with the common basis set 6-31g and sto-3g

To take advantage of the multiprocessing in Link0 put Shared Processors-Specify...-4.

Also use a checkpoint file that will be useful for posterior calculations

Make a clear plan of your calculations. You have in total 8 calculations to do; some will be substantially long. Take into account that the following guidelines are indicative. You are free (and encouraged) to comment different aspects and detail whatever you believe is important from a molecular point of view.

**2.Analyzing the results:**

Structure: Compare several geometrical features of the optimized structures like the planarity of the different cyclic moities, the distances between atoms, the planarity of amine nitrogen, etc. Compare also with the initial structure of the system in the protein. What happened??? Which type of calculation do you think combine the best structural quality/time of calculation?

Energy: Compare the energies between each tye of calculations (look for the line SCF done). Do the differences in energy between methods make sense? Compare now between calculations performed with different basis set but using the same approach. What can you say based on the theoretical part of the course?

Overall run: Discuss the number of cycles necessary for the calculation to reach the minimum. Try to make a single point calculation of the energy with a high level method on the structure minimized with a low level one and compare the energies. What can you say on the difference between structure and energy?

From this part of the work, make a discussion on which methods should be used / are the most reliable. What would be the approach with the best accuracy/time ressources ratio?

**3.Post analysis on quantum mechanical runs**

We will calculate the vibrational modes of the system. These informatioin provide with interesting information regarding the natural motion of the molecular systems. They can also be compared to a certain extend with experimental data like spectroscopic.

Take the optimized structure of the DFT calculated structure with the 6-31g functional. Copy the input file and past the correct optimized structure in this file. Remove the opt option and add the freq keyword. Run the calculation. This will take time. continue on the following point of the tutorial while the calculation is running.

Analysis. Open gaussview. Go to vibration. Study the different vector of vibration. Could you find some clear motives?

**Setting up QM/MM calculations**

Since the chromophore on its own is substantially large (81 atoms), one would be pleased to make a partition of the chromophore even outside the protein. Your first task consists in defining a correct QM/MM partition. Once decided, Prepare the calculation following these lines:

a. Open the structure of your optimized chromophore

b. You will first have to define the different layers of the system. Go to Edit --> Select Layers

In the new pop up window, first do a select all and put set layer = low (all the atoms will pass in the layer that will be calculated with a low level approach i.e. MM)

Then apply (all the atoms should now be in lines)

Now we have to select all the atoms of your partition that will be dealt with a QM approach (the rest will remain in a low cost approach)

Use the left hand botton of your mouse to select atoms. You can use jointly with shift to select several at the same time.

Then prepare the calculation like in the previous part of the work but this time in method check the multilayer ONIOM Model. For each region select the methodology you want to use.

**Some insights on reactivity**

Comparing the entire pathways for the activation of the chromophore of the NCS through a full size QM/MM modeling stands beyond the limit of the present tutorial. We will just focus on several parts of an hypothetical study were mechanism 2 and 3 of Myers' paper are compared. We will focus on a limited structures along the reaction steps including products and reactants. This is mainly the calculation of the difference of energy between reactants and products. On these bases we will try to comment on whether 2 or 3 are the most likely mechanisms (if this actually fits into the model we are using...).

The calculations need therefore the thiol, the water, the chromophore and the final products. You will have to build these structures. To do so, you'll have to manipulate a bit the structure of your compoundsusing a molecular editor. Gaussview has one but you might prefer the Build structure interface in Chimera. In this case, go to Tools, Structure Editing, Build Structure.

Using this tool, generate the two possible products of the action of small thiols with the holo-NCS (structure 2 and 3 of Myers' paper).

In this interface, use modify structure and adjust bond essentially. To complete or expand valences, use addh and modify the hydrogen atoms to the atoms you are interested in. To remove valences use the delete selection "del sel" command in the command line.

To facilitate your life, you'll have to minimize the structure using the AMBER force Field (GAFF). You'll have to make the correct atom type assignation. Look at the time it takes. Why are those calculations so fast? Would you definitively reckon MM calculations to study the reactivity for the holo-NCS with thiols?

Prepare product 2 and 3, the thiol and the water. Calculate these structures with the different basis set and the metho PM6, B3LYP (overnight), PM6:UFF and B3LYP:UFF. Which calculation are you thinking is the most accurate? What are those that are the closest?

Calculate the energy of reaction for pathways 2 and 3.

Complete the study so that you could discuss all the aspects of the present tutorial and write a short report on your activity.

Some ideas to write down your report:

- Look at the impact of the basis set in term of geometric quality, computational time (ressources) and energy. You can look at the number of basis functions generated in each calculation and compare with what was mentioned in the theoretical course.
- Look at the absolute energy. Are they different between methods? Why?
- Structural differences between the optimized geometries

Script to read gaussian outputs in chimera: here

File to fill to prepare and write down the report: here