Fig. 1 Influences of crystal lattice contacts on Pol η structures. a. The dominant lattice contact between two hpol η molecules (silver and gold) in the type 1 crystals. b. A close-up view of the hydrophobic interactions between W339 of one molecule and V372 and C406 of the other. These three residues and E335 were mutated as listed on the right to break the lattice contact. All mutant proteins retain the polymerase activity (data not shown). c. Comparison of the LF-DNA interaction in the final Nrm structure with that in the type 1 structure and in the structure of yeast Pol η complexed with cisplatin (PDB: 2R8J). The green spheres represent the cations in the active site. The rmsd between two hpol η structures is 1.4Å over 335 pairs of Cα atoms, and between human (Nrm) and yeast Pol η is 2.3Å over 334 pairs of Cα atoms. 1
Fig. 2 Comparison of the palm domain and active site of hpol η and Pol β. Each palm domain is shown in rainbow colors from blue N- to red C-terminus. Extended loops are trimmed off for clarity. The PDB accession code of Pol β is 2FMS. 2
Fig. 3 Comparison of Y-family polymerases. a. Superposition of hpol η and homologous polymerases based on the palm domain. The view is similar to that in Fig. 1a. The orange arrowheads point to the finger domain, which is closed in the ternary complexes of hpol η and other Y-family members except for the structure of yeast hpol η-cisplatin complexes (pdb: 2R8J, 2R8K). The ovals encircle the thumb and LF domains, which are shifted in hpol η compared to Dpo4, Pol ι and κ. Letters b and c indicate where LF of hpol η interacts with the catalytic core. b-c. Detailed diagram of the interfaces between LF and the catalytic core. Palm is shown in pink, finger in light blue, LF in violet and thumb in green. 3
Fig. 4 Structural models. a. A 6-4PP (PDB: 3EI1 and shown in orange/red/blue sticks) is superimposed onto the CPD in TT1 (shown in yellow) and would clash severely with the Cα of S62. The 6-4PP cannot base pair with the incoming dntp either. b. When modeled at the -3 position (3bp from the replicating basepair), the CPD (colored in orange red) loses favorable hydrogen bonds that normal nucleotides (shown as grey sticks with oxygen atoms highlighted in raspberry color) make with the protein (grey carbon, blue nitrogen and red oxygen) as indicated by yellow dashed lines. CPD at the -3 position would also clash with hpol η as indicated by the black double arrowheads. 4
Fig. 5 Sequence alignment of Pol η orthologs. Names of species are Hs, Homo sapiens; Xl, Xenopus laevis; Dm, Drosophila melanogaster; At, Arabidopsis thaliana; Ce, Caenorhabditis elegans; Sp, Schizosaccharomyces pombe; Sc, Saccharomyces cerevisiae. Letters in red, green and blue indicate degrees of conservation from high to modest. Red and orange boxes indicate residues forming the active site and interacting with the incoming nucleotide, respectively. The green and blue boxes indicate residues that contact primer and template strand, respectively. The blue bars indicate residues that strengthen the molecular splint of Pol η. Q38 and R61 are marked by the red triangles. Seven XPV mutations are indicated by the yellow triangles. The eight, R361, is outside of the aligned region. R81, R84 and W339, which may interact with the downstream DNA duplex, is marked by purple dots. 5
Fig. 6 CPD-containing DNAs are bent, unstacked, and segmented when complexed with DNA repair proteins, photolyase, T4 endonuclease V and yeast Rad4. Their PDB accession codes are shown in parentheses and references can be found in the main text. The CPD complexed with yeast Rad4 was disordered in the crystal structure. 6
R361 V99 R111 A117 T122 G263/A264 G295 F290 W174 I272 Fig. 7 Mapping of all missense mutations of hpol η found in XPV and melanoma patients. The protein is represented by the Cα trace (pink palm, light-green thumb, light-blue finger and lightpurple LF), DNA as tube-and-ladders, and the altered residues are shown as ball-and-sticks. The 8 mutations predicted to alter the polymerase activity are highlighted in cyan and the remaining 3 in pea-green. V99 forms hydrophobic interactions, where Met can be readily accommodated. I272 is near the outer surface of the thumb domain, where Thr substitution is unlikely to alter the structure or polymerase activity of Pol η. W174 is in the back of the palm domain in the most flexible region of hpol η. The base substitution that leads to W174C mutation has been suggested to cause XPV by altering splicing and mrna level of Pol η 13. The catalytic carboxylates (pink/ red), two Mg 2+ ions (purple) are also shown. 7
Fig. 8 Potential interactions with downstream DNA. a. A front view of the symmetry-related DNA molecules in the P61 crystals (Nrm, TT1, TT3 and TT4). hpolη is shown in blue-grey cartoon with semi-transparent molecular surface. DNAs are colored in orange (template) and yellow (primer). The 5 end of the symmetry-related primer strand stacks with the exposed W339 (shown in magenta) on the back of LF. b. The back view of the complex. The symmetry-related DNA in the P212121 crystal (TT2) is also included and colored in darker shades. It may represent a second location of the downstream duplex or mimic additional DNA hairpin structures at a fragile site. R81 and R84 near the symmetry-related DNA are highlighted in blue. 8
Fig. 9 Diagram of hpol η-dna interactions. Nrm, the undamaged DNA complex), is used as an example. Hydrogen bonds are defined as within 3.2Å and van der Waals contacts within 4.2Å. The template base is labeled +1, and upstream from it are -1, -2, etc.. W42 is base stacked with da at +3. 9
Supplementary Table 1 Crystals of hpolη ternary complexes Structure name Protein DNA sequence Incoming dntp Nrm (Type 3) TT1 (Type 3) TT2 (Type 2) TT3 (Type 3) TT4 (Type 3) Native (Type 2) SeMet (Type 2) Type 1 C406M C406M C406M 3 TCGCAGTATTACT 5 TAGCGTCAT 5 3 dampnpp 3 TCGCAGTATTAC 5 5 TAGCGTCAT 3 dampnpp 3 CGCAGTATTCAAT 5' TGCGTCATA 5 3 dampnpp 3 CGCAGTATTCAAT 5 ACGTCATAA 5 3 dgmpnpp 3 CGAGCATTACTAC 5 5 TCTCGTAAT 3 dgmpnpp 3 ATCGCAGTATTA 5 5 TAGCGTCAT 3 dampnpp 3 ATCGCAGTATTA 5 5 TAGCGTCAT 3 dampnpp 3 CCCCCTTCCTAAGTTTCT 5 5 GGGGGAAGGATTC 3 datp 3 GCACGGATCGCATGTATG 5 5 GTGCCTAGCGTA 3 dctp 3 CACGCACGGATCGCATGTATG 5 5 GTGCGTGCCTAGCGTA 3 dctp TT denotes cis-syn thymine dimer 10
Supplementary Table 2 Data collection, phasing and refinement statistics Nrm TT1 TT2 TT3 TT4 SeMet Native Type 1 Data collection Space group P 6 1 P 6 1 P2 1 2 1 2 1 P 6 1 P 6 1 P2 1 2 1 2 1 P2 1 2 1 2 1 P 6 1 Cell dimensions a, b, c (Å) 98.46 98.46 82.50 98.24 98.24 82.31 61.24 80.26 139.76 99.13 99.13 81.70 98.04 98.04 82.05 63.26 81.45 139.39 63.51 80.16 139.26 137.06 137.06 65.59 Peak Inflection Remote Wavelength (Å) 1.0000 1.0000 1.0000 1.0000 1.0000 0.9794 0.9796 0.9494 1.0000 1.0000 Resolution (Å) 45.0-1.83 40.0-1.75 30.0-2.15 30.0-1.80 30.0-1.90 60-3.50 60-3.50 60-3.50 40-2.64 30-2.90 R sym (%)* 6.3 (51.7) 8.0 (59.6) 8.6 (59.3) 6.6 (57.4) 7.8 (53.4) 12.1 8.2 (31.1) 9.1 (29.8) 4.5 (62.7) 9.6 (67.4) (23.8) I/σI* 17.9 (3.2) 13.9 (3.5) 12.4 (2.9) 15.1 (2.6) 14.5 (3.3) 10.5 (5.1) 11.8 (4.1) 10.9 (4.5) 18.3 (2.0) 23.6 (2.0) Completeness (%)* 98.9 (90.6) 99.5 (99.6) 99.0 (97.1) 99.3 (99.6) 98.9 (98.0) 97.2 (95.1) 93.9 (95.8) 95.5 (95.9) 92.1 (93.8) 99.0 (92.0) Redundancy* 4.4 (4.0) 4.0 (4.0) 8.4 (6.8) 4.1 (4.1) 3.3 (3.2) 3.4 (2.0) 2.0 (2.0) 2.0 (2.0) 2.4 (2.4) 7.2 (5.7) Refinement Resolution (Å) 45.0-1.83 40.0-1.75 30.0-2.15 30.0-1.80 30.0-1.90 No. reflections 39732 45282 37705 42013 34965 R work/ R free 17.4 / 19.7 18.9 / 20.6 24.2 / 26.2 19.1 / 20.7 19.6 / 21.8 No. atoms Protein / DNA 3355 / 391 3334 / 406 3310 / 402 3347 / 395 3340 / 390 dnmpnpp / Mg 2+ 30 / 2 30 / 2 30 / 2 31 / 2 31 / 2 Water / Solutes 402 / 47 431 / 47 156 / 84 367 / 78 357 / 54 B-factors Protein / DNA 22.2 / 25.8 19.7 / 23.6 63.9 / 78.9 24.8 / 32.8 24.3 / 29.5 dnmpnpp / Mg 2+ 14.3 / 13.2 10.2 / 9.6 51.1 / 38.5 13.9 / 14.0 14.7 / 9.8 Water / Solutes 38.0 / 42.2 34.7 / 40.0 70.3 / 96.5 39.8 / 51.5 34.3 / 45.9 R.m.s deviations Bond lengths (Å) 0.018 0.011 0.013 0.009 0.009 Bond angles (º) 1.6 1.7 1.8 1.1 1.4 *Highest resolution shell is shown in parenthesis. 11
Supplementary Table 3 Kinetic parameters of dntp incorporation by, Q38A and R61A Polη on normal and CPD templates Enzyme Template : dntp K M (μm) k cat (min -1 ) k cat /K M (μm -1 min -1 ) Polη (1-511) ND (3 T) : datp 4.1 ± 1 200 ± 10 49 ND (3 T) : dgtp 17 ± 1 50 ± 10 3.0 CPD (3 T) : datp 4.1± 0.7 90 ± 5 20 CPD (3 T) : dgtp 110 ± 40 60 ± 3 0.55 Polη (1-511) Q38A ND (3 T) : datp 8.8 ± 0 210 ± 20 24 ND (3 T) : dgtp 80 ± 0 90 ± 7 1.1 CPD (3 T) : datp 15 ± 1 160 ± 20 11 CPD (3 T) : dgtp 330 ± 30 40 ± 2 0.12 Polη (1-511) R61A ND (3 T) : datp 6.5 ± 0.4 150 ± 6 22 ND (3 T) : dgtp 170 ± 50 80 ± 16 0.47 CPD (3 T) : datp 13± 2 100 ± 3 7.5 CPD (3 T) : dgtp 450 ± 40 20 ± 1 0.044 12