Research Interests

Protein-nucleic acid interactions play many important cellular roles, including regulating DNA replication, controlling gene expression and DNA house-keeping, such as maintaining a super-coiled state, performing damage and mismatch repair, allowing recombination, restricting foreign DNA, and the processes of DNA ligating, methylating and degrading.  Our research goals are to provide better understanding for sequence-dependent and -independent recognition between proteins and DNA/RNA at atomic level.  We would like to find out why some proteins recognize only specific sequences, but some recognize DNA without sequence preference.  We work on several endonucleases which cleave DNA either site-specifically or randomly.  Using X-ray diffraction methods, we elucidate the three-dimensional structures of these proteins and protein/DNA complexes to show the underlying structural basis for the selectivity principle of these proteins.  Biochemistry and mutational approaches are also used to characterize the enzymatic function of these endonucleases.  Two main projects are briefly described below

¡@

1.   The H-N-H family of endonucleases

 The H-N-H motif was first recognized by the sequence similarity in several intron-encoded homing endonucleases and bacteria toxins.  The numbers of proteins discovered containing the H-N-H motif has increased substantially in the past several years; more than one hundred proteins are listed in databases (information found at the Pfam Protein Families database).  The biggest subgroups in the H-N-H family are group I and group II homing endonucleases, which initialize the process of transferring a mobile intervening sequence into a homologous allele that lacks the sequence.  These homing endonucleases recognize and cleave DNA site-specifically at target alleles.  The second biggest subgroup of proteins in the H-N-H family are bacteria toxins, such as Escherichia coli colicins (ColE7 and ColE9) and Pseudomonas aeruginosa pyocins (S1 and S2).  All of these toxins share a highly homologous C-terminal nuclease domain capable of hydrolyzing DNA nonspecifically in target cells.

¡@

HNH consensus                          E HH  P    GG     NL       H   H
Bacteriocins	Colicin E7	SGKRTSFELHHEKPISQNGGVYDMDNISVVTPKRHIDIHRGK-576
		Colicin E2	VGGRERFELHHDKPISQDGGVYDMNNIRVTTPKRHIDIHRGK-581
		Colicin E8	VGGRRSFELHHDKPISQDGGVYDMDNLRITTPKRHIDIHRGQ
		Colicin E9	VGGRVKYELHHDKPISQGGEVYDMDNIRVTTPKRHIDIHRGK-582
		Pyocin S1	AGGRIKIEIHHKVRVADGGGVYNMGNLVAVTPKRHIEIHKGG-617
		Pyocin S2	AGGRIKIEIHHKVRIADGGGVYNMGNLVAVTPKRHIEIHKGG-770
Group I homing	I-HmuI		EGYEEGLVVDHKD...GNKDNNLSTNLRWVTQKINVENQMSR-77
endonuclease	I-HmuII 	GGYEESLVVDHID...RNRHNNHFSNLRWVSRKENSSNISAD-104
		I-HmuIII	EGYGEDLVVDHID...QDRDNNHCSNLRWVSRKENSNNISAD-103
		yosQ		YDIPKGMFVNHID...GNKLNNHVRNLEIVTPKENTLHAMKI-102
		I-TevIII	DSDGRTDEIHHKD...GNRENNDLDNLMCLSIQEHYDIHLAQ-55
		ORF253 		VT-DKNKYIDHIN...GNPLDNRRNNLRVVSHQENMMNKKTY-192
Group II homing	Avi		CPMC9 WNIHHIIKRHMGGGDEL.DNLVLLHPNCHRQLH
endonuclease	Cpc1    	CSHC9 IEIDHIIP.KSQGGKDVYDNLQALHRHCHDVKTATD-568
		Cpc2		CSEC9 MEVHHIDQNR..GNNKL.SNLTLVHRHCHDIIH
		PetD		INSIP.YELHHILP.KRFGGKDTPNNMVLLCKSPCHQLVSSSI-574
Restriction or	McrA  		CENC14LEVHHVIP.LSSGGADTTDNCVALCPNCHRELHYSK-259
Repair enzyme	S.g. mtMSH	ICGAPADAVHHIKP6LCNRKLNRRSNLVPVCSSCHLDIHRNK-953

Sequence alignment of several H-N-H family proteins in the H-N-H motif region.  The dots (.) represent gaps and numbers represent the insertion of amino acids.  The first row shows the consensus sequence of the H-N-H motif with the most conserved Asn and His residues displayed in bald with underline.  The H-N-H proteins are classified into four sub-groups: (1) Bacterial toxins, including colicins and pyocins; (2) Group I homing endonucleases, including I-HmuI, I-HmuII, I-HmuIII, yosQ, I-TevIII and ORF253; (3) Group II homing endonucleases, including Avi, Cpc and PetD; (4) Restriction or repair enzymes, including McrA and S. g. mtMSH.

¡@

Our group reported the crystal structure of the endonuclease domain of colicin E7 (nuclease-ColE7) in complex with its inhibitor Im7, which is the first structure to show the fold of an H-N-H motif.  The topology and the coordination of the metal ion in the H-N-H motif share several features similar to those of classic zinc finger motifs containing an antiparallel-stranded b-sheet linked to a C-terminal a-helix with centrally located zinc ion.

The crystal structural model of the H-N-H motif in the DNase domain of ColE7.  This motif has a topology similar to that of the classical zinc finger motif with two antiparallel b-strands linked to an a-helix by a Zn2+ ion.  The Zn2+ ion is bound to three histidine residues, His544, His569, and His573, and a phosphate molecule in a distorted tetrahedral geometry.

¡@

A general ¡§bba-metal¡¨ fold, similar to that of H-N-H motif, has been identified before in the active sites of several nucleases, including the His-Cys box homing endonuclease I-PpoI, Serratia nuclease, and phage T4 endonuclease VII.  ColE7 shares no sequence identity with these nucleases however the similar structure in the active sites suggests that the H-N-H endonucleases likely bind and hydrolyze DNA in a comparable manner as those nucleases.

Structural comparison of dimeric nuclease-ColE7 with other dimeric nucleases.  Ribbon models for the crystal structures of (a) I-PpoI in complex with DNA; (b) Serratia nuclease; (c) Phage T4 Endonuclease VII; and (d) nuclease-ColE7.  The common bba-fold of the active sites are displayed in red.  The H-N-H motifs in nuclease-ColE7 are arranged similarly as compared to the bba-fold of active sites in I-PpoI.

Up to now neither the role of zinc ion, nor the way of DNA binding is known for the DNA hydrolysis mediated by the H-N-H family proteins.  We would like to find out why the H-N-H homing endonucleases cleave DNA at specific sites, but the H-N-H toxins cleave DNA randomly.  Our ultimate goals are using this system to show the underlying structural basis for protein/DNA recognition in general and DNA cleavage process carried out by the group of endonucleases containing a bba-metal fold active site in particular.

¡@

2. Other endonucleases containsing a ¡§£]£]£\-metal¡¨ fold active site 

We are also working on several endonucleases which do not contain a H-N-H motif in sequence but do contain a similar ¡§bba-metal¡¨ fold active site.  One example is the Vibrio Vulnificus nuclease (Vvn) which is a non-specific endonuclease capable of digesting both DNA and RNA.  This protein is located in the periplasm and it appears to involve in protection of a cell by preventing the uptake of foreign DNA in transformation.  The 2.3 Å crystal structure of the magnesium ion-bound Vvn, dertermined in our laboratory by the MAD method, showed that Vvn has a novel mixed £\/£] topology containing four disulfide bridges.  The overall structure of Vvn shows no similarity to other endonucleases, however, a known endonuclease motif containing a ¡§£]£]£\-metal¡¨ fold is identified in the central cleft region.

Ribbon diagram of Vvn structure. Vvn has a novel V-shaped mixed £\/£] fold. The four disulfide bridges are displayed in a ball-and-stick model. The £]£]£\-metal motif in Vvn is colored in red with a magnesium ion located in the center.

The crystal structure of the Vvn mutant H80A in complex with a duplex DNA further demonstrates that Vvn binds mainly at the minor groove of DNA, resulting in duplex bending towards the major groove by about 20o. Only the DNA phosphate backbones make hydrogen bonds with Vvn, suggesting at structural basis for its sequence-independent recognition of DNA and RNA. Based on the crystal structure of Vvn/DNA complex, a catalytic mechanism is proposed. We suggest that Vvn hydrolyzes DNA by a general single-metal ion mechanism, and show how non-specific DNA-binding proteins may recognize DNA.

Overall crystal structure of two Vvn molecules bound to two DNA octamers in one asymmetric unit. The general ¡§£]£]£\-metal¡¨ fold active site is in red.