We have determined the DNA sequence of the long unique region (U) in the genome of herpes simplex virus type 1 (HSV-1) strain 17. The U sequence contained 107943 residues and had a base composition of 66.9% G+C. Together with our previous work, this completes the sequence of HSV-1 DNA, giving a total genome length of 152260 residues of base composition 68.3% G+C. Genes in the U region were located by the use of published mapping analyses, transcript structures and sequence data, and by examination of DNA sequence characteristics. Fifty-six genes were identified, accounting for most of the sequence. Some 28 of these are at present of unknown function. The gene layout for U was found to be very similar to that for the corresponding part of the genome of varicella-zoster virus, the only other completely sequenced alphaherpesvirus, and the amino acid sequences of equivalent proteins showed a range of similarities. In the whole genome of HSV-1 we now recognize 72 genes which encode 70 distinct proteins.


