Q07791 · YD23B_YEAST
- ProteinTransposon Ty2-DR3 Gag-Pol polyprotein
- GeneTY2B-DR3
- StatusUniProtKB reviewed (Swiss-Prot)
- Amino acids1770 (go to sequence)
- Protein existenceInferred from homology
- Annotation score5/5
Function
function
Capsid protein (CA) is the structural component of the virus-like particle (VLP), forming the shell that encapsulates the retrotransposons dimeric RNA genome. The particles are assembled from trimer-clustered units and there are holes in the capsid shells that allow for the diffusion of macromolecules. CA has also nucleocapsid-like chaperone activity, promoting primer tRNA(i)-Met annealing to the multipartite primer-binding site (PBS), dimerization of Ty2 RNA and initiation of reverse transcription (By similarity).
The aspartyl protease (PR) mediates the proteolytic cleavages of the Gag and Gag-Pol polyproteins after assembly of the VLP.
Reverse transcriptase/ribonuclease H (RT) is a multifunctional enzyme that catalyzes the conversion of the retro-elements RNA genome into dsDNA within the VLP. The enzyme displays a DNA polymerase activity that can copy either DNA or RNA templates, and a ribonuclease H (RNase H) activity that cleaves the RNA strand of RNA-DNA heteroduplexes during plus-strand synthesis and hydrolyzes RNA primers. The conversion leads to a linear dsDNA copy of the retrotransposon that includes long terminal repeats (LTRs) at both ends (By similarity).
Integrase (IN) targets the VLP to the nucleus, where a subparticle preintegration complex (PIC) containing at least integrase and the newly synthesized dsDNA copy of the retrotransposon must transit the nuclear membrane. Once in the nucleus, integrase performs the integration of the dsDNA into the host genome (By similarity).
Miscellaneous
Retrotransposons are mobile genetic entities that are able to replicate via an RNA intermediate and a reverse transcription step. In contrast to retroviruses, retrotransposons are non-infectious, lack an envelope and remain intracellular. Ty2 retrotransposons belong to the copia elements (pseudoviridae).
Catalytic activity
- a 2'-deoxyribonucleoside 5'-triphosphate + DNA(n) = diphosphate + DNA(n+1)
- a 2'-deoxyribonucleoside 5'-triphosphate + DNA(n) = diphosphate + DNA(n+1)
Features
Showing features for site, active site, binding site.
Type | ID | Position(s) | Description | |||
---|---|---|---|---|---|---|
Site | 397-398 | Cleavage; by Ty2 protease | ||||
Sequence: HN | ||||||
Active site | 457 | For protease activity; shared with dimeric partner | ||||
Sequence: D | ||||||
Site | 578-579 | Cleavage; by Ty2 protease | ||||
Sequence: NN | ||||||
Binding site | 667 | Mg2+ 1 (UniProtKB | ChEBI); catalytic; for integrase activity | ||||
Sequence: D | ||||||
Binding site | 732 | Mg2+ 1 (UniProtKB | ChEBI); catalytic; for integrase activity | ||||
Sequence: D | ||||||
Site | 1232-1233 | Cleavage; by Ty2 protease | ||||
Sequence: AA | ||||||
Binding site | 1361 | Mg2+ 2 (UniProtKB | ChEBI); catalytic; for reverse transcriptase activity | ||||
Sequence: D | ||||||
Binding site | 1442 | Mg2+ 2 (UniProtKB | ChEBI); catalytic; for reverse transcriptase activity | ||||
Sequence: D | ||||||
Binding site | 1443 | Mg2+ 2 (UniProtKB | ChEBI); catalytic; for reverse transcriptase activity | ||||
Sequence: D | ||||||
Binding site | 1625 | Mg2+ 3 (UniProtKB | ChEBI); catalytic; for RNase H activity | ||||
Sequence: D | ||||||
Binding site | 1667 | Mg2+ 3 (UniProtKB | ChEBI); catalytic; for RNase H activity | ||||
Sequence: E | ||||||
Binding site | 1700 | Mg2+ 3 (UniProtKB | ChEBI); catalytic; for RNase H activity | ||||
Sequence: D |
GO annotations
Aspect | Term | |
---|---|---|
Cellular Component | cytoplasm | |
Cellular Component | nucleus | |
Molecular Function | aspartic-type endopeptidase activity | |
Molecular Function | ATP binding | |
Molecular Function | DNA binding | |
Molecular Function | DNA-directed DNA polymerase activity | |
Molecular Function | metal ion binding | |
Molecular Function | RNA binding | |
Molecular Function | RNA-directed DNA polymerase activity | |
Molecular Function | RNA-DNA hybrid ribonuclease activity | |
Biological Process | DNA integration | |
Biological Process | DNA recombination | |
Biological Process | proteolysis | |
Biological Process | transposition | |
Biological Process | viral translational frameshifting |
Keywords
- Molecular function
- Biological process
- Ligand
Protein family/group databases
Names & Taxonomy
Protein names
- Recommended nameTransposon Ty2-DR3 Gag-Pol polyprotein
- Alternative names
- Cleaved into 4 chains
Gene names
Organism names
- Strain
- Taxonomic lineageEukaryota > Fungi > Dikarya > Ascomycota > Saccharomycotina > Saccharomycetes > Saccharomycetales > Saccharomycetaceae > Saccharomyces
Accessions
- Primary accessionQ07791
- Secondary accessions
Proteomes
Organism-specific databases
PTM/Processing
Features
Showing features for chain.
Type | ID | Position(s) | Description | |||
---|---|---|---|---|---|---|
Chain | PRO_0000279301 | 1-397 | Capsid protein | |||
Sequence: MESQQLHQNPHSLHGSAAASVTSKEVPSNQDPLAVSASNLPEFDRDSTKVNSQQETTPGTSAVPENHHHVSPQPASVPPPQNGQYQQHGMMTPNKAMASNWAHYQQPSMMTCSHYQTSPAYYQPDPHYPLPQYIPPLSTSSPDPIDSQNQHSEVPQAKTKVRNNVLPPHTLTSEENFSTWVKFYIRFLKNSNLGDIIPNDQGEIKRQMTYEEHAYIYNTFQAFAPFHLLPTWVKQILEINYADILTVLCKSVSKMQTNNQELKDWIALANLEYDGSTSADTFEITVSTIIQRLKENNINVSDRLACQLILKGLSGDFKYLRNQYRTKTNMKLSQLFAEIQLIYDENKIMNLNKPSQYKQHSEYKNVSRTSPNTTNTKVTSRNYHRTNSSKPRAAKAH | ||||||
Chain | PRO_0000279300 | 1-1770 | Transposon Ty2-DR3 Gag-Pol polyprotein | |||
Sequence: MESQQLHQNPHSLHGSAAASVTSKEVPSNQDPLAVSASNLPEFDRDSTKVNSQQETTPGTSAVPENHHHVSPQPASVPPPQNGQYQQHGMMTPNKAMASNWAHYQQPSMMTCSHYQTSPAYYQPDPHYPLPQYIPPLSTSSPDPIDSQNQHSEVPQAKTKVRNNVLPPHTLTSEENFSTWVKFYIRFLKNSNLGDIIPNDQGEIKRQMTYEEHAYIYNTFQAFAPFHLLPTWVKQILEINYADILTVLCKSVSKMQTNNQELKDWIALANLEYDGSTSADTFEITVSTIIQRLKENNINVSDRLACQLILKGLSGDFKYLRNQYRTKTNMKLSQLFAEIQLIYDENKIMNLNKPSQYKQHSEYKNVSRTSPNTTNTKVTSRNYHRTNSSKPRAAKAHNIATSSKFSRVNNDHINESTVSSQYLSDDNELSLGQQQKESKPTRTIDSNDELPDHLLIDSGASQTLVRSAHYLHHATPNSEINIVDAQKQDIPINAIGNLHFNFQNGTKTSIKALHTPNIAYDLLSLSELANQNITACFTRNTLERSDGTVLAPIVKHGDFYWLSKKYLIPSHISKLTINNVNKSKSVNKYPYPLIHRMLGHANFRSIQKSLKKNAVTYLKESDIEWSNASTYQCPDCLIGKSTKHRHIKGSRLKYQESYEPFQYLHTDIFGPVHHLPKSAPSYFISFTDEKTRFQWVYPLHDRREESILNVFTSILAFIKNQFNARVLVIQMDRGSEYTNKTLHKFFTNRGITACYTTTADSRAHGVAERLNRTLLNDCRTLLHCSGLPNHLWFSAVEFSTIIRNSLVSPKNDKSARQHAGLAGLDITTILPFGQPVIVNNHNPDSKIHPRGIPGYALHPSRNSYGYIIYLPSLKKTVDTTNYVILQDNQSKLDQFNYDTLTFDDDLNRLTAHNQSFIEQNETEQSYDQNTESDHDYQSEIEINSDPLVNDFSSQSLNPLQLDKEPVQKVRAPKEVDADISEYNILPSTIRSRTPHIINKESTEMGGTIESDTTSPRHSSTFTARNQKRPGSPNDMIDLTSQDRVNYGLENIKTTRLGGTEEPYIQRNSDTNIKYRTTNSTPSIDDRSSNSESTTPIISIETKAVCDNTPSIDTDPPEYRSSDHATPNIMPDKSSKNVTADSILDDLPLPDLTNKSPTDTSDVSKDIPHIHSRQTNSSLGGMDDSNVLTTTKSKKRSLEDNETEIEVSRDTWNNKNMRSLEPPRSKKRINLIAAIKGVKSIKPVRTTLRYDEAITYNKDNKEKDRYVEAYHKEISQLLKMNTWDTNKYYDRNDIDPKKVINSMFIFNKKRDGTHKARFVARGDIQHPDTYDSDMQSNTVHHYALMTSLSIALDNDYYITQLDISSAYLYADIKEELYIRPPPHLGLNDKLLRLRKSLYGLKQSGANWYETIKSYLINCCDMQEVRGWSCVFKNSQVTICLFVDDMILFSKDLNANKKIITTLKKQYDTKIINLGEGDNEIQYDILGLEIKYQRSKYMKLGMEKSLTEKLPKLNVPLNPKGKKLRAPGQPGHYIDQDELEIDEDEYKEKVHEMQKLIGLASYVGYKFRFDLLYYINTLAQHILFPSRQVLDMTYELIQFIWNTRDKQLIWHKSKPVKPTNKLVVISDASYGNQPYYKSQIGNIYLLNGKVIGGKSTKASLTCTSTTEAEIHAVSEAIPLLNNLSHLVQELNKKPIIKGLLTDSRSTISIIKSTNEEKFRNRFFGTKAMRLRDEVSGNNLYVYYIETKMNIADVMTKPLPIKTFKLLTNKWIH | ||||||
Chain | PRO_0000279302 | 398-578 | Ty2 protease | |||
Sequence: NIATSSKFSRVNNDHINESTVSSQYLSDDNELSLGQQQKESKPTRTIDSNDELPDHLLIDSGASQTLVRSAHYLHHATPNSEINIVDAQKQDIPINAIGNLHFNFQNGTKTSIKALHTPNIAYDLLSLSELANQNITACFTRNTLERSDGTVLAPIVKHGDFYWLSKKYLIPSHISKLTIN | ||||||
Chain | PRO_0000279303 | 579-1232 | Integrase | |||
Sequence: NVNKSKSVNKYPYPLIHRMLGHANFRSIQKSLKKNAVTYLKESDIEWSNASTYQCPDCLIGKSTKHRHIKGSRLKYQESYEPFQYLHTDIFGPVHHLPKSAPSYFISFTDEKTRFQWVYPLHDRREESILNVFTSILAFIKNQFNARVLVIQMDRGSEYTNKTLHKFFTNRGITACYTTTADSRAHGVAERLNRTLLNDCRTLLHCSGLPNHLWFSAVEFSTIIRNSLVSPKNDKSARQHAGLAGLDITTILPFGQPVIVNNHNPDSKIHPRGIPGYALHPSRNSYGYIIYLPSLKKTVDTTNYVILQDNQSKLDQFNYDTLTFDDDLNRLTAHNQSFIEQNETEQSYDQNTESDHDYQSEIEINSDPLVNDFSSQSLNPLQLDKEPVQKVRAPKEVDADISEYNILPSTIRSRTPHIINKESTEMGGTIESDTTSPRHSSTFTARNQKRPGSPNDMIDLTSQDRVNYGLENIKTTRLGGTEEPYIQRNSDTNIKYRTTNSTPSIDDRSSNSESTTPIISIETKAVCDNTPSIDTDPPEYRSSDHATPNIMPDKSSKNVTADSILDDLPLPDLTNKSPTDTSDVSKDIPHIHSRQTNSSLGGMDDSNVLTTTKSKKRSLEDNETEIEVSRDTWNNKNMRSLEPPRSKKRINLIA | ||||||
Chain | PRO_0000279304 | 1233-1770 | Reverse transcriptase/ribonuclease H | |||
Sequence: AIKGVKSIKPVRTTLRYDEAITYNKDNKEKDRYVEAYHKEISQLLKMNTWDTNKYYDRNDIDPKKVINSMFIFNKKRDGTHKARFVARGDIQHPDTYDSDMQSNTVHHYALMTSLSIALDNDYYITQLDISSAYLYADIKEELYIRPPPHLGLNDKLLRLRKSLYGLKQSGANWYETIKSYLINCCDMQEVRGWSCVFKNSQVTICLFVDDMILFSKDLNANKKIITTLKKQYDTKIINLGEGDNEIQYDILGLEIKYQRSKYMKLGMEKSLTEKLPKLNVPLNPKGKKLRAPGQPGHYIDQDELEIDEDEYKEKVHEMQKLIGLASYVGYKFRFDLLYYINTLAQHILFPSRQVLDMTYELIQFIWNTRDKQLIWHKSKPVKPTNKLVVISDASYGNQPYYKSQIGNIYLLNGKVIGGKSTKASLTCTSTTEAEIHAVSEAIPLLNNLSHLVQELNKKPIIKGLLTDSRSTISIIKSTNEEKFRNRFFGTKAMRLRDEVSGNNLYVYYIETKMNIADVMTKPLPIKTFKLLTNKWIH |
Post-translational modification
Initially, virus-like particles (VLPs) are composed of the structural unprocessed proteins Gag and Gag-Pol, and also contain the host initiator methionine tRNA (tRNA(i)-Met) which serves as a primer for minus-strand DNA synthesis, and a dimer of genomic Ty RNA. Processing of the polyproteins occurs within the particle and proceeds by an ordered pathway, called maturation. First, the protease (PR) is released by autocatalytic cleavage of the Gag-Pol polyprotein, and this cleavage is a prerequisite for subsequent processing at the remaining sites to release the mature structural and catalytic proteins. Maturation takes place prior to the RT reaction and is required to produce transposition-competent VLPs (By similarity).
Proteomic databases
PTM databases
Structure
Family & Domains
Features
Showing features for compositional bias, region, domain, motif.
Type | ID | Position(s) | Description | |||
---|---|---|---|---|---|---|
Compositional bias | 1-67 | Polar residues | ||||
Sequence: MESQQLHQNPHSLHGSAAASVTSKEVPSNQDPLAVSASNLPEFDRDSTKVNSQQETTPGTSAVPENH | ||||||
Region | 1-89 | Disordered | ||||
Sequence: MESQQLHQNPHSLHGSAAASVTSKEVPSNQDPLAVSASNLPEFDRDSTKVNSQQETTPGTSAVPENHHHVSPQPASVPPPQNGQYQQHG | ||||||
Region | 295-397 | RNA-binding | ||||
Sequence: ENNINVSDRLACQLILKGLSGDFKYLRNQYRTKTNMKLSQLFAEIQLIYDENKIMNLNKPSQYKQHSEYKNVSRTSPNTTNTKVTSRNYHRTNSSKPRAAKAH | ||||||
Compositional bias | 355-440 | Polar residues | ||||
Sequence: SQYKQHSEYKNVSRTSPNTTNTKVTSRNYHRTNSSKPRAAKAHNIATSSKFSRVNNDHINESTVSSQYLSDDNELSLGQQQKESKP | ||||||
Region | 355-449 | Disordered | ||||
Sequence: SQYKQHSEYKNVSRTSPNTTNTKVTSRNYHRTNSSKPRAAKAHNIATSSKFSRVNNDHINESTVSSQYLSDDNELSLGQQQKESKPTRTIDSNDE | ||||||
Region | 579-636 | Integrase-type zinc finger-like | ||||
Sequence: NVNKSKSVNKYPYPLIHRMLGHANFRSIQKSLKKNAVTYLKESDIEWSNASTYQCPDC | ||||||
Domain | 656-831 | Integrase catalytic | ||||
Sequence: ESYEPFQYLHTDIFGPVHHLPKSAPSYFISFTDEKTRFQWVYPLHDRREESILNVFTSILAFIKNQFNARVLVIQMDRGSEYTNKTLHKFFTNRGITACYTTTADSRAHGVAERLNRTLLNDCRTLLHCSGLPNHLWFSAVEFSTIIRNSLVSPKNDKSARQHAGLAGLDITTILP | ||||||
Region | 1005-1038 | Disordered | ||||
Sequence: GGTIESDTTSPRHSSTFTARNQKRPGSPNDMIDL | ||||||
Region | 1057-1205 | Disordered | ||||
Sequence: GGTEEPYIQRNSDTNIKYRTTNSTPSIDDRSSNSESTTPIISIETKAVCDNTPSIDTDPPEYRSSDHATPNIMPDKSSKNVTADSILDDLPLPDLTNKSPTDTSDVSKDIPHIHSRQTNSSLGGMDDSNVLTTTKSKKRSLEDNETEIE | ||||||
Compositional bias | 1060-1108 | Polar residues | ||||
Sequence: EEPYIQRNSDTNIKYRTTNSTPSIDDRSSNSESTTPIISIETKAVCDNT | ||||||
Compositional bias | 1126-1140 | Polar residues | ||||
Sequence: PNIMPDKSSKNVTAD | ||||||
Compositional bias | 1169-1187 | Polar residues | ||||
Sequence: IHSRQTNSSLGGMDDSNVL | ||||||
Compositional bias | 1188-1205 | Basic and acidic residues | ||||
Sequence: TTTKSKKRSLEDNETEIE | ||||||
Motif | 1193-1227 | Bipartite nuclear localization signal | ||||
Sequence: KKRSLEDNETEIEVSRDTWNNKNMRSLEPPRSKKR | ||||||
Domain | 1353-1491 | Reverse transcriptase Ty1/copia-type | ||||
Sequence: NDYYITQLDISSAYLYADIKEELYIRPPPHLGLNDKLLRLRKSLYGLKQSGANWYETIKSYLINCCDMQEVRGWSCVFKNSQVTICLFVDDMILFSKDLNANKKIITTLKKQYDTKIINLGEGDNEIQYDILGLEIKYQ | ||||||
Domain | 1625-1767 | RNase H Ty1/copia-type | ||||
Sequence: DASYGNQPYYKSQIGNIYLLNGKVIGGKSTKASLTCTSTTEAEIHAVSEAIPLLNNLSHLVQELNKKPIIKGLLTDSRSTISIIKSTNEEKFRNRFFGTKAMRLRDEVSGNNLYVYYIETKMNIADVMTKPLPIKTFKLLTNK |
Domain
The C-terminal RNA-binding region of CA is sufficient for all its nucleocapsid-like chaperone activities.
Integrase core domain contains the D-x(n)-D-x35-E motif, named for the phylogenetically conserved glutamic acid and aspartic acid residues and the invariant 35 amino acid spacing between the second and third acidic residues. Each acidic residue of the D,D35E motif is independently essential for the 3'-processing and strand transfer activities of purified integrase protein (By similarity).
Keywords
- Domain
Phylogenomic databases
Family and domain databases
Sequence & Isoform
- Sequence statusComplete
This entry describes 2 isoforms produced by Ribosomal frameshifting. The Gag-Pol polyprotein is generated by a +1 ribosomal frameshift.
Q07791-1
This isoform has been chosen as the canonical sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.
- NameTransposon Ty2-DR3 Gag-Pol polyprotein
- NoteProduced by +1 ribosomal frameshifting between codon Leu-431 and Gly-432 of the YDR261W-A ORF.
- Length1,770
- Mass (Da)201,962
- Last updated1996-11-01 v1
- ChecksumC7D9F5E2103BD6F9
Q99303-1
The sequence of this isoform can be found in the external entry linked below. Isoforms of the same protein are often annotated in two different entries if their sequences differ significantly.
View isoform- NameTransposon Ty2-DR3 Gag polyprotein
Sequence caution
Features
Showing features for compositional bias.
Type | ID | Position(s) | Description | |||
---|---|---|---|---|---|---|
Compositional bias | 1-67 | Polar residues | ||||
Sequence: MESQQLHQNPHSLHGSAAASVTSKEVPSNQDPLAVSASNLPEFDRDSTKVNSQQETTPGTSAVPENH | ||||||
Compositional bias | 355-440 | Polar residues | ||||
Sequence: SQYKQHSEYKNVSRTSPNTTNTKVTSRNYHRTNSSKPRAAKAHNIATSSKFSRVNNDHINESTVSSQYLSDDNELSLGQQQKESKP | ||||||
Compositional bias | 1060-1108 | Polar residues | ||||
Sequence: EEPYIQRNSDTNIKYRTTNSTPSIDDRSSNSESTTPIISIETKAVCDNT | ||||||
Compositional bias | 1126-1140 | Polar residues | ||||
Sequence: PNIMPDKSSKNVTAD | ||||||
Compositional bias | 1169-1187 | Polar residues | ||||
Sequence: IHSRQTNSSLGGMDDSNVL | ||||||
Compositional bias | 1188-1205 | Basic and acidic residues | ||||
Sequence: TTTKSKKRSLEDNETEIE |
Keywords
- Coding sequence diversity
- Technical term
Sequence databases
Nucleotide Sequence | Protein Sequence | Molecule Type | Status | |
---|---|---|---|---|
Z68329 EMBL· GenBank· DDBJ | CAA92721.1 EMBL· GenBank· DDBJ | Genomic DNA | Sequence problems. | |
Z74387 EMBL· GenBank· DDBJ | CAA98914.1 EMBL· GenBank· DDBJ | Genomic DNA | ||
BK006938 EMBL· GenBank· DDBJ | DAA12102.1 EMBL· GenBank· DDBJ | Genomic DNA |