P47024 · YJ41B_YEAST
- ProteinTransposon Ty4-J Gag-Pol polyprotein
- GeneTY4B-J
- StatusUniProtKB reviewed (Swiss-Prot)
- Amino acids1803 (go to sequence)
- Protein existenceInferred from homology
- Annotation score5/5
Function
function
Capsid protein (CA) is the structural component of the virus-like particle (VLP), forming the shell that encapsulates the retrotransposons dimeric RNA genome.
The aspartyl protease (PR) mediates the proteolytic cleavages of the Gag and Gag-Pol polyproteins after assembly of the VLP.
Reverse transcriptase/ribonuclease H (RT) is a multifunctional enzyme that catalyzes the conversion of the retro-elements RNA genome into dsDNA within the VLP. The enzyme displays a DNA polymerase activity that can copy either DNA or RNA templates, and a ribonuclease H (RNase H) activity that cleaves the RNA strand of RNA-DNA heteroduplexes during plus-strand synthesis and hydrolyzes RNA primers. The conversion leads to a linear dsDNA copy of the retrotransposon that includes long terminal repeats (LTRs) at both ends (By similarity).
Integrase (IN) targets the VLP to the nucleus, where a subparticle preintegration complex (PIC) containing at least integrase and the newly synthesized dsDNA copy of the retrotransposon must transit the nuclear membrane. Once in the nucleus, integrase performs the integration of the dsDNA into the host genome (By similarity).
Miscellaneous
Retrotransposons are mobile genetic entities that are able to replicate via an RNA intermediate and a reverse transcription step. In contrast to retroviruses, retrotransposons are non-infectious, lack an envelope and remain intracellular. Ty4 retrotransposons belong to the copia elements (pseudoviridae).
Catalytic activity
- a 2'-deoxyribonucleoside 5'-triphosphate + DNA(n) = diphosphate + DNA(n+1)
- a 2'-deoxyribonucleoside 5'-triphosphate + DNA(n) = diphosphate + DNA(n+1)
Features
Showing features for active site, binding site.
Type | ID | Position(s) | Description | |||
---|---|---|---|---|---|---|
Active site | 415 | For protease activity; shared with dimeric partner | ||||
Sequence: D | ||||||
Binding site | 631 | Mg2+ 1 (UniProtKB | ChEBI); catalytic; for integrase activity | ||||
Sequence: D | ||||||
Binding site | 696 | Mg2+ 1 (UniProtKB | ChEBI); catalytic; for integrase activity | ||||
Sequence: D | ||||||
Binding site | 1384 | Mg2+ 2 (UniProtKB | ChEBI); catalytic; for reverse transcriptase activity | ||||
Sequence: D | ||||||
Binding site | 1463 | Mg2+ 2 (UniProtKB | ChEBI); catalytic; for reverse transcriptase activity | ||||
Sequence: D | ||||||
Binding site | 1464 | Mg2+ 2 (UniProtKB | ChEBI); catalytic; for reverse transcriptase activity | ||||
Sequence: D | ||||||
Binding site | 1645 | Mg2+ 3 (UniProtKB | ChEBI); catalytic; for RNase H activity | ||||
Sequence: D | ||||||
Binding site | 1687 | Mg2+ 3 (UniProtKB | ChEBI); catalytic; for RNase H activity | ||||
Sequence: E | ||||||
Binding site | 1721 | Mg2+ 3 (UniProtKB | ChEBI); catalytic; for RNase H activity | ||||
Sequence: D |
GO annotations
Aspect | Term | |
---|---|---|
Cellular Component | mitochondrion | |
Cellular Component | nucleus | |
Molecular Function | aspartic-type endopeptidase activity | |
Molecular Function | ATP binding | |
Molecular Function | DNA binding | |
Molecular Function | DNA-directed DNA polymerase activity | |
Molecular Function | metal ion binding | |
Molecular Function | RNA binding | |
Molecular Function | RNA-directed DNA polymerase activity | |
Molecular Function | RNA-DNA hybrid ribonuclease activity | |
Biological Process | DNA integration | |
Biological Process | DNA recombination | |
Biological Process | proteolysis | |
Biological Process | transposition | |
Biological Process | viral translational frameshifting |
Keywords
- Molecular function
- Biological process
- Ligand
Names & Taxonomy
Protein names
- Recommended nameTransposon Ty4-J Gag-Pol polyprotein
- Alternative names
Including 4 domains:
- Recommended nameCapsid protein
- Short namesCA
- Recommended nameTy4 protease
- EC number
- Short namesPR
- Recommended nameIntegrase
- Short namesIN
- Recommended nameReverse transcriptase/ribonuclease H
- EC number
- Short namesRT; RT-RH
Gene names
Organism names
- Strains
- Taxonomic lineageEukaryota > Fungi > Dikarya > Ascomycota > Saccharomycotina > Saccharomycetes > Saccharomycetales > Saccharomycetaceae > Saccharomyces
Accessions
- Primary accessionP47024
- Secondary accessions
Proteomes
Organism-specific databases
PTM/Processing
Features
Showing features for chain.
Type | ID | Position(s) | Description | |||
---|---|---|---|---|---|---|
Chain | PRO_0000203502 | 1-1803 | Transposon Ty4-J Gag-Pol polyprotein | |||
Sequence: MATPVRGETRNVIDDNISARIQSKVKTNDTVRQTPSSLRKVSIKDEQVRQYQRNLNRFKTILNGLKAEEEKLSEADDIQMLAEKLLKLGETIDKVENRIVDLVEKIQLLETNENNNILHEHIDATGTYYLFDTLTSTNKRFYPKDCVFDYRTNNVENIPILLNNFKKFIKKYQFDDVFENDIIEIDPRENEILCKIIKEGLGESLDIMNTNTTDIFRIIDGLKKQNIEVCMVEMSELEPGEKVLVDTTCRNSALLMNKLQKLVLMEKWIFSKCCQDCPNLKDYLQEAIMGTLHESLRNSVKQRLYNIPHDVGIDHEEFLINTVIETVIDLSPIADDQIENSCMYCKSVFHCSINCKKKPNRELGLTRPISQKPIIYKVHRDNNHLSPVQNEQKSWNKTQKRSNKVYNSKKLVIIDTGSGVNITNDKTLLHNYEDSNRSTRFFGIGKNSSVSVKGYGYIKIKNGHNNTDNKCLLTYYVPEEESTIISCYDLAKKTKMVLSRKYTRLGNKIIKIKTKIVNGVIHVKMNELIERPSDDSKINAIKPTSSPGFKLNKRSITLEDAHKRMGHTGIQQIENSIKHNHYEESLDLIKEPNEFWCQTCKISKATKRNHYTGSMNNHSTDHEPGSSWCMDIFGPVSSSNADTKRYMLIMVDNNTRYCMTSTHFNKNAETILAQVRKNIQYVETQFDRKVREINSDRGTEFTNDQIEEYFISKGIHHILTSTQDHAANGRAERYIRTIITDATTLLRQSNLRVKFWEYAVTSATNIRNYLEHKSTGKLPLKAISRQPVTVRLMSFLPFGEKGIIWNHNHKKLKPSGLPSIILCKDPNSYGYKFFIPSKNKIVTSDNYTIPNYTMDGRVRNTQNINKSHQFSSDNDDEEDQIETVTNLCEALENYEDDNKPITRLEDLFTEEELSQIDSNAKYPSPSNNLEGDLDYVFSDVEESGDYDVESELSTTNNSISTDKNKILSNKDFNSELASTEISISGIDKKGLINTSHIDEDKYDEKVHRIPSIIQEKLVGSKNTIKINDENKISDRIRSKNIGSILNTGLSRCVDITDESITNKDESMHNAKPELIQEQLKKTNHETSFPKEGSIGTNVKFRNTNNEISLKTGDTSLPIKTLESINNHHSNDYSTNKVEKFEKENHHPPPIEDIVDMSDQTDMESNCQDGNNLKELKVTDKNVPTDNGTNVSPRLEQNIEASGSPVQTVNKSAFLNKEFSSLNMKRKRKRHDKNNSLTSYELERDKKRSKKNRVKLIPDNMETVSAPKIRAIYYNEAISKNPDLKEKHEYKQAYHKELQNLKDMKVFDVDVKYSRSEIPDNLIVPTNTIFTKKRNGIYKARIVCRGDTQSPDTYSVITTESLNHNHIKIFLMIANNRNMFMKTLDINHAFLYAKLEEEIYIPHPHDRRCVVKLNKALYGLKQSPKEWNDHLRQYLNGIGLKDNSYTPGLYQTEDKNLMIAVYVDDCVIAASNEQRLDEFINKLKSNFELKITGTLIDDVLDTDILGMDLVYNKRLGTIDLTLKSFINRMDKKYNEELKKIRKSSIPHMSTYKIDPKKDVLQMSEEEFRQGVLKLQQLLGELNYVRHKCRYDIEFAVKKVARLVNYPHERVFYMIYKIIQYLVRYKDIGIHYDRDCNKDKKVIAITDASVGSEYDAQSRIGVILWYGMNIFNVYSNKSTNRCVSSTEAELHAIYEGYADSETLKVTLKELGEGDNNDIVMITDSKPAIQGLNRSYQQPKEKFTWIKTEIIKEKIKEKSIKLLKITGKGNIADLLTKPVSASDFKRFIQVLKNKITSQDILASTDY |
Post-translational modification
Proteolytically processed into capsid protein (CA), Ty4 protease (PR), integrase (IN) and reverse transcriptase/ribonuclease H (RT) proteins (Probable). Initially, virus-like particles (VLPs) are composed of the structural unprocessed proteins Gag and Gag-Pol, and also contain the host initiator methionine tRNA (tRNA(i)-Met) which serves as a primer for minus-strand DNA synthesis, and a dimer of genomic Ty RNA. Processing of the polyproteins occurs within the particle and proceeds by an ordered pathway, called maturation. First, the protease (PR) is released by autocatalytic cleavage of the Gag-Pol polyprotein, and this cleavage is a prerequisite for subsequent processing at the remaining sites to release the mature structural and catalytic proteins. Maturation takes place prior to the RT reaction and is required to produce transposition-competent VLPs (By similarity).
Proteomic databases
Structure
Family & Domains
Features
Showing features for coiled coil, region, domain, compositional bias.
Type | ID | Position(s) | Description | |||
---|---|---|---|---|---|---|
Coiled coil | 39-115 | |||||
Sequence: RKVSIKDEQVRQYQRNLNRFKTILNGLKAEEEKLSEADDIQMLAEKLLKLGETIDKVENRIVDLVEKIQLLETNENN | ||||||
Region | 382-502 | Ty4 protease | ||||
Sequence: NNHLSPVQNEQKSWNKTQKRSNKVYNSKKLVIIDTGSGVNITNDKTLLHNYEDSNRSTRFFGIGKNSSVSVKGYGYIKIKNGHNNTDNKCLLTYYVPEEESTIISCYDLAKKTKMVLSRKY | ||||||
Region | 540-600 | Integrase-type zinc finger-like | ||||
Sequence: AIKPTSSPGFKLNKRSITLEDAHKRMGHTGIQQIENSIKHNHYEESLDLIKEPNEFWCQTC | ||||||
Domain | 620-787 | Integrase catalytic | ||||
Sequence: TDHEPGSSWCMDIFGPVSSSNADTKRYMLIMVDNNTRYCMTSTHFNKNAETILAQVRKNIQYVETQFDRKVREINSDRGTEFTNDQIEEYFISKGIHHILTSTQDHAANGRAERYIRTIITDATTLLRQSNLRVKFWEYAVTSATNIRNYLEHKSTGKLPLKAISRQP | ||||||
Region | 1224-1250 | Disordered | ||||
Sequence: KRKRKRHDKNNSLTSYELERDKKRSKK | ||||||
Compositional bias | 1233-1250 | Basic and acidic residues | ||||
Sequence: NNSLTSYELERDKKRSKK | ||||||
Domain | 1376-1511 | Reverse transcriptase Ty1/copia-type | ||||
Sequence: RNMFMKTLDINHAFLYAKLEEEIYIPHPHDRRCVVKLNKALYGLKQSPKEWNDHLRQYLNGIGLKDNSYTPGLYQTEDKNLMIAVYVDDCVIAASNEQRLDEFINKLKSNFELKITGTLIDDVLDTDILGMDLVYN | ||||||
Domain | 1645-1791 | RNase H Ty1/copia-type | ||||
Sequence: DASVGSEYDAQSRIGVILWYGMNIFNVYSNKSTNRCVSSTEAELHAIYEGYADSETLKVTLKELGEGDNNDIVMITDSKPAIQGLNRSYQQPKEKFTWIKTEIIKEKIKEKSIKLLKITGKGNIADLLTKPVSASDFKRFIQVLKNK |
Domain
Integrase core domain contains the D-x(n)-D-x35-E motif, named for the phylogenetically conserved glutamic acid and aspartic acid residues and the invariant 35 amino acid spacing between the second and third acidic residues. Each acidic residue of the D,D35E motif is independently essential for the 3'-processing and strand transfer activities of purified integrase protein (By similarity).
Keywords
- Domain
Phylogenomic databases
Family and domain databases
Sequence & Isoform
- Sequence statusComplete
This entry describes 2 isoforms produced by Ribosomal frameshifting. The Gag-Pol polyprotein is generated by a +1 ribosomal frameshift.
P47024-1
This isoform has been chosen as the canonical sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.
- NameTransposon Ty4-J Gag-Pol polyprotein
- NoteProduced by +1 ribosomal frameshifting between codon Leu-363 and Gly-364 of the YJL114W ORF.
- Length1,803
- Mass (Da)207,710
- Last updated2010-11-02 v3
- ChecksumA58D1D4E96F7C9C9
P47023-1
The sequence of this isoform can be found in the external entry linked below. Isoforms of the same protein are often annotated in two different entries if their sequences differ significantly.
View isoform- NameTransposon Ty4-J Gag polyprotein
Sequence caution
Features
Showing features for sequence conflict, compositional bias.
Type | ID | Position(s) | Description | |||
---|---|---|---|---|---|---|
Sequence conflict | 452 | in Ref. 1; X67284 | ||||
Sequence: V → L | ||||||
Sequence conflict | 684 | in Ref. 1; X67284 | ||||
Sequence: T → A | ||||||
Sequence conflict | 920 | in Ref. 1; X67284 | ||||
Sequence: A → S | ||||||
Sequence conflict | 1020 | in Ref. 1; X67284 | ||||
Sequence: S → R | ||||||
Compositional bias | 1233-1250 | Basic and acidic residues | ||||
Sequence: NNSLTSYELERDKKRSKK | ||||||
Sequence conflict | 1803 | in Ref. 1; X67284 | ||||
Sequence: Y → YLINEVLNTQISVEVQ |
Keywords
- Coding sequence diversity
- Technical term
Sequence databases
Nucleotide Sequence | Protein Sequence | Molecule Type | Status | |
---|---|---|---|---|
X67284 EMBL· GenBank· DDBJ | - | Genomic DNA | No translation available. | |
Z49389 EMBL· GenBank· DDBJ | CAA89409.1 EMBL· GenBank· DDBJ | Genomic DNA | Sequence problems. | |
BK006943 EMBL· GenBank· DDBJ | DAA08686.2 EMBL· GenBank· DDBJ | Genomic DNA | Frameshift |