P04588 · POL_HV1MA
- ProteinGag-Pol polyprotein
- Genegag-pol
- StatusUniProtKB reviewed (Swiss-Prot)
- Amino acids1440 (go to sequence)
- Protein existenceEvidence at protein level
- Annotation score5/5
Function
function
Gag-Pol polyprotein
Matrix protein p17
Capsid protein p24
Host restriction factors such as TRIM5-alpha or TRIMCyp bind retroviral capsids and cause premature capsid disassembly, leading to blocks in reverse transcription. Capsid restriction by TRIM5 is one of the factors which restricts HIV-1 to the human species. Host PIN1 apparently facilitates the virion uncoating. On the other hand, interactions with PDZD8 or CYPA stabilize the capsid
Nucleocapsid protein p7
Protease
Reverse transcriptase/ribonuclease H
Integrase
Miscellaneous
Reverse transcriptase/ribonuclease H
Catalytic activity
- Endohydrolysis of RNA in RNA/DNA hybrids. Three different cleavage modes: 1. sequence-specific internal cleavage of RNA. Human immunodeficiency virus type 1 and Moloney murine leukemia virus enzymes prefer to cleave the RNA strand one nucleotide away from the RNA-DNA junction. 2. RNA 5'-end directed cleavage 13-19 nucleotides from the RNA end. 3. DNA 3'-end directed cleavage 15-20 nucleotides away from the primer terminus.
- a 2'-deoxyribonucleoside 5'-triphosphate + DNA(n) = diphosphate + DNA(n+1)
- a 2'-deoxyribonucleoside 5'-triphosphate + DNA(n) = diphosphate + DNA(n+1)
Cofactor
Activity regulation
Features
Showing features for site, active site, binding site, dna binding.
Type | ID | Position(s) | Description | |||
---|---|---|---|---|---|---|
Site | 138-139 | Cleavage; by viral protease | ||||
Sequence: YP | ||||||
Site | 227-228 | Cis/trans isomerization of proline peptide bond; by human PPIA/CYPA | ||||
Sequence: GP | ||||||
Site | 369-370 | Cleavage; by viral protease | ||||
Sequence: LA | ||||||
Site | 383-384 | Cleavage; by viral protease | ||||
Sequence: IM | ||||||
Site | 438-439 | Cleavage; by viral protease | ||||
Sequence: NF | ||||||
Site | 446-447 | Cleavage; by viral protease | ||||
Sequence: FP | ||||||
Site | 493-494 | Cleavage; by viral protease | ||||
Sequence: FP | ||||||
Active site | 518 | For protease activity; shared with dimeric partner | ||||
Sequence: D | ||||||
Site | 592-593 | Cleavage; by viral protease | ||||
Sequence: FP | ||||||
Binding site | 702 | Mg2+ 1 (UniProtKB | ChEBI); catalytic; for reverse transcriptase activity | ||||
Sequence: D | ||||||
Binding site | 777 | Mg2+ 1 (UniProtKB | ChEBI); catalytic; for reverse transcriptase activity | ||||
Sequence: D | ||||||
Binding site | 778 | Mg2+ 1 (UniProtKB | ChEBI); catalytic; for reverse transcriptase activity | ||||
Sequence: D | ||||||
Site | 993 | Essential for RT p66/p51 heterodimerization | ||||
Sequence: W | ||||||
Site | 1006 | Essential for RT p66/p51 heterodimerization | ||||
Sequence: W | ||||||
Site | 1032-1033 | Cleavage; by viral protease; partial | ||||
Sequence: FY | ||||||
Binding site | 1035 | Mg2+ 2 (UniProtKB | ChEBI); catalytic; for RNase H activity | ||||
Sequence: D | ||||||
Binding site | 1070 | Mg2+ 2 (UniProtKB | ChEBI); catalytic; for RNase H activity | ||||
Sequence: E | ||||||
Binding site | 1090 | Mg2+ 2 (UniProtKB | ChEBI); catalytic; for RNase H activity | ||||
Sequence: D | ||||||
Binding site | 1141 | Mg2+ 2 (UniProtKB | ChEBI); catalytic; for RNase H activity | ||||
Sequence: D | ||||||
Site | 1152-1153 | Cleavage; by viral protease | ||||
Sequence: LF | ||||||
Binding site | 1164 | Zn2+ (UniProtKB | ChEBI) | ||||
Sequence: H | ||||||
Binding site | 1168 | Zn2+ (UniProtKB | ChEBI) | ||||
Sequence: H | ||||||
Binding site | 1192 | Zn2+ (UniProtKB | ChEBI) | ||||
Sequence: C | ||||||
Binding site | 1195 | Zn2+ (UniProtKB | ChEBI) | ||||
Sequence: C | ||||||
Binding site | 1216 | Mg2+ 3 (UniProtKB | ChEBI); catalytic; for integrase activity | ||||
Sequence: D | ||||||
Binding site | 1268 | Mg2+ 3 (UniProtKB | ChEBI); catalytic; for integrase activity | ||||
Sequence: D | ||||||
Binding site | 1304 | Mg2+ 3 (UniProtKB | ChEBI); catalytic; for integrase activity | ||||
Sequence: E | ||||||
DNA binding | 1375-1422 | Integrase-type | ||||
Sequence: FRVYYRDNRDPIWKGPAKLLWKGEGAVVIQDNSDIKVVPRRKAKIIRD |
GO annotations
Keywords
- Molecular function
- Biological process
- Ligand
Protein family/group databases
Names & Taxonomy
Protein names
- Recommended nameGag-Pol polyprotein
- Alternative names
- Cleaved into 11 chains
Gene names
Organism names
- Taxonomic lineageViruses > Riboviria > Pararnavirae > Artverviricota > Revtraviricetes > Ortervirales > Retroviridae > Orthoretrovirinae > Lentivirus > Human immunodeficiency virus 1
- Virus hosts
Accessions
- Primary accessionP04588
- Secondary accessions
Proteomes
Subcellular Location
Gag-Pol polyprotein
Matrix protein p17
Capsid protein p24
Nucleocapsid protein p7
Reverse transcriptase/ribonuclease H
Integrase
Keywords
- Cellular component
Phenotypes & Variants
Keywords
- Disease
PTM/Processing
Features
Showing features for initiator methionine, lipidation, chain, modified residue, peptide.
Type | ID | Position(s) | Description | |||
---|---|---|---|---|---|---|
Initiator methionine | 1 | Removed; by host | ||||
Sequence: M | ||||||
Lipidation | 2 | N-myristoyl glycine; by host | ||||
Sequence: G | ||||||
Chain | PRO_0000042385 | 2-138 | Matrix protein p17 | |||
Sequence: GARASVLSGGKLDAWEKIRLRPGGKKKYRLKHLVWASRELERFALNPGLLETGEGCQQIMEQLQSTLKTGSEEIKSLYNTVATLYCVHQRIDVKDTKEALDKIEEIQNKSRQKTQQAAAAQQAAAATKNSSSVSQNY | ||||||
Chain | PRO_0000261270 | 2-1440 | Gag-Pol polyprotein | |||
Sequence: GARASVLSGGKLDAWEKIRLRPGGKKKYRLKHLVWASRELERFALNPGLLETGEGCQQIMEQLQSTLKTGSEEIKSLYNTVATLYCVHQRIDVKDTKEALDKIEEIQNKSRQKTQQAAAAQQAAAATKNSSSVSQNYPIVQNAQGQMIHQAISPRTLNAWVKVIEEKAFSPEVIPMFSALSEGATPQDLNMMLNIVGGHQAAMQMLKDTINEEAADWDRVHPVHAGPIPPGQMREPRGSDIAGTTSTLQEQIGWMTSNPPIPVGDIYKRWIILGLNKIVRMYSPVSILDIRQGPKEPFRDYVDRFFKTLRAEQATQEVKNWMTETLLVQNANPDCKTILKALGPGATLEEMMTACQGVGGPSHKARVLAEAMSQATNSTAAIMMQRGNFKGQKRIKCFNCGKEGHLARNCRAPRKKGCWKCGKEGHQMKDCTERQANFLRENLAFPQGKAREFPSEQTRANSPTSRELRVWGGDKTLSETGAERQGIVSFSFPQITLWQRPVVTVRVGGQLKEALLDTGADDTVLEEINLPGKWKPKMIGGIGGFIKVRQYDQILIEICGKKAIGTILVGPTPVNIIGRNMLTQIGCTLNFPISPIETVPVKLKPGMDGPRVKQWPLTEEKIKALTEICKDMEKEGKILKIGPENPYNTPVFAIKKKDSTKWRKLVNFRELNKRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSVPLDEDFRKYTAFTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFRTKNPEIVIYQYMDDLYVGSDLEIGQHRTKIEELREHLLKWGFTTPDKKHQKEPPFLWMGYELHPDKWTVQPIQLPDKESWTVNDIQKLVGKLNWASQIYPGIKVKQLCKLLRGAKALTDIVPLTAEAELELAENREILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQEQYKNLKTGKYARIKSAHTNDVKQLTEAVQKIAQESIVIWGKTPKFRLPIQKETWEAWWTEYWQATWIPEWEFVNTPPLVKLWYQLETEPIVGAETFYVDGAANRETKKGKAGYVTDRGRQKVVSLTETTNQKTELQAIHLALQDSGSEVNIVTDSQYALGIIQAQPDKSESEIVNQIIEQLIQKDKVYLSWVPAHKGIGGNEQVDKLVSSGIRKVLFLDGIDKAQEEHEKYHSNWRAMASDFNLPPIVAKEIVASCDKCQLKGEAMHGQVDCSPGIWQLDCTHLEGKIIIVAVHVASGYIEAEVIPAETGQETAYFILKLAGRWPVKVVHTDNGSNFTSAAVKAACWWANIKQEFGIPYNPQSQGVVESMNKELKKIIGQVREQAEHLKTAVQMAVFIHNFKRKGGIGGYSAGERIIDMIATDIQTKELQKQITKIQNFRVYYRDNRDPIWKGPAKLLWKGEGAVVIQDNSDIKVVPRRKAKIIRDYGKQMAGDDCVAGGQDED | ||||||
Modified residue | 138 | Phosphotyrosine; by host | ||||
Sequence: Y | ||||||
Chain | PRO_0000042386 | 139-369 | Capsid protein p24 | |||
Sequence: PIVQNAQGQMIHQAISPRTLNAWVKVIEEKAFSPEVIPMFSALSEGATPQDLNMMLNIVGGHQAAMQMLKDTINEEAADWDRVHPVHAGPIPPGQMREPRGSDIAGTTSTLQEQIGWMTSNPPIPVGDIYKRWIILGLNKIVRMYSPVSILDIRQGPKEPFRDYVDRFFKTLRAEQATQEVKNWMTETLLVQNANPDCKTILKALGPGATLEEMMTACQGVGGPSHKARVL | ||||||
Peptide | PRO_0000042387 | 370-383 | Spacer peptide 1 | |||
Sequence: AEAMSQATNSTAAI | ||||||
Chain | PRO_0000042388 | 384-438 | Nucleocapsid protein p7 | |||
Sequence: MMQRGNFKGQKRIKCFNCGKEGHLARNCRAPRKKGCWKCGKEGHQMKDCTERQAN | ||||||
Peptide | PRO_0000246719 | 439-446 | Transframe peptide | |||
Sequence: FLRENLAF | ||||||
Chain | PRO_0000042389 | 447-493 | p6-pol | |||
Sequence: PQGKAREFPSEQTRANSPTSRELRVWGGDKTLSETGAERQGIVSFSF | ||||||
Chain | PRO_0000038659 | 494-592 | Protease | |||
Sequence: PQITLWQRPVVTVRVGGQLKEALLDTGADDTVLEEINLPGKWKPKMIGGIGGFIKVRQYDQILIEICGKKAIGTILVGPTPVNIIGRNMLTQIGCTLNF | ||||||
Chain | PRO_0000042391 | 593-1032 | p51 RT | |||
Sequence: PISPIETVPVKLKPGMDGPRVKQWPLTEEKIKALTEICKDMEKEGKILKIGPENPYNTPVFAIKKKDSTKWRKLVNFRELNKRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSVPLDEDFRKYTAFTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFRTKNPEIVIYQYMDDLYVGSDLEIGQHRTKIEELREHLLKWGFTTPDKKHQKEPPFLWMGYELHPDKWTVQPIQLPDKESWTVNDIQKLVGKLNWASQIYPGIKVKQLCKLLRGAKALTDIVPLTAEAELELAENREILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQEQYKNLKTGKYARIKSAHTNDVKQLTEAVQKIAQESIVIWGKTPKFRLPIQKETWEAWWTEYWQATWIPEWEFVNTPPLVKLWYQLETEPIVGAETF | ||||||
Chain | PRO_0000042390 | 593-1152 | Reverse transcriptase/ribonuclease H | |||
Sequence: PISPIETVPVKLKPGMDGPRVKQWPLTEEKIKALTEICKDMEKEGKILKIGPENPYNTPVFAIKKKDSTKWRKLVNFRELNKRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSVPLDEDFRKYTAFTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFRTKNPEIVIYQYMDDLYVGSDLEIGQHRTKIEELREHLLKWGFTTPDKKHQKEPPFLWMGYELHPDKWTVQPIQLPDKESWTVNDIQKLVGKLNWASQIYPGIKVKQLCKLLRGAKALTDIVPLTAEAELELAENREILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQEQYKNLKTGKYARIKSAHTNDVKQLTEAVQKIAQESIVIWGKTPKFRLPIQKETWEAWWTEYWQATWIPEWEFVNTPPLVKLWYQLETEPIVGAETFYVDGAANRETKKGKAGYVTDRGRQKVVSLTETTNQKTELQAIHLALQDSGSEVNIVTDSQYALGIIQAQPDKSESEIVNQIIEQLIQKDKVYLSWVPAHKGIGGNEQVDKLVSSGIRKVL | ||||||
Chain | PRO_0000042392 | 1033-1152 | p15 | |||
Sequence: YVDGAANRETKKGKAGYVTDRGRQKVVSLTETTNQKTELQAIHLALQDSGSEVNIVTDSQYALGIIQAQPDKSESEIVNQIIEQLIQKDKVYLSWVPAHKGIGGNEQVDKLVSSGIRKVL | ||||||
Chain | PRO_0000042393 | 1153-1440 | Integrase | |||
Sequence: FLDGIDKAQEEHEKYHSNWRAMASDFNLPPIVAKEIVASCDKCQLKGEAMHGQVDCSPGIWQLDCTHLEGKIIIVAVHVASGYIEAEVIPAETGQETAYFILKLAGRWPVKVVHTDNGSNFTSAAVKAACWWANIKQEFGIPYNPQSQGVVESMNKELKKIIGQVREQAEHLKTAVQMAVFIHNFKRKGGIGGYSAGERIIDMIATDIQTKELQKQITKIQNFRVYYRDNRDPIWKGPAKLLWKGEGAVVIQDNSDIKVVPRRKAKIIRDYGKQMAGDDCVAGGQDED |
Post-translational modification
Gag-Pol polyprotein
Matrix protein p17
Capsid protein p24
Nucleocapsid protein p7
Keywords
- PTM
Interaction
Subunit
Matrix protein p17
Interacts with gp41 (via C-terminus) (By similarity).
Interacts with host CALM1; this interaction induces a conformational change in the Matrix protein, triggering exposure of the myristate group (By similarity).
Interacts with host AP3D1; this interaction allows the polyprotein trafficking to multivesicular bodies during virus assembly (By similarity).
Part of the pre-integration complex (PIC) which is composed of viral genome, matrix protein, Vpr and integrase (By similarity).
Capsid protein p24
Interacts with host PDZD8; this interaction stabilizes the capsid (By similarity).
Interacts with monkey TRIM5; this interaction destabilizes the capsid (By similarity).
Protease
Reverse transcriptase/ribonuclease H
Heterodimerization of RT is essential for DNA polymerase activity (By similarity).
The overall folding of the subdomains is similar in p66 RT and p51 RT but the spatial arrangements of the subdomains are dramatically different (By similarity).
Integrase
Part of the pre-integration complex (PIC) which is composed of viral genome, matrix protein, Vpr and integrase. Interacts with human SMARCB1/INI1 and human PSIP1/LEDGF isoform 1. Interacts with human KPNA3; this interaction might play a role in nuclear import of the pre-integration complex (By similarity).
Interacts with human NUP153; this interaction might play a role in nuclear import of the pre-integration complex (By similarity).
Structure
Family & Domains
Features
Showing features for region, motif, zinc finger, domain.
Type | ID | Position(s) | Description | |||
---|---|---|---|---|---|---|
Region | 7-31 | Interaction with Gp41 | ||||
Sequence: VLSGGKLDAWEKIRLRPGGKKKYRL | ||||||
Region | 8-43 | Interaction with host CALM1 | ||||
Sequence: LSGGKLDAWEKIRLRPGGKKKYRLKHLVWASRELER | ||||||
Region | 12-19 | Interaction with host AP3D1 | ||||
Sequence: KLDAWEKI | ||||||
Region | 14-33 | Interaction with membrane phosphatidylinositol 4,5-bisphosphate and RNA | ||||
Sequence: DAWEKIRLRPGGKKKYRLKH | ||||||
Motif | 16-22 | Nuclear export signal | ||||
Sequence: WEKIRLR | ||||||
Motif | 26-32 | Nuclear localization signal | ||||
Sequence: KKKYRLK | ||||||
Region | 73-77 | Interaction with membrane phosphatidylinositol 4,5-bisphosphate | ||||
Sequence: EEIKS | ||||||
Region | 195-233 | Interaction with human PPIA/CYPA and NUP153 | ||||
Sequence: NIVGGHQAAMQMLKDTINEEAADWDRVHPVHAGPIPPGQ | ||||||
Region | 283-369 | Dimerization/Multimerization of capsid protein p24 | ||||
Sequence: YSPVSILDIRQGPKEPFRDYVDRFFKTLRAEQATQEVKNWMTETLLVQNANPDCKTILKALGPGATLEEMMTACQGVGGPSHKARVL | ||||||
Zinc finger | 396-413 | CCHC-type 1 | ||||
Sequence: IKCFNCGKEGHLARNCRA | ||||||
Zinc finger | 417-434 | CCHC-type 2 | ||||
Sequence: KGCWKCGKEGHQMKDCTE | ||||||
Region | 494-498 | Dimerization of protease | ||||
Sequence: PQITL | ||||||
Domain | 513-582 | Peptidase A2 | ||||
Sequence: KEALLDTGADDTVLEEINLPGKWKPKMIGGIGGFIKVRQYDQILIEICGKKAIGTILVGPTPVNIIGRNM | ||||||
Region | 542-548 | Dimerization of protease | ||||
Sequence: GIGGFIK | ||||||
Region | 581-593 | Dimerization of protease | ||||
Sequence: NMLTQIGCTLNFP | ||||||
Domain | 636-826 | Reverse transcriptase | ||||
Sequence: EGKILKIGPENPYNTPVFAIKKKDSTKWRKLVNFRELNKRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSVPLDEDFRKYTAFTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFRTKNPEIVIYQYMDDLYVGSDLEIGQHRTKIEELREHLLKWGFTTPDKKHQKEPPFLWMGYEL | ||||||
Region | 819-827 | RT 'primer grip' | ||||
Sequence: FLWMGYELH | ||||||
Motif | 990-1006 | Tryptophan repeat motif | ||||
Sequence: WEAWWTEYWQATWIPEW | ||||||
Domain | 1026-1149 | RNase H type-1 | ||||
Sequence: IVGAETFYVDGAANRETKKGKAGYVTDRGRQKVVSLTETTNQKTELQAIHLALQDSGSEVNIVTDSQYALGIIQAQPDKSESEIVNQIIEQLIQKDKVYLSWVPAHKGIGGNEQVDKLVSSGIR | ||||||
Zinc finger | 1155-1196 | Integrase-type | ||||
Sequence: DGIDKAQEEHEKYHSNWRAMASDFNLPPIVAKEIVASCDKCQ | ||||||
Domain | 1206-1356 | Integrase catalytic | ||||
Sequence: VDCSPGIWQLDCTHLEGKIIIVAVHVASGYIEAEVIPAETGQETAYFILKLAGRWPVKVVHTDNGSNFTSAAVKAACWWANIKQEFGIPYNPQSQGVVESMNKELKKIIGQVREQAEHLKTAVQMAVFIHNFKRKGGIGGYSAGERIIDMI |
Domain
Reverse transcriptase/ribonuclease H
Reverse transcriptase/ribonuclease H
Integrase
Keywords
- Domain
Family and domain databases
Sequence & Isoform
- Sequence statusComplete
This entry describes 2 isoforms produced by Ribosomal frameshifting. Translation results in the formation of the Gag polyprotein most of the time. Ribosomal frameshifting at the gag-pol genes boundary occurs at low frequency and produces the Gag-Pol polyprotein. This strategy of translation probably allows the virus to modulate the quantity of each viral protein. Maintenance of a correct Gag to Gag-Pol ratio is essential for RNA dimerization and viral infectivity.
P04588-1
This isoform has been chosen as the canonical sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.
- NameGag-Pol polyprotein
- NoteProduced by -1 ribosomal frameshifting.
- Length1,440
- Mass (Da)162,122
- Last updated2007-01-23 v3
- ChecksumD212FABD311A9AB8
P04594-1
The sequence of this isoform can be found in the external entry linked below. Isoforms of the same protein are often annotated in two different entries if their sequences differ significantly.
View isoform- NameGag polyprotein
Keywords
- Coding sequence diversity
- Technical term