AT

Protein Structure & Folding

Biochemistry · Proteins & Amino acids · lean revision notes

Protein Structure & Folding

Proteins are linear polymers of L-α-amino acids whose biological function is dictated entirely by their three-dimensional shape. This topic links basic chemistry (bonds, forces) to classic clinical disease — scurvy, Ehlers-Danlos, prion disorders and amyloidoses — and is a recurring source of "one-liner" NEET PG questions.

The four levels of protein structure

Protein architecture is described at four hierarchical levels. Each higher level is built on, and constrained by, the one below it.

Level Definition Stabilising force(s) Classic example
Primary Linear sequence of amino acids joined by peptide (covalent) bonds Peptide bonds (and disulphide bonds) Insulin sequence (Sanger)
Secondary Local regular folding — α-helix, β-pleated sheet, β-turns Hydrogen bonds between backbone C=O and N–H α-keratin (helix), β-keratin/fibroin (sheet)
Tertiary Overall 3-D folding of a single polypeptide Hydrophobic interactions (major), H-bonds, ionic/salt bridges, disulphide bonds, van der Waals Myoglobin
Quaternary Spatial arrangement of ≥2 subunits Same non-covalent forces between subunits Haemoglobin (α₂β₂), creatine kinase

High-yield: The peptide bond is the only covalent bond of the primary structure, while hydrogen bonds stabilise secondary structure. The single most important force driving tertiary folding is the hydrophobic interaction (burial of non-polar side chains away from water).

The peptide bond — exam favourite

  • Formed by condensation between the α-carboxyl of one amino acid and α-amino of the next, with loss of water.
  • It has partial double-bond character (resonance) → it is rigid and planar, and almost always in the trans configuration (cis only common before proline).
  • It is uncharged but polar, and does not rotate. Rotation occurs only about the two bonds flanking the α-carbon — the phi (φ, N–Cα) and psi (ψ, Cα–C) angles, plotted in the Ramachandran plot to show sterically allowed conformations.

High-yield: Allowed φ/ψ combinations are read off the Ramachandran plot. Glycine (smallest R-group) has the widest allowed area; proline is the most restricted (ring locks φ).

Secondary structure details

Feature α-helix β-pleated sheet
Shape Right-handed coil Extended, side-by-side strands
H-bond direction Intra-chain, parallel to axis (residue n to n+4) Inter-strand (can be intra-chain in hairpins)
Residues per turn 3.6
Helix breakers Proline (no NH for H-bond, ring kink), glycine (too flexible)
Orientation Parallel or antiparallel
  • Proline is a classic α-helix and β-sheet breaker — it lacks the amide hydrogen needed to donate an H-bond and its ring imposes a kink.
  • β-turns (reverse turns) frequently contain proline and glycine and reverse the direction of the polypeptide chain.

Forces that stabilise the folded protein

Primary (covalent) → Peptide & disulphide bonds Non-covalent (govern folding) → Hydrophobic > Hydrogen > Ionic (salt bridges) > Van der Waals

  • Disulphide bonds (–S–S–) form between two cysteine residues; abundant in secreted/extracellular proteins (e.g. insulin, immunoglobulins) where the oxidising environment favours them. Intracellular proteins have few because the cytosol is reducing.
  • Hydrophobic interactions drive the non-polar core formation and are the dominant determinant of the final fold.
  • Ionic bonds / salt bridges form between oppositely charged side chains (e.g. Lys–Glu).

High-yield: Insulin has three disulphide bonds — two inter-chain (linking A & B chains) and one intra-chain (within the A chain). The A & B chains arise by cleavage of a single proinsulin with removal of C-peptide.

Protein folding & chaperones

Anfinsen's classic ribonuclease experiment established that the primary sequence contains all information needed to fold (the "thermodynamic hypothesis") — denatured RNase refolds spontaneously to native, active form once denaturant is removed. In the crowded cell, however, many proteins need help to avoid aggregation.

  • Molecular chaperones bind exposed hydrophobic patches of nascent/partly folded chains, prevent premature aggregation, and give time for correct folding. They are not part of the final product and do not dictate the fold — they only facilitate it.
  • Heat-shock proteins (HSPs) are inducible chaperones (Hsp70, Hsp90, Hsp60) upregulated by stress.
  • Chaperonins (e.g. GroEL–GroES in bacteria; TRiC/CCT in eukaryotes) form a barrel that sequesters a folding intermediate in an isolated chamber — folding occurs at the expense of ATP.
  • Protein disulphide isomerase (PDI) and peptidyl-prolyl isomerase (PPI) are folding catalysts that shuffle disulphides and isomerise X-Pro bonds respectively.

Folding flow: Nascent chain → chaperone (Hsp70) binds hydrophobic regions → delivery to chaperonin (ATP-dependent) → native fold → release. Misfolded proteins → ubiquitin–proteasome degradation (ATP- and ubiquitin-dependent) or autophagy.

High-yield: The cell tags misfolded/short-lived proteins with ubiquitin for destruction by the 26S proteasome — an ATP-dependent, cytosolic, lysosome-independent pathway. (Lysosomal/autophagic degradation handles long-lived proteins and organelles.)

Protein misfolding diseases

When folding fails and proteins aggregate into insoluble, β-sheet–rich amyloid, disease results. Amyloid stains with Congo red, showing pathognomonic apple-green birefringence under polarised light, and shows a β-pleated sheet on X-ray diffraction.

Disease Misfolded protein / deposit
Alzheimer disease Amyloid-β (Aβ) plaques + hyperphosphorylated tau (neurofibrillary tangles)
Parkinson disease α-synuclein (Lewy bodies)
Prion diseases (CJD, kuru) PrP^Sc
Huntington disease Mutant huntingtin (polyglutamine)
Type 2 DM (islets) Amylin (IAPP)
Systemic AL amyloid Immunoglobulin light chains
Systemic AA amyloid Serum amyloid A (chronic inflammation)

Prion diseases — the protein-only hypothesis

  • A prion (PrP^Sc) is an infectious agent containing no nucleic acid — it is a misfolded form of the normal cellular protein PrP^C.
  • The conformational change is from α-helix–rich (PrP^C) to β-sheet–rich (PrP^Sc); PrP^Sc is protease-resistant and templates conversion of normal PrP^C — a self-propagating chain reaction.
  • Diseases: Creutzfeldt–Jakob disease (CJD), variant CJD (bovine spongiform encephalopathy/"mad cow"), Kuru (ritual cannibalism, Fore tribe), Gerstmann–Sträussler–Scheinker syndrome, fatal familial insomnia.
  • Pathology: spongiform encephalopathy; clinically rapidly progressive dementia with myoclonus; EEG shows periodic sharp-wave complexes; CSF 14-3-3 protein and RT-QuIC aid diagnosis.

High-yield: Prions are infectious proteins with NO DNA/RNA; pathogenesis = α-helix → β-sheet conversion making the protein protease-resistant and aggregation-prone. Stanley Prusiner won the Nobel for this.

High-yield: Alzheimer = Aβ + tau; Parkinson/Lewy body = α-synuclein; Huntington = polyglutamine expansion. These are the most-tested protein-aggregation associations.

Collagen — structure, synthesis and disease

Collagen is the most abundant protein in the human body and a perennial exam topic because its synthesis intersects with vitamin C, copper and several named diseases.

Structure

  • A right-handed triple helix of three left-handed α-chains.
  • Characteristic repeating sequence Gly–X–Y, where X is frequently proline and Y is frequently hydroxyproline.
  • Glycine at every third residue is essential — its tiny side chain (just H) is the only one that fits in the crowded centre of the triple helix.

High-yield: Glycine is the most abundant amino acid in collagen (every 3rd residue). The repeating triplet is Gly–X–Y (often Gly–Pro–Hyp).

Synthesis steps (intracellular → extracellular)

  1. Synthesis of preprocollagen → cleavage of signal sequence → procollagen.
  2. Hydroxylation of proline and lysine residues — by prolyl hydroxylase and lysyl hydroxylase — requiring vitamin C (ascorbate), Fe²⁺, O₂ and α-ketoglutarate. (Intracellular, in rough ER.)
  3. Glycosylation of hydroxylysine.
  4. Triple-helix formation (procollagen) and secretion.
  5. Extracellular cleavage of propeptides by procollagen peptidases → tropocollagen.
  6. Cross-linking: oxidative deamination of lysine/hydroxylysine by lysyl oxidase (a copper-dependent enzyme) to form covalent cross-links → mature collagen fibril.

High-yield: Vitamin C is the cofactor for prolyl & lysyl hydroxylase (intracellular step). Lysyl oxidase needs copper and acts extracellularly for cross-linking.

Collagen-related diseases

Disease Defect Key clinical clue
Scurvy Vitamin C deficiency → defective proline/lysine hydroxylation → weak triple helix Bleeding gums, perifollicular haemorrhage, corkscrew hairs, poor wound healing, subperiosteal haemorrhage
Osteogenesis imperfecta Defect in type I collagen (COL1A1/A2), often Gly substitution Multiple fractures, blue sclerae, dental + hearing problems
Ehlers-Danlos syndrome Defects in collagen processing (e.g. type III collagen, lysyl hydroxylase, procollagen peptidase) Hyperextensible skin, hypermobile joints, vascular/organ rupture (vascular type)
Menkes disease Copper deficiency (ATP7A defect) → ↓ lysyl oxidase → poor cross-linking Kinky/steely hair, hypothermia, vascular tortuosity, neurodegeneration
Alport syndrome Type IV collagen defect Haematuria, sensorineural deafness, ocular changes

High-yield: Scurvy = failed hydroxylation (vit C) → unstable helix; Menkes = failed cross-linking (copper/lysyl oxidase). Distinguish carefully — both weaken collagen but at different steps.

Collagen types to memorise (mnemonic — "Be So Totally Cool, Read Books"):

  • Type IBone, skin, tendon (most common; defective in OI)
  • Type II — Cartilage (also vitreous, nucleus pulposus)
  • Type IIIReticular fibres, blood vessels, granulation tissue (Ehlers-Danlos vascular type)
  • Type IV — Basement membrane ("4 under the floor"; Alport, Goodpasture target)

Haemoglobin — the model of quaternary structure & cooperativity

  • HbA = α₂β₂ (four subunits, four haem groups) — classic quaternary protein.
  • Cooperative O₂ binding gives a sigmoidal dissociation curve (myoglobin, a monomer, is hyperbolic).
  • T (tense, deoxy)R (relaxed, oxy) states; binding of O₂ to one subunit eases binding to the next (allostery).
  • Right shift (↑ O₂ release): ↑ CO₂, ↑ H⁺ (↓ pH = Bohr effect), ↑ 2,3-BPG, ↑ temperature. Left shift: opposite, plus HbF, CO, methaemoglobin.

High-yield: Cooperativity and the sigmoid curve require quaternary structure — that is why monomeric myoglobin is hyperbolic and stores rather than transports O₂.

Denaturation

Loss of higher-order structure (secondary/tertiary/quaternary) without breaking peptide bonds — primary structure is preserved.

  • Agents: heat, strong acid/alkali, urea/guanidine (disrupt H-bonds), organic solvents, heavy metals, detergents (SDS), β-mercaptoethanol (reduces disulphides).
  • Usually causes loss of function and decreased solubility; may be reversible (Anfinsen) or irreversible.

Diagnosis & investigation of choice (lab angle)

  • AmyloidCongo red stainapple-green birefringence under polarised light (investigation of choice for tissue amyloid).
  • Prion disease → MRI DWI cortical ribboning, EEG periodic complexes, CSF RT-QuIC (high specificity), 14-3-3 protein; definitive = neuropathology (spongiform change).
  • Collagen/structural proteins → genetic testing (COL genes), skin biopsy for EDS subtyping.
  • Protein structure determination (academic): X-ray crystallography, cryo-EM, NMR; sequence by Edman degradation/mass spectrometry.

Management / treatment angles

  • Scurvy → oral vitamin C (ascorbic acid) replacement; dramatic, rapid response.
  • Menkes → parenteral copper-histidine (limited benefit if started late).
  • Prion diseaseno curative treatment; supportive care; strict instrument sterilisation (prions resist routine autoclaving — need extended/alkaline protocols).
  • Alzheimer → symptomatic (cholinesterase inhibitors, memantine); newer anti-amyloid monoclonal antibodies (e.g. lecanemab, aducanumab) target Aβ.

Key differentials & comparisons

  • Scurvy vs Menkes: both weaken collagen — scurvy blocks hydroxylation (vit C), Menkes blocks cross-linking (Cu/lysyl oxidase).
  • Ehlers-Danlos vs Osteogenesis imperfecta: EDS = mainly type III/processing → skin & joint laxity; OI = type I → fragile bones, blue sclerae.
  • α-helix vs β-sheet: intra-chain vs inter-strand H-bonds; proline breaks both.
  • Myoglobin vs haemoglobin: monomer/hyperbolic/storage vs tetramer/sigmoid/transport.
  • PrP^C vs PrP^Sc: same sequence, different conformation (α vs β), protease-sensitive vs protease-resistant.

Recently asked / exam angle

  • Which bond stabilises secondary structure? → Hydrogen bond. (Peptide bond = primary.)
  • Force chiefly responsible for tertiary folding → hydrophobic interaction.
  • Amino acid that breaks the α-helixproline (and glycine).
  • Most abundant amino acid in collagenglycine; repeating unit → Gly-X-Y.
  • Enzyme needing vitamin C in collagen synthesis → prolyl/lysyl hydroxylase; enzyme needing copperlysyl oxidase.
  • Prion pathogenesis → conformational change α-helix → β-sheet, no nucleic acid.
  • Protein in Alzheimer (Aβ/tau), Parkinson (α-synuclein), Huntington (polyglutamine).
  • Stain for amyloid → Congo red, apple-green birefringence.
  • Chaperone function & GroEL/Hsp70; degradation by ubiquitin-proteasome (ATP-dependent).
  • Number/type of disulphide bonds in insulin (3: two inter-chain, one intra-chain).

Rapid revision

  1. Peptide bond = covalent, planar, rigid, trans, partial double-bond; stabilises primary structure.
  2. Secondary structure stabilised by H-bonds (α-helix 3.6 residues/turn; β-sheet parallel/antiparallel).
  3. Proline breaks both α-helix and β-sheet; abundant in turns with glycine.
  4. Tertiary fold driven mainly by hydrophobic interactions; quaternary = ≥2 subunits.
  5. Ramachandran plot shows allowed φ/ψ angles; glycine most permissive, proline most restricted.
  6. Disulphide bonds join two cysteines; common in secreted proteins (insulin, Ig).
  7. Chaperones (Hsp70, GroEL) prevent aggregation, are ATP-dependent, and don't dictate the fold (Anfinsen: sequence does).
  8. Misfolded proteins are cleared by the ubiquitin–proteasome system (ATP-dependent, lysosome-independent).
  9. Prions = infectious proteins, no nucleic acid, α→β conversion, protease-resistant; cause spongiform encephalopathy.
  10. Collagen = triple helix, Gly-X-Y, glycine most abundant; vit C for hydroxylation, copper/lysyl oxidase for cross-linking.
  11. Scurvy (vit C↓) = failed hydroxylation; Menkes (Cu↓) = failed cross-linking; OI = type I; EDS = type III/processing.
  12. Amyloid = β-sheet, Congo red → apple-green birefringence; Alzheimer (Aβ+tau), Parkinson (α-synuclein), Huntington (polyQ).