Proteins are the most abundant biomolecules of the living cell and are built from α-amino acids. An α-amino acid carries both an amino group (–NH2) and a carboxyl group (–COOH) attached to the same (α) carbon, with the general formula R–CH(NH2)–COOH. The side chain R distinguishes the twenty standard amino acids.
Classification of amino acids. By the R group: neutral (one –NH2, one –COOH, e.g. glycine, alanine), acidic (extra –COOH, e.g. glutamic acid, aspartic acid) and basic (extra –NH2, e.g. lysine, arginine). Nutritionally they are essential (must come from diet, e.g. valine, leucine, lysine) or non-essential (the body can synthesise them, e.g. glycine, alanine). Except glycine, all are chiral and natural proteins use the L-forms.
Zwitterion. Amino acids exist not as neutral R–CH(NH2)–COOH but as internal salts — the proton moves from –COOH to –NH2, giving R–CH(NH3+)–COO−, a zwitterion (dipolar ion). This explains their high melting points, water solubility and amphoteric (acid + base) behaviour. In acid the –COO− picks up H+ (cation); in base the –NH3+ loses H+ (anion).
Isoelectric point (pI). The pH at which the amino acid exists almost entirely as the zwitterion with zero net charge is the isoelectric point. At this pH it does not migrate in an electric field and its solubility is lowest. Each amino acid has a characteristic pI, which is the basis of separation by electrophoresis.
Peptide bond. The –COOH of one amino acid condenses with the –NH2 of the next, eliminating water and forming an amide (–CO–NH–) link called a peptide bond. Two amino acids give a dipeptide, many give a polypeptide; a protein is a polypeptide with a definite three-dimensional shape.
Structure of proteins is described at four levels. Primary structure is the exact sequence of amino acids in the chain — any change can alter function (e.g. sickle-cell haemoglobin). Secondary structure is the local folding held by hydrogen bonds between backbone C=O and N–H groups, giving the right-handed α-helix (intramolecular H-bonds, as in keratin) or the pleated β-sheet (intermolecular H-bonds between stretched chains, as in silk fibroin). Tertiary structure is the overall 3-D folding of the whole chain (fibrous, e.g. keratin/collagen, or globular, e.g. insulin/enzymes) stabilised by H-bonds, disulphide (–S–S–) bridges, ionic and van der Waals forces. Quaternary structure is the assembly of two or more polypeptide sub-units, e.g. haemoglobin's four chains.
Denaturation. Heat, strong acid/base, heavy-metal ions or alcohol break the H-bonds and other weak forces (but not peptide bonds). The secondary and tertiary structures unravel while the primary sequence stays intact, so the protein loses biological activity — e.g. the coagulation of egg white on boiling or the curdling of milk.
Enzymes are biological catalysts, almost all globular proteins, that speed up cellular reactions by lowering activation energy — e.g. maltase, urease, pepsin. They are remarkably specific: each enzyme acts on one substrate or one type of reaction, often described by the lock-and-key (and induced-fit) model. The substrate binds at the active site, the reaction occurs, and the product is released, regenerating the enzyme. Enzyme names usually end in -ase.