The cullin-based E3 ubiquitin ligases are a superfamily of modular E3s that are responsible for the vast majority of eukaryotic protein ubiquitylation.
The SCF ubiquitin ligase, the best-studied cullin-based E3, is made up of a modular E3 core containing CUL1 and RBX1 (also called ROC1), and a substrate specificity module containing SKP1 and a member of the F-box family of proteins (Feldman et al.
1997; Skowyra et al. 1997). (Cardozo and Pagano 2004). Assembling the E2 ubiquitin conjugating enzyme with the substrate specificity module requires the CUL1/RBX1 complex to act as a scaffold (Zheng et al. 2002).
The C Terminus Of CUL1
The C terminus of CUL1 interacts with RBX1, and the N terminus of CUL1 interacts with SKP1. F-box proteins interact with SKP1 via the F-box motif, a 40-amino acid sequence that was first discovered in budding yeast Cdc4p and human cyclin F. (Bai et al. 1996).
In addition to the protein interaction domains found in F-box proteins, these proteins also bind substrates for the ubiquitylation pathway.
Each member of the superfamily of SCF-like ubiquitin ligases that employ cullin proteins as a scaffold shares essential features of the SCF complex architecture.
All five known cullins (CUL1-5) have been shown to interact with RBX1 or RBX2, but they each employ unique specificity modules that share many structural and functional similarities with the SKP1/F-box protein module.
CUL2 and CUL5 are known to interact with the SKP1-like protein elongin C, which in turn interacts with F-box protein-like specificity factors called BC/SOCS-box proteins (Deshaies 1999; Guardavaccaro and Pagano 2003).
In addition, CUL3 interacts with the BTB/POZ family of proteins (Furukawa et al. 2003; Geyer et al. 2003; Pintard et al. 2003; Xu et al. 2003), which appear to combine the functions of SKP1 and the F-box protein into a single polypeptide.
The BTB domain displays structural relationships with SKP1 (Schulman et al. 2000; Xu et al. 2003).
Proteins DDB1/DDB2 and CSA, which are part of the Cul4 complex, appear to serve as substrate specificity modules (Groisman et al. 2003).
A cluster of similarity between the proteins Cdc4, ß-TrCP, Met30, Scon2, and MD6 was first noticed by Kumar and Paietta in 1995 ; this cluster is now known as the F-box.
Not until 1996, when Bai et al realised that the F-box was a ubiquitous motif necessary for protein-protein interaction, were the ramifications of the homology fully understood. Since the motif is found in cyclin F, its presence inspired its naming by Bai et al. as the F-box.
Leucine-rich repeats (LRRs) and WD repeats are the most common motifs found at the carboxy-terminal half of proteins, with the F-box motif itself typically located in the amino-terminal half.
The Human Genome Organization’s proposed naming scheme for human F-box proteins is based on the schemes proposed by Cenciarelli et al. and Winston et al. :
FBXL for an F-box protein with LRRs, FBXW for an F-box protein with WD repeats, and FBXO for an F-box protein with either another or no other motif.
In mice, a similar naming scheme is used, but in other organisms, proteins are not currently designated by whether or not they contain an F-box.
Historical Progression Through Evolution
There are at least 38 F-box proteins in humans (see Table 1 and Additional data file 1), 326 predicted in Caenorhabditis elegans, 22 in Drosophila, and 11 in the completed Saccharomyces cerevisiae genome, but none in prokaryotes.
Many different types of secondary motifs, such as zinc fingers, cyclin domains, leucine zippers, ring fingers, tetratricopeptide (TPR) repeats, and proline-rich regions, are found within F-box proteins.
Since F-box motifs are found in such a wide variety of proteins, it is likely that they have been inserted into preexisting proteins on multiple occasions during eukaryotic evolution.
Some families of F-box proteins are more tightly constrained evolutionary: all human FBXW and FBXL proteins have counterparts in C. elegans, with most also conserved in yeast, but only about half of the human FBXO class of proteins is conserved in nematodes or yeast.
Specific Structural Characteristics
There are around 50 residues in the F-box motif. Consensus sequence (Figure 1) shows that there are few conserved residues.
Among the 234 F-box proteins used to create the consensus, 92% contain leucine or methionine at position 8, 92% contain proline at position 9, 86% contain isoleucine or valine at position 16, 81% contain leucine or methionine at position 20, and 92% contain serine or cysteine at position 32.
Given the lack of agreement, it is prudent to employ search algorithms in order to identify F-boxes, as visual inspection can be misleading.
At this time, the Prosite and Pfam databases 6 contain the two most effective search algorithms.
It is recommended to search in both databases, as there are cases where one will give a significant score to an F-box in a given protein when the other does not.