Skip to content

Colorectal Cancer

Colorectal cancer (CRC) is a significant global health concern, ranking as the third most common cancer and the fourth-leading cause of cancer death worldwide. In Western European and North American populations, the lifetime risk of developing CRC is approximately 5%. The development of CRC is influenced by a complex interplay of both genetic and environmental factors, with inherited genetic factors contributing to about one-third of the disease variance[1].

Historically, the understanding of genetic predisposition to CRC focused on rare, highly penetrant variants in a limited number of genes, including DNA mismatch repair (MMR) genes, APC, SMAD4, BMPR1A, and MUTYH [2]. However, recent advancements in genome-wide association studies (GWAS) have revealed that common genetic variations, or single nucleotide polymorphisms (SNPs), also play a role in CRC risk. These common variants typically exert modest individual effects, but much larger risks are observed in carriers of multiple risk alleles.

Studies have identified several susceptibility loci associated with CRC risk. These include regions such as 8q24 [3], 18q21 (involving the SMAD7 gene), 14q22.2 (near BMP4), 16q22.1 (near CDH1), 19q13.1 (near RHPN2), and 20p12.3 [4]. Some minor alleles are associated with an increased risk in a dose-dependent manner, meaning the risk is higher for homozygous carriers compared to heterozygous carriers. Conversely, other minor alleles may be associated with a decreased risk.

The identification of these common genetic variants holds substantial clinical and public health relevance. By understanding the genetic landscape of CRC risk, it may be possible to identify individuals at higher risk, enabling more personalized screening strategies and early intervention. Further research is ongoing to characterize the functional consequences of these genetic variations and their relationship to the development of colorectal cancer, considering factors such as tumor site (colon/rectum), microsatellite instability (MSI) status, family history, gender, and age at diagnosis.

Understanding the genetic underpinnings of colorectal cancer is a complex endeavor, and current research, while highly informative, operates within several inherent limitations. These constraints can influence the scope of discoveries, the precision of risk estimation, and the generalizability of findings to diverse populations.

Methodological and Statistical Constraints

Section titled “Methodological and Statistical Constraints”

The power of genome-wide association (GWA) studies is inherently limited, particularly in detecting alleles with smaller effect sizes or low Minor Allele Frequencies (MAFs) below 0.1. While current studies are robust for common loci conferring risks of 1.2 or greater, this high power does not extend to the potentially numerous variants with more modest effects or lower frequencies that likely remain undiscovered [3]. Furthermore, GWA-based strategies are not optimally configured to identify low-frequency variants that may exert stronger effects, nor are the arrays ideally formatted to capture copy number variants, both of which could significantly affect colorectal cancer risk.

Another constraint lies in the coverage of genetic variation. The tagging SNPs typically used in GWA studies capture approximately 80% of common SNPs in European populations, but this efficiency drops significantly for less common alleles, with only about 12% of SNPs with MAFs between 5–10% being adequately tagged [3]. This limitation restricts the ability to detect susceptibility alleles within this frequency range. Additionally, the necessity for stringent thresholds to establish statistical significance, combined with financial constraints on the number of variants that can be followed up, can inadvertently lead to the omission of genuinely associated variants; for instance, a previously robustly associated variant was not captured in a meta-analysis due to its P-value threshold [3].

Genetic Complexity and Remaining Heritability Gaps

Section titled “Genetic Complexity and Remaining Heritability Gaps”

A significant proportion of the inherited risk for colorectal cancer remains unexplained by currently identified genetic factors. While inherited susceptibility contributes to approximately 35% of all colorectal cancer cases, known high-risk germline mutations in genes such as APC, mismatch repair genes, MUTYH, SMAD4, BMPR1A, and STK11/LKB1 account for less than 6% of cases. This substantial gap points to considerable “missing heritability,” where common, low-risk variants identified through GWA studies only partially account for the inherited predisposition.

The estimated contribution of identified loci to excess familial risk is likely conservative, as the true effect of a causal variant can often be larger than the association detected through a tag SNP. It is also probable that multiple causal variants, including low-frequency variants with potentially larger effects, exist at each locus, contributing to the overall colorectal cancer risk. The complex allelic architecture, with a spectrum of common and rare alleles contributing to risk, suggests that many low-penetrance variants are yet to be discovered.

Generalizability and Phenotypic Specificity

Section titled “Generalizability and Phenotypic Specificity”

The generalizability of findings from current studies is primarily limited to populations of European ancestry, as study participants have often been exclusively of European descent and from regions like the UK [3]. This raises important questions about the applicability of these results to non-European populations, which may exhibit different genetic risk profiles and varying prevalences of colorectal cancer. Future research is essential to explore how these findings translate across diverse ancestral backgrounds.

Furthermore, the ascertainment criteria for study participants often introduce specific phenotypic biases. Cases are frequently selected based on factors such as early-onset colorectal cancer, the presence of multiple adenomas, or large/aggressive adenomas, often with a family history of colorectal neoplasia. Controls are typically chosen as unaffected spouses or partners without a personal or family history of colorectal neoplasia[3]. These highly specific inclusion and exclusion criteria mean that the findings may not be fully representative of the broader colorectal cancer patient population or the general population’s risk factors.

Several genetic variants across multiple genes and non-coding RNA loci are associated with colorectal cancer risk, influencing key biological pathways that drive tumor development and progression. These variants often reside in regulatory regions, impacting gene expression, protein function, or cellular signaling, and collectively contribute to an individual’s susceptibility to the disease.

The 8q24 chromosomal region is a well-established locus strongly linked to colorectal cancer, featuring several important genes and non-coding RNAs. Variants such asrs6983267 , rs7013278 , rs12682374 , and rs4871022 are found within or near genes like CASC8 (Colon Adenocarcinoma Associated Transcript 8), CCAT2(Colon Cancer Associated Transcript 2),POU5F1B, and PCAT1(Prostate Cancer Associated Transcript 1).CASC8 and CCAT2 are long non-coding RNAs (lncRNAs) known to regulate cell proliferation, migration, and apoptosis, processes fundamental to tumor growth and spread. POU5F1B is a pseudogene related to pluripotency, while PCAT1is another lncRNA implicated in various cancers. These variants are believed to modulate the expression or activity of these genes and lncRNAs, thereby influencing the oncogenic pathways they control and contributing to colorectal cancer susceptibility.

The SMAD7 gene, located on chromosome 18q21, is a crucial negative regulator of the transforming growth factor-beta (TGF-β) signaling pathway, which governs cell growth, differentiation, and programmed cell death. Variants in SMAD7, including rs2337113 , rs11874392 , and rs7226855 , can alter its expression or function, potentially weakening its inhibitory effect on TGF-β signaling. This dysregulation can lead to uncontrolled cell proliferation and contribute to colorectal cancer development. In other genomic regions, theRNA5SP299 - LINC02676 locus contains variants like rs11255841 , rs11255815 , and rs7894531 , while the LINC00536 - EIF3H region includes rs16892766 , rs2437844 , and rs2450115 . LINC00536 and LINC02676 are lncRNAs, and EIF3Hencodes a subunit of the eukaryotic translation initiation factor 3, a complex frequently overactive in cancer, promoting the translation of oncogenic proteins. Variants in these regions can affect the stability or expression of these non-coding RNAs or protein-coding genes, impacting cellular processes vital for cancer progression.

Further variants influence essential cellular functions such as cell adhesion, cytoskeletal dynamics, and immune regulation. For instance, LAMA5 (Laminin Subunit Alpha 5) encodes a vital component of the basement membrane, critical for cell-matrix interactions, adhesion, and migration; variants like rs1741640 , rs493809 , and rs4925386 can alter these interactions, influencing tumor invasion and metastasis. RHPN2 (Rhophilin 2) interacts with Rho GTPases to regulate the actin cytoskeleton, affecting cell shape, movement, and adhesion, making its variants rs28840750 , rs73039433 , and rs73039434 relevant to cellular motility and invasiveness. POU2AF2 (POU Class 2 Homeobox Associating Factor 2) is a transcriptional co-activator involved in immune responses; its variants rs3087967 and rs7130173 may influence immune surveillance or inflammatory pathways, both crucial in colorectal carcinogenesis. Additionally, the SCG5 - GREM1-AS1 locus, with variants such as rs1554865 , rs58658771 , and rs16970016 , encompasses GREM1-AS1, an antisense lncRNA that regulates GREM1, an antagonist of bone morphogenetic protein (BMP) signaling, which is often dysregulated in colorectal cancer. Lastly, theCASC20 - LINC01713 region, including rs6140071 , rs4813802 , and rs6085662 , represents another complex locus where non-coding RNAs may play regulatory roles in oncogenic pathways.

RS IDGeneRelated Traits
rs2337113
rs11874392
rs7226855
SMAD7erythrocyte volume
colorectal cancer
rs6983267 CASC8, CCAT2, POU5F1B, PCAT1prostate carcinoma
colorectal cancer
cancer
polyp of colon
rectum cancer
rs7013278
rs12682374
rs4871022
CASC8, POU5F1B, PCAT1colorectal cancer
rs11255841
rs11255815
rs7894531
RNA5SP299 - LINC02676colorectal cancer
rs3087967
rs7130173
POU2AF2colorectal cancer
rectum cancer
peptide yy measurement
polyp of colon
rs16892766
rs2437844
rs2450115
LINC00536 - EIF3Hcolorectal cancer
AGRP/NPY protein level ratio in blood
rectum cancer
benign colon neoplasm
colon carcinoma
rs1741640
rs493809
rs4925386
LAMA5colorectal cancer
colorectal carcinoma
rs1554865
rs58658771
rs16970016
SCG5 - GREM1-AS1colorectal cancer
rs6140071
rs4813802
rs6085662
CASC20 - LINC01713colorectal cancer
body height
rs28840750
rs73039433
rs73039434
RHPN2colorectal cancer
alkaline phosphatase measurement
optic disc size trait
bone tissue density
protein-glutamine gamma-glutamyltransferase e measurement

Classification, Definition, and Terminology

Section titled “Classification, Definition, and Terminology”

Colorectal cancer (CRC) is precisely defined as a malignant neoplasm originating in the colon or rectum. Within scientific studies, CRC cases are specifically identified according to the ninth revision of the International Classification of Diseases (ICD-9) using codes 153–154[5]. All cases included in these studies had pathologically proven adenocarcinoma [6]. Colorectal cancer is considered a complex trait, with research identifying specific genetic susceptibility loci and common genetic risk factors[6]. The condition is also broadly referred to as a “colorectal neoplasia” [6].

The classification and study of colorectal cancer incorporate several key factors and related terms:

  • Site:CRC is classified based on its primary location within the digestive tract, distinguishing between cancer of the colon and cancer of the rectum[6].
  • MSI Status: Microsatellite instability (MSI) status is a molecular characteristic used to classify CRC, indicating specific genetic alterations within the tumor [6].
  • Family History:The presence of a family history of CRC is a significant classification factor. This is typically defined as having at least one first-degree relative diagnosed with colorectal cancer[6].
  • Age at Diagnosis: The age at which an individual is diagnosed with CRC is an important variable. Studies often stratify cases into groups based on the median age at diagnosis [6].
    • Early Age at Onset: A specific criterion, such as diagnosis at age 55 or younger, is sometimes used to identify cases that may have a stronger genetic predisposition [6].
  • Gender: Differences in the prevalence, risk, and characteristics of CRC are often examined in relation to gender [6].
  • Colorectal Adenoma:This term refers to a non-cancerous growth or polyp in the colon or rectum that has the potential to develop into cancer. Specific criteria for adenomas considered in studies include[6]:
    • Any colorectal adenoma diagnosed at age 45 or younger.
    • Three or more colorectal adenomas diagnosed at age 75 or younger.
    • A large (greater than 1 cm in diameter) or aggressive (villous and/or severely dysplastic) adenoma diagnosed at age 75 or younger.
  • Exclusion Criteria:To focus on common genetic variants, certain hereditary conditions that significantly increase CRC risk are often excluded from study populations. These include known dominant polyposis syndromes, hereditary nonpolyposis colorectal carcinoma (also known as Lynch syndrome), and individuals carrying bi-allelicMUTYH mutations [6].
  • Ancestry: Research studies have primarily focused on populations of European ancestry. It has been noted that findings may require further investigation to determine their applicability to non-European populations, which may exhibit different prevalences of CRC [6].

Colorectal cancer (CRC) is characterized by pathologically proven adenocarcinoma. For research purposes, it has been primarily defined according to the International Classification of Diseases, ninth revision (ICD-9) by codes 153–154[7].

Typical presentations observed in study participants included specific phenotypes related to colorectal neoplasia. These criteria for identification often involved a family history of CRC, such as at least one first-degree relative affected by CRC. Additionally, cases were identified by phenotypes like CRC diagnosed at age 75 or less; any colorectal adenoma at age 45 or less; three or more colorectal adenomas at age 75 or less; or a large (>1 cm diameter) or aggressive (villous and/or severely dysplastic) adenoma at age 75 or less[6].

The diagnosis of colorectal cancer in studies was confirmed pathologically[6]. Beyond pathological confirmation, specific molecular characteristics were assessed:

  • Microsatellite Instability (MSI): The MSI status of colorectal tumors was determined by analyzing DNA extracted from formalin-fixed paraffin-embedded tumor sections. Researchers microdissected regions containing at least 60% tumor cells. Tumor DNA was then genotyped for the mononucleotide microsatellite loci BAT25 and BAT26. Samples showing novel alleles at either BAT25 or BAT26, or both, were classified as having MSI [6].

The presentation and characteristics of colorectal cancer can vary across individuals and populations:

  • Age at Diagnosis: The mean age at diagnosis for CRC cases in various study cohorts has been observed to be around 59 to 61 years, with standard deviations indicating variability around this mean [6].
  • Tumor Site:Research has investigated differences in risk between colon cancer and rectal cancer, suggesting potential variability based on the tumor’s anatomical location[6].
  • Gender: For certain genetic susceptibility alleles, such as rs9929218 , the allele associated with increased risk has been found to be more common in females than in males [6].
  • Microsatellite Stability (MSI) Status: Genetic associations with CRC can differ based on the tumor’s MSI status. For instance, the association between rs4444235 (BMP4) and CRC has been noted to be significantly stronger in cases with microsatellite stable tumors compared to those with microsatellite instability [6].
  • Ancestry: The studies primarily involved participants of European ancestry. It is recognized that findings may need further assessment to understand their applicability to non-European populations, which can exhibit considerably lower prevalence of CRC [6].
  • Family History: A familial component contributes to the genetic susceptibility of CRC, with a recognized increase in risk for individuals with affected first-degree relatives [6].

Colorectal cancer (CRC) is a complex disease influenced by both genetic and environmental factors. Approximately one-third of the disease variance is attributed to inherited genetic factors[1].

Genetic contributions to colorectal cancer risk can be broadly categorized into rare, high-penetrance variants and more common genetic variations.

Historically, a significant genetic contribution to CRC has been linked to rare, highly penetrant mutations in a few specific genes. These include DNA mismatch repair (MMR) genes [8], APC, SMAD4, BMPR1A, and MUTYH [9]. Mutations in these genes are associated with substantially increased lifetime risk.

More recently, genome-wide association studies (GWAS) have identified common genetic variations that contribute modestly to CRC risk. These include susceptibility loci initially found in the 8q24 region [3] and the 18q21 region (involving SMAD7) [4].

Further large-scale meta-analyses of GWAS data have revealed additional common susceptibility loci [4]. These include regions at:

  • 14q22.2 (near the BMP4 gene) [4]
  • 16q22.1 (near the CDH1 gene) [4]
  • 19q13.1 (near the RHPN2 gene) [4]
  • 20p12.3 [4]

While individual common alleles typically exert only small effects on risk, carrying multiple such risk alleles can lead to a significantly larger overall risk [4]. It is thought that these genetic associations may arise from regulatory sequence variants or position effects, rather than changes in protein-coding sequences [4].

Environmental factors also play a role in the development of colorectal cancer[1]. While the specific environmental factors are not detailed, their contribution alongside genetic factors highlights the multifactorial nature of the disease.

Colorectal cancer (CRC) is a common malignancy, ranking as the third most frequent cancer and the fourth-leading cause of cancer death globally. In Western European and North American populations, the lifetime risk of developing CRC is approximately 5%. The development of CRC is influenced by both genetic and environmental factors, with about one-third of the disease variance attributed to inherited genetic factors[1].

Historically, the known genetic basis for CRC primarily involved rare, highly penetrant variants in specific genes. These include:

  • DNA mismatch repair (MMR) genes: Mutations in these genes are a defined genetic contribution to CRC and are associated with familial colorectal cancer risk[8].
  • APC: High-risk germline mutations in this gene contribute to inherited susceptibility [9].
  • SMAD4: This gene is also linked to high-risk germline mutations in CRC [9].
  • BMPR1A: High-risk germline mutations in this gene are recognized contributors to CRC susceptibility [9].
  • MUTYH (MYH): Germline defects in this base-excision repair gene lead to susceptibility to CRC [10]. An association between MUTYH and colorectal cancer has been observed[9].

More recently, research has identified common genetic variations that also play a role in CRC risk. These include regions on chromosomes such as 8q24 [3] and 18q21 (involving SMAD7) [9]. Further large-scale genetic studies have identified four additional susceptibility loci:

  • 14q22.2 (rs4444235 , near BMP4) [9].
  • 16q22.1 (rs9929218 , within CDH1) [9].
  • 19q13.1 (rs10411210 , near RHPN2) [9].
  • 20p12.3 (rs961253 ) [9].

Several molecular and cellular pathways are critically involved in the development and progression of colorectal cancer:

  • DNA Repair Mechanisms: Genes like DNA mismatch repair genes and MUTYH are crucial for maintaining genomic integrity. Defects in these genes can lead to an accumulation of mutations, contributing to cancer development[10].
  • Wnt–β-catenin Signaling Pathway: Aberrant activation of this pathway is considered an initiating event in the development of CRC [9].
  • CDH1 (E-cadherin): This gene, located at the 16q22.1 locus, has an established role in CRC. Somatic inactivation of CDH1, through mutation or promoter methylation, occurs frequently in CRC. This inactivation leads to increased activity of the β-catenin–TCF transcription factor, a key component of the Wnt pathway. CDH1 is essential for adherens junction formation, and its downregulation promotes invasiveness in several cancers, including CRC, by disrupting these junctions [9].

The identification of genetic susceptibility loci for colorectal cancer (CRC) holds significant clinical relevance for understanding disease risk and progression. Although individual genetic variants (alleles) typically have small effects, carrying multiple risk alleles can substantially increase an individual’s overall risk of developing CRC[6]. This cumulative effect underscores the potential public health importance of these identified susceptibility loci, especially as further genetic factors are discovered [6].

Genotype-phenotype correlations provide insights into the diverse characteristics of CRC. For instance, the association between the rs4444235 variant, located in the BMP4 gene, and CRC risk was found to be notably stronger in cases with microsatellite stable (MSS) tumors compared to those with microsatellite instability (MSI) [6]. Another observed correlation indicates that the susceptibility allele for rs9929218 is more prevalent in females than in males [6]. These findings contribute to a deeper understanding of the genetic architecture underlying CRC etiology [6].

Further research is essential to fully characterize the genetic variation at these loci and to determine the functional consequences that drive CRC development [6]. Such advancements could pave the way for more personalized risk assessments, targeted prevention strategies, and potentially influence treatment approaches in the future.

Frequently Asked Questions About Colorectal Cancer

Section titled “Frequently Asked Questions About Colorectal Cancer”

These questions address the most important and specific aspects of colorectal cancer based on current genetic research.


No, not necessarily. While inherited genetic factors contribute to about one-third of colorectal cancer risk, having a family history doesn’t guarantee you’ll develop it. Many factors, both genetic and environmental, play a role. However, it does mean you might be at a higher risk, making personalized screening important.

Yes, you can. Colorectal cancer development is a complex interplay of both your genetics and your environment. While a healthy lifestyle can certainly lower your risk, inherited genetic factors contribute significantly to the disease, meaning even very healthy individuals can still be susceptible.

It’s possible. Most of the current research on genetic risk factors, especially from large-scale studies, has primarily involved people of European descent. This means that the genetic risk profiles for non-European populations might be different, and more research is needed to fully understand how ancestry impacts your specific risk.

Yes, it can be very useful, especially if you have a family history or other risk factors. Genetic testing can identify specific common genetic variations or rarer, highly penetrant mutations in genes like APC or MMR genes that increase your risk. This information can help your doctor recommend personalized screening strategies and early interventions.

It’s complex because even within families, genetic inheritance and environmental exposures vary. While you share many genes, you might not have inherited the same combination of common risk variants or rare, high-risk mutations (like in APC or MUTYH) as your sibling. Lifestyle differences also play a significant role in who develops the disease.

Historically, we focused on rare, highly penetrant gene mutations, like those in APC or MMRgenes, which cause conditions like Lynch syndrome. However, these account for less than 6% of all colorectal cancer cases. We now know that common genetic variations (SNPs) with modest individual effects also play a significant role, especially when you carry multiple of them.

7. Does having more genetic risks make my chance higher?

Section titled “7. Does having more genetic risks make my chance higher?”

Yes, absolutely. Research shows that while individual common genetic variants (SNPs) might only have a modest effect on your risk, carrying multiple of these risk alleles significantly increases your overall chance of developing colorectal cancer. The risk can even be dose-dependent, meaning homozygous carriers have a higher risk than heterozygous carriers.

Yes, if colon cancer runs in your family, it’s generally recommended to discuss earlier or more frequent screening with your doctor. Identifying individuals at higher genetic risk, whether from rare mutations or multiple common variants, allows for more personalized screening strategies and can lead to earlier detection and intervention.

Despite significant progress, a substantial portion of the inherited risk for colorectal cancer, known as “missing heritability,” remains unexplained. Current genetic studies are excellent at finding common variants with moderate effects, but they often miss rarer variants with stronger effects or many low-penetrance variants that collectively contribute to risk.

Yes, your daily habits can definitely influence your overall risk, even with an inherited predisposition. Colorectal cancer is influenced by a complex interplay of both your genetic makeup and environmental factors. While you can’t change your genes, adopting a healthy lifestyle can help mitigate some of the inherited risk and is a crucial part of prevention.


This FAQ was automatically generated based on current genetic research and may be updated as new information becomes available.

Disclaimer: This information is for educational purposes only and should not be used as a substitute for professional medical advice. Always consult with a healthcare provider for personalized medical guidance.

[1] Lichtenstein P, et al. “Environmental and heritable factors in the causation of cancer—analyses of cohorts of twins from Sweden, Denmark, and Finland.”N. Engl. J. Med, vol. 343, 2000, pp. 78–85.

[2] Aaltonen L, et al. “Explaining the familial colorectal cancer risk associated with mismatch repair (MMR)-deficient and MMR-stable tumors.”Clin. Cancer Res, vol. 13, 2007, pp. 356–361.

[3] Tomlinson I, et al. “A Genome-Wide Association Scan of Tag SNPs Identifies a Susceptibility Variant for Colorectal Cancer at 8q24.21.”Nat. Genet, vol. 39, 2007, pp. 984–988.

[4] COGENT Study. “Meta-analysis of Genome-Wide Association Data Identifies Four New Susceptibility Loci for Colorectal Cancer.”Nat Genet, 2010.

[5] International Classification of Diseases, Ninth Revision.

[6] Houlston RS, et al. “Genome-wide association scan identifies a colorectal cancer susceptibility locus on chromosome 8q24.”Nature Genetics, vol. 39, 2007, pp. 989–994.

[7] World Health Organization. International Classification of Diseases, Ninth Revision.

[8] Barnetson, R. A., et al. “Identification and Survival of Carriers of Mutations in DNA Mismatch-Repair Genes in Colon Cancer.”N. Engl. J. Med, vol. 354, 2006, pp. 2751–2763.

[9] Tenesa, A., et al. “Association of MUTYH and Colorectal Cancer.”Br. J. Cancer, vol. 95, 2006, pp. 239–242.

[10] Farrington, S. M., et al. “Germline susceptibility to colorectal cancer due to base-excision repair gene defects.”Am. J. Hum. Genet, vol. 77, 2005, pp. 112–119.