CpG islands are short sequences that preserve a high concentration of the two nucleic acids Cytosine and Guanine. The letter ‘p’ in CpG represents the phosphodiester bonds that appear between the nucleic acids C and G. CpG islands were first identified by Tykocinski and Max as small regions that contain the restriction enzyme HpaII in the genome and were thus originally called HpaII Tiny Fragment (HTF) islands.
A definition of CpG islands was first offered by Gardiner-Garden and Frommer (GGF) in 1987. The original description included the length of the suspected region, which has to exceed 200 bp, the G+C content in that region, which has to be higher than 50%, and the observed/expected (O/E) ratio, which has to surpass a value of 0.6. Takai and Jones improved the GGF definition of CpG islands in 2002. Their modified definition requires that the minimum length of the suspected region is 500 bp and that the required G+C content and O/E ratio are 55% and 0.65, respectively.
In this study we propose two new prediction method called CPSO and CGA, which combines complementary particle swarm optimization (CPSO) and complementary genetic algorithm (CGA) method to predict CpG islands in the human genome.
This work is partly supported by the National Science Council in Taiwan under grants NSC96-2221-E-214-050-MY3, NSC98-2221-E-151-040-, NSC 98-2622-E-151-001-CC2 and 98-2622-E-151-024-CC3.