Genomics is expanding the horizons of epidemiology, providing a new dimension for classical epidemiological studies and inspiring the development of large-scale multicenter studies with the statistical power necessary for the assessment of gene-gene and gene-environment interactions in cancer etiology and prognosis. This paper describes the methodology of the Clinical Genome of Cancer Project in São Paulo, Brazil (CGCP), which includes patients with nine types of tumors and controls. Three major epidemiological designs were used to reach specific objectives: cross-sectional studies to examine gene expression, case-control studies to evaluate etiological factors, and follow-up studies to analyze genetic profiles in prognosis. The clinical groups included patients' data in the electronic database through the Internet. Two approaches were used for data quality control: continuous data evaluation and data entry consistency. A total of 1749 cases and 1509 controls were entered into the CGCP database from the first trimester of 2002 to the end of 2004. Continuous evaluation showed that, for all tumors taken together, only 0.5% of the general form fields still included potential inconsistencies by the end of 2004. Regarding data entry consistency, the highest percentage of errors (11.8%) was observed for the follow-up form, followed by 6.7% for the clinical form, 4.0% for the general form, and only 1.1% for the pathology form. Good data quality is required for their transformation into useful information for clinical application and for preventive measures. The use of the Internet for communication among researchers and for data entry is perhaps the most innovative feature of the CGCP. The monitoring of patients' data guaranteed their quality.