Accurate software effort estimation is an important part of the software process. Originally, estimation was performed using only human expertise, but more recently attention has turned to a variety of machine learning methods. This paper attempts to evaluate critically the potential of genetic programming (GP) in software effort estimation when compared with previously published approaches, in terms of accuracy and ease of use. The comparison is based on the well-known Desharnais data set of 81 software projects derived from a Canadian software house in the late 1980s. The input variables are restricted to those available from the specification stage and significant effort is put into the GP and all of the other solution strategies to offer a realistic and fair comparison. There is evidence that GP can offer significant improvements in accuracy but this depends on the measure and interpretation of accuracy used. GP has the potential to be a valid additional tool for software effort estimation but set up and running effort is high and interpretation difficult, as it is for any complex meta-heuristic technique.