Statistical testing is an important method for justifying system/software design with direct empirical evidence. In this paper the potential inadequacies of traditional 'black-box' reliability estimation models are explored and new 'white-box' box models developed. These new models represent a system as a number of linked components with sets of possible interactions. In this way different degrees of system complexity can be introduced. It is shown that increasing complexity requires more testing to justify a given reliability level, indicating that the results from black-box representations of complex system should be considered to be optimistic. The new models allow the potential increase in testing to be ameliorated by the use of justifiable prior estimates at a component level. The approach offers a new and rigorous approach to the justification of new systems constructed from a number of COTS components.