Radiological images play a central role in radiotherapy, especially in target volume delineation. Radiomic feature extraction has demonstrated its potential for predicting patient outcome and cancer risk assessment prior to treatment. However, inherent methodological challenges such as severe class imbalance, small training sample size, multi-centre data and weak correlation of image representations to outcomes are yet to be addressed adequately. Current radiomic analysis relies on segmented images (e.g., of tumours) for feature extraction, leading to loss of important context information in surrounding tissue. In this work, we examine the correlation between radiomics and clinical outcomes by combining two data modalities: pre-treatment computerized tomography (CT) imaging data and contours of segmented gross tumour volumes (GTVs). We focus on a clinical head & neck cancer dataset and design an efficient convolutional neural network (CNN) architecture together with appropriate machine learning strategies to cope with the challenges. During the training process on two cohorts, our algorithm learns to produce clinical outcome predictions by automatically extracting radiomic features. Test results on two other cohorts show state-of-the-art performance in predicting different clinical endpoints (i.e., distant metastasis: AUC = 0.91; loco-regional failure: AUC = 0.78; overall survival: AUC = 0.70 on segmented CT data) compared to prior studies. Furthermore, we also conduct extensive experiments both on the whole CT dataset and a combination of CT and GTV contours to investigate different learning strategies for this task. For example, further experiments indicate that overall survival prediction significantly improves to 0.83 AUC by combining CT and GTV contours as inputs, and the combination provides more intuitive visual explanations for patient outcome predictions.