In the traditional image processing approaches, first low-level image features are extracted and then they are sent to a classifier or a recognizer for further processing. While the traditional image processing techniques employ this step-by-step approach, majority of the recent studies prefer layered architectures which both extract features and do the classification or recognition tasks. These architectures are referred as deep learning techniques and they are applicable if sufficient amount of labeled data is available and the minimum system requirements are met. Nevertheless, most of the time either the data is insufficient or the system sources are not enough. In this study, we experimented how it is still possible to obtain an effective visual representation by combining low-level visual features with features from a simple deep learning model. As a result, combinational features gave rise to 0.80 accuracy on the image data set while the performance of low-level features and deep learning features were 0.70 and 0.74 respectively.