Deep convolutional neural networks (DCNNs) have attracted significant interest in the computer vision community in the recent years and have exhibited high performance in resolving many computer vision problems, such as image classification. We address the pixel-level depth prediction from a single image by combining DCNN and sparse connected conditional random field (CRF). Owing to the invariance properties of DCNNs that make them suitable for high-level tasks, their outputs are generally not localized enough for detailed pixel-level regression. A multiscale DCNN and sparse connected CRF are combined to overcome this localization weakness. We have evaluated our framework using the well-known NYU V2 depth dataset, and the results show that the proposed method can improve the depth prediction accuracy both qualitatively and quantitatively, as compared to previous works. This finding shows the potential use of the proposed method in three-dimensional (3-D) modeling or 3-D video production from the given two-dimensional (2-D) images or 2-D videos.