Video transcription with deep learning module for recognizing sounds