This hands-on course explores how convolutional and recurrent neural networks can be combined to generate effective descriptions of content within images and video clips.
Learn how to train a network using TensorFlow and the MSCOCO dataset to generate captions from images and video by:
-
Implementing deep learning workflows like image segmentation and text generation
-
Comparing and contrasting data types, workflows, and frameworks
-
Combining computer vision and natural language processing
Upon completion, you’ll be able to solve deep learning problems that require multiple types of data inputs.