Multimodal Unsupervised Image-to-Image Translation

Research output: Contribution to journalConference articleResearchpeer-review

Unsupervised image-to-image translation is an important and challenging problem in computer vision. Given an image in the source domain, the goal is to learn the conditional distribution of corresponding images in the target domain, without seeing any examples of corresponding image pairs. While this conditional distribution is inherently multimodal, existing approaches make an overly simplified assumption, modeling it as a deterministic one-to-one mapping. As a result, they fail to generate diverse outputs from a given source domain image. To address this limitation, we propose a Multimodal Unsupervised Image-to-image (MUNIT) framework. We assume that the image representation can be decomposed into a content code that is domain-invariant, and a style code that captures domain-specific properties. To translate an image to another domain, we recombine its content code with a random style code sampled from the style space of the target domain. We analyze the proposed framework and establish several theoretical results. Extensive experiments with comparisons to state-of-the-art approaches further demonstrate the advantage of the proposed framework. Moreover, our framework allows users to control the style of translation outputs by providing an example style image. Code and pretrained models are available at https://github.com/nvlabs/MUNIT.

Original languageEnglish
JournalLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages (from-to)179-196
Number of pages18
ISSN0302-9743
DOIs
Publication statusPublished - 2018
Externally publishedYes
Event15th European Conference on Computer Vision, ECCV 2018 - Munich, Germany
Duration: 8 Sep 201814 Sep 2018

Conference

Conference15th European Conference on Computer Vision, ECCV 2018
CountryGermany
CityMunich
Period08/09/201814/09/2018

Bibliographical note

Publisher Copyright:
© 2018, Springer Nature Switzerland AG.

    Research areas

  • GANs, Image-to-image translation, Style transfer

ID: 301825957