AI for Content Creation Workshop

@ CVPR 2023

19th June 2023 — 9am PDT
East Exhibit Hall A, Vancouver Convention Center
+ CVPR Virtual Platform (Zoom link behind login)



Summary

The AI for Content Creation (AI4CC) workshop at CVPR brings together researchers in computer vision, machine learning, and AI. Content creation is required for simulation and training data generation, media like photography and videography, virtual reality and gaming, art and design, and documents and advertising (to name just a few application domains). Recent progress in machine learning, deep learning, and AI techniques has allowed us to turn hours of manual, painstaking content creation work into minutes or seconds of automated or interactive work. For instance, generative adversarial networks (GANs) can produce photorealistic images of 2D and 3D items such as humans, landscapes, interior scenes, virtual environments, or even industrial designs. Neural networks can super-resolve and super-slomo videos, interpolate between photos with intermediate novel views and even extrapolate, and transfer styles to convincingly render and reinterpret content. In addition to creating awe-inspiring artistic images, these offer unique opportunities for generating additional and more diverse training data. Learned priors can also be combined with explicit appearance and geometric constraints, perceptual understanding, or even functional and semantic constraints of objects.

AI for content creation lies at the intersection of the graphics, the computer vision, and the design community. However, researchers and professionals in these fields may not be aware of its full potential and inner workings. As such, the workshop is comprised of two parts: techniques for content creation and applications for content creation. The workshop has three goals:

  1. To cover introductory concepts to help interested researchers from other fields start in this exciting area.
  2. To present success stories to show how deep learning can be used for content creation.
  3. To discuss pain points that designers face using content creation tools.

More broadly, we hope that the workshop will serve as a forum to discuss the latest topics in content creation and the challenges that vision and learning researchers can help solve.

Welcome! -
Deqing Sun (Google)
Lingjie Liu (University of Pennsylvania)
Fitsum Reda (NVIDIA)
Huiwen Chang (Google)
Lu Jiang (Google)
Seungjun Nah (NVIDIA)
Yijun Li (Adobe)
Ting-Chun Wang (NVIDIA)
Jun-Yan Zhu (Carnegie Mellon University)
James Tompkin (Brown University)




Dall-E 2 (OpenAI, 2022), SuperSlomo (NVIDIA, 2018), GauGAN2 (NVIDIA, 2021), Imagen (Google, 2022).

Awards


Schedule—Video Recording

Click ▶ to jump to each talk!

Morning session:
Time PDT
09:00 Welcome and introductions 👋
09:10 Ben Mildenhall (Google) — neural fields
09:40 Ryan Murdock — ML+Art
10:10 Coffee break
10:20 Angela Dai (TU Munich) — 3D
10:50 Tim Salimans (Google) — images, video
11:20 Poster session 1 - West Exhibit Hall #93-#107
  1. AADiff: Audio-Aligned Video Synthesis with Text-to-Image Diffusion
    Seungwoo Lee, Chaerin Kong, Donghyeon Jeon, Nojun Kwak
  2. We never go out of Style: Motion Disentanglement by Subspace Decomposition of Latent Space
    Rishubh Parihar, Raghav Magazine, Piyush Tiwari, Venkatesh Babu Radhakrishnan
  3. Matching-based Data Valuation for Generative Model
    Jiaxi Yang, Wenlong Deng, Benlin Liu, Yangsibo Huang, Xiaoxiao Li
  1. SVS: Adversarial refinement for sparse novel view synthesis
    Violeta Menendez Gonzalez, Andrew Gilbert, Graeme Phillipson, Stephen Jolly, Simon Hadfield [https://bmvc2022.mpi-inf.mpg.de/886/] — BMVC 2022
  2. SpaText: Spatio-Textual Representation for Controllable Image Generation
    Omri Avrahami, Thomas F Hayes, Oran Gafni, Sonal Gupta, Yaniv Taigman, Devi Parikh, Dani Lischinski, Ohad Fried, Xi Yin [https://omriavrahami.com/spatext/] — CVPR 2023
  3. Can We Find Strong Lottery Tickets in Generative Models?
    Sangyeop Yeo, Yoojin Jang, Jy-yong Sohn, Dongyoon Han, Jaejun Yoo — AAAI 2023
  4. Exploring Gradient-based Multi-directional Controls in GANs
    Zikun Chen, Ruowei Jiang, Brendan Duke, Han Zhao, Parham Aarabi — ECCV2022
  5. 3DAvatarGAN: Bridging Domains for Personalized Editable Avatars
    Rameen Abdal, Hsin-Ying Lee, Peihao Zhu, Menglei Chai, Aliaksandr Siarohin, Peter Wonka, Sergey Tulyakov [https://rameenabdal.github.io/3DAvatarGAN/] — CVPR 2023
  6. Putting People in Their Place: Affordance-Aware Human Insertion into Scenes
    Sumith Kulal, Tim Brooks, Alex Aiken, Jiajun Wu, Jimei Yang, Jingwan Lu, Alexei A Efros, Krishna Kumar Singh [https://sumith1896.github.io/affordance-insertion/] — CVPR 2023
  7. CLIP-Actor: Text-Driven Recommendation and Stylization for Animating Human Meshes
    Kim Youwang, Kim Ji-Yeon, Tae-Hyun Oh [https://clip-actor.github.io/] — ECCV 2022
  8. High-fidelity 3D Human Digitization from Single 2K Resolution Images
    Sang-Hun Han, Min-Gyu Park, Ju Hong Yoon, Ju-Mi Kang, Young-Jae Park, Hae-Gon Jeon [https://sanghunhan92.github.io/conference/2K2K/] — CVPR 2023
  9. Visual prompt tuning for generative transfer learning
    Kihyuk Sohn, Huiwen Chang, Jose Lezama, Luisa Polania Cabrera, Han Zhang, Yuan Hao, Irfan Essa, Lu Jiang — CVPR 2023
  10. Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment
    Kim Sung-Bin, Arda Senocak, Hyunwoo Ha, Andrew Owens, Tae-Hyun Oh — CVPR 2023
  11. Discrete Predictor-Corrector Diffusion Models for Image Synthesis
    Jose Lezama, Tim Salimans, Lu Jiang, Huiwen Chang, Jonathan Ho, Irfan Essa — ICLR 2023
  12. BLT: Bidirectional Layout Transformer for Controllable Layout Generation
    Xiang Kong, Lu Jiang, Huiwen Chang, Han Zhang, Yuan Hao, Haifeng Gong, Irfan Essa [https://shawnkx.github.io/blt] — ECCV 2022
12:30 Lunch break 🥪


Afternoon session:
Time PDT
13:30 Oral session + best paper announcement
14:00 Jiajun Wu (Stanford University) — representations
14:30 Shizhao Sun (Microsoft) — design
15:00 Coffee break
15:15 Yuanzhen Li (Google) — images, video
15:45 Xingang Pan (NTU)DragGAN — late breaking speaker! 🦜
16:15 Panel discussion — AI Hopes and Fears for Practical Content Creation
  • Larry Gritz
    Distinguished Engineer, Sony Pictures Imageworks
  • Carl Jarrett
    Senior Art Director, Electronic Arts (EA)
  • Daryl Anselmo
    Art Director / Visual Artist, 'On sabbatical' (previously Midwinter)
🗣️
17:15 Poster session 2 - West Exhibit Hall #94-#107
  1. MetaDance: Few-shot Dancing Video Retargeting via Temporal-aware Meta-learning
    Yuying Ge, Ruimao Zhang, Yibing Song
  2. Custom-Edit: Text-Guided Image Editing with Customized Diffusion Models
    Jooyoung Choi, Yunjey Choi, Yunji Kim, Junho Kim
  3. The Nuts and Bolts of Adopting Transformer in GANs
    Rui Xu, Xiangyu Xu, Kai Chen, Bolei Zhou, Chen Change Loy
  4. LPMM : Intuitive Pose Control for Neural Talking-Head Model via Landmark-Parameter Morphable Model
    Kwangho Lee, Patrick Kwon, Myung Ki Lee, Namhyuk Ahn, Junsoo Lee [https://khlee369.github.io/LPMM/]
  5. LayoutDiffuse: Adapting Foundational Diffusion Models for Layout-to-Image Generation
    Jiaxin Cheng, Xiao Liang, Xingjian Shi, Tong He, Tianjun Xiao, Mu Li [https://github.com/cplusx/layout_diffuse]
  6. Reference-based Image Composition with Sketch via Structure-aware Diffusion Model
    Kangyeol Kim, Sunghyun Park, Junsoo Lee, Jaegul Choo [https://github.com/kangyeolk/Paint-by-Sketch]
  7. Balancing Reconstruction and Editing Quality of GAN Inversion for Real Image Editing with StyleGAN Prior Latent Space
    Kai Katsumata, MinhDuc Vo, Bei Liu, Hideki Nakayama
  8. Fine-grained Image Editing by Pixel-wise Guidance Using Diffusion Models
    Naoki Matsunaga, Masato Ishii, Akio Hayakawa, Kenji Suzuki, Takuya Narihira
  9. 'Tax-free' 3DMM Conditional Face Generation
    Nick Huang, Zhiqiu Yu, Xinjie Yi, Yue Wang, James Tompkin
  10. Instance-Aware Image Completion
    Jinoh Cho, Minguk Kang, Vibhav Vineet, Jaesik Park
  11. Context-Preserving Two-Stage Video Domain Translation for Portrait Stylization
    Doyeon Kim, Eunji Ko, Hyunsu Kim, Yunji Kim, Junho Kim, Dongchan Min, Junmo Kim, Sung Ju Hwang
  12. Neural Sign Reenactor: Deep Photorealistic Sign Language Retargeting
    Christina Ourania Tze, Panagiotis Filntisis, Athanasia-Lida Dimou, Anastasios Roussos, Petros Maragos
  13. Paste and Harmonize via Denoising: Towards Controllable Exemplar-based Image Editing
    Xin Zhang, Jiaxian Guo, Paul Yoo, Yutaka Matsuo, Yusuke Iwasawa [https://sites.google.com/view/phd-demo-page]
  14. Text-to-image Editing by Image Information Removal
    Zhongping Zhang, Jian Zheng, Jacob Zhiyuan Fang, Bryan A. Plummer

Previous Workshops (including session videos)