Yesterday, I attended the amazing “GANs for Good” panel discussion hosted by deeplearning.ai, and here are my takeaways:
Generative adversarial networks (GANs) have been improved over the years and are starting to see adoption in the real world in domains such as health, art, and augmented reality. A conversation on progress and responsible use is needed.
Current progress and iterations of GANs show that we have gone from generating simple low-resolution images to high-resolution realistic images. However, applications beyond simple image generation are starting to surface.
One interesting application of GANs is the manufacturing of dental crowns which speeds up the whole process for the patient. A procedure that may take weeks could now be done with high precision in hours. Read this paper for more.
GANs are now being used for improving augmented reality (AR) scenes where incomplete environment maps can be completed using the creative generation capabilities of GANs through learning the statistical structure of the world. Other AR-related use cases of GANs involve environment texturing such as enabling lighting and reflections.
Other use cases involve the adoption of differential privacy when training GANs so as to learn the data distribution without disclosing too much information about those samples. Read more in this survey about differentially private GANs.
People have wondered whether it’s possible to use GANs for generating training data to be used in low-data regimes. There hasn’t been much success there but it’s possible to meld two sources of data for generating more realistic and useful training data. For instance, a research team at Apple showed that you can leverage large unlabeled data and feed it to a refiner (powered by GANs) which is trained to generate more realistic training data given some base labeled synthetic data. This can reduce the cost of generating supervised datasets and help on a variety of machine learning tasks that weren’t addressed before. Read more about this work here.
GANs have also been used to generate data for QuickPath which is a gestural text-entry feature for the iOS.
GANs have also been used to generate data for “Good”. In other words, it’s possible to have more control over the sampling of data to mitigate biases and oversampling for a particular category of data which enables accessibility, inclusiveness, and fairness. Sources to look at to learn more:
Anima Anandkumar also talked about the importance of understanding GANs and how to better train them by rethinking optimization.
By applying an idea called Competitive Gradient Descent (CGD), researchers are able to stabilize the training of GANs which is further extended into the realm of reinforcement learning. The key is in modeling the interaction between the agents such as the discriminator and generator. Gradient descent algorithms are used for weight updates done simultaneously and individually for the agents so there is no interaction modeled between the agents. CGD allows the dynamic interaction of agents to be modeled and encourage stable training of GANs.
A lot of researchers are interested in disentanglements in GANs. Achieving better disentanglement quality can enable compositionally. This means that improving the disentanglement of various factors of variation can enable more control over the generation process. Controllable generation helps in applications such as image editing and 3D scene rendering. To improve the process of disentanglement one work finds that using a small amount of supervised data improved the disentanglement quality. You can then do really interesting things like adding sunglasses and other attributes to generated faces, etc. All of this is done while still maintaining high generation quality which has been a challenge. Read more here.
Another interesting line of research is to better understand the underlying interaction among agents in a GANs to better explain its performance.
Comments, advice, thoughts from the panelists (Anima Anandkumar, Ian Goodfellow, Alexei A. Efros, Andrew Ng, and Sharon Zhou) to researchers and practitioners:
Some interesting research topics involving GANs include but not limited to improving disentanglement, applying contrastive learning, and training more stable GANs.
Build tools that help practitioners and researchers easily go from proof of concepts to real-world applications. This will inspire more creative use of GANs.
Deploying GANs and other large models like BERT at scale is a challenge but an important endeavor in the community.
Participate in more interdisciplinary studies and engage with experts to better understand how to more responsibly conduct the research and apply these ideas in the real world.
Borrow inspiration from understanding the mechanisms (e.g., feedback mechanism) of the brain to help build more robust models. This is essential as we aim to put these models in production and do it in a more reliable way.
Differential private GANs is an important topic where data privacy is concerned.
GANs can help get over the data scarcity problem, particularly in areas like medicine where labeled data is scarce and costly.
The ability of GANs to produce potentially useful training data can also help to support research in languages that aren’t supported as much today.
GANs can allow photography that is more inclusive and can better capture a variety of styles (hence the focus on disentanglement).
Deep learning has nice recipes that make it easier and reliable to build more production-ready machine learning models. However, with GAN-based models, we are still figuring out what those ingredients and recipes are which also means that this limits their adoption and usage in the real world. Standardization of techniques and tricks to better train stable GANs is an ongoing effort.
Evaluation metrics are also a challenge for GANs but that seems to be the case in the entire field of machine learning.
If you are a researcher, don’t be too focused on the short-term. Don’t be too worried about staying up to date with all the relevant literature (most of it won’t be too helpful anyway). Also, don’t rush to publish.
When you see a new trick proposed in the literature don’t be too quick to want to adopt it. Typically, if the trick/technique is useful, say they are effective when applied to complex tasks, it will be widely adopted by the community and potentially incorporated into libraries that you can make use of.
We need to be more open about issues affecting the community at large such as inclusion and diversity. Every opinion counts. Find allies, communities, and spread awareness on the issues that matter to you and your communities.
Learn how to communicate your ideas well.
Being part of a community is a great way to stay up-to-date with the latest trends. Encourage open paper reading sessions with not too much focus on having perfect interpretations, explanations, or expectations of it. This takes time and mistakes will be made. Just keep practicing and do your due diligence.
Rather than wasting time tracking and reading all the relevant papers that come out every day, spend time focusing on asking why is a problem hard. Think about something new that hasn’t been tried. Only until you have made progress with those questions does it makes sense to look at what others have done previously.
Older papers are a great source of inspiration. Very few people read them but it doesn’t mean that they are not relevant.
Start with the foundations if you haven’t. For instance, how can we use the teachings of game theory in reinforcement learning? This is where you pick up strong intuitions that help you understand and tackle problems worth solving.
Help to talk about and share how GANs are being used in the real world. It may inspire someone that’s working on a similar problem. You never know! By the same token, research on something that’s different from what everyone else is doing.
Think of GAN applications beyond images.
Let’s be very careful about how we apply these models in the wild. Be extra cautious when applying GANs to real-world applications.
Have fun and enjoy what you do. Ideas are often scooped but your goal should be to pursue what makes you happy and what motivates you not what excites the community at large.
Interdisciplinary studies continue to be even more important today. Let’s think about how studies outside our machine learning realm, such as in economics and philosophy, can help us to better pose research questions. We can learn a lot from experts in other fields on how to best apply models in the real world.
Let’s not just try to hand out knowledge via papers. We can go the extra mile and publish demos, additional useful metrics for practitioners (e.g., inference latency, energy efficiency, etc.), and be more involved in feedback/discussion.
There are great opportunities to help train more efficient models for fast rendering and support multimodal data which is the case with hard problems like self-driving vehicles.
If you know that your ML project will do harm in some kind of way, avoid publishing these. With great power comes great responsibility!
If you are interested in more articles like this in the future, please subscribe below: