VRYNT (HashChi) wanted to create a GAN based NFT generation platform for allowing their community members to easily create/build unique NFTs by using components (latent space) and buy/sell in their Marketplace
Rigorous research and data-driven insights drive success. Leverage our expertise, cutting-edge tools, and meticulous approach for informed choices in a dynamic landscape
Generative Modeling is an unsupervised learning task that automates the discovery of patterns in input data, enabling the generation of new images or variations. The revolutionary Generative Adversarial Networks (GANs) model, introduced by Ian Goodfellow in 2014, excels at creating fake images resembling real data. With a dynamic interplay between its generator and discriminator sub-models, GANs push the boundaries of AI creativity, overcoming earlier limitations and achieving training stability, fidelity, and control over image features. DCGAN, ProGAN, and BigGAN are prominent GAN architectures that further enhance the generation of high-resolution, diverse images.
For our use case StyleGAN2 & StyleGAN-ADA is best fitted as it can generate high resolution images with diversity and have control over image features. The Stylegan2 model allows mapping from latent codes to images and vice versa. The Latter part is known as Image Inversion which allows any similar domain image w.r.t trained model to be embedded into latent space.
The first step for image manipulation in GANs is to be able to map a given image into the latent space. A popular approach to achieve this is to train an encoder to map the image into the latent space. After training the generator using GAN loss, we freeze the generator(decoder) weights and train an encoder that maps images to the latent space. The generator then generates a synthetic image based on the predicted latent. The encoder network is trained like an AutoEncoder by comparing the original image with the generator’s output. StyleGAN model architecture has built in Autoencoder to invert images in latent space.
Stylegan introduced a mapping network to untangle image features, the goal of the mapping network is to convert the input latent vector into the intermediate vector whose different elements control different visual features. StyleGAN warps a space that can be sampled with a uniform or normal distribution into the latent feature space with disentangled feature space.The mapping function is implemented using 8-layer MLP (8-fully connected layers).The output of mapping network (w) then passes through a learned affine transformation (A) before passing into the synthesis network which uses the AdaIN (Adaptive Instance Normalization) module. This model converts the encoded mapping into the generated image.In order to have more control on the styles of the generated image, the synthesis network provides control over the style to different levels of details (or resolution) during image synthesis & upscaling from constans vector to resolution 1024 as shown in figure.These different levels are defined as
Image features control can be achieved by identifying latent direction using Principal component analysis (PCA) in activation space and manual labeling of each identified latent direction based upon changes appearing in the image
Style mixing is another way to manipulate image features by imparting source image features of target image. This involves combining two image ( source and target) latent vectors, more layers are combined, more source image look-alike target and vice versa.
Training a new model using StyleGAN-ADA requires less images compared to stylegan which requires large no. of images as it uses image augmentation technique to increase dataset size and takes less time to train a new model with limited images. Training a stylegan ada model is a gpu intensive task , Frechet Inception Distance (FID) score is used to keep a check on how well a model is performing while training a new model from scratch.
VRYNT project has been completed and deployed on client server with remarkable 20+ GAN Models and 2 Million images in Gallery Collections for creating & blending unique and one of its kind NFTs to place in the marketplace.