Building Digital Products

Welcome to Ashima's blog!

PM Learning AI – Week 3 (Introduction to Generative AI)

Spread the love

Firstly, happy new year! And now moving to the cool buzz word – “GenAI”!

Generative AI (GenAI) is a type of AI that can create new content like text, images, videos, audio, music. It works by using large AI models called ‘Foundational Models’ to learn from the data and generate content.

So, if you have heard or used ‘ChatGPT’, Google’s Gemini etc, you already have interacted with GenAI. 🙂

So a generative AI model can be Unimodal or Multimodal.

There are multiple GenAI models. Each model has a different type of processing and use.

Introduction to Generative AI
Generative AI Models
  1. Variational Autoencoders (VAEs) – Combines the principals of neural networks and probabilistic graphical models.
    • Uses the concept of Encoders, Latent space representation and Decoders.
      • Encoders map the input data to a lower dimensional latent space representation.
      • Decoder then reconstructs the original input data from latent representation.
      • The latent space representation captures the key features of data and sends to decoders
    • Applications: Image generation, Anomaly detection, Data imputation, semi-supervised learning like improving task performance with limited labeled data.
  2. Generative Adversarial Networks (GANs) – Designed to generate new, realistic data samples.
    • They leverage two neural networks: Generator and Discriminator.
      • Goal of generator is to fool the discriminator into believing that the data generated is real.
      • Goal of Discriminator is to tell the difference between real and fake data.
      • The two neural networks train simultaneously. Generator works on improving the data generated so that its closer to real data and discriminator works on improvising its logic to detect fake data.
    • Applications: Image synthesis, Style transfer, data augmentation
  3. Autoregressive Models: Used to create data sequentially using the context of previously generated elements. It captures the dependencies in the data by modeling each data point as function of previous one.
    • Application: Text generation, image generation, audio synthesis
  4. Transformers: Mostly used for Natural Language processing (NLP) tasks.
    • Consists of Encoders & Decoders layers to generate text sequences and translations.
    • As the raw data is inputed, the data is converted into embedings and sequence order is added. Then the input is processed by multiple layers of encoder. Using the Ecoder’s output the decoder generates the output and then output tokens are predicted one by one.

With this we will meet soon! 🙂


Spread the love

Leave a Reply

Your email address will not be published. Required fields are marked *