Exploring Transformers and Diffusers

In this post, we will explore Transformers and Diffusers - two popular generative AI libraries by HuggingFace, both based on the transformer architecture in Python.

HuggingFace Transformers Library

The HuggingFace Transformers library provides state-of-the-art pre-trained models for natural language processing (NLP), computer vision, and audio tasks. It supports popular transformer architectures like BERT, GPT, RoBERTa, ViT, and more.

Key features of the Transformers library include:

Thousands of pre-trained models that can be used for transfer learning or fine-tuning on downstream tasks
Interoperability between PyTorch, TensorFlow, and JAX frameworks
High-level APIs like pipeline() for easy inference on common tasks
Low-level APIs for more flexibility and customization
Detailed documentation, tutorials, and an active community

Installation

The Transformers library can be easily installed with pip:

pip install transformers

Example: Text Classification

Here’s an example of using a pre-trained BERT model for text classification with the pipeline API:

from transformers import pipeline

classifier = pipeline("text-classification", model="distilbert-base-uncased-finetuned-sst-2-english")

result = classifier("I absolutely loved this movie! The acting was superb.")
print(result)

Output:

[{'label': 'POSITIVE', 'score': 0.9998801946640015}]

Example: Question Answering

Here’s an example of using a pre-trained model for question answering:

from transformers import pipeline

qa_model = pipeline("question-answering", model="distilbert-base-cased-distilled-squad")

context = "The Eiffel Tower is a wrought-iron lattice tower on the Champ de Mars in Paris, France."
question = "Where is the Eiffel Tower located?"

result = qa_model(question=question, context=context)
print(result)

Output:

{'score': 0.9940124392509461, 'start': 35, 'end': 47, 'answer': 'Champ de Mars'}

The Transformers library provides a wide range of capabilities for NLP tasks. You can explore more examples and tutorials in the official documentation [4].

HuggingFace Diffusers Library

The HuggingFace Diffusers library focuses on diffusion models for generative tasks like image generation, audio generation, and even generating 3D structures of molecules. It provides pre-trained diffusion models, interchangeable noise schedulers, and modular components for building custom diffusion systems.

Key features of the Diffusers library include:

State-of-the-art diffusion pipelines for inference with just a few lines of code
Flexibility to balance trade-offs between generation speed and quality
Modular design for creating custom end-to-end diffusion systems
Integration with the Hugging Face Hub for sharing and discovering models

Installation

The Diffusers library can be easily installed with pip:

pip install diffusers

Example: Text-to-Image Generation

Here’s an example of using a pre-trained Stable Diffusion model for text-to-image generation:

from diffusers import DiffusionPipeline
import torch

model_id = "runwayml/stable-diffusion-v1-5"
pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe = pipe.to("cuda")

prompt = "a photo of an astronaut riding a horse on mars"
image = pipe(prompt).images[0]  
image.save("astronaut_horse.png")

This code snippet loads the Stable Diffusion v1.5 model, moves it to the GPU, and generates an image based on the provided text prompt [1] [3].

Example: Image-to-Image Translation

Here’s an example of using a pre-trained model for image-to-image translation:

from diffusers import DiffusionPipeline
import requests
from PIL import Image
from io import BytesIO

# Load the image
url = "https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg"
response = requests.get(url)
init_image = Image.open(BytesIO(response.content)).convert("RGB")
init_image = init_image.resize((768, 512))

model_id = "runwayml/stable-diffusion-v1-5"
pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe = pipe.to("cuda")

prompt = "A fantasy landscape, trending on artstation"

images = pipe(prompt=prompt, init_image=init_image, strength=0.75, guidance_scale=7.5).images
images[0].save("fantasy_landscape.png")

This code snippet loads an input image, resizes it, and then uses the Stable Diffusion model to generate a new image based on the input image and the provided text prompt [1] [3].

The Diffusers library provides a powerful toolset for generative tasks using diffusion models. You can explore more examples and tutorials in the official documentation [1] [11] [18].

In summary, the Hugging Face Transformers and Diffusers libraries are invaluable tools for anyone working with state-of-the-art models in NLP, computer vision, and generative AI. They provide pre-trained models, easy-to-use APIs, and extensive documentation to help you get started quickly and build impressive applications [4] [10] [12].

References

[1] towardsdatascience.com: Hugging Face Just Released the Diffusers Library
[2] microsoft.com: What are Hugging Face Transformers? - Azure Databricks
[3] learnopencv.com: Introduction to Hugging Face Diffusers
[4] freecodecamp.org: Hugging Face Transformer Library Overview
[5] philschmid.de: Hugging Face Transformers Examples
[6] datacamp.com: An Introduction to Using Transformers and Hugging Face
[7] youtube.com: Hugging Face Transformers Tutorial - Getting Started with NLP
[8] youtube.com: Hugging Face Transformers - Intro to the Library
[9] linkedin.com: How to Get Started with the Diffusers Library by Hugging Face: A Guide
[10] huggingface.co: Transformers Notebooks
[11] huggingface.co: Diffusers Documentation
[12] huggingface.co: Transformers Documentation
[13] huggingface.co: Transformers Documentation v4.15.0
[14] huggingface.co: Diffusers Training Overview
[15] huggingface.co: Transformers on the Hugging Face Hub
[16] github.com: Hugging Face Diffusers Repository
[17] huggingface.co: Diffusers Basic Training
[18] huggingface.co: Diffusers Documentation
[19] huggingface.co: Diffusers on the Hugging Face Hub
[20] huggingface.co: Diffusers Tutorials Overview

Assisted by claude-3-opus on perplexity.ai

Written on April 10, 2024