API Reference Bumblebee v#0.7.0
View SourceModules
High-level tasks related to audio processing.
Whisper model family.
Whisper featurizer for audio data.
An interface for configurable entities.
ControlNet model with two spatial dimensions and conditioning state.
Denoising diffusion implicit models (DDIMs).
Latent Consistency Model (LCM) sampling.
Pseudo numerical methods for diffusion models (PNDMs).
High-level tasks based on Stable Diffusion.
A CLIP-based model for detecting unsafe image content.
High-level tasks based on Stable Diffusion with ControlNet.
U-Net model with two spatial dimensions and conditioning state.
Variational autoencoder (VAE) with Kullback–Leibler divergence (KL) loss.
An interface for configuring and applying featurizers.
An interface for configuring and using logits processors.
An interface for configuring and building models based on the same architecture.
The BLIP model for text-image similarity.
The CLIP model for text-image similarity.
LayoutLM Model family.
An interface for configuring and using schedulers.
High-level tasks related to text processing.
ALBERT model family.
BART model family.
BERT model family.
Blenderbot model family.
The BLIP model for text encoding.
The CLIP model for text encoding.
DistilBERT model family.
Gemma model family.
Gemma 3 model family.
An interface for language models supporting sequence generation.
A set of configuration options controlling text generation.
GPT-2 model family.
GPT-BigCode model family.
GPT-NeoX model family.
LLaMA model family.
M2M100 model family.
mBART model family.
Mistral model family.
ModernBERT model family.
ModernBERT Decoder model family.
MPNet model family.
Nomic BERT model family.
Phi model family.
Phi-3 model family.
Wraps a pre-trained tokenizer from the Tokenizers library.
Qwen3 model family.
RoBERTa model family.
SmolLM3 is a 3B parameter language model designed to push the boundaries of small models. It supports dual mode reasoning, 6 languages and long context. SmolLM3 is a fully open model that offers strong performance at the 3B–4B scale.
T5 model family.
A set of Whisper-specific configuration options controlling text generation.
An interface for configuring and applying tokenizers.
High-level tasks related to vision.
BiT featurizer for image data.
BLIP featurizer for image data.
The BLIP model for image encoding.
CLIP featurizer for image data.
The CLIP model for image encoding.
ConvNeXT model family.
ConvNeXT featurizer for image data.
DeiT model family.
DeiT featurizer for image data.
DINOv2 model family.
ResNet model family.
Swin Transformer model.
ViT model family.
ViT featurizer for image data.