The above instrument is a Stable Diffusion Image Variations model optimized to accept multiple CLIP image embeddings as inputs, enabling users to combine the image embeddings from multiple images to mix their concepts and add text concepts for greater variation. Runs locally or on Lambda GPU Cloud, producing a 640×640 picture.



