X-Y plot of algorithmically-generated AI art of European-style castle in Japan demonstrating DDIM diffusion steps
An X/Y plot of algorithmically-generated AI artworks depicting a European-style castle in Japan, created using the Stable Diffusion V1-5 AI diffusion model. This plot serves to demonstrate the U-Net denoising process, using the DDIM sampling method. Diffusion models algorithmically generate images by repeatedly removing Gaussian noise, step-by-step, and then decoding the denoised output into pixel space. Shown here are a smaller subset of steps within a 40-step generation process.
- Procedure/Methodology
These images were generated using an NVIDIA RTX 4090; since Ada Lovelace chipsets (using compute capability 8.9, which requires CUDA 11.8) are not fully supported by the pyTorch dependency libraries currently used by Stable Diffusion, I've used a custom build of xformers, along with pyTorch cu116 and cuDNN v8.6, as a temporary workaround. Front-end used for the entire generation process is Stable Diffusion web UI created by AUTOMATIC1111.
A batch of 512x768 images were generated with txt2img using the following prompts:
Prompt: a (european castle:1.3) in japan. by Albert Bierstadt, ray traced, octane render, 8k
Negative prompt: None
Settings: Sampler: DDIM, CFG scale: 7, Size: 512x768
During the generation of this batch, the X/Y plot was generated using the "X/Y plot" txt2img script, along with the following settings:
- X-axis:
Steps: 1, 2, 3, 5, 8, 10, 15, 20, 30, 40
- Y-axis: None
- Output images
As the creator of the output images, I release this image under the licence displayed within the template below.
- Stable Diffusion AI model
The Stable Diffusion AI model is released under the CreativeML OpenRAIL-M License, which "does not impose any restrictions on reuse, distribution, commercialization, adaptation" as long as the model is not being intentionally used to cause harm to individuals, for instance, to deliberately mislead or deceive, and the authors of the AI models claim no rights over any image outputs generated, as stipulated by the license.
- Addendum on datasets used to teach AI neural networks
Relevantní obrázky
Relevantní články
Stable DiffusionStable Diffusion je model hlubokého učení převádějící text na obraz, který byl uveden na trh v roce 2022 na základě techniky difúze. Je primárně určen k generování podrobných obrázků na základě popisů textu, ale lze jej také použít k dalším úkolům, jako je inpainting, outpainting a generování překladů obrazu k textovému zadání. Byl vyvinut výzkumníky z CompVis Group na Ludwig Maximilian University v Mnichově a Runway, s výpočetním příspěvkem od Stability AI a trénovacími daty poskytnutými neziskovými organizacemi. .. pokračovat ve čtení