When an A.I. model is trained to create images from text, it uses a huge dataset of images and their corresponding captions. The model is trained by showing it the captions, and having it try to recreate the images associated with each one, as closely as possible. The model learns both general concepts present in millions of images, like what humans look like, as well as more specific details like textures, environments, poses and compositions which are more uniquely identifiable.
Discover similar tools to enhance your workflow
Search 10M+ of AI art and prompts generated by DALL·E 2, Midjourney, Stable Diffusion