Wk4 - AI Generated Images
Wk4 - AI Generated Images
Richard Lui
1
AI generated Arts
https://indianexpress.com/article/explained/explained-culture/artificial-
http://startupbeat.hkej.com/?p=123546 intelligence-generated-artwork-top-prize-us-art-competition-8129855
2
• Jason Allen, a media artist, used AI software Midjourney to create art pieces
for a competition.
• He generated over a hundred images using Midjourney and then further edited and
refined them using Adobe Photoshop and Gigapixel software.
• From the collection of images, he selected three artworks, printed them on canvas, and
submitted them for the competition.
• One of his AI-generated artworks, "Théåtre D’opéra Spatial," won first place in the
Colorado State Fair Fine Arts Competition in the digital art category
• He shared his creative process and inspiration for his AI-generated art on the
Midjourney Discord channel
• Gained widespread attention on social media platforms like Twitter and Reddit.
• Some argued that using AI in art competitions was unfair to human artists, while
others pointed out that Allen had disclosed his use of AI in his works.
3
• Digital artist Gokul Pillai shared images created
using "Midjourney" in Instagram
• Transform some of the world's richest people
into individuals that look as if they are living in
poverty.
• Inspired by the movie Slumdog Millionaire and
wanted to create a humorous contrast between
the billionaires and the reality of poverty.
• The images went viral on social media and
sparked a lot of reactions from viewers.
https://edition.cnn.com/style/article/japan-first-ai-generated-manga-art-intl-hnk/index.html 6
Midjourney
• One of the best tools for creating images with high quality
• Uses a proprietary machine learning model
• How far Midjourney has come in just a year
7
Dall-E 2
• Text-to-image models developed by OpenAI
• Integrated in Bing chat
8
Stable Diffusion
• A deep learning, text-to-image model that can generate realistic and detailed
images based on natural language descriptions.
• Open-source project and source code is available for free
• Run on most consumer hardware with a GPU and does not require a cloud service
• Versions
• V1.4, August 2022
• V1.5, October 2022
• V2.0, November 2022
• V2.1, December 2022
• SDXL 1.0, July 2023
Prompt:
A photorealistic hello kitty wearing a yellow t-shirt playing
guitar in a forest in front of a lake, rainy day 9
Model: Stable Diffusion XL
Stable Diffusion Models
Prompt:
A puppy in space
13
Stable diffusion keywords
• Words that can change the style, format, or perspective of the image.
• “Magic words”: may make the image better
• E.g. Highly detailed, professional, UHD, 64K
• Give your character a name (why?)
• https://www.behindthename.com/random/
A photo of a young woman, cinematic, A digital illustration of a young A young woman, cinematic,
A photo of a young woman woman, cinematic, outdoor, sun
outdoor, sun, top photographer outdoor, sun, anime style
14
Model: Stable Diffusion XL
In the style of …
• Generate images that are influenced by a specific artistic style
• A painter, a photographer, a filmmaker, …
15
Prompt Engineering for Stable Diffusion
A well-written prompt consisting of modifiers and a good sentence structure.
Example
Do you want a photo or a painting? => painting
What’s the subject of the photo? Person. animal, landscape. => a boy
What details do you want to be added?
=> wearing a suit
Special Lighting? Soft, ambient, ring light, neon => natural light
Environment. Indoor, outdoor, underwater, in space => grassland with a hill
Color Scheme. Vibrant, dark, pastel => with bright colors
In specific art style? => Art by Masashi Kishimoto
Prompt:
A painting of a boy, wearing a suit, natural light, grassland with
a hill, with bright colors, Art by Masashi Kishimoto
16
Prompt Guidance
• Adjusts how much the image looks closer to the prompt
• Higher values will make your image closer to your prompt
• If it’s too high, the image may be distorted
• The text prompt may be ignored when the guidance scale value is set to 0
• The image may become more creative
17
Prompt:
A photorealistic hello kitty wearing a yellow t-shirt playing
guitar in a forest in front of a lake, rainy day
Model: Stable Diffusion 1.5 Model: Stable Diffusion 1.5 Model: Stable Diffusion 1.5
Prompt Guidance: 2 Prompt Guidance: 7 Prompt Guidance: 30
18
Negative Prompt
• Specify what you don't want to see in the generated images
20
Example Prompt:
peaceful elven forest, thick forest, large living trees are visible in
the background, by alan lee, michal karcz, smooth details, lord
of the rings, game of thrones, smooth, detailed terrain, oil
painting, trending artstation, concept, fantasy matte painting
https://getimg.ai/guides/guide-to-negative-prompts-in-stable-diffusion
21
Image to image (Img2Img)
• Generates new images from an input image and a corresponding text prompt
22
Prompt:
Stable Diffusion 1.5
building with trees, university campus
Filter: ReVAnimated
Prompt:
Grassland with mountain and river,
photorealism, highly detailed
23
Inpainting
• Fill in some part of an image that is missing or has been removed.
Prompt:
A tree with red leaves
24
Outpainting
• Extend the boundaries of an image by adding more pixels that are consistent with
the original image
Prompt:
A lake with duck
25
ControlNet
• Provide more precise and versatile control over image generation
• Application
• Human pose to image generation, black and white artwork colorization, architectural
rendering, design brainstorming, storyboarding, etc
26
Controlnet Preprocessor
OpenPose: Extract human poses and facial expression
A pose graph is a representation of the human body using a set of keypoints and limbs.
• Each keypoint corresponds to a body part, such as the head, shoulders, elbows, wrists, etc.
• Each limb connects two keypoints with a colored line segment
27
Canny: Extracts the edge/outlines of an image. It is useful for retaining the composition of the original image.
Depth: Estimates the depth information from the reference image (how far or close things are in the original image)
Prompt
A pretty girl holding a banner "I love PolyU"
References:
1. Thought the Pope Francis puffer photo was real? Here's how not to get fooled by AI next time
2. Why AI art struggles with hands
29
What’s next?
https://openai.com/dall-e-3 30