
AI’s New Canvas: The Visual Creators in the Zoo
A photorealistic astronaut riding a whale through a swirling nebula. A technical schematic for a product that doesn’t exist yet. A dozen logo concepts that perfectly capture a brand’s ethos. A year ago, these images required skilled artists and significant budgets. Today, they can be generated in seconds from a single line of text.
This explosion in AI image generation represents a fundamental shift in visual creation. But with a new model or tool announced seemingly every week, knowing where to start can feel daunting.
Last month, in our “Who's Who in the Zoo” series, we worked with machines to write; this month, we work with them to create images. We’ll get straight to what matters: how these systems work, which platforms are right for you, and the practical techniques to turn a simple prompt into an image. We’ll even see how TechArena’s own Rachel Horton uses these tools to build our visuals.
Let’s demystify how these tools create images from words. Most use a clever technique where they learn to create by learning to deconstruct. During training, the AI takes a clear picture and adds layers of random “static” until the original is completely gone. By doing this millions of times, it becomes an expert at reversing the process. When you give it a prompt, the AI starts with a fresh canvas of pure static. Then, guided by your description, it skillfully cleans away that static step-by-step to reveal your final image.
A few terms to help you speak the language:
- Prompt: The text description of what you want.
- Negative prompt: A way to tell the model what to avoid.
- Aspect ratio: Controls the dimensions of the image, like 1:1 for a square or 16:9 for widescreen.
- Seed: A number that sets the starting noise pattern; fixing the seed lets you reproduce a result.
With that in mind, let’s meet a few of the key players in the image gen field.
Midjourney —The Artist
When you see stunning AI art on social media, there’s a good chance it’s from Midjourney. This tool is known for creating beautiful, artistic images with rich colors, interesting light, and a dreamy feel. Because its default style is so artistic, it’s fantastic for fantasy scenes and concept art, but might be too dramatic for simple, photo-realistic pictures.
Tips for Use:
Start with evocative prompts. Because the model thrives on mood and texture, think like a novelist: “misty morning in a cyberpunk Tokyo alley, neon signs reflecting off wet cobblestones.”
Don’t expect perfection on the first try. Pick the best thumbnail, upscale it, then click “Vary” to explore subtle changes. Iteration is the secret.
OpenAI’s DALLE 3—The Literal Interpreter
If Midjourney is an impressionist painter, OpenAI’s DALLE 3 is a technical illustrator. The latest version integrates directly with ChatGPT, allowing users with ChatGPT Plus or Team accounts to create images through chat. Compared with earlier models, DALLE 3 has been engineered to better understand nuanced prompts and adhere closely to complex descriptions. It can render text and nuance instruction efficiently, making it useful for infographics and diagrams.
Tips for Use:
Chat with it like an art director. DALL-E 3’s biggest strength is its integration with ChatGPT, allowing it to understand natural language and conversational requests. Instead of just a list of keywords, describe exactly what you want in full sentences.
Stable Diffusion—The OpenSource Powerhouse
Stable Diffusion is the engine powering many image generation websites and desktop apps. It’s a family of opensource models capable of generating realistic images. Being open source means anyone can run the model on their own hardware, modify it or train it. Openness comes with tradeoffs. The original model was trained on images scraped from the public internet, which raises important questions around copyright. Its outputs sometimes distort hands, and it struggles to insert legible text. Running the model locally also requires a powerful GPU and some technical know-how.
Tips for Use:
Try a hosted service with preset configurations before installing it locally like Leonardo.ai. They provide user-friendly frontends with sensible defaults.
Experiment with different checkpoints or fine tune a small model to create a custom look.
Adobe Firefly—BrandSafe by Design
Among the crowded field of image generators, Adobe’s Firefly takes a different path: safety. Adobe says Firefly’s generative assets are trained on properly licensed content, including media from Adobe Stock and public domain material. According to Adobe, this means you can use Firefly images in commercial projects without worrying about copyright infringement. Firefly is a collection of models available both online and within Creative Cloud applications.
Tips for Use:
Use Firefly inside the apps you already know in Photoshop, try Generative Fill to extend or clean up a photo.
Turn to Firefly for a service that intends to provide peace of mind when licensing images for marketing campaigns or customer collateral.
An Editorial Perspective: Bringing Images to Life
Our own TechArena editorial director, Rachel Horton, uses Midjourney and Sora daily to generate images for articles. Instead of working to master new prompt craft on a daily basis, Rachel uses Gemini and Claude to first generate prompts for images. The way she does this is to feed the article into the AI, asking it to generate five Midjourney and five Sora prompts using the latest best practices in prompt craft. From there, Rachel reads through the prompts, makes revisions as she sees fit, and then sets off creating two images at a time – she places her favorite Sora prompt into Sora and her favorite Midjourney prompt into Midjourney, then waits for the results and chooses which one she likes better. If she doesn’t like either one, she either iterates the prompt with the image she prefers or goes to her second-favorite ideas from Claude or Gemini.
Conclusion: Create Responsibly
The visual side of generative AI is amazing. Within seconds, we can create scenes that once required teams of artists. But these are still technical tools. The fun comes from combining your imagination with each model’s capabilities—whether you’re working on a specific composition or simply illustrating a blog post without searching stock sites for hours.
As you set off on your own image generation adventure, start simple. Play with prompts. Try different models. Learn where to place your energy and what you can ignore. More importantly, remember the ethics. Some models are trained on scraped data, raising legitimate copyright and consent issues. Others, like Firefly, try to alleviate those concerns. Always respect the rights of creators, avoid creating images of real people or misleading edits and be honest with your audience about how your visuals were made.
Next month, we’ll continue our AI exploration with a look at agents. Until then, stay curious, and keep experimenting. The AI Zoo is growing!
A photorealistic astronaut riding a whale through a swirling nebula. A technical schematic for a product that doesn’t exist yet. A dozen logo concepts that perfectly capture a brand’s ethos. A year ago, these images required skilled artists and significant budgets. Today, they can be generated in seconds from a single line of text.
This explosion in AI image generation represents a fundamental shift in visual creation. But with a new model or tool announced seemingly every week, knowing where to start can feel daunting.
Last month, in our “Who's Who in the Zoo” series, we worked with machines to write; this month, we work with them to create images. We’ll get straight to what matters: how these systems work, which platforms are right for you, and the practical techniques to turn a simple prompt into an image. We’ll even see how TechArena’s own Rachel Horton uses these tools to build our visuals.
Let’s demystify how these tools create images from words. Most use a clever technique where they learn to create by learning to deconstruct. During training, the AI takes a clear picture and adds layers of random “static” until the original is completely gone. By doing this millions of times, it becomes an expert at reversing the process. When you give it a prompt, the AI starts with a fresh canvas of pure static. Then, guided by your description, it skillfully cleans away that static step-by-step to reveal your final image.
A few terms to help you speak the language:
- Prompt: The text description of what you want.
- Negative prompt: A way to tell the model what to avoid.
- Aspect ratio: Controls the dimensions of the image, like 1:1 for a square or 16:9 for widescreen.
- Seed: A number that sets the starting noise pattern; fixing the seed lets you reproduce a result.
With that in mind, let’s meet a few of the key players in the image gen field.
Midjourney —The Artist
When you see stunning AI art on social media, there’s a good chance it’s from Midjourney. This tool is known for creating beautiful, artistic images with rich colors, interesting light, and a dreamy feel. Because its default style is so artistic, it’s fantastic for fantasy scenes and concept art, but might be too dramatic for simple, photo-realistic pictures.
Tips for Use:
Start with evocative prompts. Because the model thrives on mood and texture, think like a novelist: “misty morning in a cyberpunk Tokyo alley, neon signs reflecting off wet cobblestones.”
Don’t expect perfection on the first try. Pick the best thumbnail, upscale it, then click “Vary” to explore subtle changes. Iteration is the secret.
OpenAI’s DALLE 3—The Literal Interpreter
If Midjourney is an impressionist painter, OpenAI’s DALLE 3 is a technical illustrator. The latest version integrates directly with ChatGPT, allowing users with ChatGPT Plus or Team accounts to create images through chat. Compared with earlier models, DALLE 3 has been engineered to better understand nuanced prompts and adhere closely to complex descriptions. It can render text and nuance instruction efficiently, making it useful for infographics and diagrams.
Tips for Use:
Chat with it like an art director. DALL-E 3’s biggest strength is its integration with ChatGPT, allowing it to understand natural language and conversational requests. Instead of just a list of keywords, describe exactly what you want in full sentences.
Stable Diffusion—The OpenSource Powerhouse
Stable Diffusion is the engine powering many image generation websites and desktop apps. It’s a family of opensource models capable of generating realistic images. Being open source means anyone can run the model on their own hardware, modify it or train it. Openness comes with tradeoffs. The original model was trained on images scraped from the public internet, which raises important questions around copyright. Its outputs sometimes distort hands, and it struggles to insert legible text. Running the model locally also requires a powerful GPU and some technical know-how.
Tips for Use:
Try a hosted service with preset configurations before installing it locally like Leonardo.ai. They provide user-friendly frontends with sensible defaults.
Experiment with different checkpoints or fine tune a small model to create a custom look.
Adobe Firefly—BrandSafe by Design
Among the crowded field of image generators, Adobe’s Firefly takes a different path: safety. Adobe says Firefly’s generative assets are trained on properly licensed content, including media from Adobe Stock and public domain material. According to Adobe, this means you can use Firefly images in commercial projects without worrying about copyright infringement. Firefly is a collection of models available both online and within Creative Cloud applications.
Tips for Use:
Use Firefly inside the apps you already know in Photoshop, try Generative Fill to extend or clean up a photo.
Turn to Firefly for a service that intends to provide peace of mind when licensing images for marketing campaigns or customer collateral.
An Editorial Perspective: Bringing Images to Life
Our own TechArena editorial director, Rachel Horton, uses Midjourney and Sora daily to generate images for articles. Instead of working to master new prompt craft on a daily basis, Rachel uses Gemini and Claude to first generate prompts for images. The way she does this is to feed the article into the AI, asking it to generate five Midjourney and five Sora prompts using the latest best practices in prompt craft. From there, Rachel reads through the prompts, makes revisions as she sees fit, and then sets off creating two images at a time – she places her favorite Sora prompt into Sora and her favorite Midjourney prompt into Midjourney, then waits for the results and chooses which one she likes better. If she doesn’t like either one, she either iterates the prompt with the image she prefers or goes to her second-favorite ideas from Claude or Gemini.
Conclusion: Create Responsibly
The visual side of generative AI is amazing. Within seconds, we can create scenes that once required teams of artists. But these are still technical tools. The fun comes from combining your imagination with each model’s capabilities—whether you’re working on a specific composition or simply illustrating a blog post without searching stock sites for hours.
As you set off on your own image generation adventure, start simple. Play with prompts. Try different models. Learn where to place your energy and what you can ignore. More importantly, remember the ethics. Some models are trained on scraped data, raising legitimate copyright and consent issues. Others, like Firefly, try to alleviate those concerns. Always respect the rights of creators, avoid creating images of real people or misleading edits and be honest with your audience about how your visuals were made.
Next month, we’ll continue our AI exploration with a look at agents. Until then, stay curious, and keep experimenting. The AI Zoo is growing!