Mobile navigation

AI SPECIAL 

Using AI for image creation & editing

Much of publishers’ initial interest in AI was text-based, but AI’s potential to make a huge impact on image creation and editing soon became apparent. Derek Milne, commercial pixometrist at Pixometry, looks at what’s possible.

By Derek Milne

Using AI for image creation & editing

Q: How are publishers currently using AI with images?

A: Image generation is one of the more tangible realisations of the capabilities of artificial intelligence. The ability to create stunning images in seconds, add precise metadata and manipulate the content with words, offers a transformative approach for publishers.

Understandably, there’s considerable excitement around this topic, but what aspects are realistic and, most importantly, beneficial to publishers?

AI image generation technology can be divided into two categories: attention-grabbing image creation and less conspicuous, yet often more valuable, processes essential for everyday operations.

Visual Nirvana

There are several AI image generation engines available including Adobe Firefly, DALL-E, Midjourney and Stable Diffusion. These platforms, which improve seemingly weekly, are leading the charge in technology development.

These engines operate on text prompts, meaning users create images by simply describing their desired scene, such as “a nonchalant cat holding a big fish” or “a 70s style model”. The process is really that straightforward.

Regarding the engines themselves, Firefly and DALL-E are significantly more accessible for everyday use. Midjourney and Stable Diffusion require varying levels of advanced technical knowledge however yield more artistic results.

Images are created quickly, with each prompt typically generating 3-4 variations that can be somewhat similar or wildly different in styles and composition. Still, it’s not all about glamorous cover shots; these engines also create mouth-watering food images, breathtaking scenery, conceptual ideas, logos and more. The only limit is the prompter’s imagination and command of the English language.

As one-off pieces of art, the results are perfect for cover shots, hero images, supporting visuals etc. However, all AI engines are currently unable to repeat image styles; it’s impossible to get a thematic collection of images to use across a story. Additionally, unlike traditional imaging departments, it’s not possible to review and change individual elements in the image; the process generates new variations every time.

All four engines offer impressive results. However, there is a standout tool in Firefly that is arguably the most beneficial and practical interpretation of AI image generation.

Photoshop’s ‘generative fill’ feature extends the background of an image with content that blends naturally with the existing scene. Imagine a portrait-oriented photo of the king and queen walking through Windsor Park but the layout requires a landscape-oriented frame. In seconds, an operator can add realistic trees, grass, and sky to fill the surrounding white space.

Practical Perspectives

The second category of AI image generation has a multitude of technologies, some wide ranging, others particularly niche. However, they all add significant intelligence and automation to the imaging workflow and these are the most beneficial and practical to any publisher on a daily basis.

Imagine you’re viewing a photo of a wet Labrador caught mid shake in a bath. We all instinctively know what it is, however, AI technologies have only recently been able to ‘understand’ and label the content with increasing accuracy.

There are various engines currently available with Google Vision and Amazon Rekognition being the most relevant for publishers. Vision offers greater accuracy, adding precise keywords to metadata, aiding content management and discoverability. Rekognition excels in celebrity identification and provides more editorial-friendly language.

Image enhancement and optimisation leverage this understanding. Content awareness enables very specific toning and corrections to be made, resulting in standout images optimised for their specific output on paper or screen.

Automated background removal tools are well established and extremely precise. Capable of incredible and fast results, real world use sees a 90% success rates over thousands of images.

Image upscaling takes a low resolution image and recreates missing pixels to enhance the image’s clarity, ensuring it displays with detail at larger sizes.

Q: What will be possible in the future?

A: Development will continue apace. Illustrating this point perfectly, ChatGPT launched new versions as this was being written.

  • Expect more meaningful integrations of these technologies into content management and page layout systems, empowering editorial departments to quickly and accurately handle tasks typically sent to other departments or external organisations.
  • Image generation engines will be delivering more refined and realistic results, particularly in facial rendering.
  • Users will gain greater control over the placement and modification of specific elements within the image and there’s a key focus on repeatability and consistency across results.

Three top tips

  1. Embrace the technology and experiment. Experiment with different AI tools to discover what best suits the needs of the business. It’s not a one size fits all approach and the solutions are constantly evolving.
  2. It’s an ‘AND’ not an ‘OR’ approach. Maximise the full potential of these imaging technologies by combining the solutions with the existing processes and skill sets in the business.
  3. Don’t be distracted. However stunning the individual results are from the image generators, the current real day to day benefits lie with generative fills and automated services, metadata generation, image enhancement, background removal and upscaling.

Derek and the other contributors to our AI Special took part in an ‘AI Special – Q&A’ webinar on Wednesday, 26 June. You can watch the recording by registering here.


About us

For over 25 years, Pixometry’s advanced image enhancement software has been powering the imaging workflows of publishers worldwide. Continuously evolving, our software now incorporates the latest AI imaging technologies to enhance and enrich images, perfect for engaging readers in print and digital media.

Email: derek.milne@pixometry.com

Website: www.pixometry.com


This article was included in the AI Special, published by InPublishing in June 2024. Click here to see the other articles in this special feature.