LTP 134: Photography’s AI Spectrum


Let's Talk Photography Logo

Panel

In this solo show Bart explains why we can’t have a single unified opinion on ‘AI in photography’ — there are a whole spectrum of different AI-powered tools that interact with photography in different ways, and they all have their own subjective pros and cons.

While this podcast is free for you to enjoy, it’s not free for Bart to create. Please consider supporting the show by becoming a patron on Patreon.

Reminder – you can submit questions for future Q & A shows at http://lets-talk.ie/photoq

MP3 DownloadRSS FeediTunes

A lot of discussions try to boil all of AI down to a single “good or bad” opinion, but that just doesn’t make any sense!

AI isn’t one thing, it doesn’t do one thing, and even when you confine yourself to just AI's intersection with photography, there are still a lot of different applications to consider. Each of those applications have their own pros and cons, and many of those are very subjective.

When it comes to AI’s effect on the art and craft of photography there’s a huge spectrum of opportunities and threats, and we’re all likely to rank them differently. I’d like to share my current thinking, not in the hope of convincing you to agree with me, but to set you thinking!

To help organise my thoughts I've broken AI in photography down into three topics:

  1. Image classifiers — 'what's in this photo?'
  2. Image editors — 'make this photo look better'
  3. Image generators — 'make me a photorealistic image of …'

That second grouping, editors, is very broad, and so I've added my own sub-groupings. These are not any kind of formal terms, they're just my best attempt to put sensible labels on groups of features so I can talk about them in a coherent way:

  1. Filters — one-button black-box changes
  2. Image pipelines — under the hood image processing between the sensors and the 'original' image we see
  3. Smart editors — tools that move sliders for us, so we can see what they're doing, and tweak it
  4. Generative editors — tools that need to create some new content to 'fix' something in the original

With that structure in mind, let's get stuck in!

Classifiers

The first time AI encountered photography was when facial recognition arrived in our photo management apps to help us automatically tag our friends and family.

To say the first classifiers were limited is putting it mildly — by itself the algorithms could only draw a box around everything they thought might be a face, and some of the false positives were hilarious! When it did successfully find some faces you could tell it who they were, and it would ask you to train it by showing more photos and asking whether or not it was the same person. Again, results were mixed, with siblings and other family members often getting confused with each other, often across genders, leading to some awkward moments in some families!

No one thought those early classifiers were in any way a cause for concern — a source of entertainment, sure, and eventually, a useful tool, maybe, but definitely not an existential crisis!

Like with all of AI, progress has of course continued, and as it did, that same technology developed a more sinister tone when Facebook rolled it out for all users, and enabled it by default (of course)! Facebook’s version was much smarter than those in Aperture or Lightroom because Facebook had the thing AI needs most — oceans of training data, and what’s more, it training data with built-in context! The people in your photos are most likely to be you, your friends, and your family, who are also most likely to be the people you follow. So Facebook's AI classifiers could quickly learn which names belonged to which faces without needing to ask the humans for nearly as much help as our desktops apps needed. If it sounds creepy, it was! In fact, it was so creepy even Facebook realised they’d crossed a line, and seeing the writing on the wall, chose to retire the feature before governments could write the laws to force them to.

The technology may have hit it’s moral nadir with Facebook, but it didn’t stop getting better, and now we have extremely intelligent classifiers built right into our default photo apps. When your iPhone offers to tell you what that blossom, bug, or building you snapped is, those are very smart classifiers offering you a helping hand. Even more impressively, we can now search our photos using every-day descriptions like “all my photos of orange flowers in Ireland in spring”!

After taking a brief detour into the dark side, I now find classifiers to be one of the most useful and under appreciated branches of AI.

It strikes me that if we’d all been great about tagging and describing all our images we’d get much less value from classifiers, but of course, despite our best efforts, I doubt any of us kept that up consistently!

Just to note that outside of photography classifiers are a major concern, especially their use for facial recognition. There are big problems with biases in the training data, and let’s not even begin to discuss the legal and ethical minefield of their use by law enforcement!

Image Editors

After helping us find people and things in our photos, and photos of specific people and things, the next use AI was put to was helpings us edit our images to get the best out of them.

This is such a slippery slope — where exactly the boundary is between editing and creating from whole cloth is impossible to really define. As I mentioned in the introduction, I've grouped all these editing tools under some headings of my own invention. These are not formal definitions, they're just my way of trying to bring some order to a big mess of features

Filters

In my view of these things, the earliest and most primitive AI editing tools were black-box one-button tools designed to transform an image in some way. The user didn't get any kind of visibility into what the filter did, or control over how it did it. You just tapped the button and your photo changed. You could keep the change, or revert it, but that was the extent of your control!

The two earliest examples I remember were the original magic wand button in Apple Photos, and of course those infamous one-push Instagram filters.

Exactly what these filters did was a mystery to everyone, and whether or not they fit the definition of AI is debatable, some were just naive algorithms, but some probably relied on some kind of early AI technology. But that debate doesn't matter, because they were the start of the journey that eventually leads us to the most controversial editing features — the generative ones we'll get to shortly!

I never liked these tools because they were too crude. I rarely liked their result, and since I had no ability to tweak their behaviour, I just ignored them!

Image Pipelines

If you think filters are opaque black boxes, at least you choose to use them, there are actually even more opaque black boxes than filters and if you shoot with a smart phone, you are using them, and you have absolutely no choice not to!

All our camera phones perform masses of image processing on every photo we shoot before we even get your first glimpse of them, and a lot of that processing is now powered by AI.

The teeny tiny sensors and very flat lenses in our phone cameras are just big balls of compromises. The raw data coming out of the sensors is a mess! It’s noisy and distorted and would just look terrible if it were presented to us as-is, and yet, the photos we see from our modern high-end phones are just amazing! They have better dynamic range, colour accuracy, and low light performance that the last DSLR I bought. How is that possible?

Well, as I hinted at before, the answer is AI, lots and lots and lots of AI! What Apple have generally branded Machine Learning or ML is just a particular kind of AI.

The processing high end phone cameras now do is impressive and multi-staged, often combining data from multiple sensors captured through multiple lenses into a single image, and that data might even be augmented with depth data from a LiDAR sensor too. This chain has AI peppered through out, and you’ll often hear Apple refer to the whole thing as their imaging pipeline. Another jargon term you might hear is DSP, DISP, or ISP, these are all variants of the term Digital Image Signal Processor, and it just refers to a chip that processes image data.

The closest I can see to a dark side here is that we now have technology beyond the control of the photographer pre-processing our images. But that’s always been a weak argument to me, because even in the film days, photographers outsourced vast swaths of the chemistry to film producers, darkroom suppliers, and photo labs. Also, Apple and others are starting to give some control back to photographers by introducing new RAW formats that expose more of the data assembled by the imaging pipeline.

Smart Editors

Turning back to AI editors controlled by the user, a big step forward from the filter-style one-button black boxes was the move to filters and presets that moved the sliders for you. Instead of providing you a fait accompli, they provide you a starting point from which you can make your own refinements.

Because all the AI has done is move the sliders, you're still in full control, and can tweak and fine-tune each aspect of the edit to your heart’s content. These AIs don’t quite show you what they're 'thinking' because you still have no idea why it makes sense to reduce the contrast or what ever, but at least you can see what the AI is doing. This is how the modern magic wand works on the iPhone. This change is what converted me from someone who never touched that button to someone who uses it as their starting point on just about every edit!

I really can't see a dark side to these more transparent editing tools.

Generative Editors

Image editing tools have of course continued to evolve, and their use of AI has continued to expand, and that includes aspects of editing that have always been controversial.

We may not worry about it now, but even blemish removal was initially controversial because you’re changing the content of the photo rather than just the way the content is displayed.

Initially photographers removed blemishes by stamping and cloning one part of the photo over another. You had to select the source, and then you stamped it over the blemish. There was no new data being created, and the photographer was completely in charge, but that soon changed.

The first automated tool I can remember is the relatively naive algorithms for removing red eye caused by the on-camera flashes our cheap point-and-shoot cameras all had. You simply clicked on the eye you wanted to fix, and the algorithm removed the unnatural red glint.

The next evolution was to automate blemish removal by using an algorithm of some kind to choose the cloning source you. There was still no new data being created, but the computer was deciding the source of the copies.

None of this ever concerned me in any way, but I can see why photo journalists had problems with even these naive tools.

Where things really took a turn into the grey was Adobe’s release of Generative Fill and Generative Expand last year. These tools allow photographers to replace parts of an image with AI generated content, and, to expand images beyond the edge of the shot frame, filling the empty space with generated content.

Google soon added similar and even more advanced tools into their high-end Pixel phones. Using just the default photos app on a modern Pixel phone anyone can remove a person from a photo, or even move them to another part of the image, and AI generates content to fill the gaps.

Just within the last month Apple has joined the generative fill party with the Photos Cleanup feature in Apple Intelligence.

Clearly, this kind of democratisation of photographic fakery has societal impacts! Personally, I never thought it was appropriate to just trust photos (see LTP 121: Truth in Photography?), but many, probably most, regular folks do, and they need to unlearn that instinct, fast! Hopefully before enough of us get manipulated into doing great damage to our societies!

Image Generators

Finally, we arrive at the extreme end of the spectrum — completely artificial photo-realistic images. They look like photographs, but the scenes they depict simply never existed. No photons of light bounced off anything, passed through any lens, or registered on any sensor to create these images. These are not photos, but they sure look like them!

It was image generating tools like DALL-E bursting onto the scene two years ago that triggered the societal debates we’re all living through now.

I’ve already shared my thoughts on those big questions in (many!) previous episodes, so I’m not going to repeat myself this time, but here's the list:

Final Thoughts

So, I hope you now understand why I don’t have one view on AI, and why I think it's a nonsense to even try!

I really like how classifiers have helped me get the most from my photo library. I really appreciate and admire our modern image pipelines, and I love smart editors that move the sliders for me but let me tweak and tune the results. I’m ambivalent about generative editing tools. I use generative fill from time to time, but not often, and I have no interest in moving people around in my shots. I appreciate the emerging art of images co-created by humans and generative AI. But, I really worry that society isn’t going to deal with the fact that we just can’t trust the reality of images when they’re removed from their context. A photo purporting to show something meaningful is meaningless by itself, it only gets value when it’s vouched for by a source that’s earned our trust! Societally, we need to move from trusting images to trusting sources, and I'm not seeing that happen. If anything, I'm seeing social media further separate images from their sources, stripping away all context, and leaving them open massive abuse. That's already true of genuine photos presented dishonestly, but it's even more true for generatively edited or created images. I think it's inevitable, at least in the short term, that people will choose to trust really toxic sources that will happily abuse AI's potentials in all kinds of ways

Like I say, I don’t have the answers, and I wasn’t trying advocate any. I just hope I’ve convinced you to think of AI as a great big ecosystem with many different tools, and not as some kind of single species. I hope you agree that you can’t have one opinion on all of AI, but that you need a spectrum of opinions.

Leave a Reply