This column marks the one-year anniversary of my first foray into the AI realms in this journal. Much has transpired since. Over the last few months, AI (artificial intelligence) issues and disputes have kept lawyers busy; caused Hollywood writers and actors to go on strike; as well as sent fear and loathing through the professional photo and video creative community. On the other side of the coin, AI has created some intriguing options for image creation, albeit some more sensible than others. Just for balance, I’ll begin this AI redux with the troublesome stuff. Then I’ll get to some interesting apps and developments on the creative side.
AI Redux: The Debate
Let’s put side copyright and other legal and even ethical battles. A new uproar relates to AI’s potential for fraud and deception in the previously reliable use of images as information as well as evidence. The concern is not about fairly benign image enhancement or changes (making a gray sky sunny or the common use of super-wide-angle lenses to make a hotel room browsed online look larger or even retouching senior portraits). However, the concerns are more about the sinister and quite dangerous production of convincing fakes and frauds for propaganda, insurance claims, online selling, politics, science, reportage and, my goodness, even dating apps.
Indeed, there is much discussion about how all this could affect our trust in photography as a credible visual medium. This existential threat has set off alarm bells and spurred action among photography and imaging companies. Further, it spurred the formation of a movement called the Content Authenticity Initiative. Camera, trade, news organizations, photo associations, software companies as well as other concerned parties compose this large group.
A main focus is to develop ways and means to fact-check an image so it is what it says it is and not some “false positive” or fake. Akin to a fact-checking process in the photojournalism trade, it might take the form of file embeds within recorded image data. It could also utilize GPS, time stamps and the like. Moreover, camera companies are making some first steps in installing an “authentication” code as part of the sidecar data that travels with the image.
While this is a step in the right direction, it strikes me as more use to sophisticated legal arguments and cases of flagrant violations than planting easy to locate red flags that would trigger a kind of watermark warning when the false image is copied or displayed. But I guess it’s a start. It could give organizations with the proper resources, like news agencies, a way to verify the image.
And one last thing . . . it gets weirder. In a recent item about the so-called “Doomsday Clock” (a kind of countdown to the Apocalypse) from the Bulletin of Atomic Scientists, generative AI is listed among the factors that could make the clock hit midnight faster. What’s more, they raise questions about how to control a technology that could improve, or indeed threaten, civilization in countless ways.
And now for some lighter fare.
Text Generation, Old Hat?
If you have taken the time to play with early text prompt image generators or even look at samples of their output, you have probably encountered what I call the “1950’s greeting card syndrome.” It’s a kind of corny, overproduced translation that’s good for the unicorn and Hobbit crowd but not much else. Well, I guess it had to start somewhere.
Development has certainly not stopped at the six-finger stage (the propensity of early generators to become bollixed when configuring a human hand). The next generation are dubbed “hybrid AI” generators. They allow for the editing of “traditional” images with AI prompts.
AI will also soon permit enhancement and editing with voice and text prompts, rather than build one from written instructions and style sheets alone. Perhaps the most interesting developments are in the “interactive” space. Here the creator has an active role throughout the process. Albeit it’s with the bot doing the work and probably making choices from its own library of effects and actions.
One, dubbed “instant AI,” begins generation as you type in prompts that can develop into a kind of verbal to visual conversation. Here, one image morphs into another as you work. Of course, this is as opposed to earlier setups which only got to work after the text prompt was complete. In effect, it allows the creator to be a director, not just an audience.
Hands-Off Photo Editing
Work is also being done to change how images (yes, real images) are edited, with “visual/verbal edit prompts” as a kind of hands-off editing function. A basic procedure might involve verbal commands such as “darken sky,” “change the color temperature” or “add contrast.” No doubt this will evolve into other command procedures.
So, in addition to talking to your refrigerator to make up your shopping list, you can command your smart photo organizer to find all the pictures of cousin Jimmy (now done through facial recognition); choose ins and outs; and make an online custom birthday card that’s immediately sent to his e-mail address. Of course, you have to input all that info at one point. However, it’s like Siri (or Alexa, if you will) has put on a lab coat, selected the negs to print and enlarge, as well as packed and shipped them to the birthday boy.
There’s also work being done to modify photos on smartphones with verbal prompts to enhance or add content for instant compositional fixes, etc., which is done after the photo is taken. This strikes me as a natural extension of the interactive nature of the smartphone. But I’m sure it will take a good deal of practice for photographers, although 12 year olds will love it right out of the box.
Hobbies and Rough Drafts
There’s always one cousin who has taken on the task of tracking family history, perhaps spurred by the wonderful PBS program Finding Your Roots. One of my favorite AI offerings for the amateur genealogist is dubbed “MyHeritage.” It is touted to help estimate when a photo was taken. Introduce the photo into the system and it will give it a ballpark guess. This is based upon the fact that the AI setup was trained on tens of thousands of curated, definitively dated historical photos.
Furthermore, for those who only began photography using a cameraphone, this is of course of little practical use. However, for those like me who have shoeboxes filled with old black-and-white family photos from 100+ years ago, it’s a fascinating tool. I can explore family genealogy, do historical research as well as even date flea market finds of tintypes and cartes de visite. And I do also want to ID the year that portrait of me at a be-in in Central Park was made.
Generate a New Friend?
Of course, you can also bypass all that social interaction stuff and use AI to generate a synthetic human. Name the gender, age, facial expressions and more, give the system 10 seconds and, hey, you have a new friend. You can also introduce that friend to the marketing crowd. They can then create custom content and even brochures tailored to its (I guess that’s the right pronoun) education, tastes, social class, income and nationality and you’ve got the whole customization package. Create an e-mail address for the droid, step out of the way, and watch them interact.
Additionally, for all you plein air painters, hobbyist artists and even photo pros doing roughs of sets and layouts, there’s Freepik Pikaso. This is an AI sketching setup with dual “windows.” One is for the user to work on a rough sketch that is simultaneously run through an AI prompt and stock content filter. Consequently, it generates “artistic” images from the roughs in near real time. I haven’t had the chance to work with this, but the idea sounds like a great stylus and filter combo that’s made in heaven.
Fear Not, They Say
According to the AI folks, the fear of AI should end and the exploitation of its potential should commence with all due dispatch. They tell us not to see it as a threat that will put an end to photography as we know it. Instead, they say we should embrace it and accept it as the staircase to the future it represents. To some, this might sound like the subtitle to Dr. Strangelove (mangled here as “How I Learned to Stop Worrying and Love AI). Conversely, to others it could be a wonderful opportunity that opens the door to a whole new world of exploration. Even if it leaves photography as we know it behind.
The best bet is to stay informed and see if and how you want to enter the debate. You can be pro or con or somewhere in the middle, but I think it’s important for everyone in the industry to stay aware. I do recommend visiting www.visual1st.biz for their informative newsletters as well as follow a very wise fellow, Paul Melcher, of Melcher System LLC, on his LinkedIn space.