A few months ago, I was sitting on the train next to a woman who ran a suicide hotline. She was telling me about how she was redesigning her website to make it resonate more with people struggling with suicidal thoughts. She showed me how she had added art to all the pages, comparing it to other sites which were bland and lacking personality (because apparently under late-stage capitalism even suicide hotlines compete for business). She figured that having this art would make the hotline feel more welcoming to people struggling with depression.
As we were going through her site, I was taken aback by one page with an AI-generated image of a couple walking along the beach, holding hands, in an impressionist style, under the caption “There is hope.” What shocked me more, however, is that the image really did a good job of rounding out the page – aside from the fact that the reflection the sun made on the water extended onto the sands.
I had previously thought the idea that AI could replace artists was preposterous - like many other people, I always instinctively believed that there was something about art that a machine couldn’t quite “get.” But this conversation prompted me to reflect on my naive conclusion: I realized that artists may be right to worry about their jobs. For commercial applications like website graphics, AI seems to do the trick. But I asked myself if there was any type of art AI cannot do, even for commercial applications.
As an amateur photographer, I wondered whether AI image generation could replace even photography. My first thought was that the photograph seems to do something a large language model could never: turn reality into art. But as AI image generation keeps improving, it is not unimaginable that AI could generate an image so detailed and realistic that it would be indistinguishable from a photograph (indeed, we are almost already there, now the only noticable difference is a strange yellow tint or plasticy film on ChatGPT’s photos). Our first reaction to such an image might be to say “that isn’t even photography.” But why not? Is there something about photography as an art which makes it more than just a realistic representation of reality? Could AI images be considered photography?
I: What is Photography?
To answer this question, we must first begin with the more fundamental question of what a photograph is, or rather, what a photograph does. When light, reflected from objects, is captured by a camera, a photograph is born from the fertile soil of reality. The role of the photographer is to facilitate this transformation from reality to art, balancing two fundamental elements of the photograph: the documentary and the aesthetic.
These two elements are not opposed to each other; a balance must be struck between the two. This relationship is mediated by the purpose of the photograph. For example, if one takes a photo at a family event, the purpose of that photo is more documentary than, say, taking a close up shot of a concrete sidewalk to explore texture and light, which would be more aesthetic. The need to document the family’s likeness is the purpose of the photo, and the aesthetics of the photo (framing, composition, lighting, etc) are subordinated to this purpose.
In such an abstract photograph, this relationship is reversed. Take, for example, Edward Weston’s “Pepper No 30.” This photo is obviously making no claim to photorealism: it is about manipulating the subject through framing, light, and composition to create an aesthetically innovative photo. In a certain sense, reality is being molded by the camera (and the photographer), and what is being documented becomes incidental to the activity of artistic creation.
If the relation between the photograph’s documentary and aesthetic aspects are improperly mediated, we have a “bad” photograph. Imagine if a photojournalist was tasked with photographing a protest, but had taken the photo from so far away that all details relevant to reporting about the protest were lost. The photographer would have failed to fulfill their main goal, documenting the events. The aesthetic aspect and documentary aspects would not be in tune, which would result in a “bad” photograph.
Thus, photography as an art requires three things: the ability to document reality, aesthetic sense, and purpose. Having answered the question of what a photograph does, we can now return to the question of AI. Does AI have these abilities?
II: AI Photography
At first, it seems preposterous to even suggest that AI could represent reality the way a photograph does. A camera captures light on its sensor and converts it into pixels, whereas an AI generates an image based on text input and the corpus of data it was trained on. Yet despite this difference in method, the outcome is the same: any image on a screen is a collection of pixels. If a human sat down and filled in the pixels one by one (imagine a sort of photorealistic digital painting), could they not achieve the same result a photograph does? The next logical step is to imagine that one could prompt AI to do the same thing. Imagine a simple picture of a cup on a table in a white room. I can tell the AI to create a white room, then draw a table taking up the exact measurements of where it would need to be in the frame, I can say the cup is in the exact center, I can tell it the direction the light is coming from, how strong it is, and so on. This is easy to imagine for simple photographs, but we can take it further: with enough information, a large enough model, enough computing power, and enough prompting, it is theoretically possible to create a representation of any scene through AI generation that would be indistinguishable from a photograph. It is merely a question of the amount of data and the size of the model.
It is easy to get hung up on the idea that it is somehow different to take a photo versus feeding information into a language model. Yet if both the “real” photograph and the AI image are pixels on a screen, and even the exact same pixels, is there any true difference between them, regardless of the medium through which they were created? Is it really fair to say that one is less real or accurate than the other?
We can make this point clearer with help from Austrian philosopher Ludwig Wittgenstein. In his famous Tractatus Logico-Philosophicus, Wittgenstein formulates his picture theory of representation. To Wittgenstein, a picture accurately represents reality if the objects in the picture correspond to each other in the photo in the same way they correspond to each other in reality. If the cup is at the center of the table in reality, and we have a photo of the same cup, on the same table, in the same position, we can say the picture accurately models reality. As he writes: “The picture is a model of reality. To the [real] objects correspond in the picture the elements of the picture.” So long as the objects in the picture (here, any representation, including language) correspond to reality, they are a true representation.
If the AI-generated image is a true representation of reality, how could it not be documenting it? Take, for example, language: we say if someone writes an account of a battle and we read that account two thousand years later, we say the author has documented this event. Insofar as the linguistic account matches reality (which we verify through cross-checking with other sources), it is a true picture. The objects of the picture – the words – relate to each other in the same way the objects of reality – the people, places, actions – relate to each other.
So it is safe to say that AI can represent reality in a manner indistinguishable to a photograph.1 But worry not – photography as an art can still be saved. Next we must ask if AI has the aesthetic sense to manipulate the composition of photos, and whether or not it can do so with a specific purpose.
AI seems to have a basic ability to alter the aesthetics of a generated image in order to elicit a certain affective response. Based on its large corpus of data – in this case, critical analyses of photographs – it has acquired a conceptual understanding of how to create emotional responses. For example, ChatGPT may zoom further into an image in order to make it more emotionally intense. If you were to prompt AI to “make a depressing photograph of a young boy,” it would use its conceptual understanding of what makes a photo depressing (i.e., zoomed in), and apply these to a representation of its concept of a male child.
This may be able to achieve a desired aesthetic effect – it may be able to create good enough art for certain purposes, like for a website or advertisement. But to create truly great art, one must combine new ideas with each other based on a certain, indefinable unconscious impulse. Without consciousness and an affective experience of objects, AI is incapable of making new connections between ideas, precisely because it has no ideas. It does not have an intuitive understanding of how to manipulate an image to create a response because it does not represent the photo to itself in consciousness.
As Bill Evans once said of jazz, to create art, the artist must develop an intuitive understanding by internalizing the rules to such an extent that they become unconscious, and then apply these rules subconsciously to their craft. To paraphrase Kant, beauty does not arise out of connections between concepts, judgements of beauty are based on feeling, or affective response. Because the AI has no subjective, affective experience of reality, it cannot do anything new with art. It can only ever create based on the sum of what everyone else has done before it; it can only ever follow the rules it is given. (Try telling it to write a poem – it will sound horribly contrived, precisely because it cannot say anything new.) Then, we can say AI could make a good photo – that is to say, it would fulfill a specific purpose. But AI could never create a great photo: it can push the boundaries of art. The AI is unable to do this by its very nature: it is not conscious, it is always only following the rules the others give it, it can never break them.
Even if AI does have basic aesthetic sense, it does not have its own intention in creating art. AI images are always created for a specific purpose. The purpose, however, is not AI’s own intent, but the intent of whoever is instructing the AI. To create a good image for the suicide hotline website, the AI still needed the intention of the woman. In this sense, perhaps AI is a tool: it isn’t making art on its own. So the way the cultural dialogue has framed the question of whether or not AI can create art is fundamentally misguided. It implies that the AI has control of the entire artistic process. In reality, we know that to whatever extent AI can create art, it does so under the direction of people.
III: The Infinite Monkey Theorem
But there is an interesting thought experiment that complicates this conclusion, and must be taken seriously in our discussions of art and AI. Namely, what if we were to ask AI to create every possible realistic representation of an object? Imagine an object, in free space, say a chair: we could take an infinite amount of photos of the chair from an infinite amount of perspectives: varying distances, angles, lighting, etc. It is highly possible that this method could create art.
Most people have heard of the “Infinite Monkey Theorem:” if there were an infinite amount of monkeys pressing one key a second on a typewriter, at random, eventually one of them would type out Shakespeare’s entire corpus. The AI instructed to create every possible representation of an object is a more efficient version of the monkeys. Eventually, it is bound to come up with a representation of the object that elicits an affective response in the viewer. It is bound to stumble upon beauty. If we told AI to create as many photorealistic representations as possible, it would eventually create one image that the viewer would consider art.
The idea would be to eventually create the best version of the photograph by trying every way of representing an object. It's like guessing the password to an account with every possible combination of characters until you get it right: a “brute force” attack on art. And this could certainly be a valid artistic process – it was the first piece of advice my photography teacher ever gave me in high school: take as many photos as you can of the scene you find, and you’re bound to come up with a good one.
If the monkeys typed up Hamlet before Shakespeare had ever written it, would Hamlet still be one of the best plays ever written? I am inclined to say yes, and to say that it would still be art. But this is ultimately up to the reader: is a work art if it is aestheticslly besutiful, or is it art because the creator intended for it to be art?
IV: The Fundamental Question
In part one, I talked about the process by which the photographer turns reality into art. Then in part II, we examined whether or not AI can do photography, determining that while AI has the capacity to create photographs, it is not creating art by itself, it still needs a human with artistic intention guiding the model. In part III, I complicated this conclusion by asking the question of what if AI was to stumble upon art by accident. These two paths to creating art: deliberate reflection vs accidental creation, raises the central question of whether or not AI can create art.
It seems that underlying all of this analysis is the crucial human intention behind art, the human subjective consciousness which creates art as such. The AI will only generate an image based on the specifications of the prompter. In a sense, then, the prompter and the photographer are the same, but with different tools. They are both molding reality according to their artistic purpose. Even in the infinite monkey case, someone is still attempting to create art.
If we believe that an AI generated image and a photograph would be indistinguishable, so long as the observer considers it art, it would be art. If someone read Hamlet for the first time unaware of whether it had been written by a monkey or by Shakespeare, they would doubtless consider it art. And here is why artists are right to fear for their jobs – if AI can create art that the vast majority of people consider adequate, it doesn’t matter who is making it.
But perhaps this could create the space for more innovative art: artists must strive to create something new, better than the AI. But if they lose their financial stability to AI, it will become even more difficult to innovate. I saw a tweet the other day which read something like “I want AI to do the dishes so I can make art, not the other way around.” I would reply to this with: AI can create good art so humans can create great art. But it would be nice if we could make it do the dishes, too.
The power of a photograph in particular is that it represents reality, as it is, without any edits. Often when we look at a photo, we assume its contents to be true, because it is easy to tell that it has been edited. If we can create images with AI that look as if they are photographs, this undermines the power of a photograph. Yet this is an epistemological issue rather than an ontological issue. That is, it is more a problem with how much we can trust photographs to provide accurate knowledge than a question of the nature of the photograph itself.