I thrashed the RTX 4090 for 8 hours straight training Stable Diffusion to paint like my uncle Hermann
One of our favourite pieces from this year, originally published October 27, 2022.
I've been playing with the AI art tool, Stable Diffusion, a lot since the Automatic1111 web UI version first launched. I'm not much of a command line kinda guy, so having a simple mouseable interface is much more up my street. And it's a fun plaything for a man without a visual artistic bone in his body. I've pictured the hitchhiker's guide to the galaxy, a Monet painting of Boris Johnson sitting on the toilet in the middle of a pond, and Donald Trump reading my beloved PC Format.
But nothing has affected me so much as hammering the Nvidia RTX 4090 for eight and a half hours straight, training it to paint like my great uncle Hermann.
You won't know the name Hermann Kahn. I would also be incredibly surprised if you recognised him by the name he was actually more widely known by, Aharon Kahana. Honestly, I didn't know him either; sadly he died well before I was born.
But I have heard so many stories, so much talk about Uncle Hermann from both my mother and late grandmother as I grew up, that I feel like I do kind of know him. At least part of him anyway.
The familial bond is strong, ever more so since travelling to Tel Aviv just before the birth of my three-year-old son. It was the place my gran, Inge, and great grandmother, Rosa Kahn fled to from a pre-Kristallnacht Germany in the mid '30s. And the place Hermann Khan settled after meeting his wife while studying art in Berlin.
I walked the streets they walked, passed the apartment my gran grew up in, travelled the road to Haifa Rosa took each morning for work, and visited Hermann's home in Ramat Gan.
That home he shared with his wife, Mideh, has become a museum to his art and while it was closed when I visited, and clearly had been for some time, it has seemingly since re-opened and is hosting exhibitions again.
Kahana's art style is distinctive, and a distinct feature of my childhood. I was surrounded by his ceramics and both early and late style paintings in my parents' and grandparents' homes. Even as a child I was drawn to them. There's a particular vase that I could never not see as the starship Enterprise, thanks to its Trek-like saucer section.
An entirely abstract geometric image of what I always assumed was a loving couple adorned our chimney breast, an image of Parisian rooftops and a stormy looking beach scene in thick oil paint ran up our stairs.
But inevitably this early 20th century German-Israeli painter and ceramicist has not been included as one of Stable Diffusion's listed artists. And although I experimented with detailed prompts, messed around with X/Y plots to try and find levers to pull to get a close approximation of the abstract paintings he produced, I never really got there.
The Stable Diffusion checkpoint file simply doesn't have the necessary reference points. But there are ways to encourage the AI to understand different, related images, and build from those specifically. They're called embeddings and people have used them to train the tool to recognise their own faces. That way you can include yourself in all the wild furry AI-painted fantasies you could ever desire.
But I wanted to train it to recognise and understand—as best a relatively simple AI could—the art of Aharon Kahana. It's a surprisingly powerful tool, especially given the caveats in the embeddings explanation that "the feature is very raw, use at own risk". Thanks to the latest release of the web UI app on Github, however, it can all be done through a browser.
You'll need Stable Diffusion, and therefore Python, already up and running on your machine, but you can then pull together a folder of images under a particular name, and it will thrash your GPU to 100% load, and 50% of your CPU, for hours to create reference points that Stable Diffusion can use when prompted with the exact name of the embedding.
Sounds relatively simple, but it certainly took some trial and error on my part. Not least after the realisation that once I'd downloaded 70-odd images of my great uncle's work, from various auction sites around the world, that I actually had to label them with something vaguely detailed in order for the training to have any impact.
That queued up a lot of time figuring out the medium and subjects of each of the pieces I'd downloaded, and then renaming each file by hand. And when you're working with sometimes seriously abstract imagery that's not always so easy.
I then pointed the RTX 4090 and my Core i9 10900K at the relevant folder, created the embedding wrapper, and left it beavering away for over eight and a half hours to come to terms with what I'd fed it. All 16,432 cores and a healthy chunk of the 24GB of memory in the new Nvidia card, as well as half my 10th Gen Core i9, were employed on this task.
I'm not going to pretend to be smart enough to truly understand what I'd tasked the most powerful consumer GPU in the world with, but when I checked in with it over the evening I could see it had been taking the input images and making its own approximations.
It was like some teaching from beyond the grave, like my PC had spent the night learning from Hermann, doodling away in some homage to his style to try and figure out how to do it without the artist's help.
By the morning the embedding was finished and I could boot up the web UI again—now listed with one textual inversion embedding—and affix the 'by aharon_kahana' text to the end of any prompt and see what the AI had learned overnight.
And it was remarkable. My computer was creating homage after homage to my great uncle, more fascinating still when it was making images of things Kahana would never hit. I'm an absolute novice when it comes to the mystic art of the prompt, but even my basic requests delivered images that evoked the memory of the artist.
Where it lacked the pure soul and understanding of what it was actually doing, it made up for in strange digital creativity and GPU-backed effort. Certainly, it was all recognisably and inextricably linked to his art style.
I know a lot of modern artists are railing against the AI art development, frustrated at the glut of pictures of fantasy women created by people with no artistic talent—along with said furry fantasies—and I don't pretend to know exactly how Aharon Kahana would have felt, but I can't help but feel he would have embraced this new tool.
And that's what it is, a tool. As much as I've been impressed by how close Stable Diffusion has come to recreating his art style, that's all it can really do: recreate. It's not really going to evolve the style on its own; it's still going to take a human artist to take the art any further. And it still needs detailed human input to give it enough of a subject to build from.
Rather than something that's going to replace artists, it's just another tool—like high resolution SLRs and Photoshop has become for landscape painters—that will slot into the arsenal of artists interested in taking the technology to new, interesting places.
AI art then, at its current level, feels like a starting point rather than something capable of truly creating the finished product. But that's probably not going to stop me from filling my PC with a million colourful, endlessly abstract images. All inspired by part of my family I've never really known yet still hope to embrace.
Post a Comment