Image prompt: A cubist painting of robot hell

The Face That AI Built

My explorations in image generation

54

Sponsored By: Alts

This essay is brought to you by Alts, the best place for uncovering new and interesting alternative assets to invest in.

To begin, we must establish that I am a moron. 

My code writing skills are so subpar that it’s an insult to the word skill to apply it to my abilities. GitHub is frightening, engineers are wizards, etc., etc. You combine this idiocy with my Macbook Air from 2018 (that has water damage and randomly restarts twice a day), and I shouldn’t even be able to type. Despite that, I was able to use open-source software and 5 pictures of myself to make a self-portrait using AI. It took maybe 15 minutes.

I then decided to go to space. 

Then I wanted to see what I would look like in Cyberpunk Jersey Shore. 

Then I chose to become a neckbeard.

Then I sat down for a portrait session with Monet. 

These creations are remarkable in and of themselves—if you add in the fact that all of the code for this was released in the last 6 weeks it becomes overwhelming. By combining Stable Diffusion 1.4 and a model called Dream Booth from Google, anyone can do what I did. If you are looking to follow in my footsteps, just follow the instructions in this video. The end result is that you can make a portrait of anyone doing anything. All it takes is 5 photos of their face and some free time. It doesn’t even cost money.

AI is exciting because it has the potential to remake the power structures of software. By 2020, we had pretty much established who made money in the software industry versus who got their margins squeezed to zero. Crypto promised to totally upend the power dynamics of the internet (and has thus far utterly failed in that promise)—but I think AI can do it for real. 

It has been a terrible time for technology investing.

The S&P is down. The NASDAQ is down. And crypto is very, very down.

But did you know that comic books are way up? Or that tickets are mooning? Or that farmland is completely unaffected?

That's why I’ve been reading Alts. These guys analyze the heck out of alternative investment markets, and you reap the rewards.

Stefan and Wyatt provide original research and insights to help you become a better investor. More than just do a daily summary email, these guys do research.

Join 50,000 other investors and find out what you've been missing.

I will warn you, my thesis on this topic is still developing. I remain unsure of how it will all shake out. However, my dumb little portraits are useful because they are a visual reminder of what is possible. Today will be a quick post where I walk through the components of how this works and the ethical implications.

AI Power Dynamics

There are 4 necessary components for my pictures.  

  1. Backend Compute: The computing power that the models run on
  2. Foundational Model: This is the trained AI model that has some sort of broad applicability
  3. Fine Tuning: In some cases, foundational models can be tuned to specific use cases. For example, I may use a foundational language model but then use a fine-tuning of it specifically around creating marketing copy. 
  4. Access Point: The end user will have access to the tuned model through some sort of endpoint. In some cases, it will be directly integrated into existing software and in other cases, it will be a stand-alone application. 

For my portraits this stack worked out like this:

  1. Backend Compute: These image-generation AIs require a specialized chip known as a GPU. Depending on what model you end up using, there are specific requirements for which GPUs you can run it on. It is much easier to run this locally, but because I have a very bad computer I had to run mine on the cloud. In this case, I used an NVIDIA T4
  2. Foundational Model: The foundational model used was Stable Diffusion which is an open-source AI model. 
  3. Fine Tuning: Stable Diffusion doesn’t allow for uploading your own images as training data so I used Dreambooth.
  4. Access Point: While the makers of Stable Diffusion have their own UI (confusingly called Dream Studio) it doesn’t currently allow me to use Dreambooth. I’m reduced to something much more basic from Google called Colab. This is a research environment that allows for access to GPUs for free.

Again, I am a moron. These instructions sound scary but I promise it is stupid easy. 

Even so, it’s fascinating to consider what happens when this technology is not so intimidating. I predict that everywhere there is an “upload image” button on the internet, there will soon be a “generate image” button. When GIFs first hit the internet, they spread like wildfire. Every conversation embedded those little animated images, and it permeated the cultural zeitgeist. These AI image generators will do the same. However, it will extend far beyond consumer communication tools: generating images will occur in website design, in product design, in photoshop, and in all sorts of B2B applications. With the current state of this technology, these use cases could occur right now. What about in 2-3 years, after the technology is 5-10x better? Images will totally change in their cultural value.  

This technology is exciting! However, I find myself more than a little troubled by the implications. 

Ick

As I was doing my research for this piece on AI internet forums, I found multiple guides explaining how to turn off the “Not Suitable For Work” filters on Stable Diffusion. Soon after, I found discussions on how to generate photorealistic porn with just a text prompt. After apologizing to my wife for the search history I was about to create, I found even grosser forums where users were talking about how to generate porn with people’s faces without their consent. The users were freely swapping results of using AI to put celebrities’ faces on porn. Within 6 weeks of Stable Diffusion’s release, it is already being used to exploit women. 

To put it simply, if there are 5 photos of your face on the internet, someone can now generate porn with you in it with ~30 minutes of work. Unless they upload it to the internet somewhere, you’ll never know about it. The image generation algorithms are all open-source, this code can be run locally, and no one can stop a bad actor. To be fair, doing a similar thing has been possible since the 90s with Photoshop. And since 2017, there has been a popular discourse around deepfakes and the role that AI could play in non-consensual porn generation. 

However, this new generation of tooling makes it possible for me, a moron, to do so. Imagine if someone had a working computer and was actually half-way decent at any of this? By dramatically decreasing the oversight and barriers to entry for image generation, a whole host of ugly problems have been wrought.

Other ethical concerns are abundant: this will decrease the total number of graphic designers needed in the global economy. Creative destruction is a net positive thing for society, but that doesn’t mean we can rejoice about the many people who will be hurt along the way. 

The lazy answer for these conundrums would be “ban all open-source image generation AI!” I expect that we will see takes like that in popular media/politics in the upcoming weeks or months.

There are no easy answers, but there is easy image generation. Technology doesn’t care about the answer; it cares about progress. So I expect all of this stuff to continue to progress quickly while the rest of the world scrambles to catch up.

In the next two weeks, I’ll be publishing a fairly extensive market map of AI companies and the power dynamics at play. Make sure you subscribe so you can get access to it.


The Every x Muse Bundle ends tonight at midnight! If you become a paid Every subscriber you can claim access to 1 year of Muse as part of your subscription. Muse is a visual notetaking tool for deep work on iPad and Mac—and it’s usually $39.99 / year. But if you’re an Every subscriber it’s free!

Find Out What
Comes Next in Tech.

Start your free trial.

New ideas to help you build the future—in your inbox, every day. Trusted by over 75,000 readers.

Subscribe

Already have an account? Sign in

What's included?

  • Unlimited access to our daily essays by Dan Shipper, Evan Armstrong, and a roster of the best tech writers on the internet
  • Full access to an archive of hundreds of in-depth articles
  • Unlimited software access to Spiral, Sparkle, and Lex

  • Priority access and subscriber-only discounts to courses, events, and more
  • Ad-free experience
  • Access to our Discord community

Thanks to our Sponsor: Alts

Thanks once again to our sponsor Alts, the best way to learn about alternative assets to invest in.

Comments

You need to login before you can comment.
Don't have an account? Sign up!
Every

What Comes Next in Tech

Subscribe to get new ideas about the future of business, technology, and the self—every day