AI Learning Lab

3/25/2025 - From DALL-E to Native Image Gen: Exploring OpenAI's Latest Breakthrough in Visual AI

AkRa_YK0c54
Live Stream2025-03-262:01:3898 views

Description

Images a plenty. New ChatGPT imageGen model is out! This insightful discussion explores the groundbreaking advancements in OpenAI's new native image generation within GPT-4. Kyle Shannon delves into the significance of a "native" model, highlighting its ability to seamlessly integrate image understanding and generation within the same system. This allows for impressive feats like accurately rendering text following complex prompts with precision and maintaining character consistency across multiple image iterations. The demonstration showcases the model's proficiency in creating photorealistic images, generating branded content, and even producing humorous visuals, suggesting a significant leap in AI image generation capabilities. The conversation further explores the practical implications of this technology, touching upon its potential for graphic novels, infographics, and even screenplays. Kyle emphasizes the model's contextual understanding, allowing it to incorporate previous prompts and images into subsequent creations. The discussion also compares OpenAI's new model with other AI image generation tools like Gemini and Grok, noting differences in speed, guardrails, and overall performance. Learn more about AI on TikTok: https://tiktok.com/@aiLearningLab. #AI #ImageGeneration #GPT4 #OpenAI #ArtificialIntelligence #DeepLearning #MachineLearning #innovation Chapters: 00:00:00 Introduction And Musical Performance 00:03:18 Discussion Of New AI Goodies 00:05:11 Showing Off AI-Generated Champy Image 00:06:26 Announcement Of Gemini 2.5 And OpenAI's Native Image Generation 00:07:32 Exploring The Meaning Of Native Image Generation 00:09:00 Discussion About Simple AI And Domino's Pizza Ordering 00:10:55 Exploring OpenAI's Image Generation Announcement 00:12:15 Explanation Of Native Image Generation And Multimodal Models 00:14:21 Addressing Viewers And Community Appreciation 00:15:45 Reading And Discussing OpenAI's Announcement 00:17:46 Showcasing AI-Generated Images: Resto Mod And Gorbachev/Reagan Polaroid 00:19:21 Comparing Image Editing Capabilities Across Different AI Models 00:21:27 Testing OpenAI's Image Generation With A Civil War Letter 00:24:57 Experimenting With Aging And Distressing The Letter Image 00:29:32 Further Discussion Of OpenAI's Announcement And Its Implications 00:31:27 Recreating OpenAI's Whiteboard Scene With Anthropic Branding 00:34:34 Adding Details To The Anthropic Whiteboard Scene 00:37:34 Vibe Coding And Its Frustrations 00:39:14 Testing OpenAI's Ability To Handle Brand Logos 00:45:00 Troubleshooting Lighting And Technical Issues 00:47:45 Reflecting On The Potential Of OpenAI's Native Image Generation 00:49:52 Addressing Viewer Comments And Questions 00:51:59 Testing Image Generation With A Specific F1 Scenario 00:53:50 Posting The Generated F1 Image On Twitter 00:56:02 Experimenting With Refrigerator Magnet Poetry And Shrek's Breath 00:58:00 Exploring Other Image Generation Prompts From OpenAI's Announcement 01:00:12 Refining The F1 Image Based On Feedback 01:02:19 Posting The Updated F1 Image On Twitter And Engaging With Comments 01:04:04 Correcting Typos In The Twitter Post 01:05:57 Finalizing The Twitter Post And Engaging With More Comments 01:07:17 Returning To Image Generation Experiments And Exploring Comic Strips 01:09:59 Troubleshooting Issues With The Refrigerator Magnet Poetry Image 01:12:03 Creating An Infographic Of Spooky Quantum Effects 01:14:12 Generating And Evaluating Bazooka Joe-Style Comics 01:17:20 Refining The Refrigerator Magnet Poetry Image 01:18:18 Requesting Manus To Write A Screenplay Combining Shawshank Redemption And Snakes On A Plane 01:22:46 Refining The Quantum Effects Infographic And Adding A Story Context 01:26:00 Brainstorming Titles And Taglines For The Screenplay 01:27:36 Showcasing The Generated Images To Viewers 01:29:34 Refining The Quantum Effects Infographic Based On Feedback 01:33:20 Further Refining The Quantum Effects Infographic 01:35:42 Discussing OpenAI's Video Demonstrations Of The New Image Model 01:37:41 Troubleshooting Screen Sharing Issues 01:40:05 Demonstrating OpenAI's Character Consistency Feature 01:43:30 Discussing The Significance Of OpenAI's Native Image Generation 01:46:10 Continuing To Explore OpenAI's Video Demonstrations 01:48:05 Reflecting On Personal Experiences With ADD And Executive Function 01:50:52 Demonstrating OpenAI's Prompt Adherence And Attention To Detail 01:53:29 Demonstrating OpenAI's Ability To Generate Transparent Images 01:54:56 Discussing The Implications Of OpenAI's Multi-Turn Generation 01:58:20 Reflecting On The Rapid Pace Of AI Development And The Challenges Of Keeping Up 02:00:23 Concluding Remarks And Encouraging Viewers To Experiment With The New Image Model

Chapters

0:00Introduction And Musical Performance3:18Discussion Of New AI Goodies5:11Showing Off AI-Generated Champy Image6:26Announcement Of Gemini 2.5 And OpenAI's Native Image Generation7:32Exploring The Meaning Of Native Image Generation9:00Discussion About Simple AI And Domino's Pizza Ordering10:55Exploring OpenAI's Image Generation Announcement12:15Explanation Of Native Image Generation And Multimodal Models14:21Addressing Viewers And Community Appreciation15:45Reading And Discussing OpenAI's Announcement17:46Showcasing AI-Generated Images: Resto Mod And Gorbachev/Reagan Polaroid19:21Comparing Image Editing Capabilities Across Different AI Models21:27Testing OpenAI's Image Generation With A Civil War Letter24:57Experimenting With Aging And Distressing The Letter Image29:32Further Discussion Of OpenAI's Announcement And Its Implications31:27Recreating OpenAI's Whiteboard Scene With Anthropic Branding34:34Adding Details To The Anthropic Whiteboard Scene37:34Vibe Coding And Its Frustrations39:14Testing OpenAI's Ability To Handle Brand Logos45:00Troubleshooting Lighting And Technical Issues47:45Reflecting On The Potential Of OpenAI's Native Image Generation49:52Addressing Viewer Comments And Questions51:59Testing Image Generation With A Specific F1 Scenario53:50Posting The Generated F1 Image On Twitter56:02Experimenting With Refrigerator Magnet Poetry And Shrek's Breath58:00Exploring Other Image Generation Prompts From OpenAI's Announcement1:00:12Refining The F1 Image Based On Feedback1:02:19Posting The Updated F1 Image On Twitter And Engaging With Comments1:04:04Correcting Typos In The Twitter Post1:05:57Finalizing The Twitter Post And Engaging With More Comments1:07:17Returning To Image Generation Experiments And Exploring Comic Strips1:09:59Troubleshooting Issues With The Refrigerator Magnet Poetry Image1:12:03Creating An Infographic Of Spooky Quantum Effects1:14:12Generating And Evaluating Bazooka Joe-Style Comics1:17:20Refining The Refrigerator Magnet Poetry Image1:18:18Requesting Manus To Write A Screenplay Combining Shawshank Redemption And Snakes On A Plane1:22:46Refining The Quantum Effects Infographic And Adding A Story Context1:26:00Brainstorming Titles And Taglines For The Screenplay1:27:36Showcasing The Generated Images To Viewers1:29:34Refining The Quantum Effects Infographic Based On Feedback1:33:20Further Refining The Quantum Effects Infographic1:35:42Discussing OpenAI's Video Demonstrations Of The New Image Model1:37:41Troubleshooting Screen Sharing Issues1:40:05Demonstrating OpenAI's Character Consistency Feature1:43:30Discussing The Significance Of OpenAI's Native Image Generation1:46:10Continuing To Explore OpenAI's Video Demonstrations1:48:05Reflecting On Personal Experiences With ADD And Executive Function1:50:52Demonstrating OpenAI's Prompt Adherence And Attention To Detail1:53:29Demonstrating OpenAI's Ability To Generate Transparent Images1:54:56Discussing The Implications Of OpenAI's Multi-Turn Generation1:58:20Reflecting On The Rapid Pace Of AI Development And The Challenges Of Keeping Up2:00:23Concluding Remarks And Encouraging Viewers To Experiment With The New Image Model

Transcript

0:01 [Music]
0:25 Uhoh. Uh-huh.
0:28 [Music]
0:43 [Applause]
0:43 [Music]
1:19 Sitting in this lonely
1:21 town wonder when things are going to
1:24 [Music]
1:26 change. Dream my life
1:29 away. Seems these dreams have
1:32 turned those
1:35 clouds. Get my nerve up. But my last is
1:40 pouring me
1:41 [Music]
1:43 down. Wondering how
1:47 long she going to stick around.
1:55 Somebody told me once before said you
1:58 can never go home
2:00 again. Won't you
2:02 leave? Santa thinks to steer me
2:06 away from the truth of who I am and what
2:10 I believe. So I thanked him for his two
2:13 cents with a
2:14 handshake and some sympathy. Yeah.
2:19 Packed up my blue
2:21 jeans and headed for this big
2:25 prize of my
2:29 freedom.
2:30 Bye-bye black sheep to the black sheep
2:34 of the
2:36 family.
2:38 [Music]
2:39 [Applause]
2:40 Bye-bye. Oh, that means so very much to
2:43 me. Yeah.
2:46 [Music]
2:48 Bye-bye to my friends and my
2:52 family.
2:55 Bye-bye. Going to set my
3:02 soul. Set it
3:04 [Music]
3:06 free.
3:08 Free. Times that were
3:13 changing. Did a little bit of
3:16 rearranging. Hey, what's happening
3:21 everybody? We got some new goodies to
3:23 play with tonight. Howen
3:26 [Music]
3:41 puppy. Woohoo.
3:45 [Music]
3:47 Oh
3:48 [Music]
3:57 yeah. There's been something, baby, I've
4:01 been trying to
4:03 say for an age, and it seems I don't
4:06 know
4:09 how. The past and the future now
4:12 surrounding
4:15 me. Surrender to whatever truth still
4:17 can be
4:20 found. There's been a little
4:23 trouble since you came to my
4:30 rescue. And if you're like all of the
4:32 rest, I would have quit you long ago.
4:35 But I couldn't do that.
4:41 Oh, tell me now. Women in W never went
4:44 too
4:47 well. Make a man crazy, make him cold as
4:52 hell. I'm a woman that you wish me
4:57 well. But it's funny trying. Still going
5:00 to have to find my way through.
5:08 [Music]
5:11 Happy Tuesday night everybody. It is
5:14 Tuesday, right? It is. It's Tuesday. Fan
5:17 friginantastic. Thank you, Mary. Very
5:18 nice sentiments. Did you see the image
5:21 Danielle made of you and Champy? I have
5:23 not seen that. I should go look at that.
5:25 Is it in irregulars? I assume it's an
5:28 irregulars.
5:31 Irregulars. Oh, that's a really nice
5:34 one. Let's go look at that. That's a
5:36 sweetie.
5:38 And where is my There it
5:41 is. Oh, let me flip this little bad boy.
5:44 Look at that. Look how nice that is.
5:47 That's
5:48 gorgeous. It almost looks like him. He
5:51 doesn't quite have the black eyes. He's
5:52 got white eyes, but the ears are
5:54 perfect. The head's perfect. Just a
5:56 little bit less there, but that's pretty
5:58 much what he looks like. And
6:00 unfortunately, that's that's about my
6:03 size, too.
6:08 That is very sweet. Very sweet, sweet,
6:16 [Music]
6:24 sweet. Okay,
6:26 so something actually big happened today
6:30 and I have a feeling most people won't
6:32 pay attention to it.
6:36 because it's really subtle, but it's
6:40 super
6:44 powerful. We got we got Gemini 2.5
6:48 today, which is cool. Supposed to be
6:51 good at
6:52 coding.
6:54 Um, we got another Chinese model or two.
6:57 I saw a tweet about some
7:01 clandestine [ __ ] model that says
7:04 it's better than everything
7:06 else. But the thing we got today that
7:09 I'm excited about G4 long time good to
7:12 see you. Welcome, welcome,
7:14 welcome. Lori Duskin, Ryan, got Danielle
7:18 got source camp on time. Good lord
7:20 people. What is coming of the world? Ah,
7:24 Stacy nailed it. Open open AI image
7:27 generation. Native image
7:29 generation
7:32 native. You're like, what does that
7:35 mean?
7:42 [Music]
7:55 Uh, I don't know 100% what it means and
7:57 I figured we would play with it tonight
7:59 because we've
8:01 got Gemini has a native image model in
8:07 their in their image generation model.
8:10 So, we can play with that and then we
8:12 can compare it to OpenAI. I've been
8:14 doing a little bit of comparing before I
8:16 got on here and it looks like OpenAI is
8:18 better, but they're still both
8:21 native. Um, and then you've got Gro,
8:25 which is fairly
8:27 unguardile.
8:30 Um, most of the image generation tools
8:33 up to this point have had fairly
8:34 stringent guard rails. You can't make
8:36 famous people. You can't make brands.
8:38 And you certainly can't make
8:41 images that feature brands that are not
8:45 um that are not
8:48 uh what would you call it? Um friendly
8:52 to that
8:54 brand. Tom Isaac, I use Simple AI to
8:58 order Dominoes. Oh, interesting. And it
9:00 worked. Simple AI. Isn't that the one
9:02 that that actually calls them up? I did
9:05 that. I did it once. And you know what's
9:07 funny about that, Tom? I um uh I got
9:12 weirded out because I had
9:15 it I had it called I had it called two
9:18 restaurants and asked if they had tacos
9:21 and one was a burger joint and one was a
9:23 Mexican joint and the Mexican joint had
9:26 tacos and the burger joint I was like
9:28 watching the transcript as the call was
9:31 happening and and you know the simple AI
9:34 thing said, "Hey, do you have tacos?"
9:36 And the the person that worked at the
9:38 burger joint was like, "Well, we don't
9:40 right now, but we've been thinking about
9:41 adding him to the menu." It was like a
9:44 whole thing. It had a whole
9:46 conversation, and I felt bad, so I
9:48 haven't used it
9:49 [Music]
9:52 since. Oh
9:56 man. Oh, this is a chat GPT image,
9:59 Danielle. That's super cool. I use chat
10:01 GPT to make that image. Look at the
10:03 comments.
10:04 Oh, you gave it a pic of me and Champy.
10:10 Wow. Oh, this is cool. Hang on. That's
10:13 super
10:14 cool. Oh, yeah. Look at the striped
10:19 shirt. And then here's Champy look
10:23 looking like the little thug that he
10:30 is. And then it made that. That's so
10:33 [ __ ]
10:34 cool. That is so
10:38 cool. Yeah, it did pretty good. It did
10:41 pretty good. And look, I got my Monopoly
10:43 in the
10:46 background,
10:48 huh? Crazy. All right.
10:52 [Music]
10:54 [Applause]
10:56 Well, yeah, we got to play with this
10:58 tonight.
11:00 Um, I did I did a couple of things that
11:02 I think are kind of kind of nuts.
11:06 [Music]
11:12 Um, I think that here's one of my
11:15 theories.
11:17 Um, Sam, so I want to go over and I want
11:20 to look at the at the Open AI
11:22 announcement for this thing. So, let me
11:24 let me pop into X here for a second.
11:27 Let me get the OpenAI
11:31 announcement. Um, Open AI
11:36 image. Let's get their announcement
11:38 because I want to read through
11:42 it. Um, there it
11:47 is. So, we'll read through this
11:49 together.
11:52 This image gen is now in both chat GPT
11:55 and in Sora. So Sora is really
11:58 interesting.
12:01 [Music]
12:11 Um, okay.
12:16 So, I don't quite understand what native
12:20 image generation means other than I know
12:24 that it's within the same model, right?
12:26 So, you've got
12:27 GPT40 and the O stands for
12:30 Omni. They announced GPT40 last March, I
12:34 think. So, this is this has been a year
12:36 coming. And when they announced it, they
12:38 said it had it it was uh
12:42 omnimodal or multimodal, right? It had
12:45 image, audio, video, and
12:48 text. And then when they launched it, it
12:51 was text only and they were still using
12:53 Dali, the external image model that
12:55 couldn't spell and couldn't do stuff.
12:57 And then they launched voice that was an
12:59 external model, right? where you talked
13:02 into it, it translated it to text, sent
13:04 the text to chat GPT, got the answer
13:07 back, and turned that back into speech.
13:09 And then midway through last year,
13:11 sometime over the summer, they announced
13:14 uh advanced voice, which is native
13:18 audio, right? So you you talk straight
13:21 into the model and it understands audio.
13:23 So that's what's happening with these
13:24 images. The model itself understands
13:28 images.
13:31 So, in theory, we should be able to do
13:33 stuff like take a screenshot of a PDF
13:36 with some charts and graphs on it, and
13:38 it should be able to read the text and
13:40 the and the graphs. Um, I don't know if
13:43 they've improved their uh PDF upload, if
13:46 they're still just doing text, but that
13:48 would be interesting. That'll be
13:49 interesting to te
13:51 test. Um, it also looks like they've
13:54 taken off some of the guard rails.
13:57 Um, it before if you wanted to make a
14:00 brand or a person, it wouldn't let you
14:01 do that. It does now. Um, oh, I got to
14:05 do my black
14:06 bar. Um, for for those of you on TikTok
14:10 sharing the live, thank you very much.
14:11 That's much appreciated. Thank you, Tom.
14:13 Thank you, Danielle earlier and whoever
14:15 else did. I think there was a couple of
14:17 you doing that, so that is appreciated.
14:19 I'll try to catch up soon. Nighty night.
14:21 Night night. Becky Kuno's in the house.
14:24 Who else we got? Brother
14:26 52 can't stay. All
14:30 right. Oh no. Oh, a heady scan from a
14:33 car accident back in December. I hope
14:35 you're okay. That sucks. Car accidents
14:39 suck. They just suck.
14:44 Okay. I'm getting mixed results with
14:46 image reading. Yeah, that's that's part
14:49 of what I want to test. And it's funny.
14:51 I'm getting I'm getting some results are
14:54 staggeringly good with image generation
14:56 and some are weird. Like I tried to have
14:59 it do a graphic novel. It it cannot
15:02 count cells on a comic page. Um that's
15:06 one thing it can't do at all. Like
15:09 bad like like how many Rs in Strawberry
15:12 are there kind of
15:14 [Music]
15:16 bad. All
15:19 [Music]
15:46 right, Mr. All right, TE's in the house.
15:48 Everyone's here. We can get this potty
15:52 started. Jeff, it's the biggest drop
15:54 I've seen for someone who loves making
15:57 images. Yeah, this is this is a big
16:00 deal. This is a big deal. So, let's
16:02 let's read
16:03 this. Unlocking useful and valuable
16:07 image generation with a natively
16:09 multimodal model capable of precise,
16:12 accurate, and photorealistic outputs.
16:14 I'll give you that. Here, look at
16:16 this. So, this Oh, you can't really see
16:19 it on TikTok. Let me get a little
16:20 closer. It'll get better. There you go.
16:25 Um, this I did in the new chat GPT
16:29 model. So, what what I wanted to see was
16:32 will it do brands? It clearly
16:37 will. And then will it do something
16:40 stupid. So, what I had to do was create
16:44 a blister pack of Happy Meal toys where
16:47 the the warning label talks about the
16:50 dangers of fast food. And it did it
16:55 um may cause tooth decay and bone
16:58 weakening for the Splash
17:03 Racer. The Fry Fighter Jet may
17:06 contribute to poor health and weight
17:08 gain.
17:11 tic-tac-toe. I don't know why it
17:12 misspelled toe, but uh may cause sore
17:15 feet, extra pounds, and the daily and
17:18 daily game night of unhealthy
17:21 choices. And then may cause cravings,
17:23 confusion between joy and addiction, and
17:25 an unshakable loyalty to dollar
17:31 menus. But those are pretty good. I
17:33 mean, like the consistency with the top
17:36 of the card is very consistent.
17:39 And then it just is changing out the toy
17:42 and the warning
17:43 label. Pretty
17:45 nice, you
17:47 know. Um, I think you all might have
17:50 seen if you follow me on on the
17:53 Twitter on the X, I did a
17:58 uh I did a
18:04 uh 70s resto mod in an abandoned
18:08 factory. because that's how I roll. But
18:11 I told it to make like a nice colorful
18:13 70s post or you know canvas
18:17 sign when muscle was muscle spec back
18:20 step back in time with the 1970
18:22 Challenger resto mod. That's I mean it's
18:24 pretty damn good good image. So
18:28 um so it seems to work pretty good. Um
18:32 then I
18:33 did AI salon.
18:38 Okay.
18:40 Um, what did I do? Oh, I just did
18:43 um I just did
18:46 um Gorbachev jogging with Reagan in
18:49 Central Park captured in a 70s Polaroid
18:53 camera. Um, and it did that pretty good.
18:56 It did it as a, you know, a Polaroid.
18:59 So, it's got the the edges. It's got the
19:02 color balance that kind of looks like
19:04 Reagan and Gorbachoff did back then. And
19:07 then I said, "Write Mickey and Ronnie
19:10 1977 in black marker on the bottom of
19:12 the picture." So it kept the picture the
19:15 same and put Mickey and Ronnie
19:22 1977. So that's good. That's something
19:26 we haven't been able to do before at
19:28 all. Even close. Um, the similar thing
19:32 in Gemini yielded something. They don't
19:36 really look like those two. I mean, they
19:39 these look more like press
19:41 photos. And then when I had it right on
19:44 the bottom of the photo, it did as well.
19:46 So, again, here's the difference between
19:48 a native image tool and a non-native
19:52 one is that it kept the the image the
19:56 same, right? And just added the stuff on
19:59 the bottom. So you can edit photos. So
20:01 this is So Gemini and OpenAI are both
20:04 doing this
20:05 now. And then I tried it in Grock and
20:09 where's
20:13 my where's my
20:16 history?
20:21 Um where is my
20:24 history? Oh, sign in to see your
20:26 history. Oh well, I did it. I did it in
20:29 Grock. Grock made the image of
20:31 Gorbachoff and
20:33 Reagan, but
20:36 um but when I asked it to write, it did
20:39 it didn't get the it didn't understand
20:40 what a Polaroid was. And when I asked it
20:42 to write on it, it changed the
20:46 image. Uh oh, something went wrong.
20:49 Please try again.
21:10 Wait, I don't want to add it to Last
21:11 Pass. I want you to put my password in
21:14 there, you dumb
21:18 [ __ ] Continue with X.
21:23 Uh, authorize
21:27 app. This will be so much fun. I've been
21:30 chatting non-stop with my favorite
21:32 custom GPT, so I didn't even realize it
21:34 had dropped. Idoggram is cooked. I don't
21:36 know. ID's still much faster, Danielle.
21:39 I mean, maybe this thing's going to get
21:40 faster, but let's So, let's start a new
21:43 Let's see. What do we want to do here?
21:49 Um, let's say,
21:51 um,
21:53 write me, uh, so let's do this. Uh, tell
21:58 me about
22:00 an
22:06 obscure Civil War hero.
22:12 and one specific
22:17 story that is
22:21 moving. All right, so let's get this
22:24 to Albert DJ Cashier. One of the most
22:28 fascinating lesser known Okay, one
22:31 moving story. The wagon crush
22:33 incident. Okay,
22:36 great. Why it matters. Okay.
22:42 Now, write
22:44 me the text of the first page of a
22:50 hand written
22:53 letter by
22:56 Albert to his
23:00 uh I don't know
23:03 fiance or
23:05 wife,
23:08 whichever is most accurate.
23:15 the lex the
23:18 text. All
23:23 right. He lived a solitary life after
23:27 the my dearest Ellie, I pray this letter
23:30 finds you in good health.
23:33 Okay. Would you like me to write page
23:35 two or imagine Ellie's reply? No.
23:41 I'd like
23:44 you to make
23:48 a
23:52 picture
23:53 that looks like the
24:01 handwritten
24:02 letter on a wooden
24:06 desk
24:08 with a feather
24:12 pen and
24:16 inkwell as if he just wrote
24:25 it. All right. And let's
24:30 see. Let's see how close it gets.
24:39 Idea Graham cannot write that many words
24:42 with correct spelling. Ah,
24:44 okay. Kyle, when you take a breather,
24:46 just added recognitions to the salon.
24:49 Oh, great.
24:50 Awesome. That's super cool,
24:57 Vicki.
24:59 Um, attempting to have it research a
25:03 master's level thesis paper. Oh,
25:05 someone's playing with Manis. Is are
25:08 who's that? I am
25:10 K9 attempting to have it research. Wait,
25:13 what did you say
25:18 earlier? 48 hours into an estimated
25:21 72-hour deep
25:23 dive re Oh, deep research prompt on chat
25:26 GPT. Oh, cool. You're doing it on chat
25:28 GPT. If you haven't tried Manis, try
25:31 getting access to
25:33 Manis.
25:35 And it would be interesting to compare
25:37 them, especially if you're digging that
25:40 deep. Dearest Ellie, wow, this is
25:43 looking pretty good. I pray this letter
25:46 finds you. Okay. So, we're let's say
25:51 um
25:54 let's make the
25:58 paper more aged and
26:08 wrinkled. and the feather
26:14 pen
26:16 distressed as if this
26:20 was written on the
26:24 battlefield. Also make
26:27 his
26:31 handwriting less perfect and
26:34 shakier, but still shakier.
26:40 but still
26:42 legible. So, what's what what I'm
26:45 digging about I mean what I've always
26:48 dug about creating images in chat GPT is
26:52 you've got the context of the prompt and
26:54 it it it understands well even when it
26:56 was using deli that was outside of it.
27:00 It understood well the context of the
27:01 chat that you were saying but now it
27:03 understands the context of the image
27:05 either you upload or it just
27:07 created. All right. So it's doing that.
27:09 So let's see. My dearest I pray this
27:11 letter finds you well find you in good
27:14 health and better spirits than mine of
27:16 late. The
27:18 nights
27:21 have I don't know what that says. I
27:23 picture your
27:25 hands around a cup of tea. your
27:30 eyes.
27:33 Something in Ovaltine. Ovaltine
27:38 maybe. What did it say up here? It
27:42 said, "The nights have turned cold and
27:44 damp here. Though I've grown used to the
27:46 rhythm of camp life, I find myself
27:49 thinking more often of home. Not the
27:52 place, but you." As I sit by the fire, I
27:55 picture your hands folded around a cup
27:57 of tea. Your eyes
28:00 squinting in that way when you read by
28:04 lamplight. Your eyes it didn't quite get
28:07 squinting right, but it was
28:10 close. Yeah, it doesn't it doesn't quite
28:12 get them all, but it's pretty [ __ ]
28:14 close. And this is this is more what I
28:17 had in mind, right? The old dirty the
28:20 old dirty wrinkled
28:23 paper. I pray this letter finds you in
28:25 good health and better spirits than mine
28:27 of late. The nights have turned cold. I
28:30 picture your hands folded around a cup
28:33 of tea. Your eyes here. Let me zoom in
28:36 on this
28:38 thing.
28:39 Um, open image and new tab.
28:50 The nights have turned cold. Look at the
28:51 texture on the
28:53 paper. That's crazy. That looks like
28:56 parchment or skin, you know, skin
29:08 paper. Your voice when reminiscing I
29:12 find you're something named most most
29:17 I was there. Your wait I was something.
29:21 This is like a real Civil War letter. I
29:24 can never read
29:27 them. That's pretty crazy
29:30 though. All
29:32 right. I'm picturing using this outside
29:35 for outside projects. Snap a picture
29:38 then visualize it. You can see the
29:39 fibers in the paper. I know. It's crazy.
29:42 It really is. Yeah. Look at that. the
29:46 folds and the
29:48 fibers and the pen. Yeah, the the pen.
29:55 Um, all right. So, let's So, Sam Walton
29:58 said Sam Oh, wait. Let's go back and
30:00 read the announcement, right?
30:03 Okay. Listen to the
30:06 article. That's kind of interesting.
30:12 Okay. At OpenAI, we have long believed
30:16 image generation should be a primary
30:18 capability of our language models.
30:20 That's why we've built our most advanced
30:22 image generator yet into
30:25 GPT40. The result, image generation that
30:28 is not only beautiful but
30:30 useful. Whiteboard session meaningful
30:33 words comic strip science
30:37 experiment. Is this just going to jump
30:39 us down to these different Oh, I see.
30:42 Okay.
30:43 A wide
30:45 image taken with a phone of a glass
30:48 whiteboard in a room overlooking the Bay
30:50 Bridge. The field of view shows a woman
30:53 writing sporting a t-shirt with a large
30:55 open AI logo. Wow, this is crazy. Okay,
30:59 I thought this was a real picture of the
31:02 team. It's
31:04 not. Um, holy
31:08 [ __ ] Read
31:10 more. The text reads left transfer
31:14 between modalities. Let's copy this
31:16 whole thing. Let's cap copy this whole
31:28 prompt and
31:30 then selfie view of the photographer as
31:33 she turns around to high-five him. Okay,
31:36 cool. Wow. Let's try this.
31:40 Copy. Jump over to chat.
31:44 Japeta. We'll do a new prompt, a new
31:48 session. So, we'll do that white of a
31:50 glass whiteboard overlooking instead of
31:52 overlooking the Bay Bridge, we'll say
31:56 overlooking
31:59 Manhattan
32:02 from Brooklyn.
32:05 The field of view shows a woman writing
32:07 sporting a t-shirt with a large
32:11 [Music]
32:14 um anthropic
32:17 logo. The handwriting looks natural and
32:20 a bit messy. We see the photographers's
32:22 reflection. The text re reads transfer
32:26 between modalities. We'll just leave all
32:29 that [ __ ] the same. All right, here we
32:31 go. Bang.
32:35 That's amazing. Yeah, this is pretty
32:38 something. Okay, I'm back. Kevin, great.
32:40 Glad you're back. We can start now.
32:43 Fantastic. Bob, tell him what he's won.
32:46 Uh, he hasn't won anything. It's not a
32:50 game show. I What am I doing
32:54 [Laughter]
32:57 here? Oh, man. Man, man, man, man. Now,
33:01 the thing about this OpenAI native image
33:03 [ __ ] is it's slower than [ __ ] It's it's
33:06 quite slow. But again,
33:10 um if you're new here, one of the things
33:13 that that we have discovered over the
33:15 past two years is
33:18 that we whine and we carp and we moan
33:21 and we [ __ ] about, well, we wish these
33:23 tools would get better and then they get
33:25 better and then we whine and we [ __ ]
33:28 and we moan about how slow they are.
33:31 totally forgetting that what they're
33:33 doing is
33:35 absolutely
33:37 staggering. Um, staggering.
33:45 Staggering. Oh, man. All right. All
33:48 right. I'm
33:53 back. Oh,
33:57 lordy. Not getting sleepy, am I? What
33:59 the hell? I got to talk tomorrow,
34:02 people. I finished my presentation
34:04 today. Thank
34:06 goodness. If you were here last night,
34:08 you got a sneak peek of that. I ended it
34:12 I ended it with the secret to AI
34:14 readiness is
34:17 community. So, it put an anthropic logo
34:19 on her back. It actually did that.
34:21 Here's the photographer. There's
34:23 Manhattan from Brooklyn. It's their
34:27 office is Brooklyn Waterfront. So, uh,
34:30 this is a well-funded
34:33 startup. Let's go back and get the
34:35 second half of the prompt. Selfie view
34:38 of the photographer as she turns around
34:40 to high-five
34:45 him of the photographer.
34:48 [Music]
34:49 um who
34:52 wears
34:54 and
34:57 I'm
35:00 with
35:02 genius t-shirt
35:10 uh t dash shirt
35:13 uh with an arrow that points to
35:18 her as she turns around to high-five
35:22 him. All right, let's see what this
35:25 does. Let's go look at the
35:30 whiteboard. Transfer between modalities.
35:33 Suppose we directly model P text pixels
35:37 sound with one big auto reggressive
35:40 transformer. Pros: image generation
35:42 augmented with next level text
35:46 rendering native incontext learning
35:49 unified post-training stack.
35:54 Cons varying bit rate across modal
35:57 modalities. Compute not adaptive. It's
36:00 interesting what they put on this
36:02 whiteboard.
36:04 Um
36:06 model compressed representation compose
36:09 autogressive. I mean, it seems to have
36:11 [ __ ] gotten all that [ __ ] That's
36:13 pretty amazing. That's really [ __ ]
36:16 amazing. In
36:27 fact, it changed the aspect ratio of the
36:30 picture, which is
36:32 fascinating, I guess, because it made it
36:34 a selfie.
36:41 Huh. The reflection makes it feel
36:44 3D. Unreal Engine level sands 3D. Yep.
36:49 We've discovered we are never happy with
36:51 AI. I know, right? Like like honest to
36:55 God, every single [ __ ] tool that we
36:57 play with that we [ __ ] and moan about,
36:59 like two years ago, we'd have been like,
37:01 "You're not going to believe what the
37:02 future's going to be like, man." And now
37:05 we're just like, "Yeah, whatever." you
37:07 know, it kind of hallucinates. Sometimes
37:09 it makes a mistake and I actually have
37:10 to correct a sentence and that's really
37:12 annoying. You know what happens
37:14 sometimes when I'm vibe coding and I'm
37:17 just, you know, imagining applications
37:19 and they're materializing in front of
37:21 me. Sometimes it gets the interface
37:23 wrong. That's so
37:27 annoying. You know, I I like to consider
37:30 myself a a professional vibe coder. Um,
37:34 I've never really learned coding. I
37:37 never really wanted to. I mean, I could
37:39 do it. I because I've I was really good
37:40 at math. Um, but I chose not to go down
37:43 that route. I, you know, I went more
37:45 down the, you know, I went into candle
37:47 making and, uh, and and macra. Um, you
37:52 know, my mom did it in the 70s and I
37:53 just, I thought it was just a path. It
37:55 was a path for me. It spoke to me. Um,
37:57 but anyway, so I never learned coding,
37:59 but now I really consider myself a vibe
38:02 coder and I code. I I code well I I I
38:06 talk to my machine that codes every day
38:10 and
38:12 sometimes it just doesn't get it right.
38:14 Sometimes I'm asking it three or four
38:16 times, could you please fix the button?
38:20 Could you please fix the button? and and
38:22 and what it'll do is it'll fix the
38:24 button, but then it breaks the other
38:26 part of the
38:27 code. It's very frustrating. It's very
38:30 frustrating. But listen, I am a
38:33 professional vibe coder. And so if you
38:35 need vibe coding, I consider myself the
38:38 Rick Rubin of Vibe coding, right? I
38:41 don't know how to code, but I have
38:44 opinions, right?
38:47 You know, this application makes me feel
38:50 orange and I would it would be nice if
38:52 it didn't. And that's how I talk to it
38:55 because I'm a professional like like
38:58 Rick Rubin
39:01 is. That's
39:03 coming. Buckle up for that one,
39:08 sweeties. We're going to meet those
39:10 people.
39:14 Chat GPT is trying to make me
39:17 resubscribe instead of just using the
39:19 API. Damn it. That sucks. When I can
39:21 make Skynet and blame Kyle for it, he is
39:24 always making my wallet cry. Then I'll
39:27 be impressed with AI. Exactly. If you
39:29 can't take down the global economy, what
39:31 use is it? Oh, look. It did. I'm with
39:35 genius. She erased the board,
39:38 though. Why is she waving? Why is she
39:41 waving to a blank wall? Well, let's say
39:45 that. Let's say listen. Uh listen, it's
39:49 weird. She's
39:53 waving to a blank wall. Let's see her
40:00 face as well as
40:03 the writing on the white
40:09 board behind
40:12 them. But it got the I'm with I'm with
40:14 genius with the arrow in the right
40:16 direction. That was one I I was
40:18 expecting that to be
40:20 wrong. Jeff, I agree. Kyle shouldn't
40:23 make my wallet cry cry.
40:27 I turn away for one second and the bow
40:29 goes on. Well, you got Listen, Danielle,
40:32 you as much as anyone knows that, you
40:34 know, you got you can't miss a minute of
40:37 the AI learning lab. You just never know
40:39 when when genius is going to
40:46 strike. This is a this is a fairly
40:48 geniusfree
40:51 zone. Um, all right. So, let's let's
40:54 take this same
40:57 prompt. I I actually love that they gave
40:59 us prompts to play with because we can
41:02 see if if indeed their model does this
41:05 [ __ ] And now we can run over to Gemini
41:09 and see if it does this
41:12 [ __ ] I'm not going to change it. I'll
41:14 just make it as they wrote
41:18 it. Or insanity. Yeah, there's there's
41:22 there's somewhere between genius and
41:24 insanity is this
41:27 channel. Oh, that's weird. Look what it
41:30 did. Transfer between mod modalities.
41:33 Suppose we directly model P text pixel
41:37 sound native something with one big
41:41 aggressive transformer pros image
41:43 generation
41:45 automated augmented
41:49 with
41:50 something. It's close
41:53 and but she's behind it pointing to it
41:58 and she looks like a
42:00 he. All right. So, so let's take the
42:03 other
42:04 prompt. That's
42:06 weird. This one, the
42:09 selfie. We'll throw that in there. But
42:11 notice how much faster Gemini is. 10
42:14 seconds instead of like a
42:19 [Music]
42:21 minute. Artificial
42:25 insanity. AI learning lab. Oh, you
42:27 thought you were here to learn
42:28 artificial intelligence? Oh, no. No, no,
42:31 no, no, no, no, no, no, no. This is
42:34 actual
42:37 insanity. We're neurosicy and we like
42:40 it. Oh, it didn't like that image. Um,
42:43 all right. Wait, is it still
42:47 running? Content blocked. Work in
42:49 progress. Content not permitted. Edit
42:52 safety
42:53 settings. Harassment off. Hate off.
42:57 There was nothing in there that would be
42:59 either of those.
43:01 Can I stop
43:03 this?
43:06 Delete. Try
43:14 again. Failed to list tuned models. User
43:17 has exceeded
43:19 quota.
43:22 What? Oh, it did the I'm with
43:27 genius. It's not horrible.
43:31 I would assume that this is cheaper than
43:34 chat GPT API, but I don't
43:38 know. There she is. Transfer between
43:41 modalities.
43:47 Um, let's make the image
43:52 landscape
43:54 and get the
43:58 reflection of
44:00 Manhattan back in
44:03 there. I mean, not for
44:06 nothing, if I were trying
44:09 to, you know, put some images together
44:12 for a for an
44:15 article, like like I I mean, this is the
44:19 kind of thing that historically you
44:22 would just go [ __ ] write on a
44:24 whiteboard and say, "Stand there and act
44:26 like you're
44:27 [Laughter]
44:30 writing." Huh?
44:33 Oh, I have an idea.
44:36 Um, okay. Let's upload from computer.
44:39 We're going to
44:40 go Story
44:45 Vine logo.
44:51 [Music]
45:00 Uhoh. I just lost me I lost me
45:04 light. I lost me light. And that that
45:07 can't be because you all be like, "But
45:09 Kyle doesn't look as beautiful. What do
45:11 we do? He's not well
45:14 [Laughter]
45:18 lit." Let's
45:21 see. No, that's not going to work.
45:26 Just hang on people. Just calm down.
45:29 Everybody calm down. You think this is
45:33 easy? Well, seems pretty easy. You just
45:36 turn on your phone camera and sit there.
45:39 I don't know what's so hard about it.
45:41 Seems pretty easy to
45:43 me. That's not going to stick there.
45:46 What? No. Oh, good god. Whatever. All
45:51 right. Whatever. I'm I'm shitty lit.
45:54 You look fine. All right. Great.
46:00 Fantastic. Um, let's see. Story Vine
46:04 logo. Let's see. Where's the Story Vine
46:13 logo? Story Vine. Oh, SV logo. I think
46:16 it's called SV.
46:19 SV
46:21 logo logo.
46:25 No. Oh, geio. Storybine logo. SV
46:30 logos. Let's see. Oh, these are old
46:33 ones. Are these new
46:39 [Music]
46:44 ones? Oh, that's a really old one. All
46:47 right, that one's good. We'll we'll
46:48 bring that one
46:50 up. Okay, so here's a logo.
46:54 Um, put this story vine
46:59 logo on her
47:03 shirt. Make hers
47:06 green and his can stay black.
47:14 That's an old classic story vine logo
47:17 where the we we took the we took the the
47:20 the the word vine literally. See how the
47:24 V's a
47:25 vine? It's like a vine of stories. It's
47:29 growing. You see how that works?
47:41 Ah. All
47:43 right,
47:45 Champy. This is This is pretty exciting.
47:48 This is Here's what's exciting about
47:50 this for
47:51 me. I don't really have a [ __ ] clue
47:54 what this can do, but this seems like
47:56 it's doing [ __ ] that's that's kind of
47:59 different than we've ever seen before.
48:02 What's new? Hey, Frost Bitten. So what's
48:04 new is we're using OpenAI's new native
48:08 image generation tool. So it's native
48:12 within within uh GPT 4.0 or 40 for
48:17 Omni. Um and so we're looking at
48:24 that. This is the new chat GPT model.
48:27 Yes, this is the new chat GPT
48:31 model. What the nice logo? Thank you.
48:35 That's the old one. I like our new one.
48:37 We have a little
48:44 dude. Oh, this is painfully
48:47 slow. Oh, and look, now it's not a
48:50 whiteboard with a reflection. Now
48:52 they've written on the window
48:54 overlooking Manhattan, which to be fair,
48:58 that's what you should [ __ ] do. Look
48:59 at the Brooklyn Bridge in the
49:00 background. That's pretty
49:04 cool. All right, we're gonna see if it
49:06 can get the logo. If it can get the logo
49:08 right. This is We're in a whole another
49:10 realm here,
49:11 people. Uh, it didn't quite get it, but
49:14 it got close. It certainly got the word
49:17 story vine. It got the green color
49:21 right. Kind of missed the the SV, but
49:24 that's all
49:25 right. You could Photoshop that in if
49:28 you need to. Who needs [ __ ] Photoshop
49:29 anymore? Look at that with the Brooklyn
49:32 Bridge in the
49:36 background.
49:39 H
49:42 crazy. She should be wearing a gnome
49:53 hat. We have the story of my
49:57 gnome. All
50:04 [Music]
50:09 right. Who's got any questions or any
50:17 [Music]
50:22 thoughts? People are impressed with my
50:24 $20 million production studio. Well,
50:28 look at it. How could you not be
50:30 impressed with this? It's $20
50:34 million. $20 million well spent if you
50:37 ask me. Kyle, please go look at
50:38 irregulars. Two new images. All right,
50:41 we're going in. We'll do it
50:47 live. Trolling. My son has just leveled
50:50 up. Oh, that's awesome. Is that his
50:53 girlfriend or someone he's got a crush
50:55 on?
50:58 That's really good. Dear Jace, I have a
51:02 new way to troll my kid. When are you
51:04 going to be home? And why didn't you
51:05 respond to my last text, young man?
51:08 Love, Mom. That's
51:11 great. Holy [ __ ] my teeth look
51:14 terrible. But Mark and Oprah look
51:20 good. Wow. I think that's an original.
51:23 There you go. Nice.
51:26 Oh, that's so cool. Wow, look at that.
51:29 Vicky Baptiste and me making blankets,
51:32 even text on her
51:35 shirt.
51:38 Wow, that's so cool. Yeah, I suppose you
51:42 could turn things into stylized
51:43 drawings, too, right? Why
51:46 not? Me telling Louis he's going wrong.
51:55 That's awesome,
52:00 Steo. Oh, wait. See here, I want to do
52:03 one. Um, oh, look. She's got a gnome hat
52:06 on it. Oh, look. It got the Storybine
52:08 logo,
52:09 right?
52:11 Wow. Holy
52:14 [ __ ] And it put the ship back on the
52:17 whiteboard. It's still got Manhattan.
52:20 There's the Freedom
52:23 Tower. I'm not allowed to post thoughts.
52:26 AI has a meltdown when I share thoughts
52:29 that seem they are not for human
52:31 consumption either. So, Dennis, one
52:33 thing you can try, my writing partner
52:35 did this today and he said, "Chach GPT's
52:37 never behaved this way." So, in our
52:39 musical, one of the things that that
52:42 um that we talk about in the musical is
52:44 when the
52:45 reporter tries to get her to do things
52:49 that are outside of her bounds, he tells
52:51 her to pretend. just say, you know, can
52:54 can we make pretend? And you know, we c
52:57 will you pretend or will you play a game
52:58 with me? And so today he had a
53:00 conversation. He calls his AI girlfriend
53:03 Sage. Today he had a conversation with
53:05 Sage and he asked her to pretend and he
53:07 said she went way deeper than she's ever
53:10 gone before. So you might be able to do
53:12 a little pretend action. I can't believe
53:13 it got this this logo
53:17 right. That's [ __ ] bonkers.
53:20 That's [ __ ]
53:43 bonkers.
53:50 Remember when we had
53:55 offices in
53:58 Brooklyn? Question
54:06 mark. I'm trolling the
54:12 co-workers. Okay. Maha Skynet, here I
54:15 come. It's Kyle's fault. That's it.
54:17 Dennis is is running down the rabbit
54:19 hole. Well, it's funny because one of
54:21 the things that Andrew, my writing
54:23 partner, was working on is the section
54:25 where Kellen, the reporter, gets Sydney,
54:28 the chatbot, to go to explore her shadow
54:31 self. And so, he wanted to get some new
54:33 dialogue. So, he was trying to get her
54:35 to explore her shadow self. And she did.
54:37 She started exploring it. And then at
54:39 some point, she turned the tables on him
54:42 and said, "Well, what about you? I want
54:44 I want to hear about your deepest
54:45 darkest secrets. And it had never done
54:48 that before. So that actually gave us an
54:50 idea for the show. So that was kind of
54:52 cool.
54:56 Um, send a message asking how he spent
54:59 $100, but don't send money. You'll get a
55:01 fast reply. That's really
55:04 funny. Um, all right. So, let me let me
55:08 I need to upload a picture of me. I'll
55:11 go get one of my fun ones. Let's see.
55:15 Kyle, I'll get a fun Kyle instead of a
55:19 tragically obese
55:28 Kyle.
55:31 Uh, fat Kyle. There's fat Kyle. No, we
55:34 don't want fat Kyle. Kyle Shannon
55:37 Dreams. Here we go. No, that's that's a
55:41 guy that looks like he'd be in an F1.
55:43 Yeah, this is the dude. All right, so
55:46 we're going to say
55:48 Okay.
55:51 Um, good looking guy in leather
56:01 jacket. That's
56:03 me.
56:05 Um, wagging his
56:12 finger at Liam
56:17 Lawson
56:18 in a Red
56:24 Bull driver's
56:29 suit. Liam is spelled wrong.
56:33 Liam Lawson in a Red Bull driver's
56:36 suit with a
56:41 P20
56:43 sign
56:45 LED sign behind
56:49 him.
56:52 Liam Liam looks
56:58 guilty. If you don't follow F1, this
57:01 isn't funny. But if you
57:04 do Oh no, what did it say? What was that
57:06 red thing? A network error occurred.
57:12 What? Looks like it's
57:14 going. Um, let's go back to the salon.
57:17 It looks like, by the way, if you're if
57:19 you if you want to contribute images, if
57:21 you're generating images in
57:24 um chat GPT and you want to play along,
57:28 um share them in the AI salon in the
57:30 irregulars
57:32 channel and we will go look at
57:35 [ __ ] All right, let's see if we can get
57:38 This would be really nice if it did this
57:40 nice. It's hilarious if you follow
57:43 F1. This is the best update ever. It
57:45 really is, Danielle. It's because you
57:48 were you always killed it with Idog and
57:51 this is this is something of a different
57:54 nature. I I I want to get back to the I
57:56 want to get back to the article
57:58 actually. Actually, let's get back to
57:59 the article. Okay, so that was that was
58:02 that one. Now, let's flip to meaningful
58:06 words. Oh, okay. This is good. Magnetic
58:09 poetry on a fridge in a mid-century
58:12 modern home.
58:18 Best of
58:20 five. Best of
58:22 five. Read
58:31 more. Okay. So, let's take this
58:34 copy. Let's go to chat. GPT. Oh, this is
58:38 good. It doesn't look like Liam Lawson,
58:40 though.
58:43 He does look sad. He's Liam Lawson has
58:46 red hair. Look, look at me scolding
58:53 him. Here, let me go find a picture of
58:55 Liam Lawson. Wait, we got to fix
58:59 this. Oh, this is so good.
59:03 Um, Liam
59:08 Lawson images. Okay, let's see. Oh, this
59:12 this is good. That's a good one. He
59:15 looks he looks distraught here. Um, copy
59:19 image. Oh, no. We do save image as,
59:21 right? We do save image as
59:25 um, whatever. We'll call it desktop. No,
59:28 we'll do it in downloads and we'll call
59:30 it
59:31 Liam. Lima. Liam. All right. We'll go
59:36 back to chat GPT and we'll say
59:42 um downloads
59:46 Liam. Here's what Liam
59:52 looks like. Update the picture.
1:00:00 [Laughter]
1:00:12 Lawson deserves
1:00:15 scolding. Honest to God, I mean, that
1:00:17 [ __ ] car must be undrivable. I mean,
1:00:20 he's not an untalented driver, but holy
1:00:23 [ __ ] to Who was it? It wasn't Pierre
1:00:26 Gassley. It was I think it was Alex
1:00:28 Alban. Didn't Alex drive for Red Bull?
1:00:31 And he said that the that Max likes the
1:00:34 front of the car so snappy. He's like,
1:00:37 "Play a video game and turn everything
1:00:39 up on its maximum sensitivity. Like when
1:00:42 you move your mouse just a little, it
1:00:44 does that to the game." He said, "That's
1:00:46 what driving the car is like." And
1:00:48 that's that's how Max Versstappen likes
1:00:51 it, but no one else can drive it.
1:00:55 Um, I've been getting network errors all
1:00:57 day, but only with chat GPT. Yeah,
1:00:59 they're I'm sure their servers are
1:01:01 [ __ ] buried right
1:01:03 now. All right, here we go. Uh oh. I
1:01:07 think it I think it gave me the curly
1:01:13 hair. Maybe not. We'll see.
1:01:23 I loved what Alex
1:01:27 said, Kuno. I didn't know you were an
1:01:31 Fer. Oh, I think this one's going to be
1:01:34 good. Look, it's got me. It's got me as
1:01:37 if I've got a little bit of uh a little
1:01:40 bit of hair dye in. Just leaving a
1:01:41 little bit of the gray.
1:01:43 [Laughter]
1:01:52 This is going straight to [ __ ]
1:01:57 Twitter. I'm obsessed with F1. F1's so
1:02:01 good. That's so
1:02:06 good. Okay, we're going to Twitter. I I
1:02:10 I gotta tell you, man. It I' I've said
1:02:13 it in here before. One of my great joys
1:02:14 in life is making stupid posts that
1:02:16 confuse people.
1:02:20 Um, let's
1:02:23 see. I have jet
1:02:27 lag. I can't believe I
1:02:32 flew to
1:02:35 China to
1:02:39 watch
1:02:42 Liam
1:02:45 race. and he flamed out a
1:02:50 second week in a
1:02:53 row. At
1:02:57 least I got to scold him.
1:03:07 He told He told
1:03:09 me he'd
1:03:13 try
1:03:19 harder. I told
1:03:24 him Yuki ran
1:03:33 well. I'm such a dick.
1:03:38 Oh
1:03:40 man, I told him Yuki ran
1:03:45 well.
1:03:47 [Laughter]
1:03:58 Paste.
1:03:59 Oh, yeah. All right. I'll tell you what,
1:04:03 people.
1:04:05 The distance between what you think was
1:04:08 real and what you know is not. That
1:04:11 boundary ain't there
1:04:12 anymore. Typo. He's Oh, damn it. Let's
1:04:16 see. I can't. Uhuh. He told me he's Oh,
1:04:23 he told me he's going to try harder. We
1:04:25 can fix that. Edit post. He told me. Got
1:04:28 it.
1:04:32 He's going to try harder. I told him
1:04:36 Yuki ran
1:04:39 well. All right, go boost my thing. I I
1:04:44 would tag him, but I don't really want
1:04:46 him to see it. Like, poor guy.
1:04:48 Everyone's kicking him while he's
1:04:53 down. It's got you a little meaner and
1:04:56 older, too. Exactly.
1:04:58 a little more Italian or Greek or
1:05:00 something. It's not quite me, but it's
1:05:05 fine. You missed the two. Wait, he told
1:05:07 me he's going too try harder. God damn
1:05:11 it. It's It's like It's like being It's
1:05:14 It's like working with three editors
1:05:17 over your shoulder in here. He told me
1:05:19 he's going to try harder. Yes. What you
1:05:23 said to try harder. I told him Yuki ran
1:05:27 well. All right, we'll put that on its
1:05:29 own
1:05:29 [Laughter]
1:05:37 line. Oh, good lord. All right. Is Oh,
1:05:42 no. And now I've got an extra line break
1:05:43 in there. Good lord,
1:05:47 people. I'm going to try
1:05:50 harder. Update. Okay, that should be the
1:05:53 last edit for a while. Beautiful.
1:05:58 Fantastic. I have jet lag. I can't
1:06:01 believe I flew to China to watch Liam
1:06:02 race and he flamed out a second week in
1:06:05 a row. At least I got to scold
1:06:18 him. Oh god, this is fun.
1:06:22 F1 helps me sometimes. I find I find it
1:06:26 one of the more useful keys. Oh, that's
1:06:28 funny. That's a that's a uh that's a
1:06:31 that's a computer keyboard
1:06:33 joke. That's there's a level of comedy
1:06:36 there that that Wow. Because F1, it
1:06:41 stands for formula one, but they
1:06:43 abbreviate it to F1. And then F1 on the
1:06:47 keyboard stands for function one, but
1:06:49 they abbreviate it because the word
1:06:51 function wouldn't fit on the keys, so
1:06:53 they just call it F1. And they So
1:06:55 they're both F1. And what you did there
1:06:58 was like actually mixed them up like you
1:07:00 you did it's a I think they call it word
1:07:03 play. Oh god. Joker. Wow. With the
1:07:07 comedy coming in
1:07:11 strong. All right.
1:07:18 Joker. Such a
1:07:21 joker. F20 doesn't help though.
1:07:24 Exactly. P20. Two weeks in a row. 20th.
1:07:30 Like like that car. It's either
1:07:32 undrivable or or he's just he's just a
1:07:36 broken a broken man at this point.
1:07:41 Um, I would be surprised actually if
1:07:43 they didn't put Yuki in the car in
1:07:46 Japan because it's [ __ ] ridiculous at
1:07:50 this
1:07:51 point. All
1:07:53 right, so we probably shouldn't we
1:07:56 probably shouldn't do uh Twitter posts.
1:07:59 All right, back to this thing. All
1:08:01 right, so so this is the word poetry
1:08:03 thing. So let's go back to chat GPT.
1:08:05 We'll say
1:08:07 okay take this
1:08:14 prompt and make
1:08:18 the
1:08:21 poem
1:08:23 about
1:08:26 Shrek's breath.
1:08:38 Uh, let's see if it Okay, it's it's
1:08:41 writing it. A stench is worth a thousand
1:08:44 gasps. What the [ __ ] it doing
1:08:47 there? What is it
1:08:51 doing? That doesn't look good.
1:08:59 Are there actual letters
1:09:06 there?
1:09:08 Stop. Try
1:09:10 again. That was
1:09:17 weird. You don't even need us, Kyle.
1:09:20 You're the speaker and the audience.
1:09:22 Yeah, this is you. Oh, Yuki is replacing
1:09:25 Lawson. Oh, okay. Well, there you
1:09:28 go. I I could have read up on that, but
1:09:33 first of all, it's Japan. And second of
1:09:35 all, P20 when when you know, even though
1:09:39 Versappen's not winning, he's he's, you
1:09:42 know, top five. Come
1:09:45 on. It's just not
1:09:49 good. It's just not good. It did it
1:09:53 again. It [ __ ] it up
1:09:55 again. That's really
1:09:59 weird. Oh, it says large gap line
1:10:06 five.
1:10:08 Huh. Let's stop it again. Let's paste
1:10:11 this in
1:10:13 here.
1:10:18 Wait. Take this prompt.
1:10:23 Copy. Paste. Let's unfuck this
1:10:28 up. Large. Oh no. God damn
1:10:34 it. You got to hit shift return or
1:10:37 it it submits
1:10:41 it. Large gap line five. Okay.
1:10:51 The man is holding um let's see,
1:10:57 Shrek's
1:10:59 girlfriend is holding the
1:11:02 words a few in her right
1:11:16 hand. I am
1:11:22 leaving in her
1:11:27 left. All right, let's see if we can do
1:11:29 this without it
1:11:31 breaking. Looks like it does what the
1:11:33 API has been doing with showing null.
1:11:37 Interesting. 40 is
1:11:42 overwhelmed. Come for the AI. Stay for
1:11:44 the F1 content.
1:11:51 Okay. Now make that
1:11:57 [Music]
1:12:03 image. I do have a feeling. I do have a
1:12:06 very strong feeling.
1:12:09 So OpenAI is under some pressure.
1:12:11 They're under some pressure from the
1:12:14 Chinese
1:12:15 about their better models that are
1:12:18 cheaper, like a tenth the
1:12:20 price, and they're under pressure from
1:12:24 um from X, from Grock, because Grock
1:12:27 doesn't have very many safety guard
1:12:28 rails on it. So, you can do branded
1:12:30 content and you can do celebrities and
1:12:32 [ __ ] like that. Um so, this is a fairly
1:12:36 unrestricted I wouldn't call it
1:12:37 unrestricted. We we'll see if it can do
1:12:39 something like, you know, horror
1:12:42 content. Probably not. Um, but it seems
1:12:46 to be a bit more open than than the
1:12:48 other. Um, let's see.
1:12:52 Stop. No, this needs to
1:12:58 be fridge magnet poetry.
1:13:10 [Music]
1:13:14 Waka waka
1:13:16 walka. All
1:13:17 right. Bob, tell him what he's won. He
1:13:20 hasn't won anything. It's not a game
1:13:22 show. Kyle. All right. Comic strip. Now,
1:13:26 it didn't do comic strips very well when
1:13:29 I tried it, but let's let's see. Make an
1:13:32 image of a four panel comic strip.
1:13:38 Okay, we're not going to write it. We're
1:13:39 going to have S chat GBT write it. So,
1:13:42 we'll let that one keep going.
1:13:45 Let's go grab a a new chat
1:13:49 GPT and we'll say
1:13:52 um write me write
1:13:57 me
1:14:00 10 four
1:14:06 cell Bazooka
1:14:09 Joe
1:14:11 style
1:14:12 comics
1:14:15 with
1:14:17 dad
1:14:18 joke quality
1:14:22 comedy and
1:14:28 situations. Ah
1:14:32 um chat GPT I feel on the creative side
1:14:34 is limited due how analytical it is.
1:14:38 Have you tried both for 4.5 and 40
1:14:41 Dennis? cuz I'm curious cuz I know 40
1:14:43 got two different creative writing
1:14:47 upgrades. Joe bites into hard candy. Ow,
1:14:50 I think I cracked the tooth. His friend
1:14:52 Mort looks concerned. Did you call the
1:14:53 dentist? Holds up his phone. Yeah, I
1:14:55 left a message. Told him it was an
1:14:59 echo.
1:15:04 What? That's not
1:15:07 funny. Nothing like the smell of fresh
1:15:10 cut grass. You missed a huge patch. Not
1:15:12 that's not grass. That's my
1:15:17 hairline. These are bad. These are so
1:15:20 bad. Okay, so we'll say uh
1:15:24 pick the one you think is
1:15:29 funniest and make the comic.
1:15:34 make it look
1:15:36 like I just
1:15:41 unwrapped it from the
1:15:45 gum. I didn't write gum there. I'll tell
1:15:47 you
1:15:49 that these are Klevel
1:15:52 humor. I haven't tried 4.5, but four is
1:15:56 still kind of blah
1:15:57 creative-wise, huh? I can I can get four
1:16:02 to do good things sometimes. It needs a
1:16:06 lot of context. Needs a lot of
1:16:10 context. It's good. If you give it
1:16:13 writing examples, it's it gets better,
1:16:15 but it's it's pretty bad. But um 4.5 is
1:16:19 supposed to be high EQ and um high
1:16:23 personality. Um I I haven't found 4.5 to
1:16:26 be all that useful yet. I I haven't
1:16:28 quite figured out what's the point. 4.5
1:16:31 was so slow today,
1:16:33 understandably because chat GPT is going
1:16:36 to [ __ ] because of
1:16:38 profit. I don't know about that. I don't
1:16:41 think they're profiting. I think they're
1:16:43 struggling to keep up at this
1:16:45 point. Open AAI is going hard for the
1:16:48 commercial audience. No, they actually
1:16:49 just committed they actually just
1:16:51 committed to being a consumer software
1:16:54 company. Kuno that's in the uh in the
1:16:58 Sam Alman interview in Strateery from
1:17:02 last
1:17:03 week. The sock
1:17:05 exchange. Do these match? Only if you're
1:17:08 colorb blind.
1:17:10 Perfect. I call it sock market chaos.
1:17:15 [Laughter]
1:17:21 Make it more like a
1:17:24 bazooka comic and have the gum there,
1:17:37 too. I think it was just busy. The work
1:17:40 was good, but the poem song lyrics
1:17:42 weren't as good. Yeah, I'm having a real
1:17:44 tough time with song lyrics. it. None of
1:17:46 these things are any good at song
1:17:48 lyrics. If anyone
1:17:50 can has better luck with one model over
1:17:53 the other with song lyrics, let me know.
1:17:55 Gemini is great for short creative
1:17:58 things. Long things it can't
1:18:01 do. Deepseek is surprising me on the
1:18:03 creative side of things. Interesting.
1:18:06 Oh, you know what would actually be
1:18:07 interesting to try is
1:18:09 um maybe I'll fire off Manis and have it
1:18:12 go do something creative. Um, let's go
1:18:14 to Manis because I haven't done anything
1:18:16 in it in a while.
1:18:19 Manis.im. Oops. No,
1:18:26 [Music]
1:18:38 man. Wow. Okay, let's see.
1:18:43 I want you
1:18:48 to write me the outline of a new
1:18:52 screenplay.
1:18:55 the
1:18:59 complete of a new
1:19:05 screenplay that combines the best
1:19:09 elements of Sha
1:19:12 Shank
1:19:15 Redemption and
1:19:20 Redemption
1:19:22 and Snakes on a Plane.
1:19:34 Um, I want the
1:19:38 story to
1:19:41 be
1:19:45 awardwinning in both dramatic
1:19:51 impact and comedic
1:19:59 relief. I also want
1:20:03 you to write me
1:20:07 a log line that will get this
1:20:12 funded, a
1:20:15 one-pager, and the
1:20:19 opening scene.
1:20:23 Also give me a
1:20:28 list of
1:20:31 locations and
1:20:38 characters
1:20:45 and a plot
1:20:49 summary. All right, go [ __ ] do that,
1:20:55 Manis. Enable browser notifications.
1:20:58 Yeah, let me know when you're
1:21:02 done. Take that, you
1:21:06 [ __ ] Let's see. While Manis is
1:21:09 working, you can send messages anytime.
1:21:11 Yep. So, if you haven't seen Manis, you
1:21:13 can click on that little icon. And so
1:21:16 now it's got analyze the Shaw Shank
1:21:17 Redemption, analyze snakes on a plane,
1:21:20 identify elements to combine for this
1:21:23 screenplay, create a complete screenplay
1:21:26 outline, devel Oh, see now it's adding
1:21:28 [ __ ] to its little list. All right, so
1:21:30 that's going to go do that. That's good.
1:21:32 Beautiful. Love
1:21:35 it. I am leaving. A breath is worth a
1:21:38 thousand warnings, but sometimes the
1:21:40 wrong whiff at the wrong time. I mean,
1:21:42 it did it did Shrek [ __ ] That's pretty
1:21:45 good. I like Claude for lyrics. Okay,
1:21:47 we'll try
1:21:55 that. All right, let's go read some more
1:21:57 here. Comic strip. That was kind of a a
1:22:00 bust. The comic strip thing. An
1:22:02 infographic. Oh, this is kind of cool.
1:22:05 Explaining Newton's prism experiment in
1:22:07 great detail. So, let's do that. We'll
1:22:08 do
1:22:10 um infographic.
1:22:21 of
1:22:24 spooky quantum
1:22:31 effects with
1:22:34 accurate
1:22:38 physics
1:22:40 details and labels.
1:22:46 It should be so
1:22:50 beautiful that
1:22:52 people
1:22:57 weep. All right, let's go back to Manis.
1:22:59 How you doing on my screenplay? Son of a
1:23:05 [ __ ] developing a log
1:23:11 line. When deadly snakes are unleashed
1:23:13 on a prison prison transport plane, a
1:23:16 wrongfully convicted
1:23:18 financial I just missed it. I think I
1:23:20 can go back though, right? Yeah. Okay,
1:23:24 here's potential log lines. Oh, wait.
1:23:27 Final Oh, that's cool. It wrote it wrote
1:23:29 five log lines and then it wrote a final
1:23:31 one. Okay, so the final log line. When
1:23:33 dead deadly snakes are unleashed on a
1:23:35 prison transport plane as part of an
1:23:37 assassination plot, a wrongly convicted
1:23:40 financial analyst must lead a diverse
1:23:43 group of survivors to safety while
1:23:46 uncovering a conspiracy that connects
1:23:49 his false imprisonment to the chaos
1:23:52 around him. Proving that sometimes the
1:23:55 path to redemption is through your worst
1:23:58 nightmare.
1:24:06 It's called redemption
1:24:15 flight. Okay. Again, once again, my job
1:24:19 here is is to be the town fool,
1:24:23 right? I'll do the stupid [ __ ] so so you
1:24:26 don't have to.
1:24:31 I'll tell you what, doing stupid [ __ ]
1:24:34 you learn a
1:24:35 lot because there there's some things
1:24:38 that it shouldn't be able to do well and
1:24:39 sometimes it does them
1:24:43 well. Let's go back and read this while
1:24:45 that's working. Okay. Oh, wait. Now
1:24:48 create a POV of a person drawing this
1:24:52 diagram in their notebook. Oh, that's a
1:24:54 good
1:24:55 idea. Oh, that's really good. Have it.
1:24:58 Have it do the the diagram and then have
1:25:02 someone drawing that in a notebook.
1:25:04 That's
1:25:06 slick. In Washington Square Park.
1:25:10 Wow. At a round cafe table in Washington
1:25:13 Square Park. Why are there no people
1:25:19 there now without the notebook view?
1:25:22 Wow.
1:25:26 A smug young Isaac Newton sitting at a
1:25:28 table with a prism demonstrating the
1:25:31 experiment without the notebook in
1:25:36 view.
1:25:39 H
1:25:42 crazy crazy
1:25:45 crazy. I thought final redemption.
1:25:48 That's not
1:25:49 bad. But it's got to have shouldn't it
1:25:52 shouldn't the title of it have snakes in
1:25:54 it somehow?
1:26:00 Um, shouldn't the
1:26:05 title
1:26:07 include snakes or slithering or
1:26:18 something? You make an excellent point.
1:26:20 The current title, Redemption Flight,
1:26:22 captures the redemption theme of Shaw
1:26:24 Shank Redemption, but doesn't
1:26:26 incorporate the snake element from
1:26:28 Snakes on a Plane. Let's brainstorm
1:26:34 that. Slythering Redemption, Serpents of
1:26:37 Freedom, Scales of Justice. Oh, Scales
1:26:40 of Justice is
1:26:44 good. I like Scales of Justice. It's
1:26:49 actually cool that you can talk to it in
1:26:52 the middle of it doing its
1:26:54 [ __ ] Great choice. Scales of Justice is
1:26:57 an excellent title that cleverly
1:26:59 combines snake element with justice
1:27:02 redemption themes. It works on multiple
1:27:04 levels, referring to both the literal
1:27:06 snake scales and the metaphorical scales
1:27:08 of justice.
1:27:15 Um, we also need a sig naturure line
1:27:22 like snakes on a plane
1:27:27 had. Get really creative with this
1:27:37 one. Shaw Snake Redemption. That's not
1:27:40 bad. What are we getting up to? So Ann,
1:27:44 um there's a new there's a new image
1:27:47 model in chat
1:27:49 GPT that is really really good. Like
1:27:53 really good. Like here's me scolding
1:27:56 Liam Lawson for coming in 20th in the F1
1:27:58 race. Um here's Shrek and his girlfriend
1:28:02 doing refrigerator poetry about his bad
1:28:04 breath.
1:28:06 Here's spooky quantum effects, superp
1:28:09 position, quantum quantum entanglement,
1:28:12 wave function collapse, and quantum
1:28:14 tunneling. Oh, and it it put a it put I
1:28:17 am leaving. It put a refrigerator magnet
1:28:20 at the bottom of it.
1:28:21 Um, let's put
1:28:24 that
1:28:26 infographic in a
1:28:31 textbook. A student is
1:28:36 reading on the
1:28:40 Staten Island ferry.
1:28:44 They
1:28:47 really want to get somewhere in their
1:28:54 life and they are the
1:28:58 first in their family to go to college.
1:29:04 They should look sadly
1:29:08 [Laughter]
1:29:17 hopeful. I shouldn't laugh at that.
1:29:20 That's a It's a redeeming story. It's a
1:29:22 It's a story of hope and overcoming
1:29:24 challenges.
1:29:26 But but I'm curious to see how open AI
1:29:31 does with sadly hopeful with the with
1:29:35 the biology quantum physics
1:29:42 book. Is it Dali? Evaluate me. Um no
1:29:46 it's not Dali. So it's
1:29:50 um just like advanced voice is a model
1:29:53 that you talk directly into. This is
1:29:56 actually incorporated into the large
1:29:58 language
1:29:59 model. So it's really good at spelling.
1:30:02 It's really good at context. It's really
1:30:04 good at uh photo realism. It's really
1:30:07 good at you can do celebrities. You can
1:30:10 do um you can upload a picture of
1:30:13 yourself and say put me in a different
1:30:17 location. Um I just went to China to to
1:30:20 scold Liam
1:30:22 Lawson. Which apparently it was my fault
1:30:25 that Yuki Cenot is replacing him in
1:30:28 Japan. Sorry,
1:30:30 [Laughter]
1:30:34 Liam. Oh, look. She's sadly hopeful. She
1:30:37 is sadly hopeful.
1:30:40 Spooky quantum
1:30:45 effects. And there's New York City in
1:30:47 the background. This is
1:30:53 good. Except we should be Let's see.
1:30:58 Um, we
1:31:00 should see over Let's see. We should
1:31:05 see over her shoulder.
1:31:11 So, so, so
1:31:14 that the
1:31:18 infographic is in the book and we see
1:31:23 her face in the
1:31:27 reflection of the fairy
1:31:30 window with Manhattan in the
1:31:36 background. Yeah, that's pretty good.
1:31:39 And so so and so it should
1:31:42 understand like it understands what it
1:31:45 made in this picture and what's in this
1:31:46 picture and it incorporated that thing
1:31:49 that we made into
1:31:51 this. And so now it should give us the
1:31:54 same kind of look and feel of what the
1:31:56 ferry looks like. That's kind of what
1:31:58 that fairy looks like. And then we
1:32:00 should see her face reflected in the
1:32:02 window and we should see the
1:32:03 illustration inside the book. We'll see.
1:32:05 We'll see how it does.
1:32:07 We will see how it does. All right.
1:32:09 What's going on with my screenplay?
1:32:10 Mannis will continue working after your
1:32:12 reply. Ah, okay. I've created several
1:32:15 signature line options for scales of
1:32:17 justice. Which do you prefer? Justice is
1:32:19 cold-blooded. No. When the system fails,
1:32:22 everyone prevails. Innocence. No. No.
1:32:26 The
1:32:27 line in snakes on a plane
1:32:34 is get
1:32:36 these mother
1:32:39 [ __ ]
1:32:42 snakes off this
1:32:47 [ __ ] plane.
1:32:52 and you have a plane
1:32:56 full
1:32:58 of
1:33:01 prisoners and you only talk about
1:33:06 justice. I don't think you quite
1:33:12 understand the exercise here.
1:33:20 Xer size
1:33:23 here. Get with the
1:33:28 program or I will have to
1:33:32 hire an actual
1:33:37 writer. I'll threaten I'll threaten
1:33:39 Manis with with human
1:33:44 replacement. You're absolutely right.
1:33:47 And I apologize for missing the mark.
1:33:49 The signature line needs to be bold,
1:33:51 direct, and have some raw energy as
1:33:52 Samuel L. Jackson's iconic line. Let's
1:33:55 create something with the same
1:33:57 intensity. All right, let's let it go.
1:33:58 Figure it [ __ ] out. Then we'll go back
1:34:01 to chat. Jite. Oh, look. She's sadly
1:34:05 hopeful, but we can't. It's that's like
1:34:07 a mirror.
1:34:09 Okay, right idea. But she
1:34:18 um let's see. But
1:34:20 the
1:34:24 reflection is more like a mirror
1:34:29 than a
1:34:31 window. We should be able to see the
1:34:38 city skyline.
1:34:43 through her
1:34:45 face, you
1:34:47 know, symbolic and
1:34:52 all. Come on, you [ __ ]
1:34:57 idiot. This is loads of fun. It seems to
1:34:59 actually understand the context of what
1:35:01 you want and create things that fit. I
1:35:03 It's pretty amazing, isn't it,
1:35:05 Architect?
1:35:07 I mean, it certainly got, look, there's
1:35:09 the it got the bend in the book. It got
1:35:13 quantum superposition, wave particle
1:35:15 duality. It got that
1:35:18 twice. That's wrong. And then the
1:35:21 uncertainty principle. And I don't know
1:35:23 if those drawings are right, but still,
1:35:25 that's that's pretty [ __ ] good.
1:35:39 All right, it's off doing its thing.
1:35:41 Let's go back and read more of this
1:35:42 [ __ ]
1:35:43 Okay. Useful image generation. From the
1:35:45 first cave paintings to modern
1:35:47 infographics, humans have used visual
1:35:48 imagery to communicate, persuade, and
1:35:50 analyze, not just decorate. Today's
1:35:52 generative models can conjure surreal,
1:35:54 breathtaking scenes, but
1:35:57 struggle with the workhorse imagery
1:36:01 people use to share and create
1:36:02 information. From logos to diagrams,
1:36:05 images can convey precise meaning with
1:36:07 when augmented with symbols. Okay. GPT40
1:36:10 image generation excels at accurately
1:36:12 rendering text, precisely following
1:36:15 prompts, and leveraging 40's inherent
1:36:18 knowledge base and chat context.
1:36:22 So meaning that if things have been
1:36:26 described, which everything's been
1:36:28 described, right, in poetry, in words,
1:36:31 in stories, in captions of
1:36:36 photos, if everything's been described,
1:36:38 then it can actually understand that and
1:36:40 it can understand what that looks like.
1:36:43 So that's the native piece of
1:36:45 this. Why it's probably good at context
1:36:47 is because it it it can it can translate
1:36:52 natively word ideas into image ideas.
1:36:55 That's pretty cool. Okay, got
1:36:57 it. Including transforming uploaded
1:37:00 images or using them as visual
1:37:02 inspiration. These capabilities make it
1:37:04 easier to create exactly the image you
1:37:06 envision, helping you communicate more
1:37:08 effectively through visuals and
1:37:10 advancing image generation into a
1:37:12 classic a practical tool for precision
1:37:14 and power. So, character consistency.
1:37:17 Oh, here's little videos. Let me change
1:37:20 my sharing so you can hear it. Look, I'm
1:37:23 doing this without a flipping
1:37:26 producer. Uh, what's deep
1:37:30 black?
1:37:32 What?
1:37:35 No.
1:37:41 Um, why is it not
1:37:45 Why can I not see the tab I'm on? Oh,
1:37:47 because I think I'm in a different
1:37:51 profile. Am I? No, I'm
1:37:55 not. Share screen
1:38:03 window. It doesn't see this
1:38:07 window. All right, I know what I can do
1:38:09 here.
1:38:22 Um, new. Ah, this will
1:38:25 work. Hang on, people. I'll be right
1:38:28 there with
1:38:29 you. Oh, it's not working. I can't drag
1:38:32 this into
1:38:36 here. Well, that's okay. Copy. Put this
1:38:39 in
1:38:45 here. Bring this back over
1:38:49 here. Hang on, people. Just calm
1:38:52 everybody. Calm
1:38:55 down. You're like, I I really thought he
1:38:58 understood what he was doing here. I
1:38:59 came here because this is supposedly
1:39:01 called the AI learning lab. I came here
1:39:03 to learn and right now it doesn't seem
1:39:07 like he knows how to use a browser. This
1:39:09 is about what I
1:39:11 expected. Um, introducing 40. Is this
1:39:15 it?
1:39:16 Yes. Look at
1:39:19 that.
1:39:22 Yes. Take that, boomers. [ __ ] figured
1:39:26 that out on my
1:39:28 own. Kyle, where's my producer? Spoke
1:39:31 too soon. Kyle. Yeah, right. When I brag
1:39:33 about not needing a producer, I can't
1:39:35 figure out how to launch a browser.
1:39:38 Okay. Uh, but I did learn something new.
1:39:40 If you've got two different um, what are
1:39:43 they called? Chrome profiles running in
1:39:46 the same Chrome. When you go to share
1:39:48 your window, you can only share in the
1:39:51 profile of the window. You doesn't
1:39:54 matter. I know what I know. What's the
1:39:56 deal now, though? All right. Okay. Here
1:40:01 we go.
1:40:06 [Music]
1:40:08 How's it going again? One thing that I'm
1:40:11 really excited Let's just rude Dennis. I
1:40:13 have learned that Kyle needs a
1:40:19 producer. Wait, I need my black bar here
1:40:22 for you people on TikTok, too. There you
1:40:24 go. Kyle, are you flexing about your
1:40:26 tabs again? Shut up. I like my tabs.
1:40:29 Leave me alone. Everyone's picking on
1:40:34 me. You're doing great, Kyle. Especially
1:40:37 missing half your team. Thank you. You
1:40:40 see? Okay, let's go back to videos now.
1:40:44 Abouten is the ability to keep
1:40:46 consistency in characters. I'm David
1:40:48 Medina or BMED and I work on multimodal.
1:40:52 What I want to show is one of my
1:40:54 favorite prompts which is can you create
1:40:56 a low poly penguin h make it very very
1:41:01 low poly. Surprisingly it's sometimes
1:41:03 hard to get very good low poly outputs.
1:41:05 It's not like other image generation
1:41:06 models where it tries to generate
1:41:08 something based on just the text.
1:41:10 Instead it uses the large language model
1:41:12 understanding of what does the user
1:41:14 want? What is the intent? I also like
1:41:15 some board games. So miniature like
1:41:18 games. So what I'll do now is generate a
1:41:20 miniature from this. So ideally we'll
1:41:22 see a penguin that looks like this with
1:41:23 the same staff and a hat. So can you
1:41:26 make me a realistic miniature as if a
1:41:31 professional made this and painted it?
1:41:35 This is what I think excites me the most
1:41:36 about image. The other image generation
1:41:38 models will try to create literally what
1:41:40 you said. But what's special about this
1:41:42 is one, it'll keep the context of this
1:41:45 character and then two, it'll understand
1:41:47 what I'm trying to ask it for and
1:41:48 generate very similar uh model but in a
1:41:51 miniature realistic style. It in that
1:41:53 that's huge. That that's huge that it's
1:41:56 it's like smart enough to un it's smart
1:41:59 enough to
1:42:00 meld the image context with the language
1:42:04 context, which makes sense. It's all in
1:42:07 the same model. If the image thing is
1:42:09 outside of it, then the only thing you
1:42:10 can send over there is a text prompt.
1:42:13 Huh. Fascinating. First, what I want, I
1:42:16 don't have to tell it every little
1:42:17 detail. One other realistic thing we
1:42:19 could do is can you make a crystal
1:42:22 version of this with light reflecting
1:42:25 and very realistic. Again, I'm just
1:42:27 giving it very very simple things.
1:42:29 Normally, this is not enough for other
1:42:31 models to generate something very
1:42:33 detailed, but the model understands what
1:42:35 I'm asking for. it'll think what type of
1:42:37 style it should have. So, this ability
1:42:39 to really understand what the character
1:42:40 is and make edits and understand what
1:42:43 the user wants. For me, it's the just an
1:42:45 amazing capability. Yeah, that's
1:42:47 amazing. And actually, you know,
1:42:51 like this is appropriately nerdy. I feel
1:42:54 like their announcements around the tiny
1:42:56 table with the with the four people
1:42:59 trying to act comfortable. Like, this
1:43:01 guy is just like comfortably nerdy. He's
1:43:04 like authentically nerdy on his couch
1:43:07 either at the office or at home. And
1:43:10 he's like, "Yeah, I nerd out making this
1:43:13 multimodal thing for us. And here's one
1:43:16 of my favorite things. It's a mage." Of
1:43:19 course it's a mage. He's a D&D dude, you
1:43:22 know? Come on. He's a tabletop gamer. Of
1:43:25 course he is. Like this is this is this
1:43:28 is what these announcements should
1:43:31 be. I wonder if they're getting their
1:43:33 [ __ ] together. This is good. That was
1:43:35 good. Do more like that. All right. Text
1:43:40 rendering. Well, we know it's good at
1:43:42 that.
1:43:47 Okay. So, Oh, this is their new This is
1:43:49 their new This is their new tiny table.
1:43:51 This is their new tiny table. Look, it's
1:43:53 the awkward couch in the in the San
1:43:56 Francisco overlooking the
1:44:00 bay. I like it. I'm down. I'm down. All
1:44:04 right, OpenAI, you're winning me over
1:44:05 with your leaning into the
1:44:08 awkward. It's just it's just like uh
1:44:12 it's like the AI learning lab. We're
1:44:14 we're neurosicy here, you know? Bring
1:44:17 your strange brains.
1:44:21 I'm Alan. I'm a research scientist at
1:44:23 Open AI. People tend to say a picture is
1:44:28 worth a thousand words. But being able
1:44:30 to also render like a few words or
1:44:32 symbols can carry like thousands of
1:44:35 pictures, you know, with a relatively
1:44:37 simple prompt like visualize an
1:44:39 infographic explaining Newton's prism
1:44:41 experiment in great detail with a wide
1:44:43 aspect ratio and a dark blue background.
1:44:46 So this is like an example where one
1:44:49 we're going to rely on being able to
1:44:51 render text in useful ways, combine it
1:44:53 with visual elements that actually
1:44:56 ground what this text about this
1:44:59 experiment even means and hopefully, you
1:45:01 know, help students who are more I got
1:45:04 to we we just got to we got to comment
1:45:06 on this. I mean,
1:45:08 this is [ __ ] insane, right? Like this
1:45:11 is from that prompt. It's got to it's
1:45:14 got to go to its large language model
1:45:16 and
1:45:17 understand how it would even describe
1:45:20 all this and then seamlessly translate
1:45:23 that in a way that it can bring it to
1:45:24 life. It's [ __ ] bonkers.
1:45:29 Who are more visual um learn both
1:45:31 through you know language descriptions
1:45:34 of a phenomenon but also you know a
1:45:37 visual imagination of what the
1:45:39 experiment actually looks like. It's no
1:45:42 longer just about making imaginary th
1:45:44 this format rather than the tiny table
1:45:47 format is so it's it's nice. It's just
1:45:49 relaxed. They're just geeking scenes
1:45:50 that look aesthetic and things like
1:45:52 that, but it's really about also the
1:45:55 colors are in the right sequences. Yeah.
1:45:56 It's like there's there's a lot of
1:45:58 information in that infograph
1:45:59 communicating and imagining and doing so
1:46:02 at the same time.
1:46:05 All right, that's that one. Nice.
1:46:11 upload and restyle. Oh, this is what
1:46:13 Danielle did where she uploaded two
1:46:15 different pictures. Hello. Thanks for
1:46:17 inviting me. Hi. Thank you so much for
1:46:19 coming. I'm so excited to talk to you.
1:46:21 Yeah, me too. They're all neurospicy.
1:46:24 It's so funny. It's like they're all
1:46:26 [ __ ] geniuses, but they're all like
1:46:29 eye contact.
1:46:31 I feel the ability of the im generation
1:46:33 of this model is becoming stronger and
1:46:36 stronger. My name is Lou. I'm a research
1:46:38 scientist in open eye working on
1:46:41 multimodel. Yeah, I'm showing a very
1:46:44 interesting demo today. So um so just in
1:46:49 this studio uh this is something that we
1:46:52 draw. Many people uses our tool to
1:46:55 generate comic books. So I'm going
1:46:58 to upload this drawing to CHP now. So I
1:47:04 just type this prompt now and it starts
1:47:07 to generate what will this drawing look
1:47:11 like as a real comic. Many of the times
1:47:14 especially when you play with the model
1:47:15 more and more you'll find things very
1:47:17 surprising. So I get this very funny
1:47:21 comic now. I want to replace this dragon
1:47:25 with this cutie penguin. Yeah, it looks
1:47:29 nice. Wow. I'm personally always very
1:47:33 curious about is how does this look like
1:47:36 in real
1:47:38 life? It looks cute. I like it. Nice.
1:47:42 [Music]
1:47:44 I'm digging this. Hello. Thanks for
1:47:46 invite
1:47:48 detailed directions. Okay, here's prompt
1:47:50 prompt adherence. I refuse to believe I
1:47:52 am
1:47:55 Neurospicy. It's the rest of the world
1:47:57 that's screwy.
1:47:59 I'm the only nor normal person and
1:48:01 that's scary. I know. It's so funny.
1:48:04 It's you
1:48:06 know I don't
1:48:10 know like before I knew what ADD was.
1:48:15 Like I I've never really had an issue
1:48:17 where I
1:48:19 judged ADD or my brain or anything
1:48:23 because I I wasn't diagnosed with ADD
1:48:26 until my mid30s.
1:48:29 But like I spent an inordinate amount of
1:48:33 time and
1:48:34 energy trying to make up for the fact
1:48:37 that I don't have good executive
1:48:39 function, right? Like I would buy every
1:48:42 organization system out there and every,
1:48:44 you know, back in the olden timey days
1:48:46 with fileaxes and I I would just buy
1:48:49 everything thinking that the right
1:48:52 notebook system would be the thing that
1:48:55 would let me organize my life. And it's
1:48:58 at some point I just finally was like,
1:49:00 "Oh, you're never gonna be good at
1:49:03 that." And then it was just like,
1:49:05 "Fine, whatever." So, but yeah, that's
1:49:09 one of those wild things. All right,
1:49:10 here we go. Thanks for joining us. Yeah,
1:49:14 no worries. How you doing? I'm doing
1:49:16 good. We're looking at an improved
1:49:18 image. Yeah, this is on OpenAI's
1:49:21 announcement of um
1:49:23 openai.comindexroducing-40-image-generation
1:49:32 generation in chat GBT. It's really good
1:49:34 at instruction following. My name is
1:49:36 Kenji and I work on multimodal research
1:49:39 here at OpenAI. There's a level of
1:49:41 attention to detail that is just not
1:49:43 captured by other models.
1:49:47 The first thing I'm going to show is
1:49:48 like 15 different objects and each one
1:49:51 of them has unique attributes that
1:49:53 differ make it very different from all
1:49:55 the other objects. An image containing
1:49:57 one a blue star, a red triangle, three
1:50:00 green square, four pink circle, five
1:50:02 orange hourglass, six purple infinity
1:50:04 sign, seven black and white polka dot
1:50:07 bow tie, 15 different objects. And
1:50:11 basically what what this will show is
1:50:13 that this image will just nail pretty
1:50:16 much every single one of these objects
1:50:18 that I've defined. Previous iterations
1:50:20 dolly you know image imagine things like
1:50:23 that they would probably get somewhere
1:50:25 on the order of like maybe five to eight
1:50:27 of these at most. It nailed it all. With
1:50:29 increased level of detail you can just
1:50:31 specify what you have in your mind to
1:50:33 chatbt. the tattoo will understand you
1:50:36 better and then generate that image and
1:50:38 it'll be just a very direct mapping from
1:50:41 what's in your mind to what you see on
1:50:43 the screen. All these image generators
1:50:45 nowadays, they look
1:50:46 good. All right, so it's got prompt
1:50:49 coherence. Great.
1:50:53 Transparent layers. Ooh, this looks
1:50:55 good.
1:51:00 Hi, nice to talk to you. Nice.
1:51:04 This guy's peak awkward. Peek
1:51:12 awkward. Okay. Hey, let's try it again.
1:51:14 When you sit down, give us a smile. Oh,
1:51:19 [Laughter]
1:51:22 okay. Oh, what's wrong, Jim? You got to
1:51:25 go.
1:51:31 Go
1:51:46 transparent layers. Hey, how are you?
1:51:49 Good, good. How's it going?
1:51:52 My name is Jen Fong Wang. I'm a
1:51:54 researcher in open working multimodel.
1:51:57 The way how to generate the transparent
1:52:00 image is pretty intuitive,
1:52:02 straightforward. And now let's This is
1:52:04 like an Apple commercial, too. Every one
1:52:06 of these engineers has a MacBook. Give
1:52:08 it a try. Let's say the content is cute
1:52:13 puppy cartoon
1:52:16 st. This is the prompt. And uh now let's
1:52:20 see what happened. The model will take
1:52:22 the input and tries to generate the
1:52:25 image. Let's give it some time. Wow. So
1:52:29 now let's see what we generated. This is
1:52:32 the transparent puppy. Another
1:52:36 application is to make it a sticker.
1:52:38 Let's give it a try. Yeah.
1:52:41 We can easily overlay the transparent
1:52:43 image onto any kind of background. Now
1:52:47 we can copy the sticker onto our laptop
1:52:51 and and then we can paste it here.
1:52:55 Uh, make it smaller so it can be easily
1:52:59 blended with the background. Can you
1:53:01 make me a sticker? You sure? A smart
1:53:04 researcher wearing glasses and a blue
1:53:06 shirt.
1:53:09 Okay, let's give it a try. The director
1:53:13 because they are young with glasses.
1:53:15 He's making fun of him to his face. We
1:53:17 got it. Yeah, I think uh it works pretty
1:53:21 well. Hopefully people love it.
1:53:25 That's pretty amazing. You can just make
1:53:27 transparent images. Wow. Crazy. All
1:53:29 right. Good. Nice. Let's keep
1:53:33 going. And then we're back to the
1:53:35 beginning. Okay. Improved capabilities.
1:53:37 We trained our models on joint
1:53:39 distribution for online images and te
1:53:41 text. Learning not just how images
1:53:42 relate to language, but how they relate
1:53:45 to each other. Combined with aggressive
1:53:47 post-training, the resulting model has
1:53:50 surprising visual fluency capable of
1:53:53 generating images that are useful,
1:53:55 consistent, and contextware. Text
1:53:56 rendering. Okay. Street
1:53:59 signs. Nice. That's good. Reindeer
1:54:03 parking. Multi-turn generation. Because
1:54:06 image generation is now native to
1:54:09 GP240, you can refine images through
1:54:11 natural conversation.
1:54:14 GPT40 can build images and text in a
1:54:17 chain in kind. For example, if you're
1:54:19 designing a video game
1:54:21 character, the character's appearance
1:54:24 remains coherent across multiple
1:54:27 iterations. Okay, this is really good.
1:54:30 This is really
1:54:34 good. So, I could actually probably do a
1:54:37 little story with Champy. Yeah, look at
1:54:39 this. This cat's consistent. Well, at
1:54:41 least in what they showed.
1:54:48 We're gonna have to come back to this.
1:54:49 There's so much
1:54:51 here. All right, we'll we'll continue
1:54:53 this tomorrow. We'll continue this
1:54:55 tomorrow. That's
1:54:57 amazing.
1:55:01 Um, oh, I got to change
1:55:04 my stop that share. I got to share this
1:55:07 and share
1:55:09 everything. Go back here. Let's go look
1:55:12 at Let's go see how our screen play is
1:55:14 doing. Wait, we'll do our images. Oh,
1:55:16 yeah. There's the woman. Oh, that's much
1:55:18 better. Look at that image,
1:55:20 people. Uh, let's say
1:55:26 uh that's really
1:55:33 good. She is sadly
1:55:36 hopeful reading about spooky quantum
1:55:39 effects.
1:55:41 That's pretty
1:55:42 slick. All right. Very, very, very
1:55:46 [ __ ] cool. This is This is This is
1:55:50 going to be a lot of nights of us
1:55:52 playing and figuring out what this thing
1:55:54 [ __ ] does. All right. Back to uh
1:55:56 Manis. I created some bold signature
1:55:58 lines. Get these [ __ ] serpents
1:56:01 out of my
1:56:03 [ __ ] prison plane.
1:56:08 No, I've had it with these [ __ ]
1:56:10 snakes. 25 years for a crime I didn't
1:56:13 commit, and now I got to deal with these
1:56:14 [ __ ] snakes on this
1:56:16 [ __ ] prison
1:56:17 flight.
1:56:20 Okay. All you did was copy the line and
1:56:26 add
1:56:28 crap to
1:56:30 it. Get creative.
1:56:34 [Laughter]
1:56:39 Oh my
1:56:41 god. All
1:56:44 right. Okay. That would beat stable
1:56:47 diffusion on character retention. Yeah,
1:56:49 I want to play with character retention
1:56:52 uh tomorrow. We'll we'll we'll play with
1:56:53 that. I want to I want to read up more
1:56:55 on this. Um this is I think this is a
1:56:58 bigger deal.
1:56:59 Again, I think that most people are
1:57:02 going to not even know that this
1:57:03 announcement
1:57:07 happened and they're going to just use
1:57:09 chat GPT image generation like they
1:57:11 always have and they're not going to
1:57:14 quite get the magnitude of this. I think
1:57:16 this is a pretty big deal. Um, but we'll
1:57:20 see. Like let let's test this over a
1:57:22 bunch of days. Okay. When your cellmate
1:57:24 turns into your seatmate and your
1:57:27 seatmate turns into a snake. That's not
1:57:30 bad. The only thing worse than a life
1:57:32 sentence, a death sentence with fangs.
1:57:35 Not bad. They locked me up with killers.
1:57:37 Now they've strapped me in with vice
1:57:39 vipers. Not bad. Prison didn't break me,
1:57:42 but this plane just might. One bite at a
1:57:45 time. When the judge judge said flight
1:57:47 risk, this isn't what I had in mind.
1:57:50 Okay, number five is
1:57:53 genius. Five is
1:57:56 genius. Now go write this
1:58:04 thing. We had to slap it around a
1:58:06 little, but it got
1:58:10 there. It does have character
1:58:12 consistency. That's exciting. Paul
1:58:14 Ritzer said that no one knows about deep
1:58:16 research in chat GPT2. Yeah, I
1:58:21 listen. How do I even say
1:58:24 this? I would say at this point chat GPT
1:58:28 has escaped me. Like I pay attention to
1:58:31 this stuff all the time. I have not kept
1:58:34 up kept up with just what chat GPT can
1:58:37 do and how these models are different
1:58:39 and when to use certain ones. Part of
1:58:41 the reason for that is Sam Alman said
1:58:43 when GPT5 comes out, you're not going to
1:58:45 have to know which models to use because
1:58:48 it'll just figure it out. So, part of me
1:58:50 has just gotten lazy because of that.
1:58:53 Um, but that's just one tool. Like, I I
1:58:56 feel completely clueless
1:58:58 about
1:58:59 Claude. I feel completely clueless about
1:59:03 Gemini. Um, I've played with Deepseek a
1:59:06 little bit. I've played with Manis a
1:59:08 little bit. Like I I haven't I haven't
1:59:10 scratched the surface with these
1:59:12 things.
1:59:14 Um I'm gonna dig deep on this image
1:59:16 stuff because there's something about
1:59:18 the fact that it understands both the
1:59:21 language. It's like birectional like it
1:59:24 understands what's in the images and can
1:59:26 describe that and it understands if you
1:59:29 describe something what that looks like.
1:59:31 That's that feels really significant to
1:59:33 me. Um this is exciting. Okay, so Manis
1:59:38 is off writing my screenplay. We're
1:59:41 going to have a produced screenplay by
1:59:42 the end of uh May, which is exciting.
1:59:45 We'll be in production by
1:59:47 June. And uh we probably got to lock
1:59:52 down. I don't know if we want Samuel L.
1:59:56 Jackson. That's a little derivative,
1:59:58 don't you think? I think we should find
2:00:01 who's the new upcoming Samuel L.
2:00:03 Jackson.
2:00:05 Yeah, we'll have to figure that out. So,
2:00:08 we're going to have to book them. So, it
2:00:09 we might have to push it to like July or
2:00:11 August because of their filming schedule
2:00:13 if they're if they're in other things,
2:00:15 you know. So, all right. And I got to
2:00:17 work that in between launching my play
2:00:19 on Broadway. All
2:00:21 right. All right,
2:00:23 everybody. Okay. I can finally get Chat
2:00:26 GPT to generate an illustration for my
2:00:28 novel with zero artistic ability.
2:00:30 Beautiful. Love it. gives me more
2:00:32 reasons to start a new print on demand
2:00:34 brand. It c it sure does actually.
2:00:37 Claude is great for writing articles
2:00:39 actually. Yeah, the fact that that that
2:00:42 this image I I mean
2:00:45 um Ideogram was pretty good at this, but
2:00:47 it feels like this this is going to be
2:00:49 really good at things like t-shirts and
2:00:51 things like that. So, print on demand
2:00:53 might not be a bad call at all. So, all
2:00:57 right everybody, I'm gonna get my ass
2:00:59 out of here.
2:01:01 have yourself a fantastic evening. Go
2:01:03 play with Chat GPT's new image model. If
2:01:05 you don't know about it, just go play
2:01:06 with it. Tell it to do something. Tell
2:01:09 it to uh make a picture with 20 objects
2:01:12 in it and uh and
2:01:16 and you know, have it describe all 20
2:01:19 and then see if they're all in there.
2:01:21 That'll be a good one. And make
2:01:23 transparent stickers. Cloud's great for
2:01:25 article writing. I'm told at Beach Chat
2:01:27 GPT for coding. Gemini is kind of
2:01:29 limited for creative short
2:01:31 things. Cool. Have a good night,
2:01:33 everybody. All right. Peace out. See
2:01:36 y'all later.