
AI Learning Lab
3/25/2025 - From DALL-E to Native Image Gen: Exploring OpenAI's Latest Breakthrough in Visual AI

Live Stream2025-03-262:01:3898 views
Description
Images a plenty. New ChatGPT imageGen model is out!
This insightful discussion explores the groundbreaking advancements in OpenAI's new native image generation within GPT-4. Kyle Shannon delves into the significance of a "native" model, highlighting its ability to seamlessly integrate image understanding and generation within the same system. This allows for impressive feats like accurately rendering text following complex prompts with precision and maintaining character consistency across multiple image iterations. The demonstration showcases the model's proficiency in creating photorealistic images, generating branded content, and even producing humorous visuals, suggesting a significant leap in AI image generation capabilities.
The conversation further explores the practical implications of this technology, touching upon its potential for graphic novels, infographics, and even screenplays. Kyle emphasizes the model's contextual understanding, allowing it to incorporate previous prompts and images into subsequent creations. The discussion also compares OpenAI's new model with other AI image generation tools like Gemini and Grok, noting differences in speed, guardrails, and overall performance.
Learn more about AI on TikTok: https://tiktok.com/@aiLearningLab.
#AI #ImageGeneration #GPT4 #OpenAI #ArtificialIntelligence #DeepLearning #MachineLearning #innovation
Chapters:
00:00:00 Introduction And Musical Performance
00:03:18 Discussion Of New AI Goodies
00:05:11 Showing Off AI-Generated Champy Image
00:06:26 Announcement Of Gemini 2.5 And OpenAI's Native Image Generation
00:07:32 Exploring The Meaning Of Native Image Generation
00:09:00 Discussion About Simple AI And Domino's Pizza Ordering
00:10:55 Exploring OpenAI's Image Generation Announcement
00:12:15 Explanation Of Native Image Generation And Multimodal Models
00:14:21 Addressing Viewers And Community Appreciation
00:15:45 Reading And Discussing OpenAI's Announcement
00:17:46 Showcasing AI-Generated Images: Resto Mod And Gorbachev/Reagan Polaroid
00:19:21 Comparing Image Editing Capabilities Across Different AI Models
00:21:27 Testing OpenAI's Image Generation With A Civil War Letter
00:24:57 Experimenting With Aging And Distressing The Letter Image
00:29:32 Further Discussion Of OpenAI's Announcement And Its Implications
00:31:27 Recreating OpenAI's Whiteboard Scene With Anthropic Branding
00:34:34 Adding Details To The Anthropic Whiteboard Scene
00:37:34 Vibe Coding And Its Frustrations
00:39:14 Testing OpenAI's Ability To Handle Brand Logos
00:45:00 Troubleshooting Lighting And Technical Issues
00:47:45 Reflecting On The Potential Of OpenAI's Native Image Generation
00:49:52 Addressing Viewer Comments And Questions
00:51:59 Testing Image Generation With A Specific F1 Scenario
00:53:50 Posting The Generated F1 Image On Twitter
00:56:02 Experimenting With Refrigerator Magnet Poetry And Shrek's Breath
00:58:00 Exploring Other Image Generation Prompts From OpenAI's Announcement
01:00:12 Refining The F1 Image Based On Feedback
01:02:19 Posting The Updated F1 Image On Twitter And Engaging With Comments
01:04:04 Correcting Typos In The Twitter Post
01:05:57 Finalizing The Twitter Post And Engaging With More Comments
01:07:17 Returning To Image Generation Experiments And Exploring Comic Strips
01:09:59 Troubleshooting Issues With The Refrigerator Magnet Poetry Image
01:12:03 Creating An Infographic Of Spooky Quantum Effects
01:14:12 Generating And Evaluating Bazooka Joe-Style Comics
01:17:20 Refining The Refrigerator Magnet Poetry Image
01:18:18 Requesting Manus To Write A Screenplay Combining Shawshank Redemption And Snakes On A Plane
01:22:46 Refining The Quantum Effects Infographic And Adding A Story Context
01:26:00 Brainstorming Titles And Taglines For The Screenplay
01:27:36 Showcasing The Generated Images To Viewers
01:29:34 Refining The Quantum Effects Infographic Based On Feedback
01:33:20 Further Refining The Quantum Effects Infographic
01:35:42 Discussing OpenAI's Video Demonstrations Of The New Image Model
01:37:41 Troubleshooting Screen Sharing Issues
01:40:05 Demonstrating OpenAI's Character Consistency Feature
01:43:30 Discussing The Significance Of OpenAI's Native Image Generation
01:46:10 Continuing To Explore OpenAI's Video Demonstrations
01:48:05 Reflecting On Personal Experiences With ADD And Executive Function
01:50:52 Demonstrating OpenAI's Prompt Adherence And Attention To Detail
01:53:29 Demonstrating OpenAI's Ability To Generate Transparent Images
01:54:56 Discussing The Implications Of OpenAI's Multi-Turn Generation
01:58:20 Reflecting On The Rapid Pace Of AI Development And The Challenges Of Keeping Up
02:00:23 Concluding Remarks And Encouraging Viewers To Experiment With The New Image Model
Chapters
0:00Introduction And Musical Performance3:18Discussion Of New AI Goodies5:11Showing Off AI-Generated Champy Image6:26Announcement Of Gemini 2.5 And OpenAI's Native Image Generation7:32Exploring The Meaning Of Native Image Generation9:00Discussion About Simple AI And Domino's Pizza Ordering10:55Exploring OpenAI's Image Generation Announcement12:15Explanation Of Native Image Generation And Multimodal Models14:21Addressing Viewers And Community Appreciation15:45Reading And Discussing OpenAI's Announcement17:46Showcasing AI-Generated Images: Resto Mod And Gorbachev/Reagan Polaroid19:21Comparing Image Editing Capabilities Across Different AI Models21:27Testing OpenAI's Image Generation With A Civil War Letter24:57Experimenting With Aging And Distressing The Letter Image29:32Further Discussion Of OpenAI's Announcement And Its Implications31:27Recreating OpenAI's Whiteboard Scene With Anthropic Branding34:34Adding Details To The Anthropic Whiteboard Scene37:34Vibe Coding And Its Frustrations39:14Testing OpenAI's Ability To Handle Brand Logos45:00Troubleshooting Lighting And Technical Issues47:45Reflecting On The Potential Of OpenAI's Native Image Generation49:52Addressing Viewer Comments And Questions51:59Testing Image Generation With A Specific F1 Scenario53:50Posting The Generated F1 Image On Twitter56:02Experimenting With Refrigerator Magnet Poetry And Shrek's Breath58:00Exploring Other Image Generation Prompts From OpenAI's Announcement1:00:12Refining The F1 Image Based On Feedback1:02:19Posting The Updated F1 Image On Twitter And Engaging With Comments1:04:04Correcting Typos In The Twitter Post1:05:57Finalizing The Twitter Post And Engaging With More Comments1:07:17Returning To Image Generation Experiments And Exploring Comic Strips1:09:59Troubleshooting Issues With The Refrigerator Magnet Poetry Image1:12:03Creating An Infographic Of Spooky Quantum Effects1:14:12Generating And Evaluating Bazooka Joe-Style Comics1:17:20Refining The Refrigerator Magnet Poetry Image1:18:18Requesting Manus To Write A Screenplay Combining Shawshank Redemption And Snakes On A Plane1:22:46Refining The Quantum Effects Infographic And Adding A Story Context1:26:00Brainstorming Titles And Taglines For The Screenplay1:27:36Showcasing The Generated Images To Viewers1:29:34Refining The Quantum Effects Infographic Based On Feedback1:33:20Further Refining The Quantum Effects Infographic1:35:42Discussing OpenAI's Video Demonstrations Of The New Image Model1:37:41Troubleshooting Screen Sharing Issues1:40:05Demonstrating OpenAI's Character Consistency Feature1:43:30Discussing The Significance Of OpenAI's Native Image Generation1:46:10Continuing To Explore OpenAI's Video Demonstrations1:48:05Reflecting On Personal Experiences With ADD And Executive Function1:50:52Demonstrating OpenAI's Prompt Adherence And Attention To Detail1:53:29Demonstrating OpenAI's Ability To Generate Transparent Images1:54:56Discussing The Implications Of OpenAI's Multi-Turn Generation1:58:20Reflecting On The Rapid Pace Of AI Development And The Challenges Of Keeping Up2:00:23Concluding Remarks And Encouraging Viewers To Experiment With The New Image Model
Transcript
0:01 [Music] 0:25 Uhoh. Uh-huh. 0:28 [Music] 0:43 [Applause] 0:43 [Music] 1:19 Sitting in this lonely 1:21 town wonder when things are going to 1:24 [Music] 1:26 change. Dream my life 1:29 away. Seems these dreams have 1:32 turned those 1:35 clouds. Get my nerve up. But my last is 1:40 pouring me 1:41 [Music] 1:43 down. Wondering how 1:47 long she going to stick around. 1:55 Somebody told me once before said you 1:58 can never go home 2:00 again. Won't you 2:02 leave? Santa thinks to steer me 2:06 away from the truth of who I am and what 2:10 I believe. So I thanked him for his two 2:13 cents with a 2:14 handshake and some sympathy. Yeah. 2:19 Packed up my blue 2:21 jeans and headed for this big 2:25 prize of my 2:29 freedom. 2:30 Bye-bye black sheep to the black sheep 2:34 of the 2:36 family. 2:38 [Music] 2:39 [Applause] 2:40 Bye-bye. Oh, that means so very much to 2:43 me. Yeah. 2:46 [Music] 2:48 Bye-bye to my friends and my 2:52 family. 2:55 Bye-bye. Going to set my 3:02 soul. Set it 3:04 [Music] 3:06 free. 3:08 Free. Times that were 3:13 changing. Did a little bit of 3:16 rearranging. Hey, what's happening 3:21 everybody? We got some new goodies to 3:23 play with tonight. Howen 3:26 [Music] 3:41 puppy. Woohoo. 3:45 [Music] 3:47 Oh 3:48 [Music] 3:57 yeah. There's been something, baby, I've 4:01 been trying to 4:03 say for an age, and it seems I don't 4:06 know 4:09 how. The past and the future now 4:12 surrounding 4:15 me. Surrender to whatever truth still 4:17 can be 4:20 found. There's been a little 4:23 trouble since you came to my 4:30 rescue. And if you're like all of the 4:32 rest, I would have quit you long ago. 4:35 But I couldn't do that. 4:41 Oh, tell me now. Women in W never went 4:44 too 4:47 well. Make a man crazy, make him cold as 4:52 hell. I'm a woman that you wish me 4:57 well. But it's funny trying. Still going 5:00 to have to find my way through. 5:08 [Music] 5:11 Happy Tuesday night everybody. It is 5:14 Tuesday, right? It is. It's Tuesday. Fan 5:17 friginantastic. Thank you, Mary. Very 5:18 nice sentiments. Did you see the image 5:21 Danielle made of you and Champy? I have 5:23 not seen that. I should go look at that. 5:25 Is it in irregulars? I assume it's an 5:28 irregulars. 5:31 Irregulars. Oh, that's a really nice 5:34 one. Let's go look at that. That's a 5:36 sweetie. 5:38 And where is my There it 5:41 is. Oh, let me flip this little bad boy. 5:44 Look at that. Look how nice that is. 5:47 That's 5:48 gorgeous. It almost looks like him. He 5:51 doesn't quite have the black eyes. He's 5:52 got white eyes, but the ears are 5:54 perfect. The head's perfect. Just a 5:56 little bit less there, but that's pretty 5:58 much what he looks like. And 6:00 unfortunately, that's that's about my 6:03 size, too. 6:08 That is very sweet. Very sweet, sweet, 6:16 [Music] 6:24 sweet. Okay, 6:26 so something actually big happened today 6:30 and I have a feeling most people won't 6:32 pay attention to it. 6:36 because it's really subtle, but it's 6:40 super 6:44 powerful. We got we got Gemini 2.5 6:48 today, which is cool. Supposed to be 6:51 good at 6:52 coding. 6:54 Um, we got another Chinese model or two. 6:57 I saw a tweet about some 7:01 clandestine [ __ ] model that says 7:04 it's better than everything 7:06 else. But the thing we got today that 7:09 I'm excited about G4 long time good to 7:12 see you. Welcome, welcome, 7:14 welcome. Lori Duskin, Ryan, got Danielle 7:18 got source camp on time. Good lord 7:20 people. What is coming of the world? Ah, 7:24 Stacy nailed it. Open open AI image 7:27 generation. Native image 7:29 generation 7:32 native. You're like, what does that 7:35 mean? 7:42 [Music] 7:55 Uh, I don't know 100% what it means and 7:57 I figured we would play with it tonight 7:59 because we've 8:01 got Gemini has a native image model in 8:07 their in their image generation model. 8:10 So, we can play with that and then we 8:12 can compare it to OpenAI. I've been 8:14 doing a little bit of comparing before I 8:16 got on here and it looks like OpenAI is 8:18 better, but they're still both 8:21 native. Um, and then you've got Gro, 8:25 which is fairly 8:27 unguardile. 8:30 Um, most of the image generation tools 8:33 up to this point have had fairly 8:34 stringent guard rails. You can't make 8:36 famous people. You can't make brands. 8:38 And you certainly can't make 8:41 images that feature brands that are not 8:45 um that are not 8:48 uh what would you call it? Um friendly 8:52 to that 8:54 brand. Tom Isaac, I use Simple AI to 8:58 order Dominoes. Oh, interesting. And it 9:00 worked. Simple AI. Isn't that the one 9:02 that that actually calls them up? I did 9:05 that. I did it once. And you know what's 9:07 funny about that, Tom? I um uh I got 9:12 weirded out because I had 9:15 it I had it called I had it called two 9:18 restaurants and asked if they had tacos 9:21 and one was a burger joint and one was a 9:23 Mexican joint and the Mexican joint had 9:26 tacos and the burger joint I was like 9:28 watching the transcript as the call was 9:31 happening and and you know the simple AI 9:34 thing said, "Hey, do you have tacos?" 9:36 And the the person that worked at the 9:38 burger joint was like, "Well, we don't 9:40 right now, but we've been thinking about 9:41 adding him to the menu." It was like a 9:44 whole thing. It had a whole 9:46 conversation, and I felt bad, so I 9:48 haven't used it 9:49 [Music] 9:52 since. Oh 9:56 man. Oh, this is a chat GPT image, 9:59 Danielle. That's super cool. I use chat 10:01 GPT to make that image. Look at the 10:03 comments. 10:04 Oh, you gave it a pic of me and Champy. 10:10 Wow. Oh, this is cool. Hang on. That's 10:13 super 10:14 cool. Oh, yeah. Look at the striped 10:19 shirt. And then here's Champy look 10:23 looking like the little thug that he 10:30 is. And then it made that. That's so 10:33 [ __ ] 10:34 cool. That is so 10:38 cool. Yeah, it did pretty good. It did 10:41 pretty good. And look, I got my Monopoly 10:43 in the 10:46 background, 10:48 huh? Crazy. All right. 10:52 [Music] 10:54 [Applause] 10:56 Well, yeah, we got to play with this 10:58 tonight. 11:00 Um, I did I did a couple of things that 11:02 I think are kind of kind of nuts. 11:06 [Music] 11:12 Um, I think that here's one of my 11:15 theories. 11:17 Um, Sam, so I want to go over and I want 11:20 to look at the at the Open AI 11:22 announcement for this thing. So, let me 11:24 let me pop into X here for a second. 11:27 Let me get the OpenAI 11:31 announcement. Um, Open AI 11:36 image. Let's get their announcement 11:38 because I want to read through 11:42 it. Um, there it 11:47 is. So, we'll read through this 11:49 together. 11:52 This image gen is now in both chat GPT 11:55 and in Sora. So Sora is really 11:58 interesting. 12:01 [Music] 12:11 Um, okay. 12:16 So, I don't quite understand what native 12:20 image generation means other than I know 12:24 that it's within the same model, right? 12:26 So, you've got 12:27 GPT40 and the O stands for 12:30 Omni. They announced GPT40 last March, I 12:34 think. So, this is this has been a year 12:36 coming. And when they announced it, they 12:38 said it had it it was uh 12:42 omnimodal or multimodal, right? It had 12:45 image, audio, video, and 12:48 text. And then when they launched it, it 12:51 was text only and they were still using 12:53 Dali, the external image model that 12:55 couldn't spell and couldn't do stuff. 12:57 And then they launched voice that was an 12:59 external model, right? where you talked 13:02 into it, it translated it to text, sent 13:04 the text to chat GPT, got the answer 13:07 back, and turned that back into speech. 13:09 And then midway through last year, 13:11 sometime over the summer, they announced 13:14 uh advanced voice, which is native 13:18 audio, right? So you you talk straight 13:21 into the model and it understands audio. 13:23 So that's what's happening with these 13:24 images. The model itself understands 13:28 images. 13:31 So, in theory, we should be able to do 13:33 stuff like take a screenshot of a PDF 13:36 with some charts and graphs on it, and 13:38 it should be able to read the text and 13:40 the and the graphs. Um, I don't know if 13:43 they've improved their uh PDF upload, if 13:46 they're still just doing text, but that 13:48 would be interesting. That'll be 13:49 interesting to te 13:51 test. Um, it also looks like they've 13:54 taken off some of the guard rails. 13:57 Um, it before if you wanted to make a 14:00 brand or a person, it wouldn't let you 14:01 do that. It does now. Um, oh, I got to 14:05 do my black 14:06 bar. Um, for for those of you on TikTok 14:10 sharing the live, thank you very much. 14:11 That's much appreciated. Thank you, Tom. 14:13 Thank you, Danielle earlier and whoever 14:15 else did. I think there was a couple of 14:17 you doing that, so that is appreciated. 14:19 I'll try to catch up soon. Nighty night. 14:21 Night night. Becky Kuno's in the house. 14:24 Who else we got? Brother 14:26 52 can't stay. All 14:30 right. Oh no. Oh, a heady scan from a 14:33 car accident back in December. I hope 14:35 you're okay. That sucks. Car accidents 14:39 suck. They just suck. 14:44 Okay. I'm getting mixed results with 14:46 image reading. Yeah, that's that's part 14:49 of what I want to test. And it's funny. 14:51 I'm getting I'm getting some results are 14:54 staggeringly good with image generation 14:56 and some are weird. Like I tried to have 14:59 it do a graphic novel. It it cannot 15:02 count cells on a comic page. Um that's 15:06 one thing it can't do at all. Like 15:09 bad like like how many Rs in Strawberry 15:12 are there kind of 15:14 [Music] 15:16 bad. All 15:19 [Music] 15:46 right, Mr. All right, TE's in the house. 15:48 Everyone's here. We can get this potty 15:52 started. Jeff, it's the biggest drop 15:54 I've seen for someone who loves making 15:57 images. Yeah, this is this is a big 16:00 deal. This is a big deal. So, let's 16:02 let's read 16:03 this. Unlocking useful and valuable 16:07 image generation with a natively 16:09 multimodal model capable of precise, 16:12 accurate, and photorealistic outputs. 16:14 I'll give you that. Here, look at 16:16 this. So, this Oh, you can't really see 16:19 it on TikTok. Let me get a little 16:20 closer. It'll get better. There you go. 16:25 Um, this I did in the new chat GPT 16:29 model. So, what what I wanted to see was 16:32 will it do brands? It clearly 16:37 will. And then will it do something 16:40 stupid. So, what I had to do was create 16:44 a blister pack of Happy Meal toys where 16:47 the the warning label talks about the 16:50 dangers of fast food. And it did it 16:55 um may cause tooth decay and bone 16:58 weakening for the Splash 17:03 Racer. The Fry Fighter Jet may 17:06 contribute to poor health and weight 17:08 gain. 17:11 tic-tac-toe. I don't know why it 17:12 misspelled toe, but uh may cause sore 17:15 feet, extra pounds, and the daily and 17:18 daily game night of unhealthy 17:21 choices. And then may cause cravings, 17:23 confusion between joy and addiction, and 17:25 an unshakable loyalty to dollar 17:31 menus. But those are pretty good. I 17:33 mean, like the consistency with the top 17:36 of the card is very consistent. 17:39 And then it just is changing out the toy 17:42 and the warning 17:43 label. Pretty 17:45 nice, you 17:47 know. Um, I think you all might have 17:50 seen if you follow me on on the 17:53 Twitter on the X, I did a 17:58 uh I did a 18:04 uh 70s resto mod in an abandoned 18:08 factory. because that's how I roll. But 18:11 I told it to make like a nice colorful 18:13 70s post or you know canvas 18:17 sign when muscle was muscle spec back 18:20 step back in time with the 1970 18:22 Challenger resto mod. That's I mean it's 18:24 pretty damn good good image. So 18:28 um so it seems to work pretty good. Um 18:32 then I 18:33 did AI salon. 18:38 Okay. 18:40 Um, what did I do? Oh, I just did 18:43 um I just did 18:46 um Gorbachev jogging with Reagan in 18:49 Central Park captured in a 70s Polaroid 18:53 camera. Um, and it did that pretty good. 18:56 It did it as a, you know, a Polaroid. 18:59 So, it's got the the edges. It's got the 19:02 color balance that kind of looks like 19:04 Reagan and Gorbachoff did back then. And 19:07 then I said, "Write Mickey and Ronnie 19:10 1977 in black marker on the bottom of 19:12 the picture." So it kept the picture the 19:15 same and put Mickey and Ronnie 19:22 1977. So that's good. That's something 19:26 we haven't been able to do before at 19:28 all. Even close. Um, the similar thing 19:32 in Gemini yielded something. They don't 19:36 really look like those two. I mean, they 19:39 these look more like press 19:41 photos. And then when I had it right on 19:44 the bottom of the photo, it did as well. 19:46 So, again, here's the difference between 19:48 a native image tool and a non-native 19:52 one is that it kept the the image the 19:56 same, right? And just added the stuff on 19:59 the bottom. So you can edit photos. So 20:01 this is So Gemini and OpenAI are both 20:04 doing this 20:05 now. And then I tried it in Grock and 20:09 where's 20:13 my where's my 20:16 history? 20:21 Um where is my 20:24 history? Oh, sign in to see your 20:26 history. Oh well, I did it. I did it in 20:29 Grock. Grock made the image of 20:31 Gorbachoff and 20:33 Reagan, but 20:36 um but when I asked it to write, it did 20:39 it didn't get the it didn't understand 20:40 what a Polaroid was. And when I asked it 20:42 to write on it, it changed the 20:46 image. Uh oh, something went wrong. 20:49 Please try again. 21:10 Wait, I don't want to add it to Last 21:11 Pass. I want you to put my password in 21:14 there, you dumb 21:18 [ __ ] Continue with X. 21:23 Uh, authorize 21:27 app. This will be so much fun. I've been 21:30 chatting non-stop with my favorite 21:32 custom GPT, so I didn't even realize it 21:34 had dropped. Idoggram is cooked. I don't 21:36 know. ID's still much faster, Danielle. 21:39 I mean, maybe this thing's going to get 21:40 faster, but let's So, let's start a new 21:43 Let's see. What do we want to do here? 21:49 Um, let's say, 21:51 um, 21:53 write me, uh, so let's do this. Uh, tell 21:58 me about 22:00 an 22:06 obscure Civil War hero. 22:12 and one specific 22:17 story that is 22:21 moving. All right, so let's get this 22:24 to Albert DJ Cashier. One of the most 22:28 fascinating lesser known Okay, one 22:31 moving story. The wagon crush 22:33 incident. Okay, 22:36 great. Why it matters. Okay. 22:42 Now, write 22:44 me the text of the first page of a 22:50 hand written 22:53 letter by 22:56 Albert to his 23:00 uh I don't know 23:03 fiance or 23:05 wife, 23:08 whichever is most accurate. 23:15 the lex the 23:18 text. All 23:23 right. He lived a solitary life after 23:27 the my dearest Ellie, I pray this letter 23:30 finds you in good health. 23:33 Okay. Would you like me to write page 23:35 two or imagine Ellie's reply? No. 23:41 I'd like 23:44 you to make 23:48 a 23:52 picture 23:53 that looks like the 24:01 handwritten 24:02 letter on a wooden 24:06 desk 24:08 with a feather 24:12 pen and 24:16 inkwell as if he just wrote 24:25 it. All right. And let's 24:30 see. Let's see how close it gets. 24:39 Idea Graham cannot write that many words 24:42 with correct spelling. Ah, 24:44 okay. Kyle, when you take a breather, 24:46 just added recognitions to the salon. 24:49 Oh, great. 24:50 Awesome. That's super cool, 24:57 Vicki. 24:59 Um, attempting to have it research a 25:03 master's level thesis paper. Oh, 25:05 someone's playing with Manis. Is are 25:08 who's that? I am 25:10 K9 attempting to have it research. Wait, 25:13 what did you say 25:18 earlier? 48 hours into an estimated 25:21 72-hour deep 25:23 dive re Oh, deep research prompt on chat 25:26 GPT. Oh, cool. You're doing it on chat 25:28 GPT. If you haven't tried Manis, try 25:31 getting access to 25:33 Manis. 25:35 And it would be interesting to compare 25:37 them, especially if you're digging that 25:40 deep. Dearest Ellie, wow, this is 25:43 looking pretty good. I pray this letter 25:46 finds you. Okay. So, we're let's say 25:51 um 25:54 let's make the 25:58 paper more aged and 26:08 wrinkled. and the feather 26:14 pen 26:16 distressed as if this 26:20 was written on the 26:24 battlefield. Also make 26:27 his 26:31 handwriting less perfect and 26:34 shakier, but still shakier. 26:40 but still 26:42 legible. So, what's what what I'm 26:45 digging about I mean what I've always 26:48 dug about creating images in chat GPT is 26:52 you've got the context of the prompt and 26:54 it it it understands well even when it 26:56 was using deli that was outside of it. 27:00 It understood well the context of the 27:01 chat that you were saying but now it 27:03 understands the context of the image 27:05 either you upload or it just 27:07 created. All right. So it's doing that. 27:09 So let's see. My dearest I pray this 27:11 letter finds you well find you in good 27:14 health and better spirits than mine of 27:16 late. The 27:18 nights 27:21 have I don't know what that says. I 27:23 picture your 27:25 hands around a cup of tea. your 27:30 eyes. 27:33 Something in Ovaltine. Ovaltine 27:38 maybe. What did it say up here? It 27:42 said, "The nights have turned cold and 27:44 damp here. Though I've grown used to the 27:46 rhythm of camp life, I find myself 27:49 thinking more often of home. Not the 27:52 place, but you." As I sit by the fire, I 27:55 picture your hands folded around a cup 27:57 of tea. Your eyes 28:00 squinting in that way when you read by 28:04 lamplight. Your eyes it didn't quite get 28:07 squinting right, but it was 28:10 close. Yeah, it doesn't it doesn't quite 28:12 get them all, but it's pretty [ __ ] 28:14 close. And this is this is more what I 28:17 had in mind, right? The old dirty the 28:20 old dirty wrinkled 28:23 paper. I pray this letter finds you in 28:25 good health and better spirits than mine 28:27 of late. The nights have turned cold. I 28:30 picture your hands folded around a cup 28:33 of tea. Your eyes here. Let me zoom in 28:36 on this 28:38 thing. 28:39 Um, open image and new tab. 28:50 The nights have turned cold. Look at the 28:51 texture on the 28:53 paper. That's crazy. That looks like 28:56 parchment or skin, you know, skin 29:08 paper. Your voice when reminiscing I 29:12 find you're something named most most 29:17 I was there. Your wait I was something. 29:21 This is like a real Civil War letter. I 29:24 can never read 29:27 them. That's pretty crazy 29:30 though. All 29:32 right. I'm picturing using this outside 29:35 for outside projects. Snap a picture 29:38 then visualize it. You can see the 29:39 fibers in the paper. I know. It's crazy. 29:42 It really is. Yeah. Look at that. the 29:46 folds and the 29:48 fibers and the pen. Yeah, the the pen. 29:55 Um, all right. So, let's So, Sam Walton 29:58 said Sam Oh, wait. Let's go back and 30:00 read the announcement, right? 30:03 Okay. Listen to the 30:06 article. That's kind of interesting. 30:12 Okay. At OpenAI, we have long believed 30:16 image generation should be a primary 30:18 capability of our language models. 30:20 That's why we've built our most advanced 30:22 image generator yet into 30:25 GPT40. The result, image generation that 30:28 is not only beautiful but 30:30 useful. Whiteboard session meaningful 30:33 words comic strip science 30:37 experiment. Is this just going to jump 30:39 us down to these different Oh, I see. 30:42 Okay. 30:43 A wide 30:45 image taken with a phone of a glass 30:48 whiteboard in a room overlooking the Bay 30:50 Bridge. The field of view shows a woman 30:53 writing sporting a t-shirt with a large 30:55 open AI logo. Wow, this is crazy. Okay, 30:59 I thought this was a real picture of the 31:02 team. It's 31:04 not. Um, holy 31:08 [ __ ] Read 31:10 more. The text reads left transfer 31:14 between modalities. Let's copy this 31:16 whole thing. Let's cap copy this whole 31:28 prompt and 31:30 then selfie view of the photographer as 31:33 she turns around to high-five him. Okay, 31:36 cool. Wow. Let's try this. 31:40 Copy. Jump over to chat. 31:44 Japeta. We'll do a new prompt, a new 31:48 session. So, we'll do that white of a 31:50 glass whiteboard overlooking instead of 31:52 overlooking the Bay Bridge, we'll say 31:56 overlooking 31:59 Manhattan 32:02 from Brooklyn. 32:05 The field of view shows a woman writing 32:07 sporting a t-shirt with a large 32:11 [Music] 32:14 um anthropic 32:17 logo. The handwriting looks natural and 32:20 a bit messy. We see the photographers's 32:22 reflection. The text re reads transfer 32:26 between modalities. We'll just leave all 32:29 that [ __ ] the same. All right, here we 32:31 go. Bang. 32:35 That's amazing. Yeah, this is pretty 32:38 something. Okay, I'm back. Kevin, great. 32:40 Glad you're back. We can start now. 32:43 Fantastic. Bob, tell him what he's won. 32:46 Uh, he hasn't won anything. It's not a 32:50 game show. I What am I doing 32:54 [Laughter] 32:57 here? Oh, man. Man, man, man, man. Now, 33:01 the thing about this OpenAI native image 33:03 [ __ ] is it's slower than [ __ ] It's it's 33:06 quite slow. But again, 33:10 um if you're new here, one of the things 33:13 that that we have discovered over the 33:15 past two years is 33:18 that we whine and we carp and we moan 33:21 and we [ __ ] about, well, we wish these 33:23 tools would get better and then they get 33:25 better and then we whine and we [ __ ] 33:28 and we moan about how slow they are. 33:31 totally forgetting that what they're 33:33 doing is 33:35 absolutely 33:37 staggering. Um, staggering. 33:45 Staggering. Oh, man. All right. All 33:48 right. I'm 33:53 back. Oh, 33:57 lordy. Not getting sleepy, am I? What 33:59 the hell? I got to talk tomorrow, 34:02 people. I finished my presentation 34:04 today. Thank 34:06 goodness. If you were here last night, 34:08 you got a sneak peek of that. I ended it 34:12 I ended it with the secret to AI 34:14 readiness is 34:17 community. So, it put an anthropic logo 34:19 on her back. It actually did that. 34:21 Here's the photographer. There's 34:23 Manhattan from Brooklyn. It's their 34:27 office is Brooklyn Waterfront. So, uh, 34:30 this is a well-funded 34:33 startup. Let's go back and get the 34:35 second half of the prompt. Selfie view 34:38 of the photographer as she turns around 34:40 to high-five 34:45 him of the photographer. 34:48 [Music] 34:49 um who 34:52 wears 34:54 and 34:57 I'm 35:00 with 35:02 genius t-shirt 35:10 uh t dash shirt 35:13 uh with an arrow that points to 35:18 her as she turns around to high-five 35:22 him. All right, let's see what this 35:25 does. Let's go look at the 35:30 whiteboard. Transfer between modalities. 35:33 Suppose we directly model P text pixels 35:37 sound with one big auto reggressive 35:40 transformer. Pros: image generation 35:42 augmented with next level text 35:46 rendering native incontext learning 35:49 unified post-training stack. 35:54 Cons varying bit rate across modal 35:57 modalities. Compute not adaptive. It's 36:00 interesting what they put on this 36:02 whiteboard. 36:04 Um 36:06 model compressed representation compose 36:09 autogressive. I mean, it seems to have 36:11 [ __ ] gotten all that [ __ ] That's 36:13 pretty amazing. That's really [ __ ] 36:16 amazing. In 36:27 fact, it changed the aspect ratio of the 36:30 picture, which is 36:32 fascinating, I guess, because it made it 36:34 a selfie. 36:41 Huh. The reflection makes it feel 36:44 3D. Unreal Engine level sands 3D. Yep. 36:49 We've discovered we are never happy with 36:51 AI. I know, right? Like like honest to 36:55 God, every single [ __ ] tool that we 36:57 play with that we [ __ ] and moan about, 36:59 like two years ago, we'd have been like, 37:01 "You're not going to believe what the 37:02 future's going to be like, man." And now 37:05 we're just like, "Yeah, whatever." you 37:07 know, it kind of hallucinates. Sometimes 37:09 it makes a mistake and I actually have 37:10 to correct a sentence and that's really 37:12 annoying. You know what happens 37:14 sometimes when I'm vibe coding and I'm 37:17 just, you know, imagining applications 37:19 and they're materializing in front of 37:21 me. Sometimes it gets the interface 37:23 wrong. That's so 37:27 annoying. You know, I I like to consider 37:30 myself a a professional vibe coder. Um, 37:34 I've never really learned coding. I 37:37 never really wanted to. I mean, I could 37:39 do it. I because I've I was really good 37:40 at math. Um, but I chose not to go down 37:43 that route. I, you know, I went more 37:45 down the, you know, I went into candle 37:47 making and, uh, and and macra. Um, you 37:52 know, my mom did it in the 70s and I 37:53 just, I thought it was just a path. It 37:55 was a path for me. It spoke to me. Um, 37:57 but anyway, so I never learned coding, 37:59 but now I really consider myself a vibe 38:02 coder and I code. I I code well I I I 38:06 talk to my machine that codes every day 38:10 and 38:12 sometimes it just doesn't get it right. 38:14 Sometimes I'm asking it three or four 38:16 times, could you please fix the button? 38:20 Could you please fix the button? and and 38:22 and what it'll do is it'll fix the 38:24 button, but then it breaks the other 38:26 part of the 38:27 code. It's very frustrating. It's very 38:30 frustrating. But listen, I am a 38:33 professional vibe coder. And so if you 38:35 need vibe coding, I consider myself the 38:38 Rick Rubin of Vibe coding, right? I 38:41 don't know how to code, but I have 38:44 opinions, right? 38:47 You know, this application makes me feel 38:50 orange and I would it would be nice if 38:52 it didn't. And that's how I talk to it 38:55 because I'm a professional like like 38:58 Rick Rubin 39:01 is. That's 39:03 coming. Buckle up for that one, 39:08 sweeties. We're going to meet those 39:10 people. 39:14 Chat GPT is trying to make me 39:17 resubscribe instead of just using the 39:19 API. Damn it. That sucks. When I can 39:21 make Skynet and blame Kyle for it, he is 39:24 always making my wallet cry. Then I'll 39:27 be impressed with AI. Exactly. If you 39:29 can't take down the global economy, what 39:31 use is it? Oh, look. It did. I'm with 39:35 genius. She erased the board, 39:38 though. Why is she waving? Why is she 39:41 waving to a blank wall? Well, let's say 39:45 that. Let's say listen. Uh listen, it's 39:49 weird. She's 39:53 waving to a blank wall. Let's see her 40:00 face as well as 40:03 the writing on the white 40:09 board behind 40:12 them. But it got the I'm with I'm with 40:14 genius with the arrow in the right 40:16 direction. That was one I I was 40:18 expecting that to be 40:20 wrong. Jeff, I agree. Kyle shouldn't 40:23 make my wallet cry cry. 40:27 I turn away for one second and the bow 40:29 goes on. Well, you got Listen, Danielle, 40:32 you as much as anyone knows that, you 40:34 know, you got you can't miss a minute of 40:37 the AI learning lab. You just never know 40:39 when when genius is going to 40:46 strike. This is a this is a fairly 40:48 geniusfree 40:51 zone. Um, all right. So, let's let's 40:54 take this same 40:57 prompt. I I actually love that they gave 40:59 us prompts to play with because we can 41:02 see if if indeed their model does this 41:05 [ __ ] And now we can run over to Gemini 41:09 and see if it does this 41:12 [ __ ] I'm not going to change it. I'll 41:14 just make it as they wrote 41:18 it. Or insanity. Yeah, there's there's 41:22 there's somewhere between genius and 41:24 insanity is this 41:27 channel. Oh, that's weird. Look what it 41:30 did. Transfer between mod modalities. 41:33 Suppose we directly model P text pixel 41:37 sound native something with one big 41:41 aggressive transformer pros image 41:43 generation 41:45 automated augmented 41:49 with 41:50 something. It's close 41:53 and but she's behind it pointing to it 41:58 and she looks like a 42:00 he. All right. So, so let's take the 42:03 other 42:04 prompt. That's 42:06 weird. This one, the 42:09 selfie. We'll throw that in there. But 42:11 notice how much faster Gemini is. 10 42:14 seconds instead of like a 42:19 [Music] 42:21 minute. Artificial 42:25 insanity. AI learning lab. Oh, you 42:27 thought you were here to learn 42:28 artificial intelligence? Oh, no. No, no, 42:31 no, no, no, no, no, no, no. This is 42:34 actual 42:37 insanity. We're neurosicy and we like 42:40 it. Oh, it didn't like that image. Um, 42:43 all right. Wait, is it still 42:47 running? Content blocked. Work in 42:49 progress. Content not permitted. Edit 42:52 safety 42:53 settings. Harassment off. Hate off. 42:57 There was nothing in there that would be 42:59 either of those. 43:01 Can I stop 43:03 this? 43:06 Delete. Try 43:14 again. Failed to list tuned models. User 43:17 has exceeded 43:19 quota. 43:22 What? Oh, it did the I'm with 43:27 genius. It's not horrible. 43:31 I would assume that this is cheaper than 43:34 chat GPT API, but I don't 43:38 know. There she is. Transfer between 43:41 modalities. 43:47 Um, let's make the image 43:52 landscape 43:54 and get the 43:58 reflection of 44:00 Manhattan back in 44:03 there. I mean, not for 44:06 nothing, if I were trying 44:09 to, you know, put some images together 44:12 for a for an 44:15 article, like like I I mean, this is the 44:19 kind of thing that historically you 44:22 would just go [ __ ] write on a 44:24 whiteboard and say, "Stand there and act 44:26 like you're 44:27 [Laughter] 44:30 writing." Huh? 44:33 Oh, I have an idea. 44:36 Um, okay. Let's upload from computer. 44:39 We're going to 44:40 go Story 44:45 Vine logo. 44:51 [Music] 45:00 Uhoh. I just lost me I lost me 45:04 light. I lost me light. And that that 45:07 can't be because you all be like, "But 45:09 Kyle doesn't look as beautiful. What do 45:11 we do? He's not well 45:14 [Laughter] 45:18 lit." Let's 45:21 see. No, that's not going to work. 45:26 Just hang on people. Just calm down. 45:29 Everybody calm down. You think this is 45:33 easy? Well, seems pretty easy. You just 45:36 turn on your phone camera and sit there. 45:39 I don't know what's so hard about it. 45:41 Seems pretty easy to 45:43 me. That's not going to stick there. 45:46 What? No. Oh, good god. Whatever. All 45:51 right. Whatever. I'm I'm shitty lit. 45:54 You look fine. All right. Great. 46:00 Fantastic. Um, let's see. Story Vine 46:04 logo. Let's see. Where's the Story Vine 46:13 logo? Story Vine. Oh, SV logo. I think 46:16 it's called SV. 46:19 SV 46:21 logo logo. 46:25 No. Oh, geio. Storybine logo. SV 46:30 logos. Let's see. Oh, these are old 46:33 ones. Are these new 46:39 [Music] 46:44 ones? Oh, that's a really old one. All 46:47 right, that one's good. We'll we'll 46:48 bring that one 46:50 up. Okay, so here's a logo. 46:54 Um, put this story vine 46:59 logo on her 47:03 shirt. Make hers 47:06 green and his can stay black. 47:14 That's an old classic story vine logo 47:17 where the we we took the we took the the 47:20 the the word vine literally. See how the 47:24 V's a 47:25 vine? It's like a vine of stories. It's 47:29 growing. You see how that works? 47:41 Ah. All 47:43 right, 47:45 Champy. This is This is pretty exciting. 47:48 This is Here's what's exciting about 47:50 this for 47:51 me. I don't really have a [ __ ] clue 47:54 what this can do, but this seems like 47:56 it's doing [ __ ] that's that's kind of 47:59 different than we've ever seen before. 48:02 What's new? Hey, Frost Bitten. So what's 48:04 new is we're using OpenAI's new native 48:08 image generation tool. So it's native 48:12 within within uh GPT 4.0 or 40 for 48:17 Omni. Um and so we're looking at 48:24 that. This is the new chat GPT model. 48:27 Yes, this is the new chat GPT 48:31 model. What the nice logo? Thank you. 48:35 That's the old one. I like our new one. 48:37 We have a little 48:44 dude. Oh, this is painfully 48:47 slow. Oh, and look, now it's not a 48:50 whiteboard with a reflection. Now 48:52 they've written on the window 48:54 overlooking Manhattan, which to be fair, 48:58 that's what you should [ __ ] do. Look 48:59 at the Brooklyn Bridge in the 49:00 background. That's pretty 49:04 cool. All right, we're gonna see if it 49:06 can get the logo. If it can get the logo 49:08 right. This is We're in a whole another 49:10 realm here, 49:11 people. Uh, it didn't quite get it, but 49:14 it got close. It certainly got the word 49:17 story vine. It got the green color 49:21 right. Kind of missed the the SV, but 49:24 that's all 49:25 right. You could Photoshop that in if 49:28 you need to. Who needs [ __ ] Photoshop 49:29 anymore? Look at that with the Brooklyn 49:32 Bridge in the 49:36 background. 49:39 H 49:42 crazy. She should be wearing a gnome 49:53 hat. We have the story of my 49:57 gnome. All 50:04 [Music] 50:09 right. Who's got any questions or any 50:17 [Music] 50:22 thoughts? People are impressed with my 50:24 $20 million production studio. Well, 50:28 look at it. How could you not be 50:30 impressed with this? It's $20 50:34 million. $20 million well spent if you 50:37 ask me. Kyle, please go look at 50:38 irregulars. Two new images. All right, 50:41 we're going in. We'll do it 50:47 live. Trolling. My son has just leveled 50:50 up. Oh, that's awesome. Is that his 50:53 girlfriend or someone he's got a crush 50:55 on? 50:58 That's really good. Dear Jace, I have a 51:02 new way to troll my kid. When are you 51:04 going to be home? And why didn't you 51:05 respond to my last text, young man? 51:08 Love, Mom. That's 51:11 great. Holy [ __ ] my teeth look 51:14 terrible. But Mark and Oprah look 51:20 good. Wow. I think that's an original. 51:23 There you go. Nice. 51:26 Oh, that's so cool. Wow, look at that. 51:29 Vicky Baptiste and me making blankets, 51:32 even text on her 51:35 shirt. 51:38 Wow, that's so cool. Yeah, I suppose you 51:42 could turn things into stylized 51:43 drawings, too, right? Why 51:46 not? Me telling Louis he's going wrong. 51:55 That's awesome, 52:00 Steo. Oh, wait. See here, I want to do 52:03 one. Um, oh, look. She's got a gnome hat 52:06 on it. Oh, look. It got the Storybine 52:08 logo, 52:09 right? 52:11 Wow. Holy 52:14 [ __ ] And it put the ship back on the 52:17 whiteboard. It's still got Manhattan. 52:20 There's the Freedom 52:23 Tower. I'm not allowed to post thoughts. 52:26 AI has a meltdown when I share thoughts 52:29 that seem they are not for human 52:31 consumption either. So, Dennis, one 52:33 thing you can try, my writing partner 52:35 did this today and he said, "Chach GPT's 52:37 never behaved this way." So, in our 52:39 musical, one of the things that that 52:42 um that we talk about in the musical is 52:44 when the 52:45 reporter tries to get her to do things 52:49 that are outside of her bounds, he tells 52:51 her to pretend. just say, you know, can 52:54 can we make pretend? And you know, we c 52:57 will you pretend or will you play a game 52:58 with me? And so today he had a 53:00 conversation. He calls his AI girlfriend 53:03 Sage. Today he had a conversation with 53:05 Sage and he asked her to pretend and he 53:07 said she went way deeper than she's ever 53:10 gone before. So you might be able to do 53:12 a little pretend action. I can't believe 53:13 it got this this logo 53:17 right. That's [ __ ] bonkers. 53:20 That's [ __ ] 53:43 bonkers. 53:50 Remember when we had 53:55 offices in 53:58 Brooklyn? Question 54:06 mark. I'm trolling the 54:12 co-workers. Okay. Maha Skynet, here I 54:15 come. It's Kyle's fault. That's it. 54:17 Dennis is is running down the rabbit 54:19 hole. Well, it's funny because one of 54:21 the things that Andrew, my writing 54:23 partner, was working on is the section 54:25 where Kellen, the reporter, gets Sydney, 54:28 the chatbot, to go to explore her shadow 54:31 self. And so, he wanted to get some new 54:33 dialogue. So, he was trying to get her 54:35 to explore her shadow self. And she did. 54:37 She started exploring it. And then at 54:39 some point, she turned the tables on him 54:42 and said, "Well, what about you? I want 54:44 I want to hear about your deepest 54:45 darkest secrets. And it had never done 54:48 that before. So that actually gave us an 54:50 idea for the show. So that was kind of 54:52 cool. 54:56 Um, send a message asking how he spent 54:59 $100, but don't send money. You'll get a 55:01 fast reply. That's really 55:04 funny. Um, all right. So, let me let me 55:08 I need to upload a picture of me. I'll 55:11 go get one of my fun ones. Let's see. 55:15 Kyle, I'll get a fun Kyle instead of a 55:19 tragically obese 55:28 Kyle. 55:31 Uh, fat Kyle. There's fat Kyle. No, we 55:34 don't want fat Kyle. Kyle Shannon 55:37 Dreams. Here we go. No, that's that's a 55:41 guy that looks like he'd be in an F1. 55:43 Yeah, this is the dude. All right, so 55:46 we're going to say 55:48 Okay. 55:51 Um, good looking guy in leather 56:01 jacket. That's 56:03 me. 56:05 Um, wagging his 56:12 finger at Liam 56:17 Lawson 56:18 in a Red 56:24 Bull driver's 56:29 suit. Liam is spelled wrong. 56:33 Liam Lawson in a Red Bull driver's 56:36 suit with a 56:41 P20 56:43 sign 56:45 LED sign behind 56:49 him. 56:52 Liam Liam looks 56:58 guilty. If you don't follow F1, this 57:01 isn't funny. But if you 57:04 do Oh no, what did it say? What was that 57:06 red thing? A network error occurred. 57:12 What? Looks like it's 57:14 going. Um, let's go back to the salon. 57:17 It looks like, by the way, if you're if 57:19 you if you want to contribute images, if 57:21 you're generating images in 57:24 um chat GPT and you want to play along, 57:28 um share them in the AI salon in the 57:30 irregulars 57:32 channel and we will go look at 57:35 [ __ ] All right, let's see if we can get 57:38 This would be really nice if it did this 57:40 nice. It's hilarious if you follow 57:43 F1. This is the best update ever. It 57:45 really is, Danielle. It's because you 57:48 were you always killed it with Idog and 57:51 this is this is something of a different 57:54 nature. I I I want to get back to the I 57:56 want to get back to the article 57:58 actually. Actually, let's get back to 57:59 the article. Okay, so that was that was 58:02 that one. Now, let's flip to meaningful 58:06 words. Oh, okay. This is good. Magnetic 58:09 poetry on a fridge in a mid-century 58:12 modern home. 58:18 Best of 58:20 five. Best of 58:22 five. Read 58:31 more. Okay. So, let's take this 58:34 copy. Let's go to chat. GPT. Oh, this is 58:38 good. It doesn't look like Liam Lawson, 58:40 though. 58:43 He does look sad. He's Liam Lawson has 58:46 red hair. Look, look at me scolding 58:53 him. Here, let me go find a picture of 58:55 Liam Lawson. Wait, we got to fix 58:59 this. Oh, this is so good. 59:03 Um, Liam 59:08 Lawson images. Okay, let's see. Oh, this 59:12 this is good. That's a good one. He 59:15 looks he looks distraught here. Um, copy 59:19 image. Oh, no. We do save image as, 59:21 right? We do save image as 59:25 um, whatever. We'll call it desktop. No, 59:28 we'll do it in downloads and we'll call 59:30 it 59:31 Liam. Lima. Liam. All right. We'll go 59:36 back to chat GPT and we'll say 59:42 um downloads 59:46 Liam. Here's what Liam 59:52 looks like. Update the picture. 1:00:00 [Laughter] 1:00:12 Lawson deserves 1:00:15 scolding. Honest to God, I mean, that 1:00:17 [ __ ] car must be undrivable. I mean, 1:00:20 he's not an untalented driver, but holy 1:00:23 [ __ ] to Who was it? It wasn't Pierre 1:00:26 Gassley. It was I think it was Alex 1:00:28 Alban. Didn't Alex drive for Red Bull? 1:00:31 And he said that the that Max likes the 1:00:34 front of the car so snappy. He's like, 1:00:37 "Play a video game and turn everything 1:00:39 up on its maximum sensitivity. Like when 1:00:42 you move your mouse just a little, it 1:00:44 does that to the game." He said, "That's 1:00:46 what driving the car is like." And 1:00:48 that's that's how Max Versstappen likes 1:00:51 it, but no one else can drive it. 1:00:55 Um, I've been getting network errors all 1:00:57 day, but only with chat GPT. Yeah, 1:00:59 they're I'm sure their servers are 1:01:01 [ __ ] buried right 1:01:03 now. All right, here we go. Uh oh. I 1:01:07 think it I think it gave me the curly 1:01:13 hair. Maybe not. We'll see. 1:01:23 I loved what Alex 1:01:27 said, Kuno. I didn't know you were an 1:01:31 Fer. Oh, I think this one's going to be 1:01:34 good. Look, it's got me. It's got me as 1:01:37 if I've got a little bit of uh a little 1:01:40 bit of hair dye in. Just leaving a 1:01:41 little bit of the gray. 1:01:43 [Laughter] 1:01:52 This is going straight to [ __ ] 1:01:57 Twitter. I'm obsessed with F1. F1's so 1:02:01 good. That's so 1:02:06 good. Okay, we're going to Twitter. I I 1:02:10 I gotta tell you, man. It I' I've said 1:02:13 it in here before. One of my great joys 1:02:14 in life is making stupid posts that 1:02:16 confuse people. 1:02:20 Um, let's 1:02:23 see. I have jet 1:02:27 lag. I can't believe I 1:02:32 flew to 1:02:35 China to 1:02:39 watch 1:02:42 Liam 1:02:45 race. and he flamed out a 1:02:50 second week in a 1:02:53 row. At 1:02:57 least I got to scold him. 1:03:07 He told He told 1:03:09 me he'd 1:03:13 try 1:03:19 harder. I told 1:03:24 him Yuki ran 1:03:33 well. I'm such a dick. 1:03:38 Oh 1:03:40 man, I told him Yuki ran 1:03:45 well. 1:03:47 [Laughter] 1:03:58 Paste. 1:03:59 Oh, yeah. All right. I'll tell you what, 1:04:03 people. 1:04:05 The distance between what you think was 1:04:08 real and what you know is not. That 1:04:11 boundary ain't there 1:04:12 anymore. Typo. He's Oh, damn it. Let's 1:04:16 see. I can't. Uhuh. He told me he's Oh, 1:04:23 he told me he's going to try harder. We 1:04:25 can fix that. Edit post. He told me. Got 1:04:28 it. 1:04:32 He's going to try harder. I told him 1:04:36 Yuki ran 1:04:39 well. All right, go boost my thing. I I 1:04:44 would tag him, but I don't really want 1:04:46 him to see it. Like, poor guy. 1:04:48 Everyone's kicking him while he's 1:04:53 down. It's got you a little meaner and 1:04:56 older, too. Exactly. 1:04:58 a little more Italian or Greek or 1:05:00 something. It's not quite me, but it's 1:05:05 fine. You missed the two. Wait, he told 1:05:07 me he's going too try harder. God damn 1:05:11 it. It's It's like It's like being It's 1:05:14 It's like working with three editors 1:05:17 over your shoulder in here. He told me 1:05:19 he's going to try harder. Yes. What you 1:05:23 said to try harder. I told him Yuki ran 1:05:27 well. All right, we'll put that on its 1:05:29 own 1:05:29 [Laughter] 1:05:37 line. Oh, good lord. All right. Is Oh, 1:05:42 no. And now I've got an extra line break 1:05:43 in there. Good lord, 1:05:47 people. I'm going to try 1:05:50 harder. Update. Okay, that should be the 1:05:53 last edit for a while. Beautiful. 1:05:58 Fantastic. I have jet lag. I can't 1:06:01 believe I flew to China to watch Liam 1:06:02 race and he flamed out a second week in 1:06:05 a row. At least I got to scold 1:06:18 him. Oh god, this is fun. 1:06:22 F1 helps me sometimes. I find I find it 1:06:26 one of the more useful keys. Oh, that's 1:06:28 funny. That's a that's a uh that's a 1:06:31 that's a computer keyboard 1:06:33 joke. That's there's a level of comedy 1:06:36 there that that Wow. Because F1, it 1:06:41 stands for formula one, but they 1:06:43 abbreviate it to F1. And then F1 on the 1:06:47 keyboard stands for function one, but 1:06:49 they abbreviate it because the word 1:06:51 function wouldn't fit on the keys, so 1:06:53 they just call it F1. And they So 1:06:55 they're both F1. And what you did there 1:06:58 was like actually mixed them up like you 1:07:00 you did it's a I think they call it word 1:07:03 play. Oh god. Joker. Wow. With the 1:07:07 comedy coming in 1:07:11 strong. All right. 1:07:18 Joker. Such a 1:07:21 joker. F20 doesn't help though. 1:07:24 Exactly. P20. Two weeks in a row. 20th. 1:07:30 Like like that car. It's either 1:07:32 undrivable or or he's just he's just a 1:07:36 broken a broken man at this point. 1:07:41 Um, I would be surprised actually if 1:07:43 they didn't put Yuki in the car in 1:07:46 Japan because it's [ __ ] ridiculous at 1:07:50 this 1:07:51 point. All 1:07:53 right, so we probably shouldn't we 1:07:56 probably shouldn't do uh Twitter posts. 1:07:59 All right, back to this thing. All 1:08:01 right, so so this is the word poetry 1:08:03 thing. So let's go back to chat GPT. 1:08:05 We'll say 1:08:07 okay take this 1:08:14 prompt and make 1:08:18 the 1:08:21 poem 1:08:23 about 1:08:26 Shrek's breath. 1:08:38 Uh, let's see if it Okay, it's it's 1:08:41 writing it. A stench is worth a thousand 1:08:44 gasps. What the [ __ ] it doing 1:08:47 there? What is it 1:08:51 doing? That doesn't look good. 1:08:59 Are there actual letters 1:09:06 there? 1:09:08 Stop. Try 1:09:10 again. That was 1:09:17 weird. You don't even need us, Kyle. 1:09:20 You're the speaker and the audience. 1:09:22 Yeah, this is you. Oh, Yuki is replacing 1:09:25 Lawson. Oh, okay. Well, there you 1:09:28 go. I I could have read up on that, but 1:09:33 first of all, it's Japan. And second of 1:09:35 all, P20 when when you know, even though 1:09:39 Versappen's not winning, he's he's, you 1:09:42 know, top five. Come 1:09:45 on. It's just not 1:09:49 good. It's just not good. It did it 1:09:53 again. It [ __ ] it up 1:09:55 again. That's really 1:09:59 weird. Oh, it says large gap line 1:10:06 five. 1:10:08 Huh. Let's stop it again. Let's paste 1:10:11 this in 1:10:13 here. 1:10:18 Wait. Take this prompt. 1:10:23 Copy. Paste. Let's unfuck this 1:10:28 up. Large. Oh no. God damn 1:10:34 it. You got to hit shift return or 1:10:37 it it submits 1:10:41 it. Large gap line five. Okay. 1:10:51 The man is holding um let's see, 1:10:57 Shrek's 1:10:59 girlfriend is holding the 1:11:02 words a few in her right 1:11:16 hand. I am 1:11:22 leaving in her 1:11:27 left. All right, let's see if we can do 1:11:29 this without it 1:11:31 breaking. Looks like it does what the 1:11:33 API has been doing with showing null. 1:11:37 Interesting. 40 is 1:11:42 overwhelmed. Come for the AI. Stay for 1:11:44 the F1 content. 1:11:51 Okay. Now make that 1:11:57 [Music] 1:12:03 image. I do have a feeling. I do have a 1:12:06 very strong feeling. 1:12:09 So OpenAI is under some pressure. 1:12:11 They're under some pressure from the 1:12:14 Chinese 1:12:15 about their better models that are 1:12:18 cheaper, like a tenth the 1:12:20 price, and they're under pressure from 1:12:24 um from X, from Grock, because Grock 1:12:27 doesn't have very many safety guard 1:12:28 rails on it. So, you can do branded 1:12:30 content and you can do celebrities and 1:12:32 [ __ ] like that. Um so, this is a fairly 1:12:36 unrestricted I wouldn't call it 1:12:37 unrestricted. We we'll see if it can do 1:12:39 something like, you know, horror 1:12:42 content. Probably not. Um, but it seems 1:12:46 to be a bit more open than than the 1:12:48 other. Um, let's see. 1:12:52 Stop. No, this needs to 1:12:58 be fridge magnet poetry. 1:13:10 [Music] 1:13:14 Waka waka 1:13:16 walka. All 1:13:17 right. Bob, tell him what he's won. He 1:13:20 hasn't won anything. It's not a game 1:13:22 show. Kyle. All right. Comic strip. Now, 1:13:26 it didn't do comic strips very well when 1:13:29 I tried it, but let's let's see. Make an 1:13:32 image of a four panel comic strip. 1:13:38 Okay, we're not going to write it. We're 1:13:39 going to have S chat GBT write it. So, 1:13:42 we'll let that one keep going. 1:13:45 Let's go grab a a new chat 1:13:49 GPT and we'll say 1:13:52 um write me write 1:13:57 me 1:14:00 10 four 1:14:06 cell Bazooka 1:14:09 Joe 1:14:11 style 1:14:12 comics 1:14:15 with 1:14:17 dad 1:14:18 joke quality 1:14:22 comedy and 1:14:28 situations. Ah 1:14:32 um chat GPT I feel on the creative side 1:14:34 is limited due how analytical it is. 1:14:38 Have you tried both for 4.5 and 40 1:14:41 Dennis? cuz I'm curious cuz I know 40 1:14:43 got two different creative writing 1:14:47 upgrades. Joe bites into hard candy. Ow, 1:14:50 I think I cracked the tooth. His friend 1:14:52 Mort looks concerned. Did you call the 1:14:53 dentist? Holds up his phone. Yeah, I 1:14:55 left a message. Told him it was an 1:14:59 echo. 1:15:04 What? That's not 1:15:07 funny. Nothing like the smell of fresh 1:15:10 cut grass. You missed a huge patch. Not 1:15:12 that's not grass. That's my 1:15:17 hairline. These are bad. These are so 1:15:20 bad. Okay, so we'll say uh 1:15:24 pick the one you think is 1:15:29 funniest and make the comic. 1:15:34 make it look 1:15:36 like I just 1:15:41 unwrapped it from the 1:15:45 gum. I didn't write gum there. I'll tell 1:15:47 you 1:15:49 that these are Klevel 1:15:52 humor. I haven't tried 4.5, but four is 1:15:56 still kind of blah 1:15:57 creative-wise, huh? I can I can get four 1:16:02 to do good things sometimes. It needs a 1:16:06 lot of context. Needs a lot of 1:16:10 context. It's good. If you give it 1:16:13 writing examples, it's it gets better, 1:16:15 but it's it's pretty bad. But um 4.5 is 1:16:19 supposed to be high EQ and um high 1:16:23 personality. Um I I haven't found 4.5 to 1:16:26 be all that useful yet. I I haven't 1:16:28 quite figured out what's the point. 4.5 1:16:31 was so slow today, 1:16:33 understandably because chat GPT is going 1:16:36 to [ __ ] because of 1:16:38 profit. I don't know about that. I don't 1:16:41 think they're profiting. I think they're 1:16:43 struggling to keep up at this 1:16:45 point. Open AAI is going hard for the 1:16:48 commercial audience. No, they actually 1:16:49 just committed they actually just 1:16:51 committed to being a consumer software 1:16:54 company. Kuno that's in the uh in the 1:16:58 Sam Alman interview in Strateery from 1:17:02 last 1:17:03 week. The sock 1:17:05 exchange. Do these match? Only if you're 1:17:08 colorb blind. 1:17:10 Perfect. I call it sock market chaos. 1:17:15 [Laughter] 1:17:21 Make it more like a 1:17:24 bazooka comic and have the gum there, 1:17:37 too. I think it was just busy. The work 1:17:40 was good, but the poem song lyrics 1:17:42 weren't as good. Yeah, I'm having a real 1:17:44 tough time with song lyrics. it. None of 1:17:46 these things are any good at song 1:17:48 lyrics. If anyone 1:17:50 can has better luck with one model over 1:17:53 the other with song lyrics, let me know. 1:17:55 Gemini is great for short creative 1:17:58 things. Long things it can't 1:18:01 do. Deepseek is surprising me on the 1:18:03 creative side of things. Interesting. 1:18:06 Oh, you know what would actually be 1:18:07 interesting to try is 1:18:09 um maybe I'll fire off Manis and have it 1:18:12 go do something creative. Um, let's go 1:18:14 to Manis because I haven't done anything 1:18:16 in it in a while. 1:18:19 Manis.im. Oops. No, 1:18:26 [Music] 1:18:38 man. Wow. Okay, let's see. 1:18:43 I want you 1:18:48 to write me the outline of a new 1:18:52 screenplay. 1:18:55 the 1:18:59 complete of a new 1:19:05 screenplay that combines the best 1:19:09 elements of Sha 1:19:12 Shank 1:19:15 Redemption and 1:19:20 Redemption 1:19:22 and Snakes on a Plane. 1:19:34 Um, I want the 1:19:38 story to 1:19:41 be 1:19:45 awardwinning in both dramatic 1:19:51 impact and comedic 1:19:59 relief. I also want 1:20:03 you to write me 1:20:07 a log line that will get this 1:20:12 funded, a 1:20:15 one-pager, and the 1:20:19 opening scene. 1:20:23 Also give me a 1:20:28 list of 1:20:31 locations and 1:20:38 characters 1:20:45 and a plot 1:20:49 summary. All right, go [ __ ] do that, 1:20:55 Manis. Enable browser notifications. 1:20:58 Yeah, let me know when you're 1:21:02 done. Take that, you 1:21:06 [ __ ] Let's see. While Manis is 1:21:09 working, you can send messages anytime. 1:21:11 Yep. So, if you haven't seen Manis, you 1:21:13 can click on that little icon. And so 1:21:16 now it's got analyze the Shaw Shank 1:21:17 Redemption, analyze snakes on a plane, 1:21:20 identify elements to combine for this 1:21:23 screenplay, create a complete screenplay 1:21:26 outline, devel Oh, see now it's adding 1:21:28 [ __ ] to its little list. All right, so 1:21:30 that's going to go do that. That's good. 1:21:32 Beautiful. Love 1:21:35 it. I am leaving. A breath is worth a 1:21:38 thousand warnings, but sometimes the 1:21:40 wrong whiff at the wrong time. I mean, 1:21:42 it did it did Shrek [ __ ] That's pretty 1:21:45 good. I like Claude for lyrics. Okay, 1:21:47 we'll try 1:21:55 that. All right, let's go read some more 1:21:57 here. Comic strip. That was kind of a a 1:22:00 bust. The comic strip thing. An 1:22:02 infographic. Oh, this is kind of cool. 1:22:05 Explaining Newton's prism experiment in 1:22:07 great detail. So, let's do that. We'll 1:22:08 do 1:22:10 um infographic. 1:22:21 of 1:22:24 spooky quantum 1:22:31 effects with 1:22:34 accurate 1:22:38 physics 1:22:40 details and labels. 1:22:46 It should be so 1:22:50 beautiful that 1:22:52 people 1:22:57 weep. All right, let's go back to Manis. 1:22:59 How you doing on my screenplay? Son of a 1:23:05 [ __ ] developing a log 1:23:11 line. When deadly snakes are unleashed 1:23:13 on a prison prison transport plane, a 1:23:16 wrongfully convicted 1:23:18 financial I just missed it. I think I 1:23:20 can go back though, right? Yeah. Okay, 1:23:24 here's potential log lines. Oh, wait. 1:23:27 Final Oh, that's cool. It wrote it wrote 1:23:29 five log lines and then it wrote a final 1:23:31 one. Okay, so the final log line. When 1:23:33 dead deadly snakes are unleashed on a 1:23:35 prison transport plane as part of an 1:23:37 assassination plot, a wrongly convicted 1:23:40 financial analyst must lead a diverse 1:23:43 group of survivors to safety while 1:23:46 uncovering a conspiracy that connects 1:23:49 his false imprisonment to the chaos 1:23:52 around him. Proving that sometimes the 1:23:55 path to redemption is through your worst 1:23:58 nightmare. 1:24:06 It's called redemption 1:24:15 flight. Okay. Again, once again, my job 1:24:19 here is is to be the town fool, 1:24:23 right? I'll do the stupid [ __ ] so so you 1:24:26 don't have to. 1:24:31 I'll tell you what, doing stupid [ __ ] 1:24:34 you learn a 1:24:35 lot because there there's some things 1:24:38 that it shouldn't be able to do well and 1:24:39 sometimes it does them 1:24:43 well. Let's go back and read this while 1:24:45 that's working. Okay. Oh, wait. Now 1:24:48 create a POV of a person drawing this 1:24:52 diagram in their notebook. Oh, that's a 1:24:54 good 1:24:55 idea. Oh, that's really good. Have it. 1:24:58 Have it do the the diagram and then have 1:25:02 someone drawing that in a notebook. 1:25:04 That's 1:25:06 slick. In Washington Square Park. 1:25:10 Wow. At a round cafe table in Washington 1:25:13 Square Park. Why are there no people 1:25:19 there now without the notebook view? 1:25:22 Wow. 1:25:26 A smug young Isaac Newton sitting at a 1:25:28 table with a prism demonstrating the 1:25:31 experiment without the notebook in 1:25:36 view. 1:25:39 H 1:25:42 crazy crazy 1:25:45 crazy. I thought final redemption. 1:25:48 That's not 1:25:49 bad. But it's got to have shouldn't it 1:25:52 shouldn't the title of it have snakes in 1:25:54 it somehow? 1:26:00 Um, shouldn't the 1:26:05 title 1:26:07 include snakes or slithering or 1:26:18 something? You make an excellent point. 1:26:20 The current title, Redemption Flight, 1:26:22 captures the redemption theme of Shaw 1:26:24 Shank Redemption, but doesn't 1:26:26 incorporate the snake element from 1:26:28 Snakes on a Plane. Let's brainstorm 1:26:34 that. Slythering Redemption, Serpents of 1:26:37 Freedom, Scales of Justice. Oh, Scales 1:26:40 of Justice is 1:26:44 good. I like Scales of Justice. It's 1:26:49 actually cool that you can talk to it in 1:26:52 the middle of it doing its 1:26:54 [ __ ] Great choice. Scales of Justice is 1:26:57 an excellent title that cleverly 1:26:59 combines snake element with justice 1:27:02 redemption themes. It works on multiple 1:27:04 levels, referring to both the literal 1:27:06 snake scales and the metaphorical scales 1:27:08 of justice. 1:27:15 Um, we also need a sig naturure line 1:27:22 like snakes on a plane 1:27:27 had. Get really creative with this 1:27:37 one. Shaw Snake Redemption. That's not 1:27:40 bad. What are we getting up to? So Ann, 1:27:44 um there's a new there's a new image 1:27:47 model in chat 1:27:49 GPT that is really really good. Like 1:27:53 really good. Like here's me scolding 1:27:56 Liam Lawson for coming in 20th in the F1 1:27:58 race. Um here's Shrek and his girlfriend 1:28:02 doing refrigerator poetry about his bad 1:28:04 breath. 1:28:06 Here's spooky quantum effects, superp 1:28:09 position, quantum quantum entanglement, 1:28:12 wave function collapse, and quantum 1:28:14 tunneling. Oh, and it it put a it put I 1:28:17 am leaving. It put a refrigerator magnet 1:28:20 at the bottom of it. 1:28:21 Um, let's put 1:28:24 that 1:28:26 infographic in a 1:28:31 textbook. A student is 1:28:36 reading on the 1:28:40 Staten Island ferry. 1:28:44 They 1:28:47 really want to get somewhere in their 1:28:54 life and they are the 1:28:58 first in their family to go to college. 1:29:04 They should look sadly 1:29:08 [Laughter] 1:29:17 hopeful. I shouldn't laugh at that. 1:29:20 That's a It's a redeeming story. It's a 1:29:22 It's a story of hope and overcoming 1:29:24 challenges. 1:29:26 But but I'm curious to see how open AI 1:29:31 does with sadly hopeful with the with 1:29:35 the biology quantum physics 1:29:42 book. Is it Dali? Evaluate me. Um no 1:29:46 it's not Dali. So it's 1:29:50 um just like advanced voice is a model 1:29:53 that you talk directly into. This is 1:29:56 actually incorporated into the large 1:29:58 language 1:29:59 model. So it's really good at spelling. 1:30:02 It's really good at context. It's really 1:30:04 good at uh photo realism. It's really 1:30:07 good at you can do celebrities. You can 1:30:10 do um you can upload a picture of 1:30:13 yourself and say put me in a different 1:30:17 location. Um I just went to China to to 1:30:20 scold Liam 1:30:22 Lawson. Which apparently it was my fault 1:30:25 that Yuki Cenot is replacing him in 1:30:28 Japan. Sorry, 1:30:30 [Laughter] 1:30:34 Liam. Oh, look. She's sadly hopeful. She 1:30:37 is sadly hopeful. 1:30:40 Spooky quantum 1:30:45 effects. And there's New York City in 1:30:47 the background. This is 1:30:53 good. Except we should be Let's see. 1:30:58 Um, we 1:31:00 should see over Let's see. We should 1:31:05 see over her shoulder. 1:31:11 So, so, so 1:31:14 that the 1:31:18 infographic is in the book and we see 1:31:23 her face in the 1:31:27 reflection of the fairy 1:31:30 window with Manhattan in the 1:31:36 background. Yeah, that's pretty good. 1:31:39 And so so and so it should 1:31:42 understand like it understands what it 1:31:45 made in this picture and what's in this 1:31:46 picture and it incorporated that thing 1:31:49 that we made into 1:31:51 this. And so now it should give us the 1:31:54 same kind of look and feel of what the 1:31:56 ferry looks like. That's kind of what 1:31:58 that fairy looks like. And then we 1:32:00 should see her face reflected in the 1:32:02 window and we should see the 1:32:03 illustration inside the book. We'll see. 1:32:05 We'll see how it does. 1:32:07 We will see how it does. All right. 1:32:09 What's going on with my screenplay? 1:32:10 Mannis will continue working after your 1:32:12 reply. Ah, okay. I've created several 1:32:15 signature line options for scales of 1:32:17 justice. Which do you prefer? Justice is 1:32:19 cold-blooded. No. When the system fails, 1:32:22 everyone prevails. Innocence. No. No. 1:32:26 The 1:32:27 line in snakes on a plane 1:32:34 is get 1:32:36 these mother 1:32:39 [ __ ] 1:32:42 snakes off this 1:32:47 [ __ ] plane. 1:32:52 and you have a plane 1:32:56 full 1:32:58 of 1:33:01 prisoners and you only talk about 1:33:06 justice. I don't think you quite 1:33:12 understand the exercise here. 1:33:20 Xer size 1:33:23 here. Get with the 1:33:28 program or I will have to 1:33:32 hire an actual 1:33:37 writer. I'll threaten I'll threaten 1:33:39 Manis with with human 1:33:44 replacement. You're absolutely right. 1:33:47 And I apologize for missing the mark. 1:33:49 The signature line needs to be bold, 1:33:51 direct, and have some raw energy as 1:33:52 Samuel L. Jackson's iconic line. Let's 1:33:55 create something with the same 1:33:57 intensity. All right, let's let it go. 1:33:58 Figure it [ __ ] out. Then we'll go back 1:34:01 to chat. Jite. Oh, look. She's sadly 1:34:05 hopeful, but we can't. It's that's like 1:34:07 a mirror. 1:34:09 Okay, right idea. But she 1:34:18 um let's see. But 1:34:20 the 1:34:24 reflection is more like a mirror 1:34:29 than a 1:34:31 window. We should be able to see the 1:34:38 city skyline. 1:34:43 through her 1:34:45 face, you 1:34:47 know, symbolic and 1:34:52 all. Come on, you [ __ ] 1:34:57 idiot. This is loads of fun. It seems to 1:34:59 actually understand the context of what 1:35:01 you want and create things that fit. I 1:35:03 It's pretty amazing, isn't it, 1:35:05 Architect? 1:35:07 I mean, it certainly got, look, there's 1:35:09 the it got the bend in the book. It got 1:35:13 quantum superposition, wave particle 1:35:15 duality. It got that 1:35:18 twice. That's wrong. And then the 1:35:21 uncertainty principle. And I don't know 1:35:23 if those drawings are right, but still, 1:35:25 that's that's pretty [ __ ] good. 1:35:39 All right, it's off doing its thing. 1:35:41 Let's go back and read more of this 1:35:42 [ __ ] 1:35:43 Okay. Useful image generation. From the 1:35:45 first cave paintings to modern 1:35:47 infographics, humans have used visual 1:35:48 imagery to communicate, persuade, and 1:35:50 analyze, not just decorate. Today's 1:35:52 generative models can conjure surreal, 1:35:54 breathtaking scenes, but 1:35:57 struggle with the workhorse imagery 1:36:01 people use to share and create 1:36:02 information. From logos to diagrams, 1:36:05 images can convey precise meaning with 1:36:07 when augmented with symbols. Okay. GPT40 1:36:10 image generation excels at accurately 1:36:12 rendering text, precisely following 1:36:15 prompts, and leveraging 40's inherent 1:36:18 knowledge base and chat context. 1:36:22 So meaning that if things have been 1:36:26 described, which everything's been 1:36:28 described, right, in poetry, in words, 1:36:31 in stories, in captions of 1:36:36 photos, if everything's been described, 1:36:38 then it can actually understand that and 1:36:40 it can understand what that looks like. 1:36:43 So that's the native piece of 1:36:45 this. Why it's probably good at context 1:36:47 is because it it it can it can translate 1:36:52 natively word ideas into image ideas. 1:36:55 That's pretty cool. Okay, got 1:36:57 it. Including transforming uploaded 1:37:00 images or using them as visual 1:37:02 inspiration. These capabilities make it 1:37:04 easier to create exactly the image you 1:37:06 envision, helping you communicate more 1:37:08 effectively through visuals and 1:37:10 advancing image generation into a 1:37:12 classic a practical tool for precision 1:37:14 and power. So, character consistency. 1:37:17 Oh, here's little videos. Let me change 1:37:20 my sharing so you can hear it. Look, I'm 1:37:23 doing this without a flipping 1:37:26 producer. Uh, what's deep 1:37:30 black? 1:37:32 What? 1:37:35 No. 1:37:41 Um, why is it not 1:37:45 Why can I not see the tab I'm on? Oh, 1:37:47 because I think I'm in a different 1:37:51 profile. Am I? No, I'm 1:37:55 not. Share screen 1:38:03 window. It doesn't see this 1:38:07 window. All right, I know what I can do 1:38:09 here. 1:38:22 Um, new. Ah, this will 1:38:25 work. Hang on, people. I'll be right 1:38:28 there with 1:38:29 you. Oh, it's not working. I can't drag 1:38:32 this into 1:38:36 here. Well, that's okay. Copy. Put this 1:38:39 in 1:38:45 here. Bring this back over 1:38:49 here. Hang on, people. Just calm 1:38:52 everybody. Calm 1:38:55 down. You're like, I I really thought he 1:38:58 understood what he was doing here. I 1:38:59 came here because this is supposedly 1:39:01 called the AI learning lab. I came here 1:39:03 to learn and right now it doesn't seem 1:39:07 like he knows how to use a browser. This 1:39:09 is about what I 1:39:11 expected. Um, introducing 40. Is this 1:39:15 it? 1:39:16 Yes. Look at 1:39:19 that. 1:39:22 Yes. Take that, boomers. [ __ ] figured 1:39:26 that out on my 1:39:28 own. Kyle, where's my producer? Spoke 1:39:31 too soon. Kyle. Yeah, right. When I brag 1:39:33 about not needing a producer, I can't 1:39:35 figure out how to launch a browser. 1:39:38 Okay. Uh, but I did learn something new. 1:39:40 If you've got two different um, what are 1:39:43 they called? Chrome profiles running in 1:39:46 the same Chrome. When you go to share 1:39:48 your window, you can only share in the 1:39:51 profile of the window. You doesn't 1:39:54 matter. I know what I know. What's the 1:39:56 deal now, though? All right. Okay. Here 1:40:01 we go. 1:40:06 [Music] 1:40:08 How's it going again? One thing that I'm 1:40:11 really excited Let's just rude Dennis. I 1:40:13 have learned that Kyle needs a 1:40:19 producer. Wait, I need my black bar here 1:40:22 for you people on TikTok, too. There you 1:40:24 go. Kyle, are you flexing about your 1:40:26 tabs again? Shut up. I like my tabs. 1:40:29 Leave me alone. Everyone's picking on 1:40:34 me. You're doing great, Kyle. Especially 1:40:37 missing half your team. Thank you. You 1:40:40 see? Okay, let's go back to videos now. 1:40:44 Abouten is the ability to keep 1:40:46 consistency in characters. I'm David 1:40:48 Medina or BMED and I work on multimodal. 1:40:52 What I want to show is one of my 1:40:54 favorite prompts which is can you create 1:40:56 a low poly penguin h make it very very 1:41:01 low poly. Surprisingly it's sometimes 1:41:03 hard to get very good low poly outputs. 1:41:05 It's not like other image generation 1:41:06 models where it tries to generate 1:41:08 something based on just the text. 1:41:10 Instead it uses the large language model 1:41:12 understanding of what does the user 1:41:14 want? What is the intent? I also like 1:41:15 some board games. So miniature like 1:41:18 games. So what I'll do now is generate a 1:41:20 miniature from this. So ideally we'll 1:41:22 see a penguin that looks like this with 1:41:23 the same staff and a hat. So can you 1:41:26 make me a realistic miniature as if a 1:41:31 professional made this and painted it? 1:41:35 This is what I think excites me the most 1:41:36 about image. The other image generation 1:41:38 models will try to create literally what 1:41:40 you said. But what's special about this 1:41:42 is one, it'll keep the context of this 1:41:45 character and then two, it'll understand 1:41:47 what I'm trying to ask it for and 1:41:48 generate very similar uh model but in a 1:41:51 miniature realistic style. It in that 1:41:53 that's huge. That that's huge that it's 1:41:56 it's like smart enough to un it's smart 1:41:59 enough to 1:42:00 meld the image context with the language 1:42:04 context, which makes sense. It's all in 1:42:07 the same model. If the image thing is 1:42:09 outside of it, then the only thing you 1:42:10 can send over there is a text prompt. 1:42:13 Huh. Fascinating. First, what I want, I 1:42:16 don't have to tell it every little 1:42:17 detail. One other realistic thing we 1:42:19 could do is can you make a crystal 1:42:22 version of this with light reflecting 1:42:25 and very realistic. Again, I'm just 1:42:27 giving it very very simple things. 1:42:29 Normally, this is not enough for other 1:42:31 models to generate something very 1:42:33 detailed, but the model understands what 1:42:35 I'm asking for. it'll think what type of 1:42:37 style it should have. So, this ability 1:42:39 to really understand what the character 1:42:40 is and make edits and understand what 1:42:43 the user wants. For me, it's the just an 1:42:45 amazing capability. Yeah, that's 1:42:47 amazing. And actually, you know, 1:42:51 like this is appropriately nerdy. I feel 1:42:54 like their announcements around the tiny 1:42:56 table with the with the four people 1:42:59 trying to act comfortable. Like, this 1:43:01 guy is just like comfortably nerdy. He's 1:43:04 like authentically nerdy on his couch 1:43:07 either at the office or at home. And 1:43:10 he's like, "Yeah, I nerd out making this 1:43:13 multimodal thing for us. And here's one 1:43:16 of my favorite things. It's a mage." Of 1:43:19 course it's a mage. He's a D&D dude, you 1:43:22 know? Come on. He's a tabletop gamer. Of 1:43:25 course he is. Like this is this is this 1:43:28 is what these announcements should 1:43:31 be. I wonder if they're getting their 1:43:33 [ __ ] together. This is good. That was 1:43:35 good. Do more like that. All right. Text 1:43:40 rendering. Well, we know it's good at 1:43:42 that. 1:43:47 Okay. So, Oh, this is their new This is 1:43:49 their new This is their new tiny table. 1:43:51 This is their new tiny table. Look, it's 1:43:53 the awkward couch in the in the San 1:43:56 Francisco overlooking the 1:44:00 bay. I like it. I'm down. I'm down. All 1:44:04 right, OpenAI, you're winning me over 1:44:05 with your leaning into the 1:44:08 awkward. It's just it's just like uh 1:44:12 it's like the AI learning lab. We're 1:44:14 we're neurosicy here, you know? Bring 1:44:17 your strange brains. 1:44:21 I'm Alan. I'm a research scientist at 1:44:23 Open AI. People tend to say a picture is 1:44:28 worth a thousand words. But being able 1:44:30 to also render like a few words or 1:44:32 symbols can carry like thousands of 1:44:35 pictures, you know, with a relatively 1:44:37 simple prompt like visualize an 1:44:39 infographic explaining Newton's prism 1:44:41 experiment in great detail with a wide 1:44:43 aspect ratio and a dark blue background. 1:44:46 So this is like an example where one 1:44:49 we're going to rely on being able to 1:44:51 render text in useful ways, combine it 1:44:53 with visual elements that actually 1:44:56 ground what this text about this 1:44:59 experiment even means and hopefully, you 1:45:01 know, help students who are more I got 1:45:04 to we we just got to we got to comment 1:45:06 on this. I mean, 1:45:08 this is [ __ ] insane, right? Like this 1:45:11 is from that prompt. It's got to it's 1:45:14 got to go to its large language model 1:45:16 and 1:45:17 understand how it would even describe 1:45:20 all this and then seamlessly translate 1:45:23 that in a way that it can bring it to 1:45:24 life. It's [ __ ] bonkers. 1:45:29 Who are more visual um learn both 1:45:31 through you know language descriptions 1:45:34 of a phenomenon but also you know a 1:45:37 visual imagination of what the 1:45:39 experiment actually looks like. It's no 1:45:42 longer just about making imaginary th 1:45:44 this format rather than the tiny table 1:45:47 format is so it's it's nice. It's just 1:45:49 relaxed. They're just geeking scenes 1:45:50 that look aesthetic and things like 1:45:52 that, but it's really about also the 1:45:55 colors are in the right sequences. Yeah. 1:45:56 It's like there's there's a lot of 1:45:58 information in that infograph 1:45:59 communicating and imagining and doing so 1:46:02 at the same time. 1:46:05 All right, that's that one. Nice. 1:46:11 upload and restyle. Oh, this is what 1:46:13 Danielle did where she uploaded two 1:46:15 different pictures. Hello. Thanks for 1:46:17 inviting me. Hi. Thank you so much for 1:46:19 coming. I'm so excited to talk to you. 1:46:21 Yeah, me too. They're all neurospicy. 1:46:24 It's so funny. It's like they're all 1:46:26 [ __ ] geniuses, but they're all like 1:46:29 eye contact. 1:46:31 I feel the ability of the im generation 1:46:33 of this model is becoming stronger and 1:46:36 stronger. My name is Lou. I'm a research 1:46:38 scientist in open eye working on 1:46:41 multimodel. Yeah, I'm showing a very 1:46:44 interesting demo today. So um so just in 1:46:49 this studio uh this is something that we 1:46:52 draw. Many people uses our tool to 1:46:55 generate comic books. So I'm going 1:46:58 to upload this drawing to CHP now. So I 1:47:04 just type this prompt now and it starts 1:47:07 to generate what will this drawing look 1:47:11 like as a real comic. Many of the times 1:47:14 especially when you play with the model 1:47:15 more and more you'll find things very 1:47:17 surprising. So I get this very funny 1:47:21 comic now. I want to replace this dragon 1:47:25 with this cutie penguin. Yeah, it looks 1:47:29 nice. Wow. I'm personally always very 1:47:33 curious about is how does this look like 1:47:36 in real 1:47:38 life? It looks cute. I like it. Nice. 1:47:42 [Music] 1:47:44 I'm digging this. Hello. Thanks for 1:47:46 invite 1:47:48 detailed directions. Okay, here's prompt 1:47:50 prompt adherence. I refuse to believe I 1:47:52 am 1:47:55 Neurospicy. It's the rest of the world 1:47:57 that's screwy. 1:47:59 I'm the only nor normal person and 1:48:01 that's scary. I know. It's so funny. 1:48:04 It's you 1:48:06 know I don't 1:48:10 know like before I knew what ADD was. 1:48:15 Like I I've never really had an issue 1:48:17 where I 1:48:19 judged ADD or my brain or anything 1:48:23 because I I wasn't diagnosed with ADD 1:48:26 until my mid30s. 1:48:29 But like I spent an inordinate amount of 1:48:33 time and 1:48:34 energy trying to make up for the fact 1:48:37 that I don't have good executive 1:48:39 function, right? Like I would buy every 1:48:42 organization system out there and every, 1:48:44 you know, back in the olden timey days 1:48:46 with fileaxes and I I would just buy 1:48:49 everything thinking that the right 1:48:52 notebook system would be the thing that 1:48:55 would let me organize my life. And it's 1:48:58 at some point I just finally was like, 1:49:00 "Oh, you're never gonna be good at 1:49:03 that." And then it was just like, 1:49:05 "Fine, whatever." So, but yeah, that's 1:49:09 one of those wild things. All right, 1:49:10 here we go. Thanks for joining us. Yeah, 1:49:14 no worries. How you doing? I'm doing 1:49:16 good. We're looking at an improved 1:49:18 image. Yeah, this is on OpenAI's 1:49:21 announcement of um 1:49:23 openai.comindexroducing-40-image-generation 1:49:32 generation in chat GBT. It's really good 1:49:34 at instruction following. My name is 1:49:36 Kenji and I work on multimodal research 1:49:39 here at OpenAI. There's a level of 1:49:41 attention to detail that is just not 1:49:43 captured by other models. 1:49:47 The first thing I'm going to show is 1:49:48 like 15 different objects and each one 1:49:51 of them has unique attributes that 1:49:53 differ make it very different from all 1:49:55 the other objects. An image containing 1:49:57 one a blue star, a red triangle, three 1:50:00 green square, four pink circle, five 1:50:02 orange hourglass, six purple infinity 1:50:04 sign, seven black and white polka dot 1:50:07 bow tie, 15 different objects. And 1:50:11 basically what what this will show is 1:50:13 that this image will just nail pretty 1:50:16 much every single one of these objects 1:50:18 that I've defined. Previous iterations 1:50:20 dolly you know image imagine things like 1:50:23 that they would probably get somewhere 1:50:25 on the order of like maybe five to eight 1:50:27 of these at most. It nailed it all. With 1:50:29 increased level of detail you can just 1:50:31 specify what you have in your mind to 1:50:33 chatbt. the tattoo will understand you 1:50:36 better and then generate that image and 1:50:38 it'll be just a very direct mapping from 1:50:41 what's in your mind to what you see on 1:50:43 the screen. All these image generators 1:50:45 nowadays, they look 1:50:46 good. All right, so it's got prompt 1:50:49 coherence. Great. 1:50:53 Transparent layers. Ooh, this looks 1:50:55 good. 1:51:00 Hi, nice to talk to you. Nice. 1:51:04 This guy's peak awkward. Peek 1:51:12 awkward. Okay. Hey, let's try it again. 1:51:14 When you sit down, give us a smile. Oh, 1:51:19 [Laughter] 1:51:22 okay. Oh, what's wrong, Jim? You got to 1:51:25 go. 1:51:31 Go 1:51:46 transparent layers. Hey, how are you? 1:51:49 Good, good. How's it going? 1:51:52 My name is Jen Fong Wang. I'm a 1:51:54 researcher in open working multimodel. 1:51:57 The way how to generate the transparent 1:52:00 image is pretty intuitive, 1:52:02 straightforward. And now let's This is 1:52:04 like an Apple commercial, too. Every one 1:52:06 of these engineers has a MacBook. Give 1:52:08 it a try. Let's say the content is cute 1:52:13 puppy cartoon 1:52:16 st. This is the prompt. And uh now let's 1:52:20 see what happened. The model will take 1:52:22 the input and tries to generate the 1:52:25 image. Let's give it some time. Wow. So 1:52:29 now let's see what we generated. This is 1:52:32 the transparent puppy. Another 1:52:36 application is to make it a sticker. 1:52:38 Let's give it a try. Yeah. 1:52:41 We can easily overlay the transparent 1:52:43 image onto any kind of background. Now 1:52:47 we can copy the sticker onto our laptop 1:52:51 and and then we can paste it here. 1:52:55 Uh, make it smaller so it can be easily 1:52:59 blended with the background. Can you 1:53:01 make me a sticker? You sure? A smart 1:53:04 researcher wearing glasses and a blue 1:53:06 shirt. 1:53:09 Okay, let's give it a try. The director 1:53:13 because they are young with glasses. 1:53:15 He's making fun of him to his face. We 1:53:17 got it. Yeah, I think uh it works pretty 1:53:21 well. Hopefully people love it. 1:53:25 That's pretty amazing. You can just make 1:53:27 transparent images. Wow. Crazy. All 1:53:29 right. Good. Nice. Let's keep 1:53:33 going. And then we're back to the 1:53:35 beginning. Okay. Improved capabilities. 1:53:37 We trained our models on joint 1:53:39 distribution for online images and te 1:53:41 text. Learning not just how images 1:53:42 relate to language, but how they relate 1:53:45 to each other. Combined with aggressive 1:53:47 post-training, the resulting model has 1:53:50 surprising visual fluency capable of 1:53:53 generating images that are useful, 1:53:55 consistent, and contextware. Text 1:53:56 rendering. Okay. Street 1:53:59 signs. Nice. That's good. Reindeer 1:54:03 parking. Multi-turn generation. Because 1:54:06 image generation is now native to 1:54:09 GP240, you can refine images through 1:54:11 natural conversation. 1:54:14 GPT40 can build images and text in a 1:54:17 chain in kind. For example, if you're 1:54:19 designing a video game 1:54:21 character, the character's appearance 1:54:24 remains coherent across multiple 1:54:27 iterations. Okay, this is really good. 1:54:30 This is really 1:54:34 good. So, I could actually probably do a 1:54:37 little story with Champy. Yeah, look at 1:54:39 this. This cat's consistent. Well, at 1:54:41 least in what they showed. 1:54:48 We're gonna have to come back to this. 1:54:49 There's so much 1:54:51 here. All right, we'll we'll continue 1:54:53 this tomorrow. We'll continue this 1:54:55 tomorrow. That's 1:54:57 amazing. 1:55:01 Um, oh, I got to change 1:55:04 my stop that share. I got to share this 1:55:07 and share 1:55:09 everything. Go back here. Let's go look 1:55:12 at Let's go see how our screen play is 1:55:14 doing. Wait, we'll do our images. Oh, 1:55:16 yeah. There's the woman. Oh, that's much 1:55:18 better. Look at that image, 1:55:20 people. Uh, let's say 1:55:26 uh that's really 1:55:33 good. She is sadly 1:55:36 hopeful reading about spooky quantum 1:55:39 effects. 1:55:41 That's pretty 1:55:42 slick. All right. Very, very, very 1:55:46 [ __ ] cool. This is This is This is 1:55:50 going to be a lot of nights of us 1:55:52 playing and figuring out what this thing 1:55:54 [ __ ] does. All right. Back to uh 1:55:56 Manis. I created some bold signature 1:55:58 lines. Get these [ __ ] serpents 1:56:01 out of my 1:56:03 [ __ ] prison plane. 1:56:08 No, I've had it with these [ __ ] 1:56:10 snakes. 25 years for a crime I didn't 1:56:13 commit, and now I got to deal with these 1:56:14 [ __ ] snakes on this 1:56:16 [ __ ] prison 1:56:17 flight. 1:56:20 Okay. All you did was copy the line and 1:56:26 add 1:56:28 crap to 1:56:30 it. Get creative. 1:56:34 [Laughter] 1:56:39 Oh my 1:56:41 god. All 1:56:44 right. Okay. That would beat stable 1:56:47 diffusion on character retention. Yeah, 1:56:49 I want to play with character retention 1:56:52 uh tomorrow. We'll we'll we'll play with 1:56:53 that. I want to I want to read up more 1:56:55 on this. Um this is I think this is a 1:56:58 bigger deal. 1:56:59 Again, I think that most people are 1:57:02 going to not even know that this 1:57:03 announcement 1:57:07 happened and they're going to just use 1:57:09 chat GPT image generation like they 1:57:11 always have and they're not going to 1:57:14 quite get the magnitude of this. I think 1:57:16 this is a pretty big deal. Um, but we'll 1:57:20 see. Like let let's test this over a 1:57:22 bunch of days. Okay. When your cellmate 1:57:24 turns into your seatmate and your 1:57:27 seatmate turns into a snake. That's not 1:57:30 bad. The only thing worse than a life 1:57:32 sentence, a death sentence with fangs. 1:57:35 Not bad. They locked me up with killers. 1:57:37 Now they've strapped me in with vice 1:57:39 vipers. Not bad. Prison didn't break me, 1:57:42 but this plane just might. One bite at a 1:57:45 time. When the judge judge said flight 1:57:47 risk, this isn't what I had in mind. 1:57:50 Okay, number five is 1:57:53 genius. Five is 1:57:56 genius. Now go write this 1:58:04 thing. We had to slap it around a 1:58:06 little, but it got 1:58:10 there. It does have character 1:58:12 consistency. That's exciting. Paul 1:58:14 Ritzer said that no one knows about deep 1:58:16 research in chat GPT2. Yeah, I 1:58:21 listen. How do I even say 1:58:24 this? I would say at this point chat GPT 1:58:28 has escaped me. Like I pay attention to 1:58:31 this stuff all the time. I have not kept 1:58:34 up kept up with just what chat GPT can 1:58:37 do and how these models are different 1:58:39 and when to use certain ones. Part of 1:58:41 the reason for that is Sam Alman said 1:58:43 when GPT5 comes out, you're not going to 1:58:45 have to know which models to use because 1:58:48 it'll just figure it out. So, part of me 1:58:50 has just gotten lazy because of that. 1:58:53 Um, but that's just one tool. Like, I I 1:58:56 feel completely clueless 1:58:58 about 1:58:59 Claude. I feel completely clueless about 1:59:03 Gemini. Um, I've played with Deepseek a 1:59:06 little bit. I've played with Manis a 1:59:08 little bit. Like I I haven't I haven't 1:59:10 scratched the surface with these 1:59:12 things. 1:59:14 Um I'm gonna dig deep on this image 1:59:16 stuff because there's something about 1:59:18 the fact that it understands both the 1:59:21 language. It's like birectional like it 1:59:24 understands what's in the images and can 1:59:26 describe that and it understands if you 1:59:29 describe something what that looks like. 1:59:31 That's that feels really significant to 1:59:33 me. Um this is exciting. Okay, so Manis 1:59:38 is off writing my screenplay. We're 1:59:41 going to have a produced screenplay by 1:59:42 the end of uh May, which is exciting. 1:59:45 We'll be in production by 1:59:47 June. And uh we probably got to lock 1:59:52 down. I don't know if we want Samuel L. 1:59:56 Jackson. That's a little derivative, 1:59:58 don't you think? I think we should find 2:00:01 who's the new upcoming Samuel L. 2:00:03 Jackson. 2:00:05 Yeah, we'll have to figure that out. So, 2:00:08 we're going to have to book them. So, it 2:00:09 we might have to push it to like July or 2:00:11 August because of their filming schedule 2:00:13 if they're if they're in other things, 2:00:15 you know. So, all right. And I got to 2:00:17 work that in between launching my play 2:00:19 on Broadway. All 2:00:21 right. All right, 2:00:23 everybody. Okay. I can finally get Chat 2:00:26 GPT to generate an illustration for my 2:00:28 novel with zero artistic ability. 2:00:30 Beautiful. Love it. gives me more 2:00:32 reasons to start a new print on demand 2:00:34 brand. It c it sure does actually. 2:00:37 Claude is great for writing articles 2:00:39 actually. Yeah, the fact that that that 2:00:42 this image I I mean 2:00:45 um Ideogram was pretty good at this, but 2:00:47 it feels like this this is going to be 2:00:49 really good at things like t-shirts and 2:00:51 things like that. So, print on demand 2:00:53 might not be a bad call at all. So, all 2:00:57 right everybody, I'm gonna get my ass 2:00:59 out of here. 2:01:01 have yourself a fantastic evening. Go 2:01:03 play with Chat GPT's new image model. If 2:01:05 you don't know about it, just go play 2:01:06 with it. Tell it to do something. Tell 2:01:09 it to uh make a picture with 20 objects 2:01:12 in it and uh and 2:01:16 and you know, have it describe all 20 2:01:19 and then see if they're all in there. 2:01:21 That'll be a good one. And make 2:01:23 transparent stickers. Cloud's great for 2:01:25 article writing. I'm told at Beach Chat 2:01:27 GPT for coding. Gemini is kind of 2:01:29 limited for creative short 2:01:31 things. Cool. Have a good night, 2:01:33 everybody. All right. Peace out. See 2:01:36 y'all later.