Apple’s Big Reveals, OpenAI’s Multi-Step Models, and Firefly Does Video

AI transcript
0:00:01 (upbeat music)
0:00:04 – It’s not what a human curator probably would do,
0:00:08 but the weirdness is almost a feature instead of a bug.
0:00:09 Apple feels like the only company
0:00:11 that could actually make that happen
0:00:13 since they have kind of permissioning
0:00:15 and access across all the apps.
0:00:19 Is our model that’s trained on this very limited data
0:00:23 like truly the best model for making that image or video?
0:00:25 Putting my investor hat on,
0:00:28 this is not a case where I’m necessarily worried
0:00:29 for start apps.
0:00:34 This was a big week in technology.
0:00:36 For one, Apple held its annual flagship event
0:00:39 in Cupertino, releasing new hardware models
0:00:42 of the iPhone, AirPods and Watch,
0:00:45 plus a big step forward in Apple intelligence.
0:00:46 We also had new models drop,
0:00:48 including OpenAI’s 01 models,
0:00:50 focused on multi-step reasoning
0:00:54 and Adobe sneak peek of their new Firefly video model.
0:00:58 So in today’s episode, we break down all this and more,
0:01:00 including new AI features from Spotify,
0:01:04 getting 70% week-over-week feature retention,
0:01:06 IQ versus EQ benchmarks,
0:01:09 and of course, what all of the signals about what’s to come.
0:01:12 This episode was recorded on Thursday, September 12th
0:01:14 in our A16Z San Francisco office
0:01:17 with deal partners, Justine and Olivia Moore.
0:01:19 Now, if you do like this kind of episode,
0:01:22 where we break down the latest and greatest in technology,
0:01:26 let us know by sharing the episode and tagging A16Z.
0:01:27 Let’s get started.
0:01:31 As a reminder, the content here
0:01:33 is for informational purposes only,
0:01:35 should not be taken as legal, business, tax,
0:01:36 or investment advice,
0:01:38 or be used to evaluate any investment or security,
0:01:40 and is not directed at any investors
0:01:43 or potential investors in any A16Z fund.
0:01:45 Please note that A16Z and its affiliates
0:01:46 may also maintain investments
0:01:49 in the companies discussed in this podcast.
0:01:51 For more details, including a link to our investments,
0:01:54 please see a16z.com/disposures.
0:02:02 Earlier this week on Monday,
0:02:04 Apple unveiled a bunch of new products,
0:02:08 iPhone 16, AirPods 4, the Watch Series 10,
0:02:10 but also Apple Intelligence, right?
0:02:11 So a bunch of hardware,
0:02:14 but it seems like they’re upgrading the software stack.
0:02:17 Olivia, you got early access to this, is this right?
0:02:19 – Yes, it was actually open to anyone,
0:02:20 but you had to go through a bunch of steps.
0:02:23 So with some of these new operating systems,
0:02:26 they offer early access to what they call developers,
0:02:29 but if you snoop around on your phone a little bit,
0:02:31 anyone is able to download it and get access
0:02:33 a few weeks ahead of the rest of the crowd.
0:02:38 I think access to iOS 18 should be launching September 16th
0:02:41 for anyone with an iPhone 15 or 16.
0:02:43 – Okay, so you’ve been playing around with this
0:02:46 for two weeks or so, Apple Intelligence.
0:02:46 What did you find?
0:02:47 What are the new features?
0:02:49 Maybe just break them down first,
0:02:51 and then where’d you find the really inspiring,
0:02:53 wow, this might change the world kind of stuff?
0:02:55 And then where maybe do you think things are falling short?
0:02:59 – So Apple Intelligence is a set of new AI powered features
0:03:01 that are native to the iOS operating system.
0:03:04 So they’re already built in to the Apple apps
0:03:05 and to the phone itself.
0:03:08 And we’ve heard that they might charge for it down the line,
0:03:12 but at least right now it’s free to everyone who has iOS 18,
0:03:13 which is really exciting.
0:03:15 A lot of the features honestly were things
0:03:18 that have been available via standalone apps
0:03:19 that you had to download,
0:03:22 and maybe that you had to pay for for a couple years.
0:03:25 One of the classic examples is what they call
0:03:27 visual intelligence, which is actually just uploading
0:03:30 a photo of a dog and getting a report
0:03:31 on what dog breed it possibly is,
0:03:33 or uploading a photo of a plant,
0:03:35 which is nice to not have to have a separate app,
0:03:37 but is it really game changing?
0:03:38 Probably not.
0:03:41 Similarly, they have a new photo editor
0:03:45 where you can intelligently identify a person
0:03:47 in the background of your nice selfie
0:03:49 and one click to remove.
0:03:49 Is it helpful?
0:03:51 Yes, is it that much better
0:03:54 than all of the tools available via Adobe
0:03:56 and other products to do it in a more intense way?
0:03:59 I would say probably not.
0:04:02 I think we both felt the probably most game changing
0:04:05 and exciting features were actually around media search,
0:04:08 ’cause everyone has hundreds, if not in our case,
0:04:11 thousands or tens of thousands of photos and videos
0:04:12 saved on our phone.
0:04:15 And I think iOS has slowly been getting smarter
0:04:18 about trying to auto-label people or locations,
0:04:20 but this is a really big step forward.
0:04:23 So now in natural language, you can search
0:04:25 either by a person name or description,
0:04:28 by a location, by an object,
0:04:30 and it searches not only photos,
0:04:32 but also within videos, which is awesome.
0:04:35 Our mom texted me earlier this week asking,
0:04:38 do you have that video from a couple years ago in Maine
0:04:41 of the seal popping out as we were kayaking?
0:04:44 And I was able to use the new Apple Intelligence
0:04:45 to search and find it.
0:04:47 Yes, it was like 30 seconds into a two-minute video.
0:04:48 Exactly, you might have seen it.
0:04:50 Yeah, and you would never see it in the thumbnail
0:04:53 of the video ’cause it’s just like the ocean,
0:04:56 but I probably would have just ignored her texts before
0:04:58 ’cause I would have known I could never find it,
0:05:01 and this time I could search kayak, Maine, seal,
0:05:03 and it pulled up the video right away.
0:05:04 Yeah. That’s amazing.
0:05:07 I mean, I like to joke how many gigabytes, terabytes,
0:05:10 petabytes of just food photos exist somewhere
0:05:12 on all of our phones that we’ll never see again.
0:05:15 So sounds like that was maybe the most useful feature
0:05:16 that you found.
0:05:18 Yeah, and it also lets you create things
0:05:20 with that media in a new way.
0:05:23 Everyone remembers those sometimes creepy,
0:05:26 sometimes charming memories videos
0:05:28 where it tries to remind you of what you were doing
0:05:31 one day, two years ago, or some trip that you took.
0:05:33 Now you can actually natural language input,
0:05:36 like put together a movie of all my fun times
0:05:38 at the beach, and it does that.
0:05:39 So I think that’s something that Apple
0:05:42 is uniquely positioned to do since they’re the one
0:05:44 that kind of owns all of that media.
0:05:45 That was pretty exciting.
0:05:48 The one maybe disappointment for me
0:05:49 or the thing that feels yet to come
0:05:54 is like a true AI native upgrade to Siri.
0:05:56 It feels like, especially since the launch of,
0:06:01 for example, chat GBT voice, Siri feels so antiquated
0:06:03 as kind of a voice companion.
0:06:05 And they made some improvements to Siri,
0:06:08 like she’s better able to maybe understand
0:06:11 if you interrupt yourself in the middle of a question.
0:06:14 But it still is not action-oriented.
0:06:16 I would love to be able to say like Siri,
0:06:19 call an Uber to this address and have her do that.
0:06:21 And Apple feels like the only company
0:06:23 that could actually make that happen
0:06:25 since they have kind of permissioning and access
0:06:26 across all the apps.
0:06:29 – I mean, I feel like Siri, it almost only comes up
0:06:31 when you don’t want Siri to show up.
0:06:33 But there were a few other updates, right?
0:06:37 Notification summaries, maybe the kind of upgrades
0:06:39 that you would only see on your phone
0:06:41 because that’s where you have that root access.
0:06:42 Do you guys have any thoughts
0:06:44 around you’re talking about this maybe like next evolution
0:06:47 of AI native software on these devices?
0:06:50 Like when will we get that from Apple?
0:06:53 And maybe any thoughts around what that might look like?
0:06:55 – It does feel like this release was a big step forward
0:06:56 in a couple of ways.
0:06:59 They could have done things like the object removal
0:07:01 or the better search and photos for a long time.
0:07:02 And they had just not done it.
0:07:04 And I think a lot of people felt like
0:07:06 they’re just choosing not to do it.
0:07:08 They’re just choosing to let the third-party app ecosystem
0:07:10 do these sorts of things.
0:07:13 But I think these releases show that they are serious
0:07:15 about doing some of these things internally
0:07:19 and having them be native to the iOS ecosystem.
0:07:22 I personally will be really curious to see if they do more
0:07:25 around the original purpose of the phone, which is calls.
0:07:28 Historically, there’s very little you can do
0:07:29 around a call, right?
0:07:31 Like you make a call, maybe you can merge calls,
0:07:33 maybe you can add someone,
0:07:35 but like they have not wanted to touch it much.
0:07:37 And I think the new call recordings
0:07:40 and transcripts feature is pretty crazy
0:07:43 because historically they’ve made it impossible
0:07:45 to do a voice recording or anything like that
0:07:46 when you’re making a call.
0:07:48 And now they’re actually enabling,
0:07:50 hey, we’re gonna have this AI co-pilot
0:07:52 that makes a little noise at the beginning
0:07:54 that sits and listens to your calls.
0:07:56 And eventually you could see them saying,
0:07:58 hey, there was a takeaway from this call,
0:07:59 which was schedule a meeting.
0:08:01 And like in your Apple calendar,
0:08:02 it should be able to do that
0:08:04 and send the invite to someone.
0:08:06 – So now if you launch a call,
0:08:07 you can press a button on the top left
0:08:08 that says record.
0:08:11 It does play like a little two to three second chime
0:08:12 that the other person will hear
0:08:14 that says this call is being recorded.
0:08:16 But then once the call is completed,
0:08:18 it saves a transcript down in your Apple notes
0:08:20 as well as some takeaways.
0:08:23 I think the quality is okay, but not great.
0:08:25 I would imagine they’ll improve it over time.
0:08:29 But again, there’s so many people and so many apps now
0:08:31 that have a lot of users and make a lot of money
0:08:33 from things that seem small,
0:08:35 like older people who have doctor’s appointments
0:08:37 over the phone and they need to record
0:08:40 and transcribe the calls for their relatives.
0:08:42 That actually does a shocking amount of volume.
0:08:45 And so I think this new update is showing them
0:08:48 maybe pushing towards some of those use cases
0:08:50 and allowing consumers to do that more easily.
0:08:51 – Yeah, and maybe just to round this out,
0:08:53 this idea to both of your points,
0:08:55 there are so many third party developers
0:08:57 who have historically created these apps.
0:09:00 I mean, you mentioned the ability to detect a plant.
0:09:02 If you go on app magic or data.ai,
0:09:07 like you can see there are pretty massive apps
0:09:08 that that’s their single use case,
0:09:10 but it works, people need it.
0:09:11 What happens to those companies?
0:09:15 What does it signal in terms of Apple’s decision
0:09:16 to capitalize on that
0:09:18 and maybe less so have this open ecosystem
0:09:20 for third parties?
0:09:22 – Yeah, I think it kind of raises an interesting question
0:09:25 about what is the point of a consumer product often?
0:09:26 Like, is it just a utility
0:09:29 in which case Apple might be able to replace it?
0:09:31 Or does it become a social product or a community?
0:09:34 Say there’s a standalone plant app
0:09:37 and then there’s the Apple plant identifier.
0:09:39 You might stay on the standalone plant app
0:09:42 if you have uploaded all of these photos of plants
0:09:44 that you want to store there.
0:09:46 And now you have friends around the country
0:09:48 who like similar types of plants
0:09:49 and they upload and comment,
0:09:52 like it becomes like a straw for plants type thing,
0:09:54 which sounds ridiculous,
0:09:55 but there’s massive communities
0:09:58 of these like vertical social networks.
0:10:01 And so I think there’s still like huge opportunity
0:10:03 for independent consumer developers.
0:10:04 The question is just like,
0:10:06 how do you move beyond being a pure utility
0:10:08 to keeping people on the product
0:10:11 for other things that they can’t get from like an Apple?
0:10:12 – Yeah, I agree.
0:10:14 I think putting my investor hat on,
0:10:16 this is not a case where I’m necessarily worried
0:10:17 for startups.
0:10:19 I think what Apple showed through this also
0:10:21 is they’re going to build
0:10:22 probably the simplest,
0:10:25 most utility oriented version of these features.
0:10:28 And they’re not gonna do an extremely complex build out
0:10:31 with lots of workflows and lots of customization.
0:10:34 So yes, they might kill some of these standalone apps
0:10:37 that are being run as cash flow generating side projects,
0:10:40 but I don’t see them as much of a real risk
0:10:43 to some of the deeper venture backed companies
0:10:45 that are maybe more ambitious in the product scope.
0:10:52 – If we think about utility, right?
0:10:54 One of the ways that you drive utility
0:10:55 is through a better model.
0:10:57 So maybe we should talk about some of the new models
0:10:58 that came out this week.
0:11:00 We’ll pivot to OpenAI first.
0:11:01 As of today, as we’re recording,
0:11:04 they released their new 01 models,
0:11:06 which are focused on multi-step reasoning
0:11:08 instead of just answering directly.
0:11:10 In fact, I think the model even says like,
0:11:12 I thought about this for 32 seconds.
0:11:15 The article they released said that the model performed
0:11:18 similarly to PhD students on challenging benchmark tasks
0:11:20 and physics, chemistry and biology,
0:11:23 and that it excels in math and coding.
0:11:25 They even said that a qualifying exam
0:11:28 for the International Mathematics Olympiad GPT-4O,
0:11:29 so the previous model,
0:11:31 correctly solved only 13% of problems
0:11:33 while the reasoning model that they just released
0:11:35 scored an 83%.
0:11:36 So it’s a huge difference.
0:11:38 And this is something actually like a lot of researchers
0:11:39 have been talking about, right?
0:11:43 This next step, I guess opening thoughts on the model
0:11:45 and maybe what it signifies, what you’re seeing.
0:11:46 – Yeah, it’s a huge evolution
0:11:49 and it’s been very hyped for a long time.
0:11:50 So I think people are really excited
0:11:52 to see it come out and perform well.
0:11:54 I think even beyond the like really complex,
0:11:57 like physics, biology, chemistry stuff,
0:12:00 we could see the older models struggle
0:12:01 even with basic reasoning.
0:12:03 So we saw this with the whole like,
0:12:06 how many Rs are in strawberry fiasco?
0:12:07 – Yes, yeah.
0:12:08 – And I think what that essentially comes from
0:12:11 is these models are like next token predictors.
0:12:14 And so they’re not necessarily like thinking logically about,
0:12:17 oh, I’m predicting that this should be the answer.
0:12:19 But like, if I actually think deeply about the next step,
0:12:22 should I check that there are this many Rs in strawberries?
0:12:24 Is there like another database I can search?
0:12:27 What would a human do to verify and validate
0:12:29 whatever solution they came up with to a problem?
0:12:32 And I think over the past,
0:12:33 I don’t know, year, year and a half or so,
0:12:37 researchers had found that you could do that decently
0:12:39 while yourself through prompting,
0:12:42 like when you asked a question saying,
0:12:46 think about this step by step and explain your reasoning
0:12:49 and the models would get two different answers
0:12:52 on like basic questions than they would get
0:12:54 if you just ask the question.
0:12:55 And so I think it’s really powerful
0:12:57 to bring that into the models themselves.
0:12:59 So they’re like self-reflective
0:13:02 instead of requiring the user to know how to prompt,
0:13:04 like chain of thought reasoning.
0:13:05 – I agree.
0:13:06 I think it’s really exciting actually
0:13:09 for categories like consumer ed tech
0:13:11 where actually a huge in some months,
0:13:14 like a majority of chat GBT usage is by people
0:13:16 with the .edu email address
0:13:18 or have been using it to generate essays.
0:13:20 But historically it’s been pretty limited
0:13:22 to writing history, those kinds of things.
0:13:25 Because as you said, these models are just famously bad
0:13:27 at math and science and other subjects
0:13:30 that require maybe deeper and more complex reasoning.
0:13:32 And so a lot of the products we’ve seen there
0:13:34 because the models are limited
0:13:37 have been the take a picture of my math homework
0:13:41 and go find the answer online, which is fine and good.
0:13:44 And a lot of those companies will make a lot of money.
0:13:46 But I think we have an opportunity now
0:13:48 to build deeper ed tech products
0:13:51 that change how people actually learn
0:13:54 because the models are able to correctly reason
0:13:56 through the steps and explain them to the user.
0:13:59 – And when you use it today, you can see the steps
0:14:02 in which it thought about something.
0:14:05 So by default, it doesn’t show you all the steps,
0:14:07 but if you want or need to see the steps
0:14:09 like for a learning process, you can get them.
0:14:11 – I did test it out right before this
0:14:13 with the classic consulting question,
0:14:16 how many golf balls can fit in a 747?
0:14:17 – And?
0:14:20 – And 01, the new model got it completely correct
0:14:21 in 23 seconds.
0:14:23 I tested it on 4.0, the old model.
0:14:25 It was off by 2x, 3x.
0:14:27 And it took longer to generate.
0:14:30 So very small sample size, but promising early results there.
0:14:31 – No, it’s important.
0:14:33 I think I saw you tweet something about this recently,
0:14:35 this ed tech angle or slant on this.
0:14:37 A lot of people are up in arms saying
0:14:39 this technology is being used in classrooms.
0:14:41 And I think you had a really interesting take,
0:14:44 which was like, okay, this is actually pushing us
0:14:47 to force teachers to educate in a way
0:14:49 where you can’t use this technology.
0:14:51 And you have to think and develop reasoning.
0:14:52 – It’s funny, I had found a TikTok
0:14:54 that was going viral that showed
0:14:56 there’s all these new Chrome extensions for students
0:14:59 where you can attach it to Canvas or whatever system
0:15:01 you’re using to take a test or do homework.
0:15:04 And you just screenshot the question now directly.
0:15:07 And it pulls up the answer and tells you it’s A, B, C, or D.
0:15:10 And in some ways it’s like, okay, cheating.
0:15:12 Do you really wanna pay for your kid to go to college
0:15:13 to be doing that?
0:15:16 But on the other hand, before all of these models
0:15:19 and these tools, most kids were still just googling
0:15:22 those questions and picking multiple choice.
0:15:24 And you could argue a multiple choice question
0:15:27 for a lot of subjects is probably not actually
0:15:29 the best way to encourage learning
0:15:30 or to encourage the type of learning
0:15:33 that’s actually gonna make them successful in life.
0:15:35 – Or to even assess true understanding.
0:15:37 Like when someone does a multiple choice answer,
0:15:39 you have no idea if they guessed randomly,
0:15:40 if they got to the right answer
0:15:43 but had the wrong process and they were lucky,
0:15:44 or if they actually knew what they were doing.
0:15:46 – Yeah, and I think the calculator comparison
0:15:50 has been made before in terms of AI’s impact on learning.
0:15:53 But similar to the fact that now that we have calculators,
0:15:54 it took a while, it took decades,
0:15:57 but they teach kids math differently
0:15:58 and maybe focus on different things.
0:16:01 And they did pre-calculator when it was all by hand.
0:16:04 I’m hoping and thinking the same will happen with AI
0:16:07 or eventually the quality of learning is improved.
0:16:09 And maybe because it’s easier to cheat
0:16:13 on the things that are not as helpful for true understanding.
0:16:15 – Right, and I mean, if we think about
0:16:17 this just came out today,
0:16:19 is this a signal of what’s to come
0:16:21 for all of the other models
0:16:24 or at least the large foundational models?
0:16:26 Or do you see some sort of separation
0:16:28 in the way different companies approach their models
0:16:31 and think about how they think per se?
0:16:32 – It’s a great question.
0:16:35 I think we’re starting to see a little bit of a divergence
0:16:39 between general intelligence and emotional intelligence.
0:16:40 And so if you’re building a model
0:16:42 that’s generally intelligent
0:16:45 and you maybe want it to have the right answers
0:16:47 to these complex problems,
0:16:50 whether it’s physics, math, logic, whatever.
0:16:54 And I think folks like OpenAI or Anthropic or Google
0:16:55 are probably focused on having
0:16:58 these strong general intelligence models.
0:17:01 And so they’ll all probably implement similar things
0:17:03 and are likely doing so now.
0:17:05 And then there’s sort of this newer branch of companies,
0:17:06 I would say, that are saying,
0:17:08 hey, we don’t actually want the model
0:17:10 that’s the best in the world at solving math problems
0:17:11 or coding.
0:17:14 We are building a consumer app
0:17:17 or we are building an enterprise customer support agent
0:17:18 or whatever.
0:17:21 And we want one that feels like talking to a human
0:17:22 and is truly empathetic
0:17:24 and can take on different personalities
0:17:26 and is more emotionally intelligent.
0:17:29 And so I think we’re reaching this really interesting
0:17:32 branching off point where you have probably most
0:17:34 of the big labs focused on general intelligence
0:17:37 and other companies focused on emotional intelligence
0:17:39 and the longer tail of those use cases.
0:17:40 – That is so interesting.
0:17:42 Do we have benchmarks for that?
0:17:44 As in there’s obviously benchmarks for the,
0:17:45 how does it do on math?
0:17:49 And because we’re not quite at the perfection
0:17:51 in terms of utility, that’s what people are measuring.
0:17:53 But have you guys seen any sort of–
0:17:53 – I haven’t.
0:17:56 So I feel like for certain communities of consumers
0:17:59 using it for like therapy or companionship or whatever,
0:18:02 if you go on the subreddits for those products
0:18:05 or communities, you will find users
0:18:08 that have created their own really primitive benchmarks
0:18:09 of like, hey, I took these 10 models
0:18:11 and I asked them all of these questions
0:18:13 and here’s how I scored them.
0:18:15 But I don’t think there’s been like emotional intelligence
0:18:17 benchmarks at scale.
0:18:18 – A Redditor might create it.
0:18:19 – Yeah. – I would not be surprised.
0:18:20 – Yes.
0:18:22 – Maybe I’m still listening to this.
0:18:24 Reach out definitely if you’re building that.
0:18:31 – Hey, it’s Steph.
0:18:34 You might know that before my time at A16Z,
0:18:36 I used to work at a company called The Hustle.
0:18:38 And then we were acquired by HubSpot
0:18:40 where I helped build their podcast network.
0:18:42 But while I’m not there anymore,
0:18:44 I’m still a big fan of HubSpot podcasts,
0:18:47 especially My First Million.
0:18:48 In fact, I’ve listened to pretty much
0:18:51 all 600 of their episodes.
0:18:52 My First Million is perfect for those of you
0:18:55 who are always trying to stay ahead of the curve
0:18:58 or in some cases, take matters into your own hands
0:19:00 by building the future yourself.
0:19:03 Hosted by my friends, Sampar and Sean Curry,
0:19:05 who have each built and sold eight figure businesses
0:19:07 to Amazon and HubSpot,
0:19:08 the show explores business ideas
0:19:10 that you can start tomorrow.
0:19:12 Plus, Sam and Sean jam alongside guests
0:19:15 like Mr. Beast, Rob Dyrdek, Tim Ferriss,
0:19:18 and every so often, you’ll even find me there.
0:19:20 From gas station pizza and egg carton businesses
0:19:23 doing millions, all the way up to several guests
0:19:25 making their first billion.
0:19:26 Go check out My First Million
0:19:28 wherever you get your podcasts.
0:19:30 (upbeat music)
0:19:38 – I think that also relates to the idea
0:19:41 that these models ultimately in themselves aren’t products.
0:19:43 They’re embedded in products.
0:19:48 I think Olivia, you shared a Spotify Daylist tweet
0:19:50 about how that was just like a really great way
0:19:52 for an incumbent, because all of the incumbents
0:19:55 aren’t trying to embed these models in some way.
0:19:57 You said it was a really great case study
0:19:58 of how to do that well.
0:20:00 – Yeah, so Spotify Daylist, we both love this feature.
0:20:02 – I’m gonna bring up mine to see what it says.
0:20:03 This is a risky move.
0:20:04 – It is risky.
0:20:07 – I never share my Spotify wrapped
0:20:09 because basically it’s just an embarrassment.
0:20:10 – But that’s part of the emotional–
0:20:13 – Okay, mine is gentle, wistful Thursday afternoon.
0:20:15 – That’s actually much better than it could have been for you.
0:20:16 – Yeah, I get a lot of scream.
0:20:18 – I say that lovingly.
0:20:20 – So basically what Spotify Daylist does
0:20:22 is it’s a new feature in Spotify
0:20:25 that kind of analyzes all of your past listening behavior
0:20:29 and it curates a playlist based on the types of music,
0:20:32 emotionally, genre-wise, mood-wise
0:20:34 that you typically listen to at that time.
0:20:37 – And it makes three a day, I think, or four a day?
0:20:38 – Oh, wow. – By default.
0:20:39 – Yes. – And then it switches
0:20:41 out every six or so hours.
0:20:42 – Exactly.
0:20:44 And the feature was very successful.
0:20:46 So Spotify CEO tweeted recently,
0:20:49 I think it was something like 70% of users
0:20:51 are returning week over week,
0:20:53 which is a really, really good retention,
0:20:56 especially since it’s not easy to get to within Spotify.
0:20:59 You have to go to the search bar in Search Daylist.
0:21:01 – Mine is pinned now, if you click it enough.
0:21:03 (all laughing)
0:21:04 – That’s not surprising.
0:21:05 – It’s really fun.
0:21:08 And I think why it works so well,
0:21:10 and so many other incumbents have just tried
0:21:13 to tack on a generalist AI feature,
0:21:15 but this one is great because it utilizes
0:21:18 existing data that it has on you,
0:21:20 executed in a way that doesn’t feel invasive,
0:21:22 but instead feels delightful,
0:21:24 and it’s not just like a fun one-off novelty,
0:21:27 but actually the recommendations are quite good.
0:21:30 So you will end up listening to it fairly often,
0:21:33 and that’s why I think people come back week over week,
0:21:36 as well as it still has that novelty of like,
0:21:38 it said something crazy about me,
0:21:40 I’m gonna screenshot on my Instagram
0:21:42 and make sure my friends know
0:21:43 that this is how I’m feeling right now.
0:21:46 – Yeah, the daylists have gone totally viral
0:21:48 among Gen Z teens in particular.
0:21:50 They’re posting all over, like TikTok and Twitter,
0:21:53 like the crazy words in their daylists.
0:21:55 ‘Cause I think what Spotify does is it takes the data,
0:21:57 it runs it through an LLM and asks,
0:21:59 what’s sort of a fun description of this playlist?
0:22:01 But since it’s not a human,
0:22:02 the descriptions are often like,
0:22:05 normcore, black cat frightened,
0:22:07 like panics Thursday morning,
0:22:09 and you’re like, what is this even mean?
0:22:10 – But it resonates a little bit.
0:22:11 – Yeah, but you’re like– – It’s true.
0:22:13 – My mind is right,
0:22:15 but I’m also confused in a way
0:22:16 that will keep me coming back
0:22:18 to see what the next daylist is.
0:22:19 – Yes.
0:22:21 – And yeah, it’s like inherently viral
0:22:24 in a way that I’ve only seen on RAPT,
0:22:26 probably for Spotify.
0:22:27 – I would say another example
0:22:29 of a good implementation of AI
0:22:32 in a similarly both interesting,
0:22:35 but also viral way would be on Twitter, grok,
0:22:37 their new AI chatbot.
0:22:40 A lot of the read my tweets and roast my account
0:22:43 or draft a tweet with this person’s tone
0:22:44 based on all the tweets.
0:22:46 Similarly, that’s taking the existing data
0:22:48 they have on you and creating something
0:22:50 that’s like fun and charitable and interesting
0:22:51 and doesn’t feel invasive
0:22:53 ’cause you’re going and making the request
0:22:55 versus it like pushing something into your feed.
0:22:56 – Totally, yeah.
0:22:59 – Yeah, and maybe the takeaway is this idea
0:23:01 that the best model doesn’t necessarily
0:23:02 equal the best product.
0:23:03 – Yeah. – Yeah.
0:23:05 – I think he quote tweeted Nick St. Pierre,
0:23:07 who said, “Remember Dolly three,”
0:23:09 when it came out and everyone was talking
0:23:11 about how the prompt coherence is so good.
0:23:13 And then his point was,
0:23:15 how many people are still using this model anymore?
0:23:17 And the answer, I think, is not many.
0:23:19 – Yeah, I think there’s a couple of angles to that.
0:23:21 So for a day list in particular,
0:23:25 it’s not the most accurate LLM description
0:23:26 of what your music taste is.
0:23:29 Like it’s not what a human curator probably would do,
0:23:33 but the weirdness is almost a feature instead of a bug.
0:23:34 Like this is sort of an example
0:23:36 of the emotional intelligence versus the general intelligence,
0:23:38 which is like, it knows what the person wants.
0:23:40 So not like a dry,
0:23:44 oh, you listen to like soft country on Thursday mornings.
0:23:46 I think the other part is on the creative tool side
0:23:48 we’ve seen, which is there’s different styles
0:23:49 for different people,
0:23:52 but also like how do products fit in your workflow?
0:23:54 How easy are they to use?
0:23:55 Are there social features?
0:23:57 Is there easy remixing?
0:23:59 Like all of the things that make consumer products
0:24:01 grow and retain historically
0:24:03 can drive a worse model to do better
0:24:06 than a quote unquote, better model.
0:24:09 – Yeah, and I think it differs across modalities.
0:24:13 Again, Spotify is probably using an LLM to generate these.
0:24:16 And it’s not the most complicated LLM in the world, right?
0:24:17 But it’s good enough
0:24:20 to generate interesting enough descriptions.
0:24:22 I would say for most text models
0:24:24 and even most image models,
0:24:27 the gap between like great open source model
0:24:30 or great publicly available model
0:24:32 and like best in class private model,
0:24:36 there’s a gap, but it’s not like a golf necessarily.
0:24:38 Versus in video and music
0:24:41 and some of the other more complex modalities,
0:24:43 there is still a pretty big golf
0:24:45 between what the best companies have privately
0:24:50 and what is maybe available to via open source or others.
0:24:51 And so I think we’ve seen,
0:24:54 at least if the text and image trend continues,
0:24:58 that will probably shrink over time across modalities.
0:25:00 And what that means again, is it’s not,
0:25:03 does this team have the best researchers,
0:25:04 especially for consumer products,
0:25:07 but does this team have the best understanding
0:25:10 of the workflows, the structure of the output,
0:25:12 the integrations, the consumer behavior
0:25:14 and emotionality behind it
0:25:16 that will allow them to build the best product,
0:25:19 even if it’s not the best model,
0:25:20 but the model is good enough
0:25:22 to fulfill the use case, totally.
0:25:25 – Right, how important is it for these companies
0:25:27 that are using stuff that’s open source
0:25:29 to fine tune it for their own use case?
0:25:30 Like how important is it for them
0:25:34 to modify the model itself versus just being clever
0:25:37 with retention hacks or product design, things like that?
0:25:41 – I think it depends on the exact product and use case.
0:25:44 I mean, we’ve seen cases where people go totally viral
0:25:48 by taking a base flux or stable diffusion or whatever
0:25:51 and allowing people to upload 10 images from it
0:25:54 and it trains allura view and makes really cool images,
0:25:56 but the company didn’t fine tune their own model for us.
0:25:58 – Like all the avatar apps that have made the 10s,
0:26:01 if not in some cases, hundreds of millions of dollars.
0:26:03 Like maybe there’s a fine tune there,
0:26:04 but it’s probably pretty basic.
0:26:08 – Yes, but then I think, so in the consumer side,
0:26:11 usually the base models through prompting
0:26:13 or designing the product experience around it,
0:26:15 you can get like pretty far.
0:26:17 I think in enterprise,
0:26:19 we’re starting to see a little bit more
0:26:20 need to fine tune models around,
0:26:22 I’ve talked to a bunch of companies, for example,
0:26:26 doing product photography for furniture or like big items
0:26:28 where you don’t have to do a big photo shoot,
0:26:30 you can just have AI generate the images
0:26:34 and you might, for that, want to fine tune the base model
0:26:37 on a giant data set of couches from every possible angle.
0:26:39 So it gets an understanding of like,
0:26:41 how do you generate the back shot
0:26:44 when you only have the side or the front shot of the couch?
0:26:46 – ‘Cause the bar is just so much higher there
0:26:49 in terms of the usable output versus a consumer product
0:26:53 where in so many cases, the randomness is part of the fun.
0:27:01 – Well, on the note of models, let’s talk about one more.
0:27:04 Adobe released their Firefly video model.
0:27:06 Firefly was released in March, 2023,
0:27:08 but that was text to image.
0:27:11 And so now they’re releasing this video model.
0:27:13 They released a sneak peek on Wednesday.
0:27:16 They also said, interestingly enough, since March, 2023,
0:27:19 the community has now generated 12 billion,
0:27:22 what they say, images and vectors, which is a ton.
0:27:25 And now they’re, again, they’re moving toward video
0:27:27 and they released a few generations
0:27:30 that were all created in under two minutes each.
0:27:31 What are your thoughts?
0:27:32 – Adobe is a super interesting case
0:27:34 because how they describe their models
0:27:37 is they only train on the safest,
0:27:40 most responsibly licensed data
0:27:42 and they don’t train on user data.
0:27:44 And so I think historically,
0:27:46 they’ve been a little bit sort of hamstrung
0:27:49 in terms of just the pure text to image
0:27:51 or probably text to video quality
0:27:53 because when you really limit the dataset
0:27:56 compared to all the other players in the space,
0:27:58 the outputs typically aren’t as high quality.
0:28:00 I will say where they’ve done super well
0:28:04 is like, how do you bring AI into products
0:28:05 that people are already using?
0:28:07 I don’t know if this is counted
0:28:10 in the firefly numbers of generations, I guess it was,
0:28:13 but they’ve gotten really good at like, within Photoshop,
0:28:15 you can now do generative expand
0:28:17 where you got a photo that was portrait
0:28:19 and wanted it to be landscape or whatever,
0:28:22 and you can just drop the photo in, hit the crop button,
0:28:25 drag out the sides, and then firefly generates
0:28:27 everything that should have been around
0:28:30 the original image kind of like filling in the blanks.
0:28:32 – And I’ve also seen even like viral TikTok trends
0:28:36 of someone uploading a photo of themselves standing somewhere
0:28:38 and then using generative fill to kind of zoom out
0:28:41 and see whatever the AI thinks they were standing on,
0:28:44 which I think is like reflective of the fact
0:28:48 that Adobe for the first time has made that available.
0:28:51 Like typically they’ve been desktop based pretty heavy
0:28:55 in a positive way, like complex products,
0:28:58 but with AI, they’ve now put firefly on the web for free,
0:29:01 they have a mobile app in Adobe Express,
0:29:04 like they’re really going after consumers
0:29:07 in a way that I think we haven’t seen them do before.
0:29:10 I will say like reading the blog posts
0:29:11 for the new video model,
0:29:14 it did seem very focused on professional video creators
0:29:17 and how to embed into their workflows.
0:29:19 Like, okay, you have one shot,
0:29:21 what’s the natural next shot that comes out of that
0:29:23 and how do we help you generate that
0:29:26 versus a pure consumer video generator?
0:29:27 – Yeah, which makes sense I think
0:29:30 because what has really resonated with them in image
0:29:33 is I would say generative fill and generative expand,
0:29:35 which is sort of taking an existing asset
0:29:38 and saying, you know, if this was bigger,
0:29:41 what would be included or I want to erase this thing,
0:29:42 which they really shine in honestly,
0:29:45 like I still use those features all the time.
0:29:46 – Yeah, yeah.
0:29:47 – I know they’ve announced in the past
0:29:51 that they’re also going to be bringing some other models,
0:29:53 video models into their products.
0:29:55 Like I think Sora and Pika and others.
0:29:58 And so I don’t at least see this as their attempt
0:30:02 to like be the dominant killer all in one video model,
0:30:06 but maybe starting to integrate with some of their own tech.
0:30:09 – They have a really interesting opportunity
0:30:12 because they have so many users of saying,
0:30:15 okay, if we just want to have the best possible AI
0:30:19 creative co-pilot is our model
0:30:21 that’s trained on this very limited data,
0:30:24 like truly the best model for making that image or video,
0:30:27 or should we give users a choice between our model
0:30:29 and these like four other models
0:30:31 that we basically offer within our ecosystem,
0:30:34 which I think if they do go that ladder route,
0:30:36 which they’ve sort of signaled they will,
0:30:38 is a really interesting distribution opportunity
0:30:41 for startups because most startups have no way
0:30:44 of reaching the hundreds of millions of consumers
0:30:47 at once that are using Adobe products.
0:30:48 – That’s a great point.
0:30:49 So I didn’t even realize this,
0:30:50 but they’ve said that they likely want
0:30:52 to bring in these other models
0:30:54 and they can be the model that says their creator first
0:30:56 and make sure that they’re only using certain rights,
0:30:58 but then they can integrate these other models
0:31:00 and maintain their dominance
0:31:02 with however many people have Adobe subscriptions.
0:31:03 – Yes, exactly.
0:31:06 I think they’ve talked about that extensively for video
0:31:10 and I think they reiterated that with maybe Pika specifically
0:31:11 with the most recent release,
0:31:13 but before they had talked about kind of SOAR
0:31:15 and other video models as well.
0:31:16 They’re pretty interesting.
0:31:17 I think even for years,
0:31:20 they’ve allowed outside companies and developers
0:31:22 to build plugins on top of the Adobe suite.
0:31:24 And some of them seem like things
0:31:26 that Adobe itself would want to build,
0:31:29 like for example, a way to kind of products ties
0:31:31 your preset editing settings
0:31:33 and allow anyone else to use those.
0:31:35 You might think that Adobe could do that,
0:31:38 but if I were them, I would be thinking,
0:31:40 hey, actually we may not build
0:31:43 the AI native version of Adobe ourselves,
0:31:45 but we will become stickier as a product
0:31:49 if we let others build those AI native tools
0:31:51 and make them usable in Adobe
0:31:54 versus sending those people that build their own products
0:31:57 and pull users away from the Adobe suite.
0:31:59 I think we still feel like there will be one,
0:32:03 if not several standalone AI native Adobe’s
0:32:06 that come out of this era, but yeah, we’ll see.
0:32:07 – Well, to your point, it does feel like the model
0:32:10 that was shown in their article was more based on,
0:32:12 like you said, people who come with existing content,
0:32:16 who can up level it or chop it up in some unique way,
0:32:17 but not too much.
0:32:19 As you said, AI native, let’s start from nothing.
0:32:21 Let’s start from a text prompt or something like that.
0:32:23 Well, this has been great.
0:32:24 Stuff is moving so quickly.
0:32:26 So we’ll have to do this again in a few weeks
0:32:27 when I’m sure there’ll be many more models,
0:32:29 many more sandbox announcements, all that.
0:32:31 – Yes, awesome.
0:32:32 Thank you for having us.
0:32:33 – Of course, thank you.
0:32:37 All right, you heard it here.
0:32:40 If you’d like to see these kind of timely episodes continue,
0:32:42 you’ve got to let us know.
0:32:44 Put in your vote by sharing this episode
0:32:49 or leaving us a review at writthispodcast.com/a16z.
0:32:50 You can also always reach us
0:32:55 with future episode suggestions at potpitches@a16z.com.
0:32:56 Thanks so much for listening
0:32:58 and we will see you next time.
0:33:00 (upbeat music)
0:33:10 [BLANK_AUDIO]

This week in consumer tech: Apple’s big reveals, OpenAI’s multi-step reasoning, and Adobe Firefly’s video model.

Olivia Moore and Justine Moore, Partners on the a16z Consumer team, break down the latest announcements and how these product launches will shape the tech ecosystem, transform AI-powered experiences, and impact startups competing for attention in a fast-moving market.

 

Resources: 

Find Justine of Twitter: https://x.com/venturetwins

Find Olivia on Twitter: https://x.com/omooretweets

 

Stay Updated: 

Let us know what you think: https://ratethispodcast.com/a16z

Find a16z on Twitter: https://twitter.com/a16z

Find a16z on LinkedIn: https://www.linkedin.com/company/a16z

Subscribe on your favorite podcast app: https://a16z.simplecast.com/

Follow our host: https://twitter.com/stephsmithio

Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures.

Leave a Comment