Born in Silicon Valley: Revolutionizing Data Preparation for Machine Learning
Back in August, MLtwist CEO and co-founder David Smith had the opportunity to sit down with Match Relevant for a deep dive into the cutting-edge solutions that are transforming the world of data preparation for AI and machine learning models. We talked about unifying data across different annotation systems and streamlining the process for data scientists and ML engineers to give them back valuable time needed to refine and test their models.
0:00
[Applause] [Music] I’m Jake aren vill Royale born and raised in Silicon Valley and here to
0:06
take you behind the scenes to share what it’s like to be a startup Founder The Journey they’re on the problems they face and the products they build in an
0:12
effort to transform Industries I’m excited to have with us today David Smith co-founder and CEO of ml twist
0:19
David welcome to the show Hey Jake great to be here thanks for having you here we
0:24
have a little bit of a background here we’re both from the Bay Area and we’ll talk a little bit more about that but
0:30
we’ll also get into your company ml twist before we do that a little bit more about David he is um not just the
0:38
founder but also has held leadership roles at companies like Google Oracle double click Newar and has has been
0:46
through four Acquisitions he has focused on enabling strategic data for AI and
0:51
launched firstof its kind data Partnerships with companies like Oracle Google JD Power Twitter Etc David holds
0:59
a Bachelor of Science degree in computer science and engineering from UC Davis
1:04
and happy to have him here with us today before we jump in here David where are you calling from today San Jose
1:10
California hundreds of AI startups are launching every month battling to build their founding teams as a leader your
1:16
job is get results when it comes to hiring that’s where it gets tough so you go out and you try a recruitment firm
1:21
but they don’t understand your story they’re off Target and when they send you candidates it’s a waste of time we believe you should never have your time
1:28
wasted that’s why we launch match Rel cuz your story is more than just an open role it’s your Founder’s Journey the
1:34
problem you’re solving the product you’re building and why it matters when we work with companies we make sure we understand your whole story so we go out
1:40
and do a search we’re on target it’s worth their time they’re interested and more importantly it’s worth yours and
1:46
when it comes to hiring Engineers we work to make sure we get it right by deploying a team of seasoned CTO that
1:52
have built some of silicon Valley’s best companies they can collaborate with you in the technical interviewing process
1:57
they can be a sounding board or they can run for you when it comes to building teams there’s no time to waste let’s
2:03
make it count if you have a role that needs to be filled book a time with a hiring guide at match relevant. comom
2:08
and learn how we do it great we’re both from there so it’s surprising we don’t
2:14
have more of a connected Network at some level maybe we do probably be whole other
2:22
podcast exactly both been an Oracle and I’m sure there’s some lines across at
2:28
some point but anyway um yeah want to um just kind of walk through um your background a little bit why don’t you
2:34
kind of give us a how did you get into technology and in the into the startup ecosystem yeah so I guess technology I
2:43
I’ve always been you know that that kind of stereotypical Dungeons and Dragons
2:49
and into computers at a young age all that stuff Magic the Gathering in high
2:54
school and I ended up um getting a computer engineering degre degree from
3:00
UC Davis but I graduated during the bust so I couldn’t get a job right away uh
3:09
out of college and I actually ended up in Europe uh doing different things I
3:15
was a bartender at one point uh in Paris which was pretty cool and uh ended up
3:21
working as a sales engineer for uh at the time what I didn’t know but uh was
3:27
NLP uh an early ination doing different things I was a bartender at one point in
3:34
Paris which was pretty cool and ended up working as a sales engineer for at the
3:39
time what I didn’t know but was an early interation of NLP and then I ended up at
3:46
eventually joining a company that got bought by double click that company got bought by Google got to got to be a part
3:53
of that and and then from then on ended up working at some other companies like market share that got bought by new star
4:00
that got bought by Golden Gate capital and I left them to join Oracle just before Golden Gate sold that business to
4:07
to TransUnion so it’s been it’s been a a crazy ride and I’m back in back in the Silicon Valley yeah that’s great yeah
4:16
there’s a lot of T twists and turns in your in your career I like the title of
4:21
your company ml twist um you know there’s so much conversation happening
4:27
around Ai and data and what it all means um talk to us a little bit about what
4:34
inspired you to start your current company with the background you have and what was the problem you saw that you
4:40
thought you know what there’s an opportunity here to to dig in and build something I’ve been dealing with data
4:46
for a long time like at all those companies that talking about even the one in Europe data has always been kind
4:53
of front and center and AI sort of added a whole new dimension to everything so
4:58
data’s always been interesting to work with where did you get it who’s touched it where has it
5:04
been what rights do you have and that’s before you even start talking about the quality right so what I thought was
5:12
really interesting was when I AI so back in 20121 or 2020 when AI was becoming
5:18
more and more prominent and even before then I’d been heavily involved in in making sure data was ready for for
5:24
models whether they were machine learning or some of the early basian U models it dawned on me that there’s a
5:32
lot more to be done around around working with data specifically for AI
5:38
that some of the more traditional ETL extract transform load Technologies
5:43
didn’t really deal with so that for me was kind of an opportunity to say hey I
5:48
love working with data there’s something some good I think I can do out there let’s let’s give this a shot and and and
5:55
see where it takes us yeah you know for engineers this is probably very easy to
6:02
understand but for the Layman that is not a data engineer or really isn’t technical when we talk about llms large
6:10
language models and ingesting data you know it’s bringing data into systems and
6:16
models that you can then take and and hopefully make better decisions with that data create some sort of
6:23
opportunity out of it maybe it’s helping optimize how you run your business or maybe Revenue opportunities but you talk
6:30
about strategic data that you provide for companies kind of let’s take a step
6:35
back and and talk talk to us about in the data space what what are you solving
6:41
for the companies that might already have llms or they might already be you know pretty deep into trying to make use
6:48
of their data where do you fit there yeah so a lot of our customers they
6:54
actually have already built the AI the first the first iteration of it and a lot of them go oh this is we we can do
7:01
this we need it to be better right so a lot of our customers are trying to solve some problem that problem might be I
7:08
need to identify a fault in a in some material coming down a manufacturing
7:14
line it might be uh I have a news article and I want to automatically
7:19
classify all of the you know all of the references to buildings it might be an
7:24
investor who’s like I want to take uh the transcript from an earnings report and automatically you know extract all
7:31
the all the information or look look for certain signals so companies the the the
7:37
thing that’s tempting and the thing that’s also you got to be careful with is AI is it can kind of let your
7:42
imagination run wild so a lot of these companies are doing different things we’ll talk about some of the specifics
7:48
later but what you’ll do is you’ll typically take some thing that you have
7:55
throw it at AI get AI to do something to to make a very basic understanding of
8:00
what what you’re throwing at it and then you’ll go great I need you to do that every single time and that’s when it
8:07
gets tricky because what you’ll find is that the model will do pretty well on like a very small data set or a data set
8:13
that you threw at it but to get it to do better and better on more and more
8:18
versions of that data let’s say it’s you know back to the cracken the the crack in some products coming down the cracks
8:25
might look different there there’s always going to be like an exception to the rule and what you end up with is you
8:31
end up with a need to throw more and more data at your model and that’s kind
8:36
of when you start to decide okay am I am I am I going to do this myself is this
8:42
is this my thing I’m gonna you know or or do I partner do I use other
8:48
technologies that are more into getting the data ready for my model and then I can just focus on building the model
8:55
itself so when you look at a company for example I don’t know the US government
9:01
or one of your clients by the way or you know an organization that’s got tons of
9:07
data they’ve already put an llm together they’ve got the data in there and then
9:12
it starts to crack or break or just isn’t functioning when they start layering more data in there what do your
9:20
platform do or what’s how do you help solve that problem so if you look at the
9:26
AI data space right now you’re going to see hundreds of companies that are all
9:31
building Technologies focused on AI data and what ml twist is more focused on is
9:39
the the process of getting that data ready so my co-founder Audrey she’s
9:46
she’s in data operations it’s a it’s a fairly it’s a role that’s becoming more and more prominent so effectively the
9:53
people like who get the data ready for the AI models and they’re not necessarily Engineers or data SCI
9:59
scientists like they can be lawyers looking through client briefs and trying to tell a model what what they’re
10:05
looking for or it can be you know we talking about the government could be a security specialist That’s like looking
10:10
for something uh that somebody should not have on their body so these people
10:15
they kind of need to be set up in an environment where they can easily identify use their human you know their
10:22
their intelligence their expertise to to Signal stuff to a model and then have
10:28
the model pick up on on it and ml twist is focused on making that happen at a
10:33
super high level so what you end up with if you go into the weeds is you kind of
10:40
end up with a hundred things that need to happen and if any one of those things doesn’t go the way it should it can
10:47
actually ruin the the other pieces so ml twist kind of facilitates all that by
10:54
making those 100 things more automated and giving people the flexibility to use different data tools to work on the data
11:01
that they need and then change the data tool or keep that data tool and add a
11:07
new one for some new data that they’re working on in in a in an environment that’s kind of like no code so you don’t
11:14
need to write code to to get this to work and and it’s giving these data
11:19
operations people the ability to have the flexibility to work on data without needing to to to be data scientists or
11:27
or uh Engineers themselves that’s great you know creating a product or a
11:32
platform or a solution is one thing but actually getting it to the right people is another when you’re out there
11:39
presenting or pitching or getting new customers who are you selling to is it
11:44
the people you talked about that don’t have to be Engineers that are going to use this no code platform to do their
11:51
job better and if it is them how are you getting to them like what’s what’s your go to market strategy that’s working for
11:57
you there yeah so it’s uh surprisingly it’s actually well surprisingly it’s it’s product managers so often times
12:04
product managers have an AI remit they are mandated to either build AI or
12:10
integrate AI into a product that they’re managing so they typically collaborate with a team of data scientists to make
12:17
that happen and eventually those data scientists will either say we need help
12:22
or there’s some sort of model drift so the model you know was doing great and now the model’s not doing great anymore
12:28
and we think we have data problem so we’ll typically talk with them to to
12:33
talk through okay hey like what talk us through your process what tools are you using how are you going about it what
12:38
are you trying to do with the data what kind of data do you have those are all things that are are are organic to the
12:45
conversation um data operations people are typically within somewhere within
12:51
those teams they’re influencers and and and they’re very critical to that conversation as well um so it’s kind of
12:58
like when you look at AI it’s actually it’s a massive team effort across engineering operations experts and and
13:06
and typically product management is one of the roles that leads that yeah that’s
13:11
great so just walk me through the product itself so if you’re listening and you’re a product manager and maybe
13:17
you’re involved in the data strategy for your company and maybe you’re hitting some walls or just in general trying to
13:23
figure out like hey what are the tools out there that can really make my job better what do they get are they logging
13:29
into a system with a dashboard are they seeing like data buckets that they need
13:35
to connect like what’s what’s it look like walk us through that a little bit yeah that’s right so if you to to kind
13:41
of like in an Ideal World if you could wave a magic wand and just have your data ready for AI that you would you
13:48
would probably do that right like if if that option existed that’s what you would do because you’re not actually
13:55
focused like your business is not going to to really make margin by getting the
14:00
data ready your business is going to make its margin by actually building good AI that works and then is is correctly deployed and and and continues
14:07
to advance usually those are part and parcel so the AI is is linked to the
14:12
data prep our platform is someone logs into a dashboard points it to
14:18
unstructured data where it lives can decide hey this unstructured data is going to go through this type of AI to
14:24
take a first pass at it it’s what we call pre-labeling then that data then goes into an AI data tool again this is
14:32
stuff that um the customer can select or if they don’t they they they have options if they’re not they’re not sure
14:38
what to select they assign it to experts so either within their own company or
14:43
people that they approved outside of their company to work on that data and then after that the data goes through a
14:50
semi-automated quality control process there’s like ml twist looks at the raw Json the the files that get created from
14:58
that whole process that that is effectively the work that the person did and creates a report we call it an hrr
15:05
human readable report um that takes all this this this language called Json turns into something that is easier to
15:12
understand and then when that’s all said and done they hit the go button and then the data takes it from whatever tool
15:19
they were using in whatever format it it spits out data in and changes it into
15:24
the format that the data scientists actually use for their own models so as a product manager as a user you don’t
15:30
really have to know all of that all you really are is like hey I want to work on
15:36
some data today I want to process this data I want to you know so so you don’t
15:41
you don’t really need to know behind the scenes what’s going on you’re more like point it at data visualize it work on it
15:47
and then and then eventually get it to the next toop got it so what’s the
15:53
benefit to the company or to the product manager by using your platform so these SE are all things that
15:59
they need to figure out right so so most of the industry today in our experience
16:05
actually are not like using there’s a lot of tools there’s billions of dollars that have been spent on different tools
16:11
out there for AI data is still interesting to me that oftentimes a lot of the companies we talk to do open-
16:17
Source stuff they’re like no this this tool is unique to us it does the thing that we need to do but eventually what
16:25
data scientists want to get out of their data continues to evolve and and outpaces the tooling that’s available in
16:31
the market today so these product managers will probably need to sign up with either either build their own tools
16:38
themselves and hire a bunch of Engineers and to maintain build maintain and then continue to upgrade or they’ll take a
16:45
third party tool and even then the third party tool has an API every single tool
16:51
worth its salt has has an API integration because they know things need to be prepared in a certain way
16:57
before it can be pushed to their tool and then worked on with within their tool so those product managers also need
17:03
to get engineering resources to do the API integration um and then what happens
17:09
is typically they’ll have an update where they need to keep that tool but there’s like a new data file so maybe
17:15
you were working on video and now you need to do text or audio and now you
17:20
need another tool and you kind of need to like continue to to build this these
17:25
Integrations to push your data into these tools pull them from these tools and
17:31
effectively that that’s your world if you’re a a product manager ml twist kind of comes in goes hey you you don’t need
17:38
to do that like you you can just use the platform it’ll push the data where it needs to go and then it’ll it’ll pull
17:44
the data and then you still get to leverage all these amazing tools that are out there in the ecosystem that’s great sounds like it
17:51
solves a lot of problems and complexity where everyone’s still trying to figure out what to do with the data and I think
17:57
the biggest question is if you already set it up and and your your system starts to break down like who do you go
18:04
to it sounds like you can help with that aspect of it um you know when you talk
18:11
about Ai and there’s so many different routes to go in it it’s I think the next
18:16
transformation that we’re hitting I we’ both through been through the.com boom and bust and so you know it’s we’ve gone
18:22
through different Transformations from Cloud to mobile now ai uh I think is probably going to be bigger than any of
18:29
them except for the internet obviously but when you look at it um you know you
18:36
were you were sharing that the we’ve already ingested the world’s data on the
18:41
internet in within you know a year or maybe a couple years um kind of where do
18:47
we go from here I mean yeah what’s give us your perspective there yeah it’s
18:53
pretty crazy I mean even a year or two ago this idea that like um so there’s
18:59
reports out there that open AI effectively have like gone through all of the worlds all the data they can get
19:05
their hands on to train their llms um I don’t know if that’s correct or not they have they haven’t commented on it but to
19:11
me is an astounding concept and there’s two there’s kind of like three it’s it’s
19:17
sort of like PI aath right um and there’s different camps on what to do so
19:23
one Camp is saying well we’re generating new data at a rate that is unprecedented so like just wait a year and then you
19:29
have more data to feed your llm um there’s another Camp that’s like oh well
19:34
we should create data you know we should do augmented data synthetic data these other Concepts we should we should like
19:41
create data to train them all um but I think the camp that is really has kind
19:46
of made the most Headway in terms of ability to pursue that and get results is going back to the data that you
19:54
ingested and then improving the quality there’s a lot to be talked about in
19:59
terms of what does that actually mean but I believe that’s what we’re seeing so we are now seeing an uptick and
20:06
people first off like there’s Gartner reported that 4% of companies in a in
20:12
one of the reports that they ran said that their data is AI ready so yes there are some companies that are ahead of the
20:18
curve but the average company probably does not have its own data ready for AI and for those companies that are ahead
20:25
of the curve and kind of like gone through everything in the kitchen sink what we’re seeing is that those companies are going back to the data and
20:31
going oh we need to we need to make this data better to improve our AI in fact
20:37
arguably that’s better than throwing more data at it there’s a research by Andrew in professor over at Stanford who
20:45
says that and he you know one of the thought leaders in AI he says he he had
20:50
a presentation that showed that you need to throw roughly two to three times the amount of data to your model to get the
20:58
same type typ of performance as if that data was was of good quality versus like
21:03
okay quality and there’s other research that shows if you throw bad data at your model and quality that’s lower the model
21:10
actually degrades and goes backwards in performance so I think you’re going to see a lot of people going to their data
21:16
sets and then going okay let’s let’s make this data better yeah well you know
21:22
we’re hearing from almost every CEO of Fortune 500 companies that we talk to
21:27
that they need a Strate in place for AI and they’re going to their CTO and their data teams and saying what are we going
21:33
to do with AI what’s our strategy how can we become a better company generate more Revenue whatever the strategy is
21:40
that it’s all centered around the data so there’s this huge gap of what providers can do for these companies
21:47
that are still trying to figure out their own strategy and then once they do have a strategy what tools they’re going
21:52
to use that are going to help them maximize their people and their time to to get the value out of it so
21:59
I I really I like the space you’re in I know there’s a lot of competition out there what’s the biggest challenge you
22:04
face today as a company I would go back to what I mentioned before a lot of people are
22:09
still uh so one of the things I get told all the time is hey I’m using open source it’s free and what you tend to
22:17
find is you tend to find well you still do need to pay for the engineers who are supporting the open to open source
22:23
tooling and and and being able to to continue to build on that so of the one
22:28
of the concepts is just companies getting further along the AI curve is kind of the way I think about it because
22:35
eventually there there’s a reason why these other platforms the other data tools exist these other companies exist
22:42
and it’s because eventually things things kind of hit a Breaking Point and it makes sense to to to take advantage
22:49
of some of the the third part technology that’s out there um so I I guess what
22:56
I’m really saying is it’s a long road we’re at we’re all at the beginning of
23:01
it when it comes to AI a lot of companies have a lot of initiatives and as they move from kind of this let’s
23:08
experiment and get some ideas down to hey we really need to improve the performance of of the models that we’re
23:15
building or the models that we’re leveraging that’s kind of going to naturally and organically push teams to
23:22
see if there are other companies who’ve built solutions that can help them navigate some of the craziness that’s
23:27
going on in the world of makes sense I I want to talk a little bit about data pipelines you know they’ve been around
23:33
for a long time why did data pipelines for AI need to be different
23:39
yeah data pipeline so so data pipelines have always been like it’s it’s kind of notorious they’ve always been around
23:45
they’ve always been these things that have been they are they are special you you typically every company has a team
23:52
of data Engineers supporting data pipelines AI data is weird what happened with AI data is you had this idea of
24:00
quality so back in the day uh let’s say that you were in you you need to export your customer list from Salesforce to to
24:08
to Google ad manager well you have a list of people and the qu is pretty simple did that list make it over to the
24:14
other side right it was like a checkbox a 01 there’s a way to know if you you did the job the data was good was there
24:21
any corruption some the names not fully make it across things like that what AI
24:27
did was it introduced this concept of quality in a different sense where it
24:32
was more based on a human’s on a person’s perception so you could have
24:38
done everything right the formats right the the the the data you know the data
24:43
made its way across but if the thing that the that is being described was wrong like you say it’s a cat and it’s a
24:50
dog or things like that um it can look right but it can still be wrong and
24:56
that’s where AI data started to add complexity it’s this idea that we’re now Beyond data engineers and data
25:04
scientists and and and computer engineers and that AI is actually like it’s a human it’s a human concept and
25:11
with that you need subject matter experts to effectively get involved in
25:16
the data pipeline process and say yeah that that looks right that that looks wrong um and that has caused all sorts
25:23
of differences in the way that data needs to be worked on that are sort of
25:28
very off-the-beaten path of like I’ve got a database here I’m trying to get the data over to that that side and you
25:35
know and and then we’re we’re good so for me that was an opportunity to say
25:40
hey like let’s let let’s start to create technology that is very very tuned to
25:46
this to the concept of AI data versus some of the other ETL that’s out there
25:51
got it you know there’s a lot of Buzz about Ai and transparently you know
25:58
all of our customers today are AI startups and that wasn’t the case you know a couple years ago we know there’s
26:06
real opportunities and Innovation happening and a lot of money being thrown at that Innovation there’s also a
26:12
lot of buzz on social media what do you think the current state of AI is from your
26:17
perspective so I mean ml twist started genan 2021 I
26:23
think AI didn’t really get its its oh my gosh this is real moment until
26:28
you know when when open AI did their thing and and and that was several years
26:34
later and it’s really impressive it on the flip side a lot of our customers are
26:41
still interested in the good oldfashioned image recognition other parts of uh of other
26:49
types of AI that are out there machine learning so I’m not trying to say that
26:55
there a lot of companies raising money because everyone now has always understood and I think now now really
27:00
understands that this is game changer this is like you said it’s you know how do you compare it to to the the to the
27:07
internet um on the flip side there’s going to be
27:13
a curve of companies there’s a difference between B to C and B2B B to C
27:18
you can have a large language model hallucinate and it’s like it’s like you know with search like you can have for
27:25
search results that are not very good that’s that’s okay in the B2B world you’re going to find a lot of use cases
27:32
where that’s not okay and what you’re seeing is you’re going to see I think a lot of businesses try to figure out how
27:39
do we take um these people who who are very very good at their at what they do
27:46
and then use AI to help them that is still a a thing that’s progressing so I
27:51
think it’s a longwinded answer to say there is a lot of opportunity and at the
27:57
same time I think that there are there’s still a long way to go before we we
28:03
really start to see I think what what AI is going to H how it’s going to transform
28:08
businesses yeah well I I think that makes a lot of sense I know we’re kind
28:13
of in the early stages it seems like we’re progressing very quickly you know as a company you often times have a
28:19
North star and you shoot for it but you make a lot of left and right decisions has has there been any major pivots or
28:25
shifts you’ve had to make when you started with your concept and where you’re at today as a company yeah so one of the things that
28:33
happened to us fairly early was we won an award from the US government on that
28:38
was more focused on Building Technology to extract to interpret data and from
28:47
that we were like oh okay we can do this so not only do you have to grab the data and and kind you also have to figure out
28:53
the AI piece of of of interpreting it um
28:58
what we noticed was that we were spending a ton of our time working out the the the data flow piece and not as
29:06
much time in the modeling piece so is this program that the department of energy awarded us called an sbir small
29:14
business Innovation I believe research award from that we kind of did away with
29:21
with the idea of okay we should be building the the the AI thing and we should instead be far more focused on
29:27
just getting the data ready and then allow our customers to focus on building
29:33
the models with the idea that eventually the models can be plugged into the data so that they can assist with making with
29:40
the data preparation process um but let customers like focus at what they’re good at and then let us and then there’s
29:47
enough work to do on the data prep side so I would say the reason why I would kind of a failure is we did not proceed
29:53
to like the phase two of that award uh but in doing the phase one and building what we had uh built we
30:00
realized there were applications for other other entities other companies so for example that we recently did a a
30:07
webinar with siia National Laboratory that uses ml twist to process data for
30:13
the TSA and get that data ready we’re not building the threat detection
30:18
algorithms we’re fully focused on just the data processing the data transformation the data labeling getting
30:24
everything ready for those companies so in a way by not advancing to the phase
30:30
two and by having built in this this data processing piece within the phase one that we were able to to kind of
30:37
continue going forward with it was a kind of a a major transformation for the
30:43
company of okay this this is something that people want and people and people need and and and we took the business
30:49
that way yeah that’s great you know every company goes through breakthroughs
30:54
personally as well as sometimes technically or otherwise what’s been a breakthrough for you that you felt has
31:01
really been an accelerator for the company I would say there’s there’s a couple things um the biggest is just the
31:09
advancement of the so so in that story I just said at the time the base models were still not that good nowadays you
31:16
plug into you know goo so we’re we’re we’re on all the different clouds uh gcp
31:22
Azure AWS they all have these base models that are getting better and better and better so for me a
31:29
breakthrough like Facebook just launched their segment anything model version too which allows you to throw an image at it
31:35
and then it does a better job of like trying trying to identify what’s going on so the idea of you that you can that
31:41
the world is multimodel and that you can throw a lot of these models to data and
31:47
then get your data the quality up by by still integrating expertise I think for
31:53
us has been uh has been a big thing the other one was and this is going to say silly but the hrrs the human readable
32:00
reports uh were kind of transformational that was sort of inspired by Audrey co-founder who ended up saying listen
32:07
like that’s great that you’ve got all these Json but like me and the rest of the team don’t understand what the heck’s going on in these things like
32:13
like you could but they’re they’re a nightmare at a come through so the other the other breakthrough was um taking not
32:20
the it was effectively taking that and turning it into something that people could actually look at and derive
32:27
insights from which then influenced them going back to the data and fixing things that that looked off or or checking
32:33
things that looked off and validating no they they look off but that that’s actually correct um so those are two
32:39
things that I think were were fairly uh big for ML twist yeah that’s great well
32:46
we’re heading into 2025 already you know a quarter a little over a quarter away
32:51
what what are you excited about what’s on the road map for ML twist the the goal for ML twist was
32:58
always to take what we’re doing and then uh make it so that uh a a person a data
33:04
operations person doesn’t actually need to know what’s happening behind the scenes so today if you look at the
33:10
platform uh customers still fairly aware of hey we need to use this technology and that technology or they’ll they’ll
33:17
use some of the defaults that we have but the ability to effectively start to
33:22
automate those Concepts like I was asked by by someone close to me who’s stamp
33:28
collector they said hey I need to get these I I have like a a poster of all
33:33
these stamps and I need to effectively visualize stamp one by one and then I
33:39
need to identify which stamp belongs to where it is on that poster and I was
33:44
thinking to myself this is you know as as AI starts to advance we’re going to
33:49
get more and more people who are not data scientists who are not Engineers who want to do cool stuff but need help
33:57
like organizing and and and if you’re going to build AI you actually do have to start you you do have to use the data
34:03
you have to you’re going to have to do that yourself and apply your own intelligence your own magic but then
34:09
once you apply it you can then fairly easily Port that to to some of the base
34:15
models and and train them and then off to you’re Off to the Races so in terms of where we’re where we’re working to go
34:22
it’s important that we get the core right and then after that can we take the the the understanding people have to
34:30
have about their AI data pipelines and kind of make it so that they do not need to know those details and they can just
34:37
focus on on what they’re trying to do so that’s that is kind of our take of where AI would fit in our world for for
34:43
ourselves on on things that we can build and develop yeah that’s great well you
34:48
know every company that starts has a journey and a path it sounds like you’re on a good one what’s something you wish
34:54
you would have known before you started the company was that on the list of questions I’m kidding so I think what
35:03
would have been good to know was when you when you raise like how important
35:11
raising is um I kind of I think a little bit naively thought hey I’m a coder I’m
35:18
going to build something that’s useful and then we’re going to get lots of customers and continue to to do that
35:25
what I I’ve learned is that there are people who who who can do that um but in
35:32
today’s world getting investors on board who believe in in what you’re doing
35:38
believe in you know like like what you’re doing and then want to be a part of it um has been really really powerful
35:47
um so I think before a lot of times you’re like just just just very focused
35:53
on the product and then you realize no like you need to have backers and then that translates into the team you need
35:59
to make sure that the people who are also especially when you’re a smaller team are are on board also believe in in
36:06
what’s happening and and are aware of the problems that are out there so it it can start if you’re a solo founder it
36:12
can start with an investor but then it very quickly translates once you use that Capital to bring on Engineers to
36:19
bring on to bring on teams to to the people around you so I think another
36:24
it’s just the the old adage if you want to go fast go alone if you want to go far go together I would say that that’s
36:31
probably one of the biggest things that I um I have a new appreciation for uh
36:38
now now that we’re a few years down the road yeah well that’s great I love that
36:43
hopefully others will learn from that too um as we wrap up here David I want to thank you for your time and really
36:50
Having the courage to come on and tell your story and for all the listeners spending your time with us today it means a lot to me that you’ve made it to
36:57
the show um I’m the host Jake Aon vill roale signing off for now but can’t wait
37:02
to catch up with you all on the next episode until then David everyone else
37:08
take care if you like what we’re doing don’t forget to subscribe leave a review on Apple podcast or wherever you listen
37:15
and follow us on YouTube where we go behind the scenes to learn what it takes to be a startup founder
37:21
Join to learn how Sandia National Labs ran into this challenge when building AI for the TSA,
and how they overcame it.
June 25, 2024 / 2pm EST / 11am PST
The Ultimate Guide to AI Data Pipelines: Learn how to Build, Maintain and Update your pipes for your unstructured data