Let’s say you would want to create an AI bot on topic X. So that if you would ask it in-depth questions about that topic, it is able to understand you and answer you in natural language.
There are two ways to do it (that I know of):
- You fine-tune an existing model with lots of examples
- You use an external vector db with embeddings of the knowledge and you just semantic search and return it
What would be the differences between these two approaches?
Let’s give an example of creating a legal AI bot:
- It seems that with fine-tuning, you are mostly tuning the model to do the right kind of reasoning and “how to” respond to queries, as oppose to actually feeding the complete knowledge of the law plus the lawsuit at hand.
- So you will need to store all the law knowledge plus the lawsuit data into an external vector db to be queried.
If you just do the vector embeddings, it will be able to return the information you need very well, but it probably won’t be very good at using those pieces of information in their reasoning, is that right? Or the base model would be smart enough already to handle that?
And if you only do fine-tuning, honestly I wouldn’t even know how you would fine-tune a model on a huge corpus of new information. Could someone explain how that would work?
Is my understanding correct?
data into an external vector db
You don’t need a “vector DB”. You can use any “normal” SQL DB to store vectors. I use sqlite2, MySQLand PostgreSQL to store vectors in the same DB table row as the text.
If you just do the vector embeddings
And if you only do fine-tuning
Many experienced system designers would more-than-likely prefer to use a combination of these three methods when they build an application like you mention (because one size does not “fit all” cases):
- Full text DB search
- Semantic (vector based) search
- Query FT generative AI model
For example, for short length text search strings, full-text (traditional) DB searches work better than vector-based searches.
For longer text strings, then vector-based searches are are often better than full-text DB searches.
Then, if you do not get a “high-enough score” based on your own use-case threshold, you would fall back to a generative AI query.
You could then store the results of the AI LLM query into your DB and make it a part of your full-text, or vector based search in the future so there is no need to query the generative AI again, as you can get the same results from the DB for the same query.