category
We are entering the era of small & highly efficient models!
 
Context
I reported a few days ago about a new state-of-the-art, open-source model that outperforms all other models, including GPT-4.
This model is SQLCoder-70B.
In a nutshell, based on Meta’s recent CodeLlama-70B, Defog leveraged its own hand-crafted dataset and built a new, fine-tuned model.
The outcome? well see for yourself:

The model greatly outperforms GPT-4 and a wide range of SQL tasks!
Read more: You can read all about it, and test the model, here.
From SQLCoder-70B to SQLCoder-7B
Unfortunately, 70B parameter models are still too large to consider for offline integrations or to run them on your laptop.
Model Distillation
Model distillation is a machine learning process that teaches a smaller, simpler “student” model to act like a bigger, more complex “teacher” model. By learning from the teacher’s outputs, the student can make similar decisions without needing…
- 登录 发表评论