Unification of capabilities. We have significantly simplified the interface of the /embeddings endpoint by merging the five separate models shown above (
code-search-code) into a single new model. This single representation performs better than our previous embedding models across a diverse set of text search, sentence similarity, and code search benchmarks.
Longer context. The context length of the new model is increased by a factor of four, from 2048 to 8192, making it more convenient to work with long documents.
Smaller embedding size. The new embeddings have only 1536 dimensions, one-eighth the size of
davinci-001 embeddings, making the new embeddings more cost effective in working with vector databases.
Reduced price. We have reduced the price of new embedding models by 90% compared to old models of the same size. The new model achieves better or similar performance as the old Davinci models at a 99.8% lower price.
Overall, the new embedding model is a much more powerful tool for natural language processing and code tasks. We are excited to see how our customers will use it to create even more capable applications in their respective fields.
#improved #embedding #model