Hacker News new | past | comments | ask | show | jobs | submit login

As an individual, I love the idea of pushing to simplify even further to understand these core concepts. For the ecosystem, I like that vector stores make these features accessible to environments outside of Python.



If you ask ChatGPT to give you a cosine similarity function that works against two arrays of floating numbers in any programming language you'll get the code that you need.

Here's one in JavaScript (my prompt was "cosine similarity function for two javascript arrays of floating point numbers"):

    function cosineSimilarity(vecA, vecB) {
        if (vecA.length !== vecB.length) {
            throw "Vectors do not have the same dimensions";
        }
        let dotProduct = 0.0;
        let normA = 0.0;
        let normB = 0.0;
        for (let i = 0; i < vecA.length; i++) {
            dotProduct += vecA[i] * vecB[i];
            normA += vecA[i] ** 2;
            normB += vecB[i] ** 2;
        }
        if (normA === 0 || normB === 0) {
            throw "One of the vectors is zero, cannot compute similarity";
        }
        return dotProduct / (Math.sqrt(normA) * Math.sqrt(normB));
    }
Vector stores really aren't necessary if you're dealing with less than a few hundred thousand vectors - load them up in a bunch of in-memory arrays and run a function like that against them using brute-force.


i get triggered that the norm vars aren't ever really the norms but their squares


I love it!




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: