The people who are maintaining the repository likely have to organise meetings with legal and business teams. When no agreement can be reached, doing nothing is the easiest way forward. They have to do this next to their normal work.
Sending out DCMAs is a different process done, likely done by the legal team.
The PR is not in their repository, it's in the fork. Pull Request refers to "pull this from my repository, here's a link", GitHub just presents it in a convenient interface.
(But you're right, those who send DMCAs are likely to just a send a link to the original repository :)
One can assuredly delete the pull request from your own list of open pull requests, right? It isn't that the information exists in theory on GitHub: it is that it is still listed right there when you go to check the health of the project's open issues/PRs.
> "Model weights aren't part of the release for now, to respect OpenAI TOS and LLaMA license."
Makes sense if they originally licensed the model weights from Meta. Fortunately you can get the weights via torrent without agreeing to the license by visiting facebook's repository and getting the magnet link yourself: https://github.com/facebookresearch/llama/pull/73/files
Didn't explore much, but it seems alpaca-lora has better results for coding tasks. One example I've used was: "Implement quicksort in python.". This is the result with Code alpaca:
def quicksort(arr):
if len(arr) <= 1:
return arr
pivot = arr[0]
left = [arr[i] for i in range(1, len(arr)) if arr[i] < pivot]
right = [arr[i] for i in range(1, len(arr)) if arr[i] > pivot]
return quicksort(left) + [pivot] + quicksort(right)
Shorter and much cleaner, not to mention it works (code alpaca version is broken). Also it matches what ChatGPT generates for me.
The results are pretty good; I wish they'd just publish the models so we can run the inference locally (not too many people have access to 8xA100 to train themselves, though I appreciate including the training data and instructions too).
Anyone with a few hundred bucks to spare can do it by renting GPUs from a cloud provider. It only cost Stanford $600 to create Alpaca from LLAMA. $100 to generate instructions with GPT-3 and $500 to rent cloud GPUs. The license restriction is due to the use of GPT-3 output to train a model.
More like $50 or even $5 or less for the cloud GPUs. Alpaca-7B's compute costs were close to $50 and that was before the 100x cost savings of using LoRA.
A 4bit LoRA fine tune of this project would cost less than $5 to train even up to 30B/33B.
I’d love to see a crowdsourcing platform to donate to specific fine-tuning projects. I would gladly throw some money at someone to do the labor and release the models to the public.
If a couple of us get together and throw in some money we could train it on Lambda Labs hardware like the OP suggests. I would volunteer to do it myself but I don’t know enough about training models to guarantee I am not wasting money with a stupid mistake.
Llama may not be licensed for people to share since you need to apply to get one from Facebook for non commercial use. I think it's more of a license issue
Hopefully similar work can be done with LoRA so the fine-tuning is not as expensive
If someone were to add noise to the llama weights and then retrain a little, would anyone be able to tell? Could that org then pass it off as their own training, MIT licensed for the good of humanity?
> The code runs on a 8xA100 80GB, but can also run on 8xA10040GB or 4xA100 with lower batch size and gradient accumulation steps. To get the GPUs, I suggest using Lambda Labs, best pricing for the best hardware.
I wonder how much it was total in $ for the fine-tuning.
Also, does anyone have some sort of table/formula that relates MB/GB of training data to $ for fine-tuning?
Stanford only spent $500 to fine-tune LLAMA for humam instruction with 52k instructions generated by GPT-3. This probably costs less. The use of GPT to generate the instruction data instead of humans is the massive cost reduction. The actual training for fine-tuning on GPUs is relatively cheap.
Far far less. Alpaca-7B's compute cost was around $60-$70 for Stanford and around $0.60 (yes 60 cents) for equivalent fine tunes using the Parameter Efficient Fine Tuning (PEFT) strategy of Low Rank Adapters (LoRA).
The repo above can be replicated for similar costs. Easily less than $10 for up to 30B using LoRA (which requires only 24GB of VRAM for 30B/33B and smaller).
I feel like the whole Open Source ML scene is slowed down by a strong chilling effect. Everyone seems to be afraid to release models.
Meanwhile, other models are freely available up to alpaca 30b:
https://github.com/underlines/awesome-marketing-datascience/...