Why did you mix proprietary Llama with Free Software?

#5
by JLouisBiz - opened

When you mix proprietary Llama model with Free Software model it means your package cannot really enter the true Free Software or Open Source spaces like Operating Systems.

Why do that?

Mixing a proprietary model like Llama (which uses restrictive licensing, not true Free Software/Open Source) with a Free Software model can create several significant disadvantages, especially for projects aiming to align with Free Software (FSF) or Open Source (OSI) principles. Here are the key drawbacks:

1. Violation of Free Software Principles

  • Freedom Restrictions: Free Software (as defined by the FSF) requires four essential freedoms: use, study, modify, and distribute. Proprietary components (like Llama) deny these freedoms.
  • Copyleft Incompatibility: Strong copyleft licenses (GPL) require derivative works to also be free. Mixing with proprietary code taints the project, making it non-compliant.

2. Exclusion from Major Free Software Repositories

  • Linux Distributions Reject It: Major distros (Debian, Fedora, etc.) strictly follow FSF/OSI guidelines. If a package includes proprietary dependencies, it cannot be included in official repos.
  • F-Droid (for mobile apps) Rejects Non-Free Dependencies: Even a small proprietary component can disqualify an app from being listed.

3. Community Distrust & Fragmentation

  • Free Software Advocates Avoid It: Many developers/users actively avoid projects with proprietary dependencies, reducing adoption.
  • Forking Risk: The community may fork the project to remove proprietary parts, splitting development efforts.

4. Legal & Compliance Risks

  • License Incompatibility: Proprietary licenses (like Meta’s Llama license) often impose usage restrictions (e.g., no commercial use, user caps) that clash with Free Software terms.
  • Patents & IP Risks: Proprietary models may carry hidden legal risks (patents, data restrictions) that Free Software seeks to avoid.

5. Ethical & Philosophical Conflicts

  • Undermines Software Freedom: Mixing proprietary and free components sends a mixed message, weakening advocacy for software freedom.
  • Encourages Vendor Lock-in: Users may become dependent on proprietary tech, contrary to Free Software goals.

6. Maintenance & Longevity Issues

  • Dependency on a Single Vendor: If the proprietary model changes licensing or disappears, the project may break.
  • No Community Control: Free Software thrives on community contributions, but proprietary components block full transparency and collaboration.

Conclusion

For a project to truly belong in Free Software/Open Source ecosystems, it must avoid proprietary dependencies entirely. Mixing them leads to legal, ethical, and practical problems, limiting adoption and trust within the community.

Yeah I was thinking the same. If they wanted to release it with MIT license, why even use a component with a restrictive license, there are so many apache licensed LLMs available.

Got an eyebrow raise from me as well. Right now the model's status is proprietary, not MIT.

LLaMA's text encoder makes the model and the generations subject to the llama3.1 license when deployed via inference.py.

@cai-qi Please address at the earliest! I suggest checking if mistralai/Mistral-7B-Instruct-v0.3 or Qwen/Qwen2.5-7B-Instruct would fit as drop-in replacements.

Got an eyebrow raise from me as well. Right now the model's status is proprietary, not MIT.

LLaMA's text encoder makes the model and the generations subject to the llama3.1 license when deployed via inference.py.

@cai-qi Please address at the earliest! I suggest checking if mistralai/Mistral-7B-Instruct-v0.3 or Qwen/Qwen2.5-7B-Instruct would fit as drop-in replacements.

We might be able to use this? https://huggingface.co/OpenSciLM/Llama-3.1_OpenScholar-8B (AllenAI, Meta, University collab, they don't say anything about adhering to LLAMA 3.1 license, seems like a true Apache 2.0 finetune of LLAMA 3.1) Also there are some other finetunes from universities and some small companies with apache 2.0/MIT license. Almost all of these finetunes also have a page on arxiv. Might be that they are able to change their license when significant research is involved.

Might be that they are able to change their license when significant research is involved.

As far as I know, unfortunately it's not legally possible. OpenSciLM/Llama-3.1_OpenScholar-8B is a finetune over meta-llama/Llama-3.1-8B and this line stands until a court directly says it doesn't,

i. If you distribute or make available the Llama Materials (or any derivative works
thereof), or a product or service (including another AI model) that contains any of them, you shall (A)
provide a copy of this Agreement with any such Llama Materials;

OpenSciLM/Llama-3.1_OpenScholar-8B is a "product" & an "AI model" derivative that must include the llama3.1 license file.

Meta likely doesn't bother with OpenSciLM/Llama-3.1_OpenScholar-8B because it's, on its surface, a research project, similar to how they don't bother with various uncensored assistants and roleplay tunes & merges. But OpenSciLM still broke the license, and the users still need to be cautious to avoid Meta coming to bang on their door, doing a proper legal beat up first, listening to an explanation it's OpenSciLM who granted them the wrong license second.

There still could be some options that use LlamaForCausalLM, if LlamaForCausalLM is a must for this model. There were some projects meant to train new models on LLaMA's architecture but with an open license, though most must be heavily outdated today, like openlm-research/open_llama_7b_v2.

We might be able to use this? https://huggingface.co/OpenSciLM/Llama-3.1_OpenScholar-8B (AllenAI, Meta, University collab, they don't say anything about adhering to LLAMA 3.1 license, seems like a true Apache 2.0 finetune of LLAMA 3.1) Also there are some other finetunes from universities and some small companies with apache 2.0/MIT license. Almost all of these finetunes also have a page on arxiv. Might be that they are able to change their license when significant research is involved.

https://huggingface.co/OpenSciLM/Llama-3.1_OpenScholar-8B/discussions/2

Actually not because that organization wrongly used that model and they tried to re license but you cannot change the license. So the point is that model which you are referencing is proprietary.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment