How "open" open-source AI/LLM tools really are

Written by

Vanessa

11.4.2025

The term "open-source" is familiar from software development: It refers to programs whose source code is generally freely accessible and which may be used, modified and redistributed as desired. There are different types of open-source licenses, some of which can be used without any restrictions (so-called "permissive open-source licenses"), while others have restrictions, e.g. with regard to commercial use or the condition that further developments must be made available to the general public (often referred to as "restrictive open-source licenses").

However, with the arise of artificial intelligence (AI) and large language models (LLMs) in particular, "open-source" has taken on a new dimension. In terms of AI and LLM it’s not just about the code, but also about training data, model weights (i.e. the "memory of the model") and architecture. This leads to misunderstandings and confusion about how "open" an AI or LLM model actually is.

What does "open-source" mean in the context of AI?

"Open-source" can be interpreted differently in the AI context. Many AI models labeled as "open-source" only release the model weights, while training data and training processes usually remain proprietary.

There is no generally accepted definition of open-source in the context of AI. However, the definitions of the Open-source Initiative (OSI) or the Linux Foundation, for example, are helpful

Linux Foundation: The Linux Foundation's Model Openness Framework (MOF) provides a detailed classification and distinguishes three levels of openness based on the availability of components such as model parameters, training code and data sets.

Open-source Initiative: The OSI defines open-source AI as systems that can be freely used, studied, modified and redistributed. It emphasizes that all aspects of the system must be accessible, without restrictions for commercial or private use.

Why is this important for your company?

For companies that want to use AI tools commercially or even develop them further, it is crucial to understand the rights and obligations associated with the use of such models. Not all AI models designated as "open-source" allow unrestricted commercial use, modification or redistribution. It is therefore essential to check the license conditions carefully.

Checklist: What should you look out for?

Understand the product: Analyze whether the AI tool meets your business requirements and what functions it offers.
Compare different sources: It can happen that versions of a tool are available on platforms such as GitHub that were published under different license conditions than on the official website of the provider.
Check the license conditions: You should read the license conditions very carefully. Make absolutely sure that you know the most current and valid conditions. Pay particular attention to whether the license allows commercial use, modification and redistribution.

Commercial use:
First of all, you need to differentiate whether you "only" want to use the output commercially and/or the AI/LLM tool as such. Based on this, you must check whether commercial use is permitted for your intended application. So-called permissive licenses, for example, often allow the commercial use of the tool with a wording as follows or similar: "You are free to use, modify, distribute, and sell the software and its outputs for any purpose, including commercial use, without restrictions, provided that the original copyright notice and disclaimers are retained".

However, if the license states, for example. "You may apply the Outputs for personal use, academic research, or derivative product development but not for commercial purposes without explicit consent", even the commercial use of the Outputs is not freely permitted.

Modification and distribution:
Formulations such as "You may modify, distribute, and use the software under the terms of this License" generally mean that modifications and distribution are generally permitted.

On the other hand, formulations such as "You will not copy, transfer, lease, lend, sell, or sublicense the entire or part of the Services without [...]'s authorization" or "You will not engage in activities to steal network data, such as: reverse engineering, reverse assembly, reverse compilation, translation, or attempting to discover the source code, models, algorithms, and system source code or underlying components of the software" stipulate that modification and distribution are prohibited without express permission.
Pay attention to restrictions: Although many tools allow free commercial use (at least of the output), they impose certain restrictions. Permissive licenses such as the Apache 2.0, MIT or OpenRAIL-M license also provide for such restrictions. These can be seen in the license conditions, for example, as follows:
- Distribution only under the same license (copyleft principle): "If you distribute a modified version, you must license it under the same terms as this License."
- Use of names prohibited: "You may not use the name, trademarks, or logos of the original authors without prior written permission."
- Attribution required: "You must retain the original copyright notice and attribution in all copies or substantial portions of the software."
Clarify liability issues: Many open-source licenses exclude liability and warranty. Think about how you want to deal with possible risks and whether additional agreements or insurance are required.
Document your use: Keep a record of which open-source components you use, which licenses apply and how you comply with the license conditions.

Conclusion and takeaway

The use of open-source AI tools offers many advantages, but also requires a careful understanding of the respective license conditions. Even if an AI tool is advertised as "open-source", you should first check whether it is actually sufficiently "open" for your company's needs. By thoroughly checking and observing the above points, you can avoid legal pitfalls and safely exploit the full potential of these technologies. If you would like an overview of the most important license conditions for the most popular AI tools, please send us an email.

‍

We all read the headlines back in January: "DeepSeek – open-source competition to OpenAI and other LLMs". But what does open-source actually mean in the context of AI/LLM tools and how "open" are the tools in reality? With the arise of artificial intelligence (AI) and large language models (LLMs) in particular, "open-source" has taken on a new dimension. In terms of AI and LLM it’s not just about the code, but also about training data, model weights (i.e. the "memory of the model") and architecture. This leads to misunderstandings and confusion about how "open" an AI or LLM model actually is.

Book a meeting with a Lex Futura Team Member

Back to overview

Get PDF