Semjon Mössinger
Why companies are not yet introducing AI coding tools
Abstract
AI tools such as ChatGPT, GitHub Copilot or Claude Code are already commonplace in many development teams; software development is one of the areas with the highestAIadoption. At the same time, many companies do not yetallowthe use of such tools - usually for understandable reasons relating to confidentiality, data protection, security and compliance. This article categorises the typical concerns, assesses them according to their likelihood of occurrence and impact and shows a pragmatic approach, including countermeasures. The basic idea: a controlled introduction with clear guidelines creates security. Finally, specific recommendations are made for three roles: AI enthusiasts, gatekeepers and decision-makers.
Introduction
The term "AI tool" is used here to refer to AI coding assistants and coding agents: Tools that are based on Large Language Models (LLMs), are integrated into IDEs or platforms and access the code directly during development. A typical process can be described simply: (1) The assistant sends source code snippets and prompts to a model. (2) In the case of AIprogrammingagents, the AI tool can also access the developer's system. (3) The model provides code suggestions or explanatory information, for example for troubleshooting or as research results. (4) The developer checks the result and decides what to incorporate into the code base.
This process gives rise to four areas of risk: what goes out and what does the agent do on my system (transfer of code and data to third parties), what comes in (quality, security and licensing issues with AI-generated results) and what changes in the organisation (impact on processes, roles, competencies and digital sovereignty). The risks are real - but can be significantly reduced through controlled use, clear rules and technical controls.
Sharing code and data with third parties
With GitHub Copilot - as with most AI programming assistants - the "heart" of the system, the Large Language Model (LLM), does not run locally on the user's own hardware but in the cloud and often in the USA. This raises the question of source code confidentialityfirst , followed by the more specific aspect of data protection.
Confidentiality of the code
In companies, AI tools are usually usedvia business licenceswhen they are officially released. These typically use cloud LLMs (e.g.from OpenAI, Google or Anthropic) and often guarantee in the terms of use that transferred content will not be used as training data and will not be stored permanently. Advantage: very good models at manageable costs (often approx.€20-30 per developer per month, for those who want to utilise the potential of agent-based coding also quickly €100-200) with low implementation and operating costs.
If you want to avoid code being processed outside the EU, you can now bookEU data residencywith some providers (e.g.GitHub Copilot). This reduces regulatory risks, but is generally more expensive. Even EU hosting with US providers does not offer complete legal "absolute security", for example due to access options under US law.
In practice, many organisations opt for one of these two cloud variants for reasons of cost, convenience and functionality. For software without special protection requirements, this is often justifiable as long as clear rules apply.
Alternatively, there are local or European-hosted LLMs: on the developer's computer, on centralised hardware (e.g.GPU server) or with an EU provider. Many assistants (e.g.JetBrains AI Assistant, Kilo Code, Cline, OpenCode) can be connected to these. The disadvantage: The most powerful "top models" of the major US providers are generally not available in such scenarios, and integrations (e.g.in GitHub/GitLab) are only available with effort and to a limited extent. This option is particularly useful where other US cloud services are already consistently avoided or where particularly strict protection requirements apply.
Checklist (regardless of the hosting model):
No secrets in the repository (was already mandatory before, becomes even more important with AI). The Coding Agents section shows that further measures are necessary here
Use the tools' ignore/exclude functions to exclude sensitive paths/files.
If necessary, completely exclude particularly sensitive repositories (e.g.using a separate IDE/environment without an AI plugin).
There is a more specific risk if LLM operators inadvertently discloseuser prompts - for example due to configuration or implementation errors (see here for a specific example ) or because employees fall victim to a cyberattack (e.g. phishing). It therefore makes sense to adhere to the checklist even if you trust the operator in principle.
Open source projects are not normally affected by the confidentiality issue because the code is public anyway.
Important: Many providers treat explicit feedback (e.g."thumbs up/down") contractually differently from pure usage, which can lead to the content of a chat beingused forquality improvement or training after all. This should be taken into accountin policies.
Table 1 provides an overview of the options.