Subscribe to the feed

Open source software is the backbone of the modern technology landscape. Enterprises small and large, across industries, rely on open source projects to power critical applications and infrastructure. With the rise of AI-driven code generation tools, developers have a whole new frontier to explore. But while AI-generated contributions might supercharge productivity, they also raise new concerns around security, safety and governance. Below we explore the dynamics of open source projects, how AI-generated code can influence enterprise software and what considerations and best practices you should keep in mind to make sure that your AI-generated contributions maintain the appropriate levels of code security and can be sustainably adopted.

The dynamics of open source and its influence on enterprise software

The rise of open source in the enterprise

Open source software has grown from a community-driven effort to an essential building block of enterprise solutions. As more companies recognize the cost, flexibility and innovation benefits of using open source libraries and frameworks, the once niche community has evolved into a massive, globally distributed development ecosystem.

AI-generated contributions

In parallel, AI-powered coding assistants, such as GitHub Copilot and similar tools, have become increasingly sophisticated. They can:

  • Generate boilerplate code for tasks like creating APIs or microservices
  • Suggest enhancements or optimizations based on best practices gleaned from analyzing large code repositories
  • Prototype new features or bug fixes rapidly

When combined with open source projects, these AI-generated snippets and contributions have the potential to accelerate development cycles and expand the capabilities of software used by enterprises worldwide.

The security and reliability of AI-generated code

While AI-generated code can be a huge time-saver, it can also pose unique risks if not properly vetted and managed.

Potential vulnerabilities

Unintended security holes: 

AI training data may contain outdated or vulnerable code patterns. The model might replicate these patterns in its suggestions, inadvertently introducing exploits, such as SQL injections, insecure data handling and XSS vulnerabilities.

AI coding assistants are typically trained on vast swaths of publicly available repositories. This includes code with known and sometimes undisclosed security vulnerabilities. For example, a model suggests a code snippet for user authentication that relies on outdated hashing algorithms, such as MD5 or SHA-1 instead of more secure methods like bcrypt or Argon2. Models may replicate unsafe patterns for database handling, leading to SQL injection holes if not caught by code reviewers.

Malicious or backdoor insertion:

A concerning potential risk is the manipulation of training data or user prompts to deliberately introduce backdoors or malicious code. For example, attackers “poison” the open source ecosystem by committing hidden vulnerabilities into repositories that become part of the AI training set. An AI coding assistant is prompted with a snippet that includes a concealed Trojan function or backdoor credentials, which the AI then consistently reproduces as a “suggested fix” for similar problems.

Lack of context and overreliance on “magic” solutions:

AI might generate code that appears correct on the surface but fails under specific architectural or environmental conditions. If developers rely too heavily on AI suggestions without in-depth understanding, subtle design flaws or logical errors can slip into production. For example, a snippet for an authentication service is secure in isolation but interacts with other components in a way that leaks user session data. The AI suggests a library for crypto operations without understanding your specific compliance requirements, such as FIPS compliance, resulting in an insecure or non-compliant implementation.

A related issue is that people can become over-reliant on AI, and thus, not develop a better understanding of the larger codebase, which in turn can lead to bad human decisions and more security errors.

Use of deprecated or unsupported libraries:

AI-based suggestions may reference or rely on libraries or dependencies that are deprecated, unmaintained or have unpatched security issues. For example,  the generated code snippet includes a library that hasn’t received updates in years and is known to have known security vulnerabilities with common vulnerabilities and exposures (CVEs) assigned to them. Developers remain unaware until a security audit flags critical issues in production.

Hard-coded secrets or credentials:

In some instances, AI may suggest placeholders for credentials—sometimes even hard-coded secrets—particularly if it has seen such patterns in training data. AI code generation includes a line DB_PASSWORD = "root123"purely as an example, but a rushed developer might forget to replace or externalize it, potentially leading to serious security breaches.

License and copyright compliance: 

AI models often train on vast repositories of open source code with various licenses. Enterprises must confirm that AI-generated code does not violate license terms or contain copyrighted snippets that could expose them to legal risks.

Although it’s not strictly a “security” vulnerability, it can still have a high-impact. AI tools may inadvertently produce code copied from existing copyrighted works or license-restricted code. For example AI based coding assistants which autocomplete code, can do so from proprietary library code that can’t legally be integrated into an open source project under certain licenses. Copyleft licenses, such as GPL, may require open sourcing the entire project if the generated snippet is considered a derivative work.

Code quality concerns

Lack of context: AI assistants may not fully understand the context or architecture of an entire application. This can result in solutions that appear to work but harbor design flaws that surface later in the software development lifecycle.

Insufficient documentation: AI tools can provide code but often produce limited or generic documentation. This makes it harder for open source contributors and enterprise teams to maintain the code effectively.

Key considerations when using AI-generated code

Rigorous code review practices

Human oversight is still crucial, no matter how sophisticated the AI tool is. Create formal processes for the thorough peer review of AI-generated code, focusing on the following:

Security testing: Automated security scanning tools, such as static analysis and dynamic testing, can detect common vulnerabilities early.

Best practice checks: Validate architecture, compliance and coding standards.

License and compliance checks

Enterprises must maintain strict due diligence regarding the licensing of AI-generated code. Some actions include:

  • Automated license scanning: Use tools to detect potential license conflicts
  • Clear documentation: Keep records of how and where AI-generated code was obtained. This is especially important when distributing software to end users or third parties

Ethical and data privacy concerns

Depending on the domain, code might handle sensitive data or be subject to regulatory requirements like GDPR and HIPAA. Vet AI-generated code for how it processes, stores or transmits personal or confidential information.

Continuous learning and training

Teams should stay updated on:

  • Evolving AI tools: AI coding assistants continuously improve. New versions might have better security or more refined code generation capabilities
  • Open source updates: Security patches and updates in the open source community should be tracked in real time to quickly fix any newly discovered vulnerabilities

Governance and policy

Create an internal policy defining how AI-generated code is introduced and managed within the organization. Include guidelines for code acceptance criteria, attribution and licensing, security compliance and maintenance and support.

Suggestions and future actions

Implement “security by design”

Adopt a shift-left mindset
Move security checks and best practices to the earliest stages of the software development lifecycle (SDLC). Rather than treating security as a final checklist item, embed security into requirements gathering, design and coding. This helps prevent security flaws from becoming deeply ingrained in AI-generated code.

Integrate security tools in developer workflows
Equip developers with static and dynamic analysis tools directly within their coding environment. For AI-generated code, tools like linters, vulnerability scanners and code coverage analyzers become even more critical, as they can highlight vulnerabilities the AI might inadvertently introduce.

Conduct thorough threat modeling and architecture reviews
Before integrating AI-generated modules into your application, perform a detailed security assessment. Map out possible attack vectors, identify critical data pathways and evaluate how AI-generated components might introduce new risks, such as reliance on outdated libraries and missed edge cases.

Maintain ongoing security education
Regularly train development teams on secure coding practices—especially as they relate to AI-assisted outputs. Education should include awareness of common vulnerabilities, license compliance issues and responsible usage guidelines for AI tools.

Increase collaboration between AI tool providers and open source communities

Improve training models

  • Update training data: AI models should be retrained or updated frequently using validated open source projects that follow modern security and coding standards
  • Remove known vulnerabilities: Scrub datasets for libraries or code snippets with documented flaws. Work closely with open source maintainers to ensure these vulnerabilities are flagged and excluded from future training iterations

Offer greater transparency

  • Provide generation logs: Whenever feasible, AI platforms can supply metadata or logs showing how a snippet of code was formed, including references to specific training data. This helps developers trace potential issues to their source
  • Disclose model limitations: Providers should clearly communicate known limitations—such as the inability to detect certain classes of vulnerabilities or incomplete support for complex libraries. This allows developers to compensate with additional reviews

Create open channels for feedback

  • Bug bounties and vulnerability disclosure: Encourage open source contributors to report insecure AI-generated code, and reward discoveries that enhance overall model robustness
  • Community-led testing: Invite open source maintainers to pilot new AI features or capabilities, providing real-world feedback that helps refine AI tools

Encourage responsible AI practices

Proper attribution and licensing

  • Acknowledge contributors: AI-generated code may incorporate or build on open source libraries. Ensure that original authors—human or AI—are properly credited and that licensing terms for GPL, MIT and Apache are upheld
  • Establish clear ownership: Align with organizational policies and open source guidelines to clarify who owns AI-generated output. This can prevent legal disputes over intellectual property

Maintain ethical boundaries

  • Avoid malicious use: Clearly define the acceptable use of AI-generated code. For instance, strictly prohibit the creation of malicious software or knowingly infringing code
  • Respect user privacy: If your AI tool or solution processes user data to inform its suggestions, adhere to relevant privacy regulations, such as GDPR and ethical guidelines

Promote community and corporate responsibility

  • Adopt shared principles: Establish consensus-driven frameworks on responsible AI development and usage, incorporating perspectives from both open source communities and corporate stakeholders
  • Encourage ongoing dialogue: Host regular webinars, forums or community events where developers, researchers and maintainers can discuss emerging challenges in AI code generation—ultimately shaping better standards and guidelines

Conclusion

IBM's Granite code generation model is designed to be safer through a combination of responsible AI practices, including training on trusted data, open sourcing the models and implementing safety guardrails. Furthermore, these models are trained on license-permissible data, collected following AI ethics principles and guided by the IBM Corporate Legal team for trustworthy enterprise usage.

Red Hat Enterprise Linux AI is a foundation model platform to consistently develop, test and run large language  models (LLMs) to power enterprise applications. Red Hat Enterprise Linux AI integrates the open source-licensed Granite large language model (LLM) family from IBM Research. 

Lastly, using models like IBM Granite code models to create AI-generated code presents great potential for both open source and enterprise developers. It speeds up innovation and reduces time to market. This new frontier also amplifies existing security, licensing and governance challenges. By blending robust security practices, thorough reviews, automated testing and transparent policies, companies and communities can mitigate the inherent risks while reaping the benefits of faster, smarter and more innovative development.

product trial

Red Hat Enterprise Linux AI | Product Trial

A foundation model platform to develop, train, test, and run Granite family large language models (LLMs) for enterprise applications.

About the author

Huzaifa Sidhpurwala is a Senior Principal Product Security Engineer - AI security, safety and trustworthiness, working for Red Hat Product Security Team.

 
Read full bio
UI_Icon-Red_Hat-Close-A-Black-RGB

Keep exploring

Browse by channel

automation icon

Automation

The latest on IT automation for tech, teams, and environments

AI icon

Artificial intelligence

Updates on the platforms that free customers to run AI workloads anywhere

open hybrid cloud icon

Open hybrid cloud

Explore how we build a more flexible future with hybrid cloud

security icon

Security

The latest on how we reduce risks across environments and technologies

edge icon

Edge computing

Updates on the platforms that simplify operations at the edge

Infrastructure icon

Infrastructure

The latest on the world’s leading enterprise Linux platform

application development icon

Applications

Inside our solutions to the toughest application challenges

Original series icon

Original shows

Entertaining stories from the makers and leaders in enterprise tech