Navigating New Application Security Challenges Posed By GenAI

This post has been republished via RSS; it originally appeared at: Microsoft Tech Community - Latest Blogs - .



GenAI applications—software powered by large language models (LLMs) are changing the way we interact with digital platforms. These advanced applications are designed to understand, interpret, and generate human-like text, code and various forms of media, making our digital experiences more seamless and personalized than ever before. With the increasing availability of LLMs, we can expect to see even more innovative applications of this technology in the future. However, it is important to carefully consider the potential security challenges related to GenAI applications and the underline LLM.


While security challenges in machine learning models have been studied for some time, such as the potential for adversarial attacks where input data can be manipulated to mislead the model. However, challenges specific to LLMs are still relatively unexplored and pose a blind spot for researchers and practitioners. LLMs are distinct from other software tools and machine learning elements in terms of their functionality, the way GenAI applications employ them, and the way users engage with them. For these reasons, for the development and use of GenAI applications, it's crucial to implement GenAI security best practices, Zero Trust architecture, posture management solutions, and conduct red team exercises. Microsoft is at the forefront of not only deploying GenAI applications but also ensuring the security of these applications, their related data, and their users.


In this blog post, we will review the unique cybersecurity challenges that GenAI apps and the underline LLMs pose according to their special behavior, their unique use, and their interaction with the users.


LLMs are versatile, probabilistic and black box

The unique properties of LLMs that empower GenAI apps make them more vulnerable compared to other systems and poses a new challenge to main conventional cybersecurity practices. Especially the fact that LLMs are versatile, generate responses using probabilities and, to a great extent, still a black box with inner workings that are largely untraceable.


LLMs are versatile

Typical software components handle precise and bounded inputs and outputs, but LLMs engage with users and applications through natural language, code, and various forms of media. This unique capability enables the LLM to manage multiple tasks and process a diverse array of inputs. However, this distinct feature also opens the door to creative malicious prompt injections.


LLMs use probabilities to generate response

A significant factor enhancing the user experience with LLMs is their use of probabilistic methods to create responses. The nature of LLM outputs is inherently variable, subject to change with each iteration. Consequently, the same inquiry can prompt a range of different responses from an LLM. Moreover, for tackling complex challenges, LLMs might apply reasoning techniques like Chain of Thought Prompting, which breaks down intricate tasks into smaller, more manageable sub-tasks. which means it can result in varying steps, conclusions and result for the same task.


LLMs are still a black box

Despite being a highly active field of study, LLMs remain black box in several respects. The processes by which they produce responses and the information they gather throughout their training and finetuning phases are not entirely comprehensible. As an illustration, researchers have adeptly inserted a backdoor that instructs the LLM to produce benign code until triggered by a specific input indicating the year is 2024, at which point the LLM begins to produce code with vulnerabilities. This backdoor endures even through the safety training stage and remains undetected by users until activated.


From the cybersecurity perspective, those unique characteristics that make LLMs incredibly powerful also render them more susceptible to vulnerabilities. Additionally, these attributes complicate the processes of monitoring, validating, and restricting LLMs. The vast, unbounded, and non-deterministic nature of LLMs' inputs and outputs presents significant challenges in terms of surveillance, anomaly detection, and validation. This raises critical questions about how, for example, one can effectively identify and mitigate undesired behaviors in entities with such extensive and unpredictable usage patterns.


GenAI applications enable high connectivity and autonomy for LLMs

To fully leverage the potential of LLMs, they are usually integrated into a GenAI application that provides specific logic, broader functionality, and access to various data . GenAI applications enhance the capabilities of LLMs and increase their value to users. However, they also widen the potential targets for exploitation, vulnerabilities, and the impact that attackers might exploit.


GenAI apps have broad attack Surface

GenAI applications allow LLM to interact with users, plugins, and data sources. Some plugins could be crafted by untrusted sources or be exposed to the internet. Data sources may be public and can be initialized or poisoned by attackers. Considering that LLMs are designed to follow natural language instructions each of these entry points can be used by an attacker to compromise it. One main concern is Jailbreak the LLM, when the attacker bypasses the LLM restrictions using direct or indirect prompt injection. An attacker can craft malicious instructions and inject them into the application directly through the user interface or indirectly using the many data sources that the application has access to as a websites, emails or documents. Moreover, since the LLM itself has significant value, the broad attack surface opens the door to different kinds of model thefts in which the attacker is trying to steal the knowledge that the underlying LLM has acquired.


GenAI apps enable high LLM autonomy and functionality

GenAI applications employ LLMs to comprehend user queries, identify ensuing actions, access data sources, and activate plugins. Certain operations carry significant consequences, such as executing code, erasing or leaking sensitive data outside the organization. For instance, an attacker obtained a remote code execution using GenAI app, during this attack the attacker successfully leaked sensitive credentials and used them for its own purposes. Another example is an attacker successfully manipulate a Chat With Code plugin to disclose private GitHub repositories without obtaining explicit consent from the user. Those attacks are allowed by the structure and the wide functionality of GenAI applications.


The extensive array of capabilities, diverse information inputs, and significant autonomy empower LLMs to tackle complex assignments without needing user direction. Nevertheless, this also makes them vulnerable to numerous risks and intensifies the consequences of cybersecurity breaches. A primary and effective method to alleviate these concerns is the implementation of zero-trust architecture during the design phase of GenAI apps.


GenAI applications use natural language and enjoy the trust of the end-user

A key element in the expansion of GenAI is its user-friendliness for individuals without technical expertise, and the capacity of LLMs to interpret natural language directives and engage in human-like dialogues. In terms of cybersecurity, this feature influences both sides – the underline LLM and the user. The LLM is programmed to follow natural language commands, and the user regards the GenAI app and the underline LLM as a dependable source of information.


GenAI apps interact with the users using natural language

In contrast to traditional systems where an attacker must navigate through various information sources and execute multiple phases of an attack to access sensitive data, with LLMs, an attacker who gains access to an internal GenAI application or manages to perform a direct or indirect prompt injection often only needs to simply request the information from the LLM in natural language, and the LLM will provide the necessary details, effectively streamlining the process for the attacker. Furthermore, attackers might deceive the LLM into performing tasks that consume computing resources, leading to denial of wallet or service attacks. For instance, an attacker can create a malicious prompt, causing the LLM to execute a significant number of actions or calculations.


Users perceive GenAI apps as a reliable information source

On the other hand, the convenient interaction expands the user base to include non-technical individuals who depend on the application outputs for their work and other interests. This opens opportunities for phishing attacks through GenAI applications. Attackers could exploit LLMs to steer users towards their domains or use hallucinated LLM-generated content to trick users into downloading or engaging with malicious material. Users often perceive the LLM as a reliable source of information, which means that in the event of a compromise, it would be easy for attackers to mislead them.


Microsoft is Leading the GenAI Security

Applications of GenAI hold immense possibilities, yet they face significant challenges in cybersecurity. LLMs are distinct from other software tools and machine learning elements in terms of their functionality, the way GenAI applications employ them, and the way users engage with them. Therefore, it is important to implement security best practices, Zero Trust architecture, posture management, and collaborate with professionals who are well-versed in security and GenAI. As a leader in deploying GenAI, Microsoft is at the forefront of integrating artificial intelligence with generative capabilities, paving the way for innovative applications across various industries. Microsoft’s approach to GenAI is not only focused on the development and deployment of these advanced technologies but also on the stringent security measures that accompany them.


Thanks for reading!

Leave a Reply

Your email address will not be published. Required fields are marked *


This site uses Akismet to reduce spam. Learn how your comment data is processed.