Overview
In June 2023, Microsoft AI researchers inadvertently exposed 38TB of sensitive internal data while publishing open-source training materials on GitHub. The data included private keys, passwords, internal Teams messages, and backups of two employee workstations. The breach resulted from a misconfigured Azure Shared Access Signature (SAS) token used to share files. SAS tokens, known for their permissive sharing capabilities, allowed public access to the entire storage account rather than the intended training datasets.
What are SAS Tokens?
In Azure, a Shared Access Signature (SAS) token is a mechanism for granting temporary and customizable access to Azure Storage resources via a signed URL. Users can define the token’s permissions such as read, write, or full control, its scope (a single file, a container, or an entire storage account), and its expiration time. While this flexibility enables tailored access control, it also introduces significant risks. When misconfigured, SAS tokens can grant overly permissive access, such as full control over an entire storage account without an expiration date, effectively acting as an account key. This level of unrestricted access undermines security, especially if the token is inadvertently exposed or mishandled.
What Happened?
As part of their AI research, Microsoft researchers uploaded datasets to a public GitHub repository, intending to share open-source training materials. However, the configuration of the Azure SAS token used to facilitate this access inadvertently granted full access to the entire storage account. This exposed an extensive volume of data 38TB, containing:
- Internal Microsoft system backups.
- Credentials and passwords.
- Sensitive personal messages from Teams conversations.
- Proprietary development tools and information.
This misstep exposed the storage account for a prolonged period. While Microsoft assured that no evidence of external exploitation was found, the breach emphasized the vulnerability of sensitive data to cloud misconfigurations.
Misuse of SAS Tokens
The issue stemmed from an improperly configured Shared Access Signature (SAS) token, a feature in Azure that facilitates temporary and limited data sharing. The exposed token was an Account SAS, which provided full access to all content within the linked storage account instead of restricting access to the specific dataset intended for sharing.
Key characteristics of SAS tokens contributed to the breach:
- No Authentication Required: SAS tokens bypass Azure Active Directory (AAD) authentication, making them inherently risky when improperly configured.
- Broad Permissions: The token granted read, write, and delete permissions across the storage account.
- No Expiration: Without a set expiration time, the token’s exposure remained a persistent risk.
Lessons for the Tech Community
This incident provides critical lessons for organizations leveraging cloud storage and emphasizes the need for rigorous cloud security practices:
- Enable Logging and Monitoring: Use tools like Azure Monitor and Storage Logs to track access and identify anomalies.
- Conduct Regular Audits: Perform frequent reviews of configurations to ensure they align with security best practices and rectify any identified weaknesses.
- Restrict Credentials Storage: Do not store sensitive information, such as passwords or API keys, in shared or publicly accessible locations.
- Encrypt and Isolate Data: Secure sensitive data with encryption and separate public resources from private ones within cloud environments.
- Minimize Token Permissions: Use tokens with task-specific, minimal access rights to reduce exposure.
- Limit Usage of Account SAS Tokens: Rely on more secure alternatives, such as Service SAS or User Delegation SAS, and reserve Account SAS tokens for rare scenarios.
Conclusion
The Microsoft data exposure is a stark reminder of how simple misconfigurations can lead to large scale security breaches, even in organizations with robust resources. By prioritizing the principles of least privilege, proactive monitoring, and continuous training, organizations can safeguard their data and build resilience against evolving cloud security challenges. Integrating these lessons into daily operations can help mitigate risks and ensure a more secure digital environments.