A small misunderstanding - big consequences
Every developer has experienced it: a Github repository that is supposed to remain private is accidentally made public - whether by a wrong click or a faulty setting. But what if this data is still visible to the whole world even after it has been reset to private? Microsoft's AI tool Copilot now painfully shows us how easily confidential information can be compromised, even if the data is no longer publicly accessible.
The discovery of the security vulnerability
Researchers from Lasso discovered a serious problem that could put not only developers but also companies at risk. They discovered that Copilot, an AI tool from Microsoft, could read data from thousands of private repositories - even if they had actually been made private again. This is because once a repository was public, it was captured by crawlers such as the Bing caching systems. Even after being reset to private, the data remained accessible via Copilot.
Hidden dangers for companies
What initially seems like a minor mistake can become a major problem for companies and developers. Private repositories not only contain code and documentation, but often also sensitive company data such as access keys, tokens and intellectual property. If this information is inadvertently disclosed, it can have serious consequences - especially if it is still accessible via Copilot. And it's not just small development teams that are affected: Large companies such as Google, IBM, Paypal and even Microsoft itself have been added to the list of those affected.
Microsoft's reaction: a sleepy problem?
Microsoft, the parent company behind Copilot, was alerted to the problem by the researchers back in November 2024. However, instead of taking immediate action, the company initially responded with the assessment that it was a "minor" problem. Although Microsoft removed the cache links from Bing a month later, the data remains accessible via Copilot. The question remains: Did Microsoft really realize the scope of this problem?
"Minor" is a dangerous term in the world of data
It is shocking how little weight Microsoft has attached to this incident, even though it is hugely relevant at a time when data protection is becoming increasingly important. Companies that rely on confidential information in particular must remain vigilant. A small mistake here can quickly lead to a major disaster. Technology may be advanced, but it can only create trust if it also takes serious security precautions. Developers should therefore not only keep an eye on the access rights of their repositories, but also on the long-term consequences of unintentionally disclosed data.