Should this be public though?🤔

Introduction

Every day we hear about the various type of vulnerabilities that affects a company. Companies are usually fast to fix these and protect the integrity of their customer data. But what happens when an employee decides to share something they should not? Not just social engineering attacks, but employees publicly sharing sensitive information is on the rise. To understand how bad this could be, I decided to use my bug bounty time to target a company that accepts such leaks if they have substantial security risk to the company or the customer data. This blog will detail findings and setups to monitor disclosures. Additionally, I will also go into technical depths on how companies can automate their monitoring process to discover this kind of data leakage before a malicious party gets access to it.

This blog will start with informative details regarding what disclosures were found and where. Then, it will contain a more technical aspect on how to monitor this kind of exposure. In regards to automation, it will cover two distinct approaches.

Note: Many pieces of information shared here are redacted with companies’, users’ and employees’ privacy in mind.

Research

Choosing common websites

Before finding the disclosures, it was important to find where an employee is more likely to share these data. Usually when it comes to disclosures, password dumps come in mind but for this research it was not significant because I wanted to see where an employee is leaking information instead of an outside attacker. For this purpose, I started to gather intel on what sites are frequently used by employees depending on departments and projects. Some of the common websites are:

1. Prezi, Emaze or any presentation site.

2. Forums and question sites like stackoverflow.com, productforums.google.com.

3. Task management and bookmark apps: Trello (thanks to Kushagra for this tip, link for his blog is below in credits section.), papaly.com (personal bookmarks)

After choosing some bug bounty targets, I began to scan each site with some of my targets to get better understanding of how many disclosures were out there.

Testing locations

Initially, when I started to scan each of the possible leak locations, I used simple Google dorks to get the results. One important thing that helped me get good results was knowing internal domains of a company. If a company had domainx.com as its internal domain, I would look for that in the intext parameter of Google dork. For example, if I were looking in Trello, I would look for site:trello.com intext:domainx.com.

In most of the research, Google dorks helped me grab the result. However, it was essential to analyze what was disclosed to identify the impact accurately. Here are breakdowns of what I found:

  1. Valid employee credentials.
  2. Internal presentations that included information regarding internal support process for specific cases.
  3. Employee’s expense sheets for their visit and company trips. (Image below)
  4. Links to internal Google docs and sheets. Even though I could not access them, they gave out project information through titles.
  5. Dashboard apps like Datadog that was designed to be used for internal purpose.
  6. User details (emails, numbers, names, stats etc) that are tracked in internal sites.


Examples:

Sample leaked excel sheet with flights and schedules.

Sample of Papaly board where user creds could leak.

Understanding these leaks

Just because I found a user data or employee credential does not mean that a company is following lousy cybersecurity practices. Unlike monitoring hardcoded credentials in company-owned code, it is hard to watch credentials and other disclosures done by an employee in public sites. As company grow, departments get bigger and more employees get hired. The responsibility to prevent this does not just land on the company’s hand. Employees have to be better aware of what they share on public sites,. This could give out information to an attacker to understand how an internal application works.

There is not just one company where this issue exists. At the point of writing this blog, I have found multiple companies suffering from this issue. some have accepted it as a risk, and some companies (like Uber) took it seriously and decided to fix the disclosures within 24-48hrs period. It is not like a simple CSRF, XSS or even RCE that can be fixed by a code patch. Instead, this requires active monitoring and alerts to track and record any mention of internal sites in public domains or publication of internal documents. If your company has its internal domain that is not a public knowledge, it is essential to monitor it. Chances are there will be one case where an employee could leak information without realizing. Next, I will discuss how I used this research to create an automated process to find the disclosures faster.

Automating the Process (with code)

Google dorks can be tedious sometimes because it takes time to search for specific domains manually. Instead, I decided to automate the whole process once I got the hang of what I was looking for. I combined Google and other APIs to make an automation script that helps find these kinds of leaks.

In this section, I will cover the setup necessary for the script so that companies can use it to monitor their internal domains for disclosures.

Note: This will only have valid results if you are monitoring private domains. For example, if you monitor google.com you will find a lot of false positives however looking for corp.google.com will give you valid disclosures.

Setting up Google Custom Search Engine (CSE)

The script utilizes Google Custom Search Engine (CSE) to search for the user provided strings. To create a CSE, go to https://cse.google.com/cse/create/new. Once there, for Sites to Search put: trello.com/*, papaly.com/*, codepen.io/*, productforums.google.com/*, prezi.com/*. If you properly setup the links, this is what it should look like:

Once done, give a title to your CSE and your CSE setup will be complete. Next, we need to get the URL for the CSE that we can put in the script. To do this, go to https://cse.google.com/cse/all. Select your CSE and the setup page should load. In the setup page, select Public Url and click the link. Copy the data for cx parameter and save it.

Setting up Google API

To correctly query result from the CSE you will need an API key. This API key helps to retrieve the data in a JSON format. To get the API key, go to https://developers.google.com/custom-search/json-api/v1/overview and click on Get a Key. This will allow you to create a project or select a project that you already have. Once you select/create a project, it will generate an API Key and display it. Copy and save that key. This API Key will be necessary when setting up the LeakFinder script.

Setup Trello API & Auth

LeakFinder contains a script to find and analyze Trello boards so it is essential to have Trello API key. This allows finding board names based on retrieved Card and Board links. For the setup, we will generate Trello API Key and Trello Authentication Key.

Trello API Key

Go to https://trello.com/app-key, and it should display the API key for the user account that is logged into Trello. Copy and save that API key.

Trello Auth Token

In the same page (https://trello.com/app-key), the website mentions on how to generate a token. Because for this we are producing it for ourself, we can click the `token` hyperlink. This will load an Auth page. Once auth is approved, it will give the auth key. Save the auth key.

Once the setups are done, we are ready to run the code. To download LeakFinder, use: git clone https://github.com/rojan-rijal/LeakFinder.git . To setup the code, run the install.sh script:

chmod +x install.sh
sudo ./install.sh

Provide the necessary API keys and we should be ready to go.

Once that is done, do a test run of the script. To run the script, simply type python3 main.py domain.com. For example, if you want to run against uberinternal.com, you would put python3 main.py uberinternal.com.

Companies can also create a simple monitor directly through https://www.google.com/alerts. I cannot guarantee the efficiency and success of this service because I do not use it frequently.


Important

It is important to understand that not every result will have sensitive information. Always analyze the result you get, to see what is being disclosed.

HackerOne Reports on these bugs

While testing my script, I decided to target some public programs on HackerOne to see if I could find any accidental disclosures made by the employees. Some of the reports are slowly getting publicly disclosed. I wanted to list them here and give props to the companies for fixing it:


1. Shopify: Kudos to Shopify for taking down the disclosure within 2 hrs of the report. View Report

2. Uber - Uber reports have not been disclosed yet, but I want to congrats them for being active and fixing disclosures fast. In one case, a disclosure was reported over the weekend and it was taken down within 48 hours.

Credits

Special shout out and credit to Kushagra Pathak who researched regarding the Trello leaks. You can find his blog post about Trello data leak at: https://medium.freecodecamp.org/discovering-the-hidden-mine-of-credentials-and-sensitive-information-8e5ccfef2724 where he goes in depth about what he found from Trello.