Shadow IT

How to roll-your-own SaaS discovery

Jacques Louw

Co-founder / CPO

May 2, 2022 | 12 min read

Shadow IT

Identity security

We’ve compiled a list of various methods for discovering SaaS use in your organization. In this guide, we’ll explore pros and cons for each approach and introduce you to some new and novel ways to capture SaaS usage, discover unknown SaaS employees are using, and determine how securely it is being used.

Over the past few years, there’s been massive growth in the number of SaaS apps used for work. With that comes new challenges – how do you allow employees to take advantage of all the SaaS the world has to offer without locking it all down and stifling innovation? How do you figure out if you can trust all these new third parties with access to your data? Well, the first step is figuring out which apps employees are actually using, so that’s where we’re starting.

We’ve compiled a list of various options and approaches we’ve seen people take to SaaS discovery, each with their own pros and cons.

Why is SaaS discovery so hard?

Something to note straight off the bat is that with all the data-driven approaches we’re about to cover, you have to know how to extract SaaS use out of that data. That’s one of the reasons SaaS discovery is so hard. With the roll-your-own approaches in this post, you’ll be able to identify some common apps (like Trello, Slack, Dropbox, etc.), but what about all the new or lesser-known apps? Unfortunately, trying to keep track of all the SaaS apps that are available to employees is really difficult. There’s not really a great master list available on the Internet for you to cross-reference with your data.

That means that all of these roll-your-own approaches are dependent on you knowing what you’re looking for. If you must know what SaaS you’re looking for in order to determine if an asset is actually a SaaS app, you’re going to be left with quite a few blindspots given there seem to be new apps launching every day.

The second hurdle with a roll-your-own discovery approach is differentiating between SaaS access and SaaS usage. Just because an employee accesses a SaaS website, it doesn’t mean they’re using their app. Most of the data sources will produce a ton of domains, IPs, etc. for you to sift through, but differentiating access and usage based on this information alone will produce a large number of false positives unless you can correlate it with other data sources (we suggest some below). You will likely also want to know things like exactly who the users, owners and administrators of the app are which will be all but impossible from this “access” data alone.

If we ignore for the moment the difficulties in extracting information about SaaS usage, let’s run through your options for data sources and see which ones will give you the most useful data.

Collecting financial records

Looking through invoices can provide some visibility into paid SaaS apps, which is probably the lowest false positive data source. However, there are blind spots - you won’t see any free tier or trial accounts, nor will you get any useful business context about who’s using it, how they’re using it, if logins are secure, and what data it has access to. That said, it’s a quick and dirty way to get a partial view of SaaS usage, and might be the best place to start.

Network-level

Summary: Network level data is the standard old-school approach. If you already have great network monitoring in place it provides fairly broad visibility. There are some very key limitations especially around inferring usage from access, as well as outside the office visibility problems.

SaaS apps are accessed over a network - and so that seems like a sensible place to start looking for them. What if we just tried looking for all users accessing a SaaS app’s website? Let’s say we want to see if anyone is using e.g. Dropbox, so we do a Google search for all Dropbox domains and we find Dropbox.com, and a few regional domains as well. We then set about finding employees accessing those domains in our network logs - simple! Perhaps not so much…

As we mentioned in the intro, the best outcome you can hope for is to uncover SaaS access, not usage. This might seem like a subtle difference, but SaaS usage is what you want to find, not just information about which employees visited a SaaS website. If you’re looking at all app access, you’ll wind up with a massive list of SaaS, with only a portion of it indicating SaaS usage.

Since you can’t discover app usage with network data, you’d have to tie network traffic to a single employee to identify the user, then reach out to each employee to understand the business context of how they’re using the app. A network data approach can work if you have time to get that context by asking employees if they’re using the SaaS detected or by corroborating your findings with subscription invoices from the finance team.

A few ways to collect SaaS data on the network level are ingesting firewall, web proxy and DNS and VPN logs. These inputs can give you some additional visibility into SaaS access, but you may still be left with significant blind spots to actual usage if you assume it all takes place on the corporate network using a VPN. It’s also a painfully tedious process. That said, a manual process still is better than having no SaaS visibility at all.

Endpoint-level

Summary: Endpoint data is hard to get, and of limited value. However, it may be useful if you already have this data available in a SIEM or if it’s otherwise easy to query.

Perhaps we’ll get closer to what we need (usage data instead of just access data and a low false positive rate) if we move up a level and get closer to the users? Users are going to be accessing the SaaS apps through some kind of endpoint and there are some things you could use to do discovery if you have some monitoring capability on that endpoint.

For example, many SaaS apps have desktop or mobile clients (thick clients) you install. You could look for e.g. the Slack client, or the OneDrive sync agent installed on the endpoint. However, many users prefer the in-browser version, so they may not have even installed the thick client and you wouldn’t see their usage by looking at their endpoint data.

All the good data, the application level data, is in the browser, which is technically on the endpoint but not really accessible through the endpoint without doing something very hacky. Perhaps we need to go a level deeper - either closer to the application or get inside the browser.

Application-level

Summary: Application level integrations are very useful for discovering unsanctioned SaaS apps that are integrated with the SaaS apps you already know about. But when used in isolation, they have massive blind spots. Application-level data is also a goldmine for finding out how securely employees use the app.

Focusing on the SaaS app directly makes a lot of sense if you need to get really high quality usage data. The challenge is that you need to integrate with the SaaS app to get at this data. And you can’t just integrate with an app like Slack or Trello. In general, these integrations must be within a specific account or tenant that your employees are using if you want to see any of their usage or security data. So, if you must already know about the tenant to discover the SaaS - is this approach useless for detecting unknown SaaS? Maybe, but there are some very useful edge cases.

For instance, integrations with SaaS apps that are known and sanctioned can be very useful, especially with those apps that are identity providers, like Microsoft Azure/365 and Google Workspace. Lots of SaaS apps let users login with another SaaS app, which is called social login or sometimes single sign-on (SSO). When a user does “login using Google” on Salesforce using their corporate Google account, they are actually integrating (in a very limited way) Salesforce with Google Workspace. If you have application-level access (normally by calling the APIs) to known SaaS apps, you can discover these social logins (among other) integrations with other SaaS apps. These SaaS-to-SaaS links then become very useful as a discovery mechanism.

Something else to keep in mind, application-level access to known SaaS can also be incredibly useful for security beyond simple SaaS discovery. You could check authentication controls, like which users don’t have MFA enabled, sharing settings (perhaps the SaaS allows you to share documents publicly), unusual login events, other anomalous behavior, and so on.

Browser-level

Summary: Browser data is as good as you can get for SaaS discovery, but with the downside that you must build and deploy a browser extension to get at it.

What if I told you, you could get application level usage-data beyond what events the applications expose through their APIs without needing to know about the app first or fighting network encryption? The other methods in this guide allow you to get at the data using normal log processing techniques, SIEM queries, or even hacky scripts that call APIs, but there’s one reasonable option for SaaS discovery.

The only real viable way to get at this SaaS usage data is through a browser extension. The big hurdle with this approach is that browser extensions require you to develop an extension and a backend where it can send data…AND you need to deploy that extension to all employees.

Deploying that browser extension might be as simple as setting the extension to default install itself in all managed browsers - that’s possible if you’re using Google Workspace. In other environments, it may be a bit more of a challenge. Fortunately, browser extensions don’t have the complexity of normal endpoint agents. They don’t have runtime dependencies, aren’t platform dependent, don’t need admin permissions to install, have automatic update mechanisms built-in, and don’t affect performance. At the end of the day, they’re just a special piece of JavaScript running in the browser.

If you are able to get access to the data in the browser (spoiler alert: we provide an easy - and free - out-of-the-box browser extension for SaaS discovery), there is almost limitless scope to what you can do with this data. You can observe not only access to SaaS websites, you can also see:

the user login,
whether that login was successful,
whether they used MFA to login,
which email they used to login,
whether they are the owner/administrator of the SaaS app tenant, and
all their behavior and settings in the app.

Best of all, there is no need to stream all this data to a single collection point where it becomes a privacy nightmare. By writing rules in the extension to look for specific issues, you can flag only security relevant events, redacted or anonymized as far as makes sense. You can even limit the scope to only monitor the app use when the employee logs into the SaaS app using their work account to further avoid employee privacy concerns.

There’s a quick and easy solution to get the best out of the application and browser data approaches we’ve written about in the last two sections - and that’s with our free tool.

How can Push help?

We found that the most comprehensive approach is to collect data from both the application and browser level to give you full visibility and actionable security information. With our browser extension, we get full breadth of coverage so you can discover all SaaS usage and with our APIs, you get the depth of coverage you need to understand how employees are using SaaS and if they’re doing so securely. Our combined approach captures SaaS logins and adoption, in real-time, and provides the best visibility and context for security teams.

Fixing SaaS security issues automatically by partnering with employees

What we then do with that data is where the magic happens… we can automatically guide employees via ChatOps (Slack and Teams for now, more to come!) to improve SaaS security. Some of those messages will help us enrich our data by asking employees questions they’ll actually know the answers to (“You logged into Slack from Mexico just now. Are you in Mexico?”), which provides you with a good snapshot of SaaS usage in your business and lets you make informed security decisions about SaaS use to better manage risks.

Employees can also make immediate improvements to your overall security posture. In case you’re curious about what that looks like, some of the prompts we push to employees are things like:

“We noticed this SaaS app you’re using has access to all your emails, are you still using it?” Y/N. If not, they can click a button to remove it and you’ll get an immediate reduction of your attack surface.

“It looks like you’re not using MFA for your account on this SaaS app. Can we get this set up really quickly?” or “An app you installed called ‘Dropbox’ is not the official Dropbox app, click here to remove it and install the verified app instead.”

If you’re interested in learning more, check out how we can help you discover SaaS use and secure it.

We’ll also be publishing a SaaS Discovery Evaluation Guide that will explore all the off-the-shelf tools you may consider and evaluate which one is the best fit for your needs as this really does depend on your tech stack. In that, we’ll share our experiences with those products and discuss what additional coverage and context they can provide, as well as where they fall short. Subscribe to our mailing list and follow us on Twitter @pushsecurity or LinkedIn to get a head’s up when that’s live so you can have a read.

Subscribe to get updates from Push

The latest news, articles, and resources, sent to your inbox