Jump to Content
AI and Machine Learning

Everyday AI: predicting the work you need to do, with the work you’ve already done

July 10, 2019
Mike Colagrosso

Software Engineer, Google Drive

If you’re curious about how ML works in Drive, we’ll go “under the hood” in this post.

Try Google Workspace at No Cost

Get a business email, all the storage you need, video conferencing, and more.

SIGN UP

Nearly all of us have felt overworked and under-resourced in the workplace. Imagine what we could accomplish if we had extra help with mundane tasks like replying to emails, organizing files or even finding relevant data. With G Suite, we aim to do just that by providing assistive features to help you stay productive. For example, in Google Drive we use a feature called Priority that’s powered by machine learning (ML) to find and surface documents for you. If you’re curious about how ML works in Drive, we’ll go “under the hood” in this post. 

Refresh: what is Priority in Drive?
Before we explain the technology, let’s recap what Priority is. In Drive, there are page options on the left side of the homepage to help you access documents: 1.) Priority, 2.) My Drive and 3.) shared drives. The Priority page is a spot in Drive that uses several different ML techniques to continually surface relevant files for you. It looks like this.

https://storage.googleapis.com/gweb-cloudblog-publish/images/Priority_in_Drive.1008058718881038.max-1700x1700.png

When you click through to the priority page, you’ll see that it is comprised of two parts:

  1. Priority cards, located in the top half of the page. These cards continually surface important content via predictive machine learning models. Using signals (like upcoming meetings or people you frequently work with) Drive suggests relevant documents, spreadsheets, presentations and more. It can also suggest associated actions to take. We’ll explain how these two things work in a bit.
  2. Workspaces, located in the bottom half of the page. This is where Drive suggests clusters of files for a project that may need attention based on signals like a common topic (such as a codename) or team members. As time goes on, Drive offers new workspace suggestions and/or flags files that you might want to add to an existing workspace so that you can keep them fresh.

First, how does Drive know what files to surface for you? 
Priority uses several different ML models, each with their own unique purpose, to decide which “cards” to surface for you. Let’s break them down by the information they help gather.

  • Using signals from other G Suite apps you use. One deep learning model we use in Drive is Quick Access, which we constantly update and retrain. This model collects signals from across G Suite to predict which files you will open next. For example, files attached to Gmail threads or upcoming Calendar meetings are both examples of signals that improve Quick Access ranking. Naturally, so are repeatedly-edited Docs, Sheets, and Slides. Check out this research paper, which breaks down the Quick Access model in more detail and includes information on the multi-layer, feed-forward, neural network architecture.
  • Learning from collaboration patterns. To suggest files in Drive's "Shared with me" section (located in the left of your homepage under “shared drives”), we launched a new model that predicts suggestions based on who you collaborate with frequently. This model informs the Priority page, too, by using the graph of who you share your files with, who you work with in Docs, Sheets, and Slides, who you meet with in Calendar, and who you talk to in Gmail and Hangouts Chat. Unlike the file suggestions from Quick Access, the collaborator model is more robust because it’s informed by your most frequent interactions. That means that when deciding which comment to show, Priority will favor showing the comment from your #1 collaborator instead of your #2.
  • Registering important comments to identify files to suggest. In G Suite, you can comment on Docs, Sheets, Slides, and even Microsoft Office files, PDFs, and images. The frequency at which you comment can be a great indicator of important files. We built a comment model on top of the ML models above to rank your closest collaborators’ comments higher. This model also informs our suggested actions, which we’ll explain a little further down.  
  • Defining your “working set” to predict what files are important in the near term. We built a deep learning model that estimates the likelihood that a file will appear in your working set, that is, the set of files you will need to do your job for the week. This model comes in handy for the “workspaces” section. It works similarly to the Quick Access model, but it filters out files that you haven’t recently edited. Also, it's trained from data collected over an entire week instead of a single visit to Drive.  

With these multiple ML models split across Priority cards and Workspaces, Priority can optimize for both precision and recall to surface exactly the right files when you need them.

Next, how does Drive know what actions to suggest? 
We know your work encompasses more than just opening the file you need, which is why Drive is smart enough to not only surface relevant files for you, but to also suggest actions for you to take. For example, it can provide links to help you respond to comments from the Priority page (without having to switch to the document itself) or even suggest files you might need to review before an upcoming meeting. This is all made possible via the ML comment model we built and described above.

https://storage.googleapis.com/gweb-cloudblog-publish/images/Drive_suggested_actions.max-900x900_V16oafB.png

With so many comment threads going in our documents, you’d expect it to be hard to track for ML models. Not for Drive. The advantage of being in the cloud, is that we can aggregate these otherwise hard-to-find signals to make useful suggestions—something that isn't possible with on-premises or hybrid content management systems. And the benefits show: in internal analysis, Drive users respond to comments 10–15 minutes faster via Priority than other methods, thanks to the help of ML.

Last, how does Drive intelligently organize workspaces? 
Drive's ML models, which we outlined above, helps create workspaces for you to access files faster, essentially providing suggestions based on clues from the files.

Let’s say that you just finished a working session with a colleague. Throughout the session you both shared multiple files with each other and started collaborating in real-time with Docs, Sheets, and Slides. Drive clusters these files using your content and the “working set” machine learning model to propose a collection of five files as shown below:

https://storage.googleapis.com/gweb-cloudblog-publish/images/drive_suggested_workspace.max-1000x1000_6n2JeYF.png

To begin working in the workspace, you just need to accept it by clicking “Save.” When you accept a suggested workspace, you maintain full control over its name and other files you add to it.

But as you know, projects evolve, and so do their supporting files. In addition to intelligently clustering collection of files to seed a workspace, Drive suggests additional files to add to a workspace to keep it up-to-date:

https://storage.googleapis.com/gweb-cloudblog-publish/images/Drive_suggests_additiona_.0956075616361357.max-1700x1700.png

Spending time on valuable work 
Altogether, machine learning has helped Drive users find the files they need up to 50 percent faster, which means they can spend their time doing more valuable work instead. IT admins can also spend less time tagging, organizing or categorizing content on the backend. That time adds up.

Learn more about how your business can use Drive.

Posted in