Exploring helpful uses for BERT in your browser with Tensorflow.js
March 13, 2020
Posted by Philip Bayer, Creative Technologist; Ping Yu, Software Engineer; and Jason Mayes, Developer Advocate

There’s a lot of exciting research happening now exploring helpful uses of BERT for language. We wondered: what if we made BERT even more accessible, right in your web browser? What possible uses might this enable?

It’s easy to ask Google a question like “How tall is the statue of liberty?” and get the answer (305 feet) from the web. However, there's no easy way to ask natural language questions of specific content - like a news article, research paper, or blog post. You can try using the find-in-page search feature of your browser (CTRL + F) but that relies on direct word matching. Wouldn’t it be easier to type a question, instead of a word to find, and have the page highlight the answer for you?

To explore this idea, we’ve prototyped a Chrome Extension using the MobileBERT Q&A model that allows us to ask a question of any webpage. Using TensorFlow.js, the extension returns an answer based on the contents of the page. The model runs entirely on-device in the browser session, so nothing is ever sent to a server, maintaining privacy.

It’s an early stage experiment, and we’re sharing our findings here in this post to illustrate how such applications can be built from the open-source TensorFlow.js BERT model. It was helpful to explore examples where we got the answer we wanted, and examples where we didn’t get exactly what we expected. It gave us a peek at both its potential and its current limitations. We hope seeing these examples will help everyone be part of the conversation, and enable anyone to think about ideas for how machine learning can be helpful for language.
Using the chrome extension, ask a question about the article and receive an answer.

Our findings

Here are some results we found where we got helpful answers:
It’s just as interesting to explore examples where the model didn’t return exactly the answer we expected. Here are a couple examples we found:
  • This product page — Asking “What is the pitcher made of?” Returned “Ice mode pulses at staggered intervals to uniformly crush a pitcher of ice in seconds” instead of the answer “BPA-free polycarbonate pitcher”
  • This article — Asking “Were the sharks real?” returned a text “sharks! sharks,” but asking a related question “How did the sharks work?” gave us the more helpful answer of “mechanical sharks often malfunctioned”

How does the machine learning model work?

The MobileBERT Q&A model can be used to build a system that can answer users’ questions in natural language. It was created using a pre-trained BERT model fine-tuned on SQuAD 1.1 (Stanford Question Answering Dataset). This is a new method of pre-training language representations which obtain state-of-the-art results on a wide array of Natural Language Processing (NLP) tasks. We are pleased to announce that this model is now available in TensorFlow.js for your own use. The MobileBERT model is a compact BERT variant which can be deployed to resource-limited devices.
The model takes a passage and a question as input, then returns a segment of the passage that most likely answers the question. Everything happens client side in the web browser, because we are using TensorFlow.js. This means privacy is protected and no text from the website you are analyzing is ever sent to any server for classification.

TensorFlow.js BERT API

Using the model is super easy. Take a look at the following code snippet:
<!-- Load TensorFlow.js. This is required to use the qna model. -->
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs"> </script>
<!-- Load the qna model. -->
<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/qna"> </script>

<!-- Place your code in the script tag below. You can also use an external .js file -->
<script>
  // Notice there is no 'import' statement. 'qna' and 'tf' is
  // available on the index-page because of the script tag above.
  // Load the model.
  qna.load().then(model => {
    model.findAnswers(question, passage).then(answers => {
      console.log('Answers: ', answers);
    });
  });
</script>
As you can see, the first two lines load the TensorFlow.js library and the Q&A (question and answer) model from our hosted scripts, so we can perform Q&A search. This only needs to be called once - the model will stay loaded whilst it is kept in memory. We can then repeatedly call findAnswers() to which we pass two strings. The first is the question the user wishes to ask, and the second is the text within which we want to search (for example the text on the page). We will then get back a results object of the following structure:
[
  {
    text: string,
    score: number,
    startIndex: number,
    endIndex: number
  }
]
You get back an array of objects representing parts of the passage that best answer the question, along with a score representing its confidence of it being correct. We also get the indexes of the answer text for easy locating where the answer text is in the context string. That’s all there is to it! With this data you can now highlight the found text, return some richer results, or whatever creative idea you may choose to implement.
If you would like to try the MobileBERT Q&A model yourself we are happy to share that it is now open sourced and can be found on our Github repository here. If you make something you are proud of, share it with the TensorFlow.js team on social using #MadeWithTFJS - we would love to see what you create.
Next post
Exploring helpful uses for BERT in your browser with Tensorflow.js

Posted by Philip Bayer, Creative Technologist; Ping Yu, Software Engineer; and Jason Mayes, Developer Advocate

There’s a lot of exciting research happening now exploring helpful uses of BERT for language. We wondered: what if we made BERT even more accessible, right in your web browser? What possible uses might this enable?

It’s easy to ask Google a question like “How tall is the statue of liber…