Published on May 15th, 2017 | by Emergent Enterprise0
Salesforce Created an Algorithm that Automatically Summarizes Text Using Machine Learning
E-E says: People don’t have time or the patience to read long-winded emails or other documents. And everyone has that coworker that says in ten sentences what can be said in one. This algorithm could solve some of that text clutter by using machine learning to break down large groups of text into bite-size summaries. It’s a deep learning executive summary for the common man. Would you take advantage of this technology if available? Share thoughts below.
This year, people are expected to spend more than half their day reading e-mail, articles, or posts on social media, and it’s only going to get worse. To help solve this problem, researchers at Salesforce have developed an algorithm that uses machine learning to produce “surprisingly coherent and accurate” summaries according to MIT Technology Review.
The other breakthrough concerns how the researchers train the system to learn and improve itself. They used two approaches: teacher forcing and reinforcement learning. Reinforcement learning is a method that draws inspiration from how animals learn, and been used to teach Google’s DeepMind how to play video games. In this instance, the model is allowed to generate a sequence of words, and the result is then scored with an automated evaluation metric known as ROUGE (Recall-Oriented Understudy for Gisting Evaluation). The algorithm updates itself with higher scores, leading to better outcomes with future summaries. Teacher forcing is when the results are scored word by word off of an established reference, which provide “very decent results,” but which doesn’t allow for much flexibility.
Researchers found that “ROUGE-optimized RL helps improve recall…and word level learning supervision ensures good language flow, making the summary more coherent and readable.” Scored against this system, they found that their joint model scored higher than other approaches, and Richard Socher, Salesforce’s chief scientist, noted that he didn’t think that he’d ever seen “such a large improvement in any [natural-language-processing] task.”
The results are pretty astonishing: the researchers provided several examples, showing the original article, a human-generated summary, and a summary generated by their own model, and in each case, the summaries are considerably shorter than the original text, but contain the essentials in a readable form. Despite their advances, there’s still considerable work to be done in this field: MIT Technology Review spoke with Kristian Hammond, a professor at Northwestern University, who noted that the advance “shows the limits of relying purely on statistical machine learning,” but that it’s a step in the right direction.