AI researchers launch SuperGLUE, a rigorous benchmark for language understanding

Facebook AI Research, together with Google’s DeepMind, University of Washington, and New York University, today introduced SuperGLUE, a series of benchmark tasks to measure the performance of modern, high performance language-understanding AI.

SuperGLUE was made on the premise that deep learning models for conversational AI have “hit a ceiling” and need greater challenges. It uses Google’s BERT as a model performance baseline. Considered state of the art in many regards in 2018, BERT’s performance has been surpassed by a number of models this year such as Microsoft’s MT-DNN, Google’s XLNet, and Facebook’s RoBERTa, all of which were are based in part on BERT and achieve performance above a human baseline average.

SuperGLUE is preceded by the General Language Understanding Evaluation (GLUE) benchmark for language understanding in April 2018 by researchers from NYU, University of Washington, and DeepMind. SuperGLUE is designed to be more complicated than GLUE tasks, and to encourage the building of models capable of grasping more complex or nuanced language.

GLUE assigns a model a numerical score based on performance on nine English sentence understanding tasks for NLU systems, such as the Stanford Sentiment Treebank (SST-2) for deriving sentiment from a data set of online movie reviews. RoBERTa currently ranks first on GLUE’s numerical score leaderboard with state-of-the-art performance on 4 of 9 GLUE tasks.

“SuperGLUE comprises new ways to test creative approaches on a range of difficult NLP tasks focused on innovations in a number of core areas of machine learning, including sample-efficient, transfer, multitask, and self-supervised learning. To challenge researchers, we selected tasks that have varied formats, have more nuanced questions, have yet to be solved using state-of-the-art methods, and are easily solvable by people,” Facebook AI researchers said in a blog post today.

The new benchmark includes eight tasks to test a system’s ability to follow reason, recognize cause and effect, or answer yes or no questions after reading a short passage. SuperGLUE also contains Winogender, a gender bias detection tool. A SuperGLUE leaderboard will be posted online at Details about SuperGLUE can be read in a paper published on arXiv in May and revised in July.

“Current question answering systems are focused on trivia-type questions, such as whether jellyfish have a brain. This new challenge goes further by requiring machines to elaborate with in-depth answers to open-ended questions, such as ‘How do jellyfish function without a brain?’” the post reads.

To help researchers create robust language-understanding AI, NYU also released an updated version of Jiant today, a general purpose text understanding toolkit. Built on PyTorch, Jiant comes configured to work with HuggingFace PyTorch implementations of BERT and OpenAI’s GPT as well as GLUE and SuperGLUE benchmarks. Jiant is maintained by the NYU Machine Learning for Language Lab.

In other recent NLP news, on Tuesday Nvidia shared that its GPUs achieved the fastest training and inference times for BERT, and trained the largest Transformer-based NLP ever made up of 8.3 billion parameters.

Content sourced fromTNW

*This section only applies to third party rss feed users*
Kashmir Broadcasting Corporation allows the use of RSS Feeds, but with our content usage we expect that credit is given, but in the event that it is not. This content policy annotation will act as a credit towards KBC (Kashmir Broadcasting Corporation) Please visit for more news and articles — we can not justify what is written on a third party site, as the content can be altered to their specification, if something is not authentic as it should be please visit and look for the original content. if it is no longer there then it can no longer be associated with Kashmir Broadcasting Corporation and if the content on a third party site has been altered to the point of offence or deemed inappropriate please report it to KBC via email: or fill the submission form on kbc’s website: with the details of the site and article heading — Thank You

Website —
FaceBook —
Twitter —
YouTube —
Instagram —

Show More

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button

Adblock Detected

Please consider supporting us by disabling your ad blocker
%d bloggers like this: