news

A highly efficient, real-time text to speech system deployed on CPUs

Modern text-to-speech (TTS) systems have come a long way in using neural networks to mimic the nuances of human voice. To generate humanlike audio, one second of speech can require a TTS system to output as many as 24,000 samples — sometimes even more. The size and complexity of state-of-the-art models require massive computation, which often needs to run on GPUs or other specialized hardware. At Facebook, our long-term goal is to deliver high-quality, efficient voices to the billions of people in our community.
Read more

The Hateful Memes AI Challenge

We’ve built and are now sharing a data set designed specifically to help AI researchers develop new systems to identify multimodal hate speech. This content combines different modalities, such as text and images, making it difficult for machines to understand. The Hateful Memes data set contains 10,000+ new multimodal examples created by Facebook AI. We licensed images from Getty Images so that researchers can use the data set to support their work.
Read more

101 Cognitive Biases & Principles That Affect Your UX

A complete list of cognitive biases and design principles with tons of examples, checklists and quizzes to help you improve your user experience. Source: growth.design

Ultimate Guide to Natural Language Processing Courses

Selecting an online course that will match your requirements is very frustrating if you have high standards. Most of them are not comprehensive and a lot of time spent on them is wasted. How would you feel, if someone would provide you a critical path and tell, what modules exactly and in which order will provide you comprehensive, expert-level knowledge? Awesome. That is why I am going to help you with this guide to selecting a Natural Language Processing course, utilizing my 8 years of practical experience in Machine Learning.
Read more

Python 3.9 – The Shape of Things to Come

Python 3.9 is scheduled for 5th October 2020. There is still a long way to go until then, however, with the latest alpha 3.9.0a6 released last week and the beta version is just around the corner, we can already discuss new features. In this article, we explore some features that we have found interesting and that we think will make our code cleaner. A usual with each Python release, there are several big topics and a couple of smaller features.
Read more

Word2Vec: A Comparison Between CBOW, SkipGram & SkipGramSI

Learn how different Word2Vec architectures behave in practice. This is to help you make an informed decision on which architecture to use given the problem you are trying to solve. In this article, we will look at how the different neural network architectures for training a Word2Vec model behave in practice. The idea here is to help you make an informed decision on which architecture to use given the problem you are trying to solve.
Read more

The Dark Secrets Of BERT

BERT stands for Bidirectional Encoder Representations from Transformers. This model is basically a multi-layer bidirectional Transformer encoder(Devlin, Chang, Lee, & Toutanova, 2019), and there are multiple excellent guides about how it works generally, includingthe Illustrated Transformer. What we focus on is one specific component of Transformer architecture known as self-attention. In a nutshell, it is a way to weigh the components of the input and output sequences so as to model relations between them, even long-distance dependencies.
Read more

10 Best Machine Learning Textbooks that All Data Scientists Should Read

Machine learning is an intimidating topic to tackle for the first time. The term encompasses so many fields, research topics and business use cases, that it can be difficult to even know where to start. To combat this, it’s often a good idea to turn to textbooks that will introduce you to the basic principles of your new field of research. This holds true for AI and machine learning, especially if you have a background in statistics or programming.
Read more

The Best NLP Papers From ICLR 2020

I went through 687 papers that were accepted to ICLR 2020 virtual conference (out of 2594 submitted – up 63% since 2019!) and identified 9 papers with the potential to advance the use of deep learning NLP models in everyday use cases. Source: topbots.com

A Hacker’s Guide to Efficiently Train Deep Learning Models

Three months ago, I participated in a data science challenge that took place at my company. The goal was to help a marine researcher better identify whales based on the appearance of their flukes. More specifically, we were asked to predict for each image of a test set, the top 20 most similar images from the full database (train+test). This was not a standard classification task. I spent 3 months prototyping and ended up third at the final (private) leaderboard.
Read more