What can we learn about the presidents tweets together?
What can we learn about the presidents tweets together?

tl;dr

We’re running a community driven project to label Donald Trump’s tweets. Labeling these tweets will give the world more insight into what Trump is saying, how different people interpert what he says and make a new dataset for machine learning.

is open (and encouraged) to everyone and the data is publically availble here.

The Project

Starting today, we’ll be hosting President Trump’s tweets on our public annotation server. Anyone can signup, login and contribute labels for President Trump’s tweets.

Labeling Trump Tweets
Labeling Trump Tweets

Each time a member of the community logs in they’ll be shown a tweet to be labeled and the different possible labels for each word (such as Person, Country or Insult). The same tweet will be shown to multiple members of the community. The resulting dataset will tell us a lot about what the President is saying as well as how we perceive it.

Trump tweets with entity annotations
Trump tweets with entity annotations

Through that process, all of us will learn more about the President as well as how we understand him. The resulting dataset will be avaible as it evolves to all participants and a weekly snapshot will be put online for anyone to use.

Background

President Donald Trump has been a prolific tweeter, with 33,551 tweets in the Trump Twitter Archive. Obviously, this brings ample opportunities to apply natural languge processing. One of our favourites is the DeepDrumpf bot which uses neural networks to generate tweets in President Trump’s style.

Why do this?

But, we want more. As James O’Malley told the New York Times,

When the president speaks, that’s really important — that changes the world. He can tank the stock market or start wars with his words. So having a greater understanding of what’s going on inside the West Wing is surely a really useful thing.”

When the president speaks, he speaks to us. Each of us understands him in our own way. By labeling this data together, as a diverse community, we’ll be able to see where we interpert his statements in the same way and where we disagree.

Inspiration & Machine Learning

For the last two years (we think), Kevin Quealy and Jasmine C. Lee have maintained and updated a list of the People, Places and Things President Trump has insulted or praised. Their project has been an inspiration for this one.

Our team put some work into parsing out the data from the New York Times and we are happy to share that data with the community as well.

LightTag’s annotation platform has a machine learning model inside, that learns from your labels. We’ll be using the New York Times dataset to bootstrap that model for this project.

Using LightTag's suggestions to label faster
Using LightTag's suggestions to label faster