SH-A Labels

These labels are meant to be used by archivists and academics building social media collections. Some are meant to be used at the item (post) level, others for the dataset as a whole, and some are appropriate at both levels.


Prob-a-Bot

This account is likely a bot.

New Account

Account created shortly before data collection.

Content Warning

Violent imagery, hate speech, etc.

Imposter

This account claims to be someone it is not.

Noise

Advertising or other irrelevant content.

Community Control

This account has many authors.

Take Care

Take extra care in using this content.

Misinformation

Unverifiable, false, misleading, or misinformed.

Semi-Private Space

Users did not consider this to be a public statement.

Opt Out Provided

Users can opt out or remove their content from collection.

Outreach Conducted

We have shared publicly about data collection and contact options.

Part of a Whole

Dataset is part of a larger collection of diverse materials.

Privileged Information

This content cannot be accessed by all users.