This is how to spot an April Fool - and the same clues apply to fake news
April Fools and fake news are first cousins, say researchers.
They came to the conclusion after analysing hundreds of spoof April 1 articles posted on hundreds of websites over 14 years.
Experts in natural language processing from Lancaster University in the UK compared the language in the April Fool hoaxes with fake news stories — and discovered similarities in the written structure.
"April Fools hoaxes are very useful because they provide us with a verifiable body of deceptive texts that give us an opportunity to find out about the linguistic techniques used when an author writes something fictitious disguised as a factual account," said lead author Edward Dearden.
"By looking at the language used in April Fools and comparing them with fake news stories we can get a better picture of the kinds of language used by authors of disinformation."
The researchers focused on specific features in the stories, such as the amount of detail used, vagueness, formality of writing style and complexity of language.
They then compared the April Fools stories with a "fake news" dataset previously compiled by different researchers.
Similarities included less complex language, easier reading, and longer sentences than genuine news.
Important details for news stories, such as names, places, dates and times, were found less frequently within April Fools hoaxes and fake news.
First person pronouns, such as "we", were a prominent feature for both April Fools and fake news. This goes against traditional thinking in deception detection, which suggests liars use fewer first person pronouns.
The researchers found that April Fools hoax stories, when compared to genuine news:
- Are generally shorter in length;
- Use more unique words;
- Use longer sentences;
- Are easier to read;
- Refer to vague events in the future;
- Contain more references to the present;
- Are less interested in past events;
- Contain fewer proper nouns; and
- Use more first person pronouns.
The researchers also created a machine learning "classifier" to identify if articles are April Fools hoaxes, fake news or genuine news stories. The tool achieved an accuracy score of 75% at identifying April Fools articles and 72% for identifying fake news stories.
Alistair Baron, co-author of the paper, said: "Although there are many differences, our results suggest that April Fools and fake news articles share some similar features, mostly involving structural complexity.
"There are certain features in common between different forms of disinformation, and exploring these similarities may provide important insights for future research into deceptive news stories."
The research will be presented at the 20th international conference on computational linguistics and intelligent text processing later this month in La Rochelle, France.