This project is a sophisticated solution designed to predict whether a given news article will go viral or not. The virality factor is determined by the number of Facebook interactions, which include shares, comments, and likes. The system is designed to go beyond simple pitches to tell engaging stories, with the success of these endeavors measured in social shares.
- Predicts virality of news articles
- Analyzes Facebook interactions
- Understands sharing behavior
- Uses New Category Dataset
Project Inspiration
Understanding sharing behavior is big business. As consumers become blind to traditional advertising, the push is on to go beyond simple pitches to tell engaging stories. Increasingly, the success of these endeavors is measured in social shares.
Project Structure
The project is structured around a main component, ViralityPredictor
. This component is responsible for analyzing the given news article and predicting its virality. It takes as input the article's category, headline, and the presence of images, and outputs a virality score. The system uses the New Category Dataset, which contains all the articles from Huffington Post since 2012.
This project involves building a predictive model to determine the virality of news articles. The model uses a dataset of news articles from Huffington Post, filtered to include only articles written after 2016. Features such as engagement info (shares, interactions, comments), number of images in the article, and word statistics from the headlines are extracted. These features are then used to train machine learning models, specifically RandomForestClassifier and XGBoost. The models are evaluated based on their ability to predict whether a news article will go viral or not.