Group Emotion Recognition is an AI app to recognise the emotions of groups of people in various social environments.View on GitHub
About This Project
This project aims to classify a group’s perceived emotion as Positive, Neutral or Negative. The dataset being used is the Group Affect Database 3.0 which contains “in the wild” photos of groups of people in various social environments.
Our solution is a hybrid machine learning system that builds on the model by Surace et al. and extends it further with additional and more refined machine learning methods and experiments. It has been published in the paper Group Emotion Recognition Using Machine Learning.
The Need for Emotion Recognition
So, first of all, why do we need emotion recognition?
Emotion recognition is important –
- To improve the user’s experience, as a customer, learner, or as a generic service user.
- Can help improve services without the need to formally and continuously ask the user for feedback.
- Also, using automatic emotion recognition in public safety, healthcare, or assistive technology, can significantly improve the quality of people’s lives, allowing them to live in a safer environment or reducing the impact that disabilities or other health conditions have.
Applications of Emotion Recognition
Emotion Recognition has applications in crowd analytics, social media, marketing, event detection and summarization, public safety, human-computer interaction, digital security surveillance, street analytics, image retrieval, etc.
The rise of Group Emotion Recognition
The problem of emotion recognition for a group of people has been less extensively studied, but it is gaining popularity due to the massive amount of data available on social networking sites containing images of groups of people participating in social events.
Challenges facing Group Emotion Recognition
Group emotion recognition is a challenging problem due to obstructions like head and body pose variations, occlusions, variable lighting conditions, variance of actors, varied indoor and outdoor settings and image quality.
Our solution is a pipeline-based approach which integrates two modules (that work in parallel): bottom-up and top-down modules, based on the idea that the emotion of a group of people can be deduced using both bottom-up and top-down approaches.
- The bottom-up module detects and extracts individual faces present in the image and passes them as input to an ensemble of pre-trained Deep Convolutional Neural Networks (CNNs).
- Simultaneously, the top-down module detects the labels associated with the scene and passes them as input to a Bayesian Network (BN) which predicts the probabilities of each class.
- In the final pipeline, the group emotion category predicted by the bottom-up module is passed as input to the Bayesian Network in the top-down module and an overall prediction for the image is obtained.
An overview of the full pipeline is shown in the figure below.
The code has been open-sourced and is available on Github.
Please let me know if you have any feedback/suggestions.
Thank you 😃