4 months ago · 5 min read
Crowd Counting for Large Scale Event Management
Big sport events, concerts, festivals or political demonstrations all face the challenge of a huge amount of people attending the event. Our local "Allianz Arena" here in Munich has up to 75.000 seats, and while organizing the public transportation of a fan crowd of this magnitude is a challenge in itself (which we will cover in other blog posts), managing the crowd at the arena is equally challenging for the organizers and follows strict rules and timings.
Looking at major political demonstrations, the number of participants become even more extensive. In most countries they tend to have more than 100.000 and its not uncommon to reach events close to 500.000 participants in big cities like e.g. Berlin.
Being an organizer of events of that magnitude introduces multiple challenges. Most of these are centered around a seemingly very, very simple question:
How many? And where?
Answering this is not as easy as it seems. Crowds move dynamically and are sometimes very unpredictable. Their density varies between locations and certain events (e.g. the end of the soccer match) leads to sudden and somewhat erratic changes. Until recently there were very little options for organizers to get insights into this process.
In Event Management this is instead solved by estimating upper bounds and subsequently allocating enough resources (security personnel, emergency exits, etc.) for the expected quantities.
For Computer and Data Scientist however, this problem appeals quite naturally to the creative problem solving instinct every engineer has picked up most likely even before they knew they would be an engineer in the first place. Not being able to answer the question "How many? And where?" seems very inadequate in times of machine learning, sophisticated computer vision and inexpensive IP cameras.
Let's have a look what we can do to make the tasks and risks of event organizers a little more predictable.
Let's start small and take a mediocre busy shopping mall like the one above. As a first approach it's a very straightforward idea to use so called Object Detection Networks. Those networks have been trained to recognize and localize objects in an input image. There are many different architectures, approaches and datasets to this problem, varying in the size of the network, amount of different classes and accuracy. The network used above is a very popular open source network and by no means optimized to detect persons nor optimized for detection speed. But even with this very naive first try, the network detects 31 persons in this image. It is far from being perfect (persons behind the tree are not recognized), but it's not far off from counting this manually. Let's try this with more people.
It seems this approach already shows very obvious limitations. Detecting (and especially localizing) every person in the image is impossible for common Object Detection Networks. This has various reasons. Their architecture usually only a allows for a certain amount of objects for a dedicated region of the image, limiting the amount of persons which can be detected in a small region of the image by design. And furthermore, the initial dataset the networks were trained on do not contain large crowds with labels for every person, so the network never learned to recognize persons in dense crowds.
This becomes even more apparent for more dense crowds like this scene from a sports event. The network is incapable of isolating single persons in the crowd and therefore can't even give a rough estimate in this case.
The solution to this problem is called Crowd Counting and has a fundamentally different approach. Instead of detecting and localizing persons in the image, it tries to estimate the density of the crowd for every region of the image. Crowd Counting Networks are trained on datasets of big events like the one above and can be applied to a variety of different crowds, anything from very sparse (0-50) to very dense (even up to 10.000+). Let's look at the above examples with a Crowd Counting network.
This approach now counts 2.228 persons in the image in the arena and 398 persons on the plaza. Imagine counting this manually - absolutely impractical. Modern Machine Learning gives Large Scale Event Management the tools needed to plan, manage and analyze events more efficiently and in realtime.
The images for this article were taken from the ShanghaiTech dataset, a large Crowd Counting benchmark.
How does Isarsoft leverage Crowd Counting?
Isarsoft deploys Crowd Counting whenever crowds become too dense to detect and localize individual objects with Object Detection Networks. The results can be viewed and analyzed in our dashboard. The data can be exported in many portable formats, enabling users to further analyse the data or use the data in presentations and process evaluations. Our algorithms can be directly executed on AI-capable smart cameras as well as with common cameras and dedicated servers as an on-premise / cloud solution.
At Isarsoft, we believe that leveraging capabilities of modern AI to get insights into business processes should be accessible and straightforward to use for everyone, while being more efficient and faster to deploy than dedicated sensor hardware or manual surveys. Our solutions make it easy to integrate video analytics into existing infrastructure.
If you have any additional questions, we are happy to talk you through your use case.