Sport events, concerts, festivals and large political demonstrations are attended by huge crowds. The Allianz Arena here in Munich, for example, has up to 75,000 seats, and while organizing public transportation for an event of this magnitude is a challenge in itself (which we will highlight in another blog entry), managing the crowd at the stadium is at least as challenging.
When you look at large political demonstrations, the attendance becomes even more extreme. In most countries they have more than 100,000 participants and it is not uncommon to reach 500,000 in big cities like Berlin.
Organizing and securing such events comes with several difficulties and challenges. Almost all of them are based on a very simple question for planning:
How many people are expected? And where are you staying?
Answering this question is not as easy as it seems. Crowds are subject to strong dynamics and are difficult to predict. Their density varies between different areas and certain events (e.g. the end of the soccer match in the stadium) lead to sudden and unpredictable changes. Until recently, there have been very few ways for organizers to oversee these dynamics in real time.
Instead, event management often works with estimates regarding peak loads in order to then be able to meet these with sufficient resources (security personnel, emergency exits, etc.).
For computers and data scientists, on the other hand, this problem is predestined for modern solution approaches from the fields of machine learning, artificial intelligence and image processing. Not being able to answer the question "How many? And where?" is insufficient in times of such technical tools.
In the following sections, we will shed light on what we can do for organizers of large events using intelligent image analysis.
We start small with this moderately busy shopping street. A first simple approach is to test so-called object detection networks. These networks have been trained to detect and simultaneously localize objects in an input image. There are many different architectures and datasets that differ in the complexity of the network, the number of different objects and their accuracy. The network used above is a popular open source network and is not optimized to detect people or to be particularly efficient in computation. But even with this first rather naive approach, the network detects 31 people in the image. This is not perfect yet (e.g. the people behind the tree are not detected well), but the deviation to manual recounting is not too big. Next we will test this with more people in the image.
It looks like this approach is already reaching its limits. Detecting all the people in it, and especially locating them, is impossible for most ordinary object detection networks. This has several reasons. Their architecture allows only a limited number of objects for a certain region of the input image, which limits the number of detectable persons by design. In addition, the datasets on which these networks were trained do not include such crowds with markers of each person. Therefore, the network was never trained to recognize individual people in the dense crowd.
This becomes even clearer for denser crowds, as can be seen in this scene from a sporting event. The network is unable to distinguish individuals in the crowd and fails even to produce a large estimate.
The solution to this problem therefore takes a fundamentally different approach. This so-called crowd counting technique detects the density of the crowd for the regions of the image instead of individual people. Crowd counting networks are trained on data sets from large events and can be applied to many different crowds, from quite small groups (0-50) to very dense crowds (even up to 10,000+). Below are a few examples of these networks:
This approach now counts 2,228 people in the sports arena image and 398 people on the court. A manual count would be completely impractical in these examples. Modern machine learning techniques give large event organizers entirely new tools to plan, manage and analyze their events. And all this in real time.
The images for this article were taken from the ShanghaiTech dataset, a large crowd counting benchmark.
How does Isarsoft use crowd counting?
Isarsoft uses crowd counting when crowds become too dense to locate individuals and this information also does not provide much added value. The results of the count can be viewed and analyzed in a dashboard. Furthermore, the data can be exported to several common formats to allow further analysis of the data with other tools or to be used in presentation and evaluations. The algorithms can be run directly on AI-enabled smart cameras or with ordinary cameras and a server solution. The server solution is available as an on-site or cloud solution.
Isarsoft's goal is that everyone can use the possibilities of modern AI solutions to better understand their business processes and that this should also be available and easy to use. Since it is a software solution, the implementation is faster and cheaper than dedicated sensors or manual analysis. The solutions can be easily integrated into existing infrastructure.
Possible application scenarios include measuring the number of passengers on train platforms, recording the number of visitors to major events and trade fairs, and as a powerful assistance tool for cities and their authorities.
If you have any further questions, please do not hesitate to contact us. We will be happy to discuss your specific application with you and advise you on the possibilities of video analysis.
More about Isarsoft
With Isarsoft Perception, your camera systems become part of your business intelligence. Whether the goal is to increase efficiency, customer satisfaction or safety, Isarsoft Perception provides the insights needed for informed decisions.
Contact us, to learn more about how to turn security cameras into intelligent sensors.
What’s a Rich Text element?
The rich text element allows you to create and format headings, paragraphs, blockquotes, images, and video all in one place instead of having to add and format them individually. Just double-click and easily create content.
Static and dynamic content editing
A rich text element can be used with static or dynamic content. For static content, just drop it into any page and begin editing. For dynamic content, add a rich text field to any collection and then connect a rich text element to that field in the settings panel. Voila!
How to customize formatting for each rich text
Headings, paragraphs, blockquotes, figures, images, and figure captions can all be styled after a class is added to the rich text element using the "When inside of" nested selector system.