Report: 73% of Internet Traffic Comes from Web Crawlers

Arkose Labs released its Q3 2021 Malicious Bot Report, which shows that malicious bots and fraud traffic accounted for 73% of internet traffic in the third quarter of 2021.


There are two reasons for the growth of malicious bots: one is the widespread availability of artificial intelligence technology, which improves the performance of malicious bots; the other is that the black and gray industry has increased the commercial popularity of attacks through "crime as a service" (CaaS), which speeds up the launch of new attacks and further increases the number of black and gray industries.

The so-called "crime as a service" is that if an individual wants to launch an attack against a certain enterprise or organization, but does not have the resources, technology or time, they can pay another person or organization to launch a cyberattack. In other words, "crime as a service" allows those who have malicious intentions but lack skills to become cyber offenders.

TCrawlers Can Steal Data, Scam Users, and Disrupt Services

Malicious crawlers are used for various purposes, mainly for stealing data, defrauding users or disrupting services. They bring huge losses and risks to various fields, some of which are cross-industry, and some are targeted at specific industries. The industries that are most attacked by malicious crawlers are: technology (76%), gaming (29%), social media (46%), e-commerce (65%) and financial services (45%). 2023112903.jpg

Ticketing sector. This is one of the most common targets of malicious crawlers, which can help organizations and individuals grab tickets and then resell them at high prices on the black market, making it difficult for normal consumers to buy tickets at reasonable prices.

Financial institutions. Malicious crawlers often try to invade user accounts, conduct financial fraud or steal sensitive information. In addition, some investment companies also use web crawler robots to obtain data and strategies of competitors, in order to improve their own investment and trading performance. For example, hedge funds use web crawler robots to collect and analyze non-traditional data such as inventory levels, pricing data, etc., to guide their own investment decisions. It is reported that hedge funds paid $2 billion for this in 2020.

Online games. Online games are plagued by database collision robots, which try to steal money or game items from user accounts and sell them online.

Airlines. 25.9% of the traffic of airlines comes from malicious crawlers, and the flight prices and seats of airlines are captured by competitors or travel agents, affecting their revenue and customer experience. More seriously, some black and gray industries use malicious crawlers to steal the accumulated air miles in user accounts and use them for illegal transactions or exchanges.

E-commerce. 18% of the traffic on e-commerce websites comes from these malicious crawlers, which are used for content scraping, account takeover, credit card fraud and various coupons.

Social information. Malicious crawlers are often used for content scraping, not only stealing content and reposting it on other channels, but also obtaining information from competitors, in order to conduct unfair competition. This not only damages the interests and reputation of legitimate websites, but also distorts the entire network ecosystem. Many times, websites may mistakenly think that their traffic has increased, but in fact they are attacked by malicious crawlers.

Stealing accounts. Database collision attacks are another important purpose of malicious crawlers, and accounts with weak passwords or repeated passwords are easily stolen.

Experts: Here's How to Identify Malicious Web Crawlers

Nowadays, malicious crawlers have features such as random IP addresses, anonymous proxies, identity modification, and human-like operation behaviors, which make them very difficult to detect and block. Experts from Dingxiang Defense Cloud Business Security pointed out that malicious crawlers can be analyzed and identified based on their behavior and attributes.


One is the access target. The purpose of malicious crawlers is to obtain the core information of websites and apps, such as user data, product prices, comment content, etc., so they usually only access pages that contain this information and ignore other irrelevant pages.

Two is the access behavior. Malicious crawlers are automatically executed by programs, following preset processes and rules for access, so their behavior has obvious regularity, rhythm, and consistency, which are very different from the normal users' randomness, flexibility, and diversity.

Three is the access device. The goal of malicious crawlers is to capture as much information as possible in the shortest time, so they will use the same device to perform a large number of access operations, including browsing, querying, downloading, etc., which will cause the device's access frequency, duration, depth and other indicators to be abnormal.

Four is the access IP address. Malicious crawlers use various methods to change their IP addresses in order to avoid being identified and banned by websites, such as using cloud services, routers, proxy servers, etc. This will cause the IP address's source region, operator, network type and other information to be inconsistent, or deviate significantly from the normal user distribution.

Five is the access time period. Malicious crawlers usually choose to crawl in batches during periods when the website traffic is low and monitoring is weak, such as late night or early morning. This will cause the indicators such as access volume and bandwidth usage to be abnormal during this period.

Six is big data modeling and mining. By collecting, processing, mining and modeling the access data of normal users and malicious crawlers on the website, a crawler identification model that is exclusive to the website itself can be built, thereby improving the identification accuracy and efficiency.

Dingxiang: Block and Prevent Malicious Web Crawler Attacks

Malicious crawlers are becoming more intelligent and complex, and it is difficult to effectively defend against them by simply limiting the access frequency or encrypting the front-end pages. It is necessary to improve the human-machine recognition technology, increase the identification and interception capabilities of the black industry, and limit the access of robots to their human or system targets, thereby increasing the attack cost of malicious crawlers. Dingxiang provides enterprises with a full-process three-dimensional prevention and control solution, which can effectively prevent malicious crawling behavior.

First of all, use Dingxiang Defense Cloud to regularly detect and strengthen the security of the platform and App's operating environment, and protect App and client with code obfuscation, shell protection and other measures, and encrypt the communication link to ensure the security of the end-to-end full link.

Secondly, based on deploying Dingxiang Defense Cloud and Dingxiang Dinsight, through big data matching and tracking, multi-dimensional and in-depth analysis, accurately identify abnormal behaviors, and achieve accurate identification and interception of malicious crawlers.

Among them, Dingxiang Defense Cloud's intelligent verification code uses artificial intelligence technology to effectively block malicious crawlers from stealing data, and can verify, judge and intercept malicious accounts and crawling behaviors in real time at key links such as registration, login, query, etc. Dingxiang Defense Cloud's device fingerprint technology can effectively monitor and intercept risks such as code injection, hook, simulator, cloud phone, root, jailbreak, etc., and realize accurate identification and risk assessment of devices through unique device identifiers.

Dingxiang Dinsight is based on multiple dimensions of information such as business query scenario requests, device fingerprint information collected by clients, user behavior data, etc., to achieve effective identification of malicious crawler behavior. Based on security prevention and control strategies, it effectively identifies and intercepts malicious crawling behavior.

Finally, based on Dingxiang Xintelll intelligent model platform, it conducts in-depth analysis of risk data and business data, further mines potential risks, and builds exclusive risk control models to realize real-time iteration of security policies and more effectively intercept various malicious attacks.


shared library hardening,Safeguard sensitive app data,anti-reverse engineering,in-app security,hooking frameworks,Hardening apk,Financial fraud,App shielding,App repackaging and cloning,App Hardening,Android Hardening,Android app security,Android app hardening,Android App Bundle hardening,aab hardening,atbCAPTCHA,bot management,anti-bot Captcha,captcha,anti-bot solution,captcha security,fraud protection,Mobile Authentication,captcha

Copyright © 2024 AISECURIUS, Inc. All rights reserved
Hi! We are glad to have you here! Before you start visiting our Site, please note that for the best user experience, we use Cookies. By continuing to browse our Site, you consent to the collection, use, and storage of cookies on your device for us and our partners. You can revoke your consent any time in your device browsing settings. Click “Cookies Policy” to check how you can control them through your device.