Techsoma
Latest AI Innovation Global Reports Startups FinTech Funding Tech
Next-Gen Gadgets for ME Middle Eastern Startup Ecosystem FutureTech in ME Reports Artifical Intelligence Middle East Innovation Frontier Global News Reports Middle Eastern Startup Ecosystem Fintech Investment Funding FutureTech in ME
Techsoma Middle East
  • About
  • Advertise
  • Privacy & Policy
  • Contact
No Result
View All Result
Techsoma
  • About
  • Advertise
  • Privacy & Policy
  • Contact
No Result
View All Result
Techsoma
No Result
View All Result
Home Global News

Cloudflare Says Perplexity AI is Secretly Crawling Websites to Steal User Data and Ignore Privacy Rules

by Ifeanyi Abraham
August 7, 2025
in Global News
Reading Time: 4 mins read

The war over data privacy just got messier. Cloudflare has accused Perplexity AI, a fast-growing AI-powered search engine, of secretly crawling websites to collect data, even when those websites have explicitly instructed bots to stay out. According to Cloudflare, Perplexity has been disguising its identity, rotating IP addresses, and ignoring robots.txt files, which are standard tools websites use to say “do not scrape my content.”

This means Perplexity might be accessing and using information without permission, raising significant concerns about how AI companies collect user data and whether they’re adhering to the rules of the internet.

The Core Allegation

According to Cloudflare, Perplexity initially identifies itself correctly when crawling sites. However, when faced with network blocks or restrictions via robots.txt files, it allegedly switches tactics by:

  • Modifying its user agent to disguise crawling activity
  • Rotating IP addresses and ASNs to bypass restrictions
  • Using undeclared crawlers in addition to its public bots (PerplexityBot and Perplexity-User)
  • Ignoring or, in some cases, failing to even request robots.txt directives

Evidence from Cloudflare’s Investigation

Cloudflare claims it launched an investigation after receiving multiple complaints from customers who had explicitly prohibited Perplexity’s crawlers through robots.txt and Web Application Firewall (WAF) rules. Despite these measures, customers reported that Perplexity continued accessing their content.

To verify, Cloudflare:

  • Created controlled test environments using brand-new domains, implementing strict robots.txt rules to block all bots
  • Observed that Perplexity’s bots still retrieved restricted content
  • Detected attempts by Perplexity to impersonate a generic browser agent, mimicking Google Chrome on macOS when its declared crawler was blocked
  • Traced undeclared crawlers using machine learning and network signal analysis across tens of thousands of domains and millions of requests per day

The Technical Breakdown

Cloudflare’s findings show that Perplexity’s undeclared crawlers were:

  • Using IP addresses outside Perplexity’s official IP range
  • Rotating through these IPs and switching ASNs to avoid detection
  • Conducting large-scale scraping, described as across tens of thousands of domains and millions of requests per day

In addition, Cloudflare reports that Perplexity continued providing detailed responses about restricted test domains, even though they were explicitly blocked.

Why This Matters

For decades, the internet has operated on an implicit foundation of trust between site owners and automated crawlers. Protocols like robots.txt exist to balance functionality and fairness, ensuring that sites can manage automated access without resorting to aggressive measures.

Cloudflare’s statement underscores this principle:
“The Internet as we have known it for the past three decades is rapidly changing, but one thing remains constant: it is built on trust.”

Violations of this trust, the company warns, undermine the principles that allow the web and, by extension, AI systems built on top of it to function transparently.

The Broader Implications

This controversy isn’t just about one company. It raises broader questions about:

  • Ethical AI Development: Should AI companies honor standard web protocols, or is aggressive data acquisition a necessary evil in the race for better models?
  • Data Ownership and Consent: Who controls the content that AI scrapes, and how should consent be enforced?
  • Industry Regulation: Will this prompt calls for stronger governance and legal frameworks around AI-driven crawling?

Perplexity’s Response?

As of this writing, Perplexity has not issued an official statement addressing Cloudflare’s claims. The company, known for its rapid rise as a conversational AI competitor, now faces scrutiny not only from the tech community but potentially from regulators concerned with compliance and data ethics.

Bottom Line

The Cloudflare-Perplexity standoff signals the beginning of a larger battle over how AI companies acquire data, and whether transparency will remain a cornerstone of the internet or become collateral damage in the AI arms race.

Ifeanyi Abraham

Ifeanyi Abraham

Recommended For You

Global News

MENA Startup Funding Q1 2026: The Numbers Look Fine. The Worst Is Still Coming

by Onyinye Moyosore
April 22, 2026

There is a particular kind of danger in data that looks fine. It invites relief when what the situation actually calls for is attention. MENA's Q1 2026 startup funding numbers...

Read moreDetails
Iran-US war

MENA Startup Funding Fell 37% in Q1 2026 — The Iran-US War Is Why

April 21, 2026
Snapchat layoffs AI MENA

Snap is Firing 1,000 People to Fund AI. Here is What MENA Users Will Actually Get

April 15, 2026

Europe and Morocco Open a New AI Bridge for Startups and Research

April 14, 2026
Screenshot

Data Centres Become Strategic Targets As Iran-US Conflict Expands Into Digital Infrastructure

March 13, 2026
Next Post

GPT‑5 for Middle East Startups: What Founders and Developers Should Know

Fluidity Riyadh: A New Gateway for GCC–Africa Business Partnerships

Please login to join discussion

Recent News

Anthropic lists ahead of Openai

Anthropic Beats OpenAI to the IPO Window as Gulf Sovereign Funds Back Both Sides

June 11, 2026
Bybit IPO Express

Bybit Launches IPO Express to Give Retail Users Tokenised Access to SpaceX IPO

June 8, 2026

MoEI signs MoU with 42 Abu Dhabi, showcases National Data Center Observatory

June 8, 2026

Foras.AI Backs Efham.ai to Build First Arabic AI Learning Community

June 6, 2026

MEA Smartphone Shipments Fall 7 Percent in Q1 2026 as Memory Crisis Guts Budget Segment

June 4, 2026

Techsoma Africa reports on startups, fintech, AI, digital policy, and the builders shaping Africas innovation economy.

Follow Techsoma Africa

SEARCH BY CATEGORIES

  • Amazon (6)
  • Apps (9)
  • Artifical Intelligence (255)
  • Aviation (5)
  • Business (14)
  • Clean Energy Tech (7)
  • Coding (1)
  • Creator Economy (7)
  • Cryptocurrency (9)
  • Cybersecurity (24)
  • E-commerce (9)
  • EdTech (4)
  • Electric Cars (13)
  • Fintech (47)
  • Future Tech (16)
  • FutureTech in ME (40)
  • Gaming (5)
  • Global News (112)
  • Healthcare (11)
  • Image Generation (3)
  • Investment Funding (45)
  • Investor Hotspots (31)
  • Latest Gadgets (5)
  • Metaverse (1)
  • Middle East Event Radar (31)
  • Middle East Innovation Frontier (121)
  • Middle East Tech Revolution (28)
  • Middle Eastern Startup Ecosystem (55)
  • Mobility / Logistics (14)
  • Next-Gen Gadgets for ME (15)
  • Opinions (14)
  • Politics (1)
  • Proptech (2)
  • Reports (67)
  • Robotics (16)
  • Social Media (12)
  • Space Tech (3)
  • Startups (12)
  • Tech (3)
  • Tech & Society (5)
  • Tech Gadgets (8)
  • Tech Policy in Middle East (11)
  • Technology (13)
  • Telecommunications (12)
  • Trade & Policy (4)
  • Uncategorized (8)
  • Venture Capital (3)
  • Wearable Tech (3)

Recent News

Anthropic lists ahead of Openai

Anthropic Beats OpenAI to the IPO Window as Gulf Sovereign Funds Back Both Sides

June 11, 2026
Bybit IPO Express

Bybit Launches IPO Express to Give Retail Users Tokenised Access to SpaceX IPO

June 8, 2026
  • About
  • Advertise
  • Privacy & Policy
  • Contact

Copyright 2026 Techsoma Middle East. All rights reserved.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
Techsoma

© 2026 Techsoma Media.

Company

Apps Startups Tech Reports

Legal

Terms Privacy RSS

Latest

Anthropic Beats OpenAI to the IPO Window as Gulf Sovereign Funds Back Both Sides The race to go public just got real. Anthropic, the San Francisco-based company behind the Claude AI model,... Bybit Launches IPO Express to Give Retail Users Tokenised Access to SpaceX IPO   Bybit, the world's second-largest cryptocurrency exchange by trading volume, has launched a new product called IPO Express... MoEI signs MoU with 42 Abu Dhabi, showcases National Data Center Observatory The Ministry of Energy and Infrastructure has introduced the National Data Center Observatory, an AI-based platform that helps...
No Result
View All Result

Copyright 2026 Techsoma Middle East. All rights reserved.