App Reviews12 min read

AI Assistant App Reviews: ChatGPT vs Claude vs Gemini vs Perplexity in 2026

In-depth analysis of negative reviews for the top AI assistant apps. See what users complain about most in ChatGPT, Claude, Gemini, Perplexity, and Copilot — and which app handles user feedback best.

AI assistant apps have gone from novelty to daily-use utilities in less than three years. By 2026, ChatGPT, Claude, Gemini, Perplexity, and Microsoft Copilot collectively account for hundreds of millions of monthly active users on iOS and Android. But behind the polished marketing and impressive demos, the App Store and Google Play tell a different story — one written in 1-star reviews.

We analyzed thousands of negative reviews across the top AI assistant apps to understand what frustrates users most, where each app stumbles, and what these complaints reveal about the future of consumer AI.

Why AI App Reviews Matter More Than You Think

AI assistants are different from other apps in three critical ways:

  • Trust is everything — Users hand over personal questions, work documents, and creative projects. A single hallucination or privacy slip can end the relationship
  • Switching cost is near zero — Unlike productivity tools, users haven't built up years of data inside an AI chatbot. Migration takes 30 seconds
  • Expectations move fast — A model that wowed users in January feels outdated by July. Reviews reflect this constant moving target

That makes negative reviews especially valuable. They're not just complaints — they're a real-time signal of where the AI race is being won or lost.

Top Complaints Across All AI Assistant Apps

1. Hallucinations and Confidently Wrong Answers (24%)

The single most common complaint across every AI app:

  • "It made up a citation that doesn't exist" — fabricated sources
  • "Told me a recipe with poison ingredients" — dangerous misinformation
  • "Gave me the wrong tax advice and I got audited" — high-stakes errors
  • "Insists 2+2=5 even when corrected" — refusal to admit mistakes
  • "Cited a court case that never happened" — legal hallucinations

Hallucinations are the original sin of LLM-based assistants, and three years of progress haven't fully solved them. Users have become more sophisticated — they no longer accept "AI sometimes makes mistakes" as a sufficient disclaimer.

2. Aggressive Subscription Pushes (19%)

  • "Paywalled after 3 messages" — overly restrictive free tiers
  • "Says 'upgrade for better answers' constantly" — interruption-based monetization
  • "Free version is unusable, just a tease" — degraded free experiences
  • "Charged me $20 to get the same answer Google gives free" — value perception
  • "Auto-renewed without warning" — billing dark patterns

The subscription complaint pattern intensifies whenever an AI app tightens its free tier. ChatGPT and Gemini both faced waves of negative reviews after restricting free access to their best models.

3. Privacy and Data Concerns (14%)

  • "Read my entire chat history without consent"
  • "Uses my conversations to train their model"
  • "Asks for contact and microphone permissions for no reason"
  • "Can't delete my history"
  • "Showed me an ad based on something I told it in private"

Privacy concerns are growing fastest in EU countries and have become the dominant complaint in German and French App Store markets specifically.

4. App Crashes and Connection Errors (12%)

  • "Crashes mid-response" — losing answers in progress
  • "Network error every other message" — unreliable connectivity handling
  • "App freezes when I try to upload an image"
  • "Can't continue a long conversation, hits memory limit"
  • "Voice mode disconnects constantly"

Stability complaints disproportionately affect Android users, where the variety of devices makes consistent performance harder to achieve.

5. Censorship and Over-Refusals (10%)

  • "Refused to help me write a fictional villain" — creative writing limits
  • "Won't even discuss medical symptoms" — overly cautious safety filters
  • "Lectures me about ethics when I just want a recipe" — moralizing responses
  • "Treats every question like I'm a criminal"
  • "Used to answer this, now it won't"

Over-refusal complaints have grown sharply across all AI apps. Users perceive safety filters as condescending, and the perception is that newer model versions are more restrictive than their predecessors — even when this isn't strictly true.

6. Memory and Context Failures (8%)

  • "Forgets what I said two messages ago"
  • "Can't remember my preferences between sessions"
  • "Loses track in long conversations"
  • "Memory feature doesn't actually remember anything important"
  • "Mixes up different conversations"

7. Voice Mode Issues (6%)

  • "Voice mode interrupts me constantly"
  • "Can't understand my accent"
  • "Voice is robotic and unnatural"
  • "Latency makes it unusable for real conversation"
  • "Drops the call when I pause to think"

8. Image Generation Quality (4%)

  • "Can't draw hands correctly in 2026"
  • "Image generation is worse than DALL-E was in 2023"
  • "Won't generate an image of myself even with permission"
  • "Watermarks ruin my images"

9. Slow Response Times (3%)

  • "Takes 30 seconds to respond to a simple question"
  • "Free tier is intentionally throttled"
  • "Was fast last month, now it's crawling"

App-by-App Deep Dive

ChatGPT (OpenAI)

Primary complaint: Subscription restrictions and frequent paywall prompts

Star average trend: Slowly declining as free tier tightens

Typical negative reviewer: Long-time free user who feels increasingly squeezed

Standout positive in negative reviews: Voice mode quality is praised even in 1-star reviews

Recurring 1-star theme: "It used to be smarter" — perception that GPT has been quantized or degraded

Claude (Anthropic)

Primary complaint: Over-refusals and overly cautious responses

Star average trend: Stable, with strong loyalty among existing users

Typical negative reviewer: Creative writer or developer hitting safety filters

Standout positive in negative reviews: Writing quality and thoughtfulness consistently praised

Recurring 1-star theme: "Refuses to help with basic questions" — though long-form writing reviews are overwhelmingly positive

Gemini (Google)

Primary complaint: Hallucinations and confidently incorrect answers

Star average trend: Improving slowly after a rough 2024

Typical negative reviewer: User who expected Google-level accuracy

Standout positive in negative reviews: Deep integration with Google services

Recurring 1-star theme: "Worse than just searching Google" — comparison to Google Search itself

Perplexity

Primary complaint: Source quality and citation accuracy

Star average trend: Strong, with passionate user base

Typical negative reviewer: Researcher who caught a fabricated source

Standout positive in negative reviews: Source-citation approach is the most-praised feature even in critical reviews

Recurring 1-star theme: "Cited a source that doesn't say what they claim"

Microsoft Copilot

Primary complaint: Identity confusion ("Is this Bing? GPT? Copilot?")

Star average trend: Volatile, swinging with each rebrand

Typical negative reviewer: User confused by overlapping Microsoft AI products

Standout positive in negative reviews: Office integration when it works

Recurring 1-star theme: "Microsoft keeps renaming and breaking it"

The Geographic Pattern of AI Complaints

Negative reviews vary significantly by country:

  • United States — Quality of answers and value for money dominate
  • Germany — Privacy and data handling lead complaints
  • France — Language quality and French-specific knowledge
  • Japan — Tone, politeness, and cultural appropriateness
  • India — Language support and pricing in local currency
  • Brazil — Portuguese language quality and local context

This geographic variance is critical for AI app developers. A single global model strategy underperforms in markets where users expect culturally and linguistically tuned responses.

What AI Apps Can Learn From Their Negative Reviews

Stop Pretending Hallucinations Are Solved

Users have heard "we improved factuality" too many times. The apps gaining trust are the ones that surface uncertainty visually — confidence indicators, source previews, and clear "I don't know" responses.

Free Tier Restrictions Backfire

Every aggressive paywall tightening generates a measurable spike in 1-star reviews. The apps with the most stable ratings give meaningful free access and let users self-select into paid plans based on usage volume, not feature gating.

Privacy Disclosures Need to Be Specific

"We respect your privacy" doesn't work anymore. Users want to know exactly: Is my chat used for training? Is it stored? Who can see it? Apps with detailed in-app privacy explainers receive significantly fewer privacy complaints.

Refusal Messages Should Explain

The most-hated review pattern is "I can't help with that" with no explanation. Users tolerate refusals when they understand the reason. They rebel when they feel arbitrarily blocked.

Voice Mode Is the New Battleground

Mentions of voice mode in reviews tripled in 2025 and continue to grow. The app that nails low-latency, natural voice conversation will earn a generation of users who already prefer talking to typing.

How Reviews Are Changing AI Product Strategy

The most interesting insight from analyzing AI app reviews isn't what users complain about — it's how quickly successful AI companies respond. The fastest-improving apps in our analysis show a clear pattern:

  • Monitor 1-3 star reviews daily
  • Cluster complaints by theme weekly
  • Address top complaints in monthly release notes
  • Reply directly to negative reviewers when fixes ship
  • Track whether the same complaints recur

The slowest-improving apps treat reviews as a vanity metric. The fastest-improving apps treat them as a free product roadmap from their most engaged users.

Analyze AI App Reviews Yourself

Curious how a specific AI assistant is performing in your country, or how its rating has changed after a major update? Unstar.app lets you analyze negative reviews for any app on the App Store or Google Play. See word cloud breakdowns of common complaints, country-by-country rating differences, and version-specific trends — perfect for evaluating which AI tools are actually worth using and which are coasting on hype.

The AI assistant market in 2026 is finally maturing past the "wow, it can talk" phase. Users have higher expectations, sharper criticism, and zero patience for the same problems they were promised would be fixed last year. The apps that listen to negative reviews — not just count their average rating — will define the next phase of consumer AI.

ai appschatgptclaudegeminiperplexitycopilotnegative reviewsapp analysisai assistants

Ready to analyze your app's negative reviews?

See what users really complain about — for free.

Try Unstar.app