A technical SEO audit finds the issues that prevent your content from ranking regardless of how good it is. A page with a crawl block, a canonical pointing to the wrong URL, or a 4-second load time on mobile is not competing effectively; no amount of content quality compensates for these fundamental barriers.

I run through this checklist whenever I take on a new site or a site experiences an unexplained ranking drop. It’s organized in the order that problems compound: crawl issues prevent indexation; indexation issues prevent ranking; ranking issues prevent traffic.

If you are rebuilding a slow site, start with our Core Web Vitals masterclass and the practical guide to boosting website speed for conversions. For broader search strategy, see AI SEO in 2026. Google publishes the official Search Essentials documentation if you want the source perspective.

1. Crawlability

robots.txt

# Fetch your robots.txt
curl https://yourdomain.com/robots.txt

Check for:

  • Accidentally blocking Googlebot: Disallow: / (common staging error left in production)
  • Blocking CSS or JavaScript that renders the page: search engines need to render JS
  • Blocking important directories that contain indexed content
# Correct robots.txt for most sites
User-agent: *
Disallow: /api/
Disallow: /admin/
Disallow: /_next/static/chunks/  # Only if you're sure these aren't needed

Sitemap: https://yourdomain.com/sitemap.xml

Verify Google can access your robots.txt in Google Search Console → Settings → robots.txt Tester.

Sitemap

# Verify sitemap is valid and accessible
curl https://yourdomain.com/sitemap.xml
curl https://yourdomain.com/sitemap_index.xml

# Validate XML
xmllint --noout https://yourdomain.com/sitemap.xml 2>&1

Sitemap checklist:

  • All URLs return 200 status (no 301/404 in sitemap)
  • URLs are canonical (match the <link rel="canonical"> on each page)
  • <lastmod> dates are accurate (don’t set static dates that never update)
  • Total URL count is reasonable (sitemaps support up to 50,000 URLs; use sitemap index for more)
  • Submitted in Google Search Console → Sitemaps

Crawl budget issues

For large sites (10,000+ pages), crawl budget matters. Googlebot has finite crawling capacity per domain. Wasting it on low-value pages reduces crawling of important content.

# Common crawl budget wasters to block via robots.txt or noindex:
- Faceted navigation URLs (/?color=blue&size=large → millions of combinations)
- Session IDs in URLs (?sessionid=abc123)
- Sort and filter parameter duplicates (?sort=price_asc)
- Paginated parameter variations (?page=1 through ?page=500)
- Internal search results pages
- Printer-friendly URL versions

Use URL parameters tool in Google Search Console to tell Google how to handle parameter variations.

Diagram showing a search crawler bot moving through a website sitemap and internal links

2. Indexation

Coverage report

In Google Search Console → Index → Pages, review:

  • Indexed: Pages Google has indexed. Verify count matches expectation.
  • Not indexed: Investigate each reason:
    • “Crawled - currently not indexed” → thin content, duplicate content, or Google judged insufficient quality
    • “Discovered - currently not indexed” → crawl budget or low priority
    • “Excluded by ‘noindex’ tag” → intentional or accidental noindex
    • “Duplicate without user-selected canonical” → canonical configuration issue

Checking index status programmatically

// Use Google's URL Inspection API to check index status
// For bulk checking, use Screaming Frog with Google Search Console integration

// Manual check: site: operator
// site:yourdomain.com -- shows indexed pages
// site:yourdomain.com/blog -- shows indexed pages in /blog section

noindex audit

# Find all pages with noindex meta tag using Screaming Frog or:
grep -r "noindex" ./src --include="*.astro" --include="*.tsx" --include="*.html"

Check:

  • Production pages don’t have <meta name="robots" content="noindex">
  • The X-Robots-Tag: noindex HTTP header isn’t set for content you want indexed
  • noindex isn’t in the robots.txt for important directories (different from Disallow)

3. Canonicalization

Canonical issues are the most common technical SEO problem. A duplicate URL without a canonical, or a canonical pointing to the wrong URL, splits link equity and confuses indexation.

<!-- Canonical should always point to the canonical version -->
<link rel="canonical" href="https://yourdomain.com/blog/post-slug/" />

Common canonical failures:

HTTP vs HTTPS:

https://yourdomain.com/page → canonical should NOT point to http://yourdomain.com/page

Trailing slash inconsistency:

https://yourdomain.com/page  →  canonical should match the URL served (with or without slash, not both)

www vs non-www:

Redirect all traffic to one version; canonical should match

Pagination canonicals:

<!-- Each paginated page should have its own canonical, NOT point to page 1 -->
<!-- Page 3 of blog listing: -->
<link rel="canonical" href="https://yourdomain.com/blog/?page=3" />
<!-- NOT: href="https://yourdomain.com/blog/" -->

Audit canonicals at scale:

# Screaming Frog: Configuration → Spider → Extraction → Add custom extraction for canonical
# Or use a sitemap audit to spot-check pages

4. Core Web Vitals

Google uses CWV as a ranking signal. The thresholds that matter:

MetricGoodNeeds ImprovementPoor
LCP (Largest Contentful Paint)< 2.5s2.5s – 4s> 4s
INP (Interaction to Next Paint)< 200ms200ms – 500ms> 500ms
CLS (Cumulative Layout Shift)< 0.10.1 – 0.25> 0.25

Check CWV:

# PageSpeed Insights API (field data + lab data)
curl "https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url=https://yourdomain.com&strategy=mobile&key=YOUR_API_KEY"

# Lighthouse CLI
lighthouse https://yourdomain.com --output=json --strategy=mobile

# CrUX (real user data): Google Search Console → Experience → Core Web Vitals

LCP fixes (most impactful):

  • Preload the LCP image: <link rel="preload" as="image" href="hero.webp" fetchpriority="high">
  • Never lazy-load the LCP element
  • Use AVIF/WebP and appropriate sizes
  • Use a fast CDN

CLS fixes:

  • Set explicit width and height on all images and videos
  • Reserve space for late-loading embeds (ads, iframes)
  • Avoid inserting content above existing content after load

INP fixes:

  • Reduce JavaScript execution time on user interaction
  • Use scheduler.postTask() or requestIdleCallback() for non-urgent work
  • Avoid long tasks (> 50ms) on the main thread.
Dashboard gauges showing Largest Contentful Paint, Interaction to Next Paint, and Cumulative Layout Shift scores

5. Mobile Usability

Google uses mobile-first indexing; the mobile version of your page is what’s indexed and ranked.

Check in Google Search Console → Experience → Mobile Usability. Common failures:

  • Text too small to read (below 12px font)
  • Clickable elements too close together (touch targets under 48×48px)
  • Content wider than screen (horizontal scroll on mobile)
  • Viewport not configured: missing <meta name="viewport" content="width=device-width, initial-scale=1">
Smartphone showing a mobile-friendly webpage with readable text and properly sized tap targets

Search engines find and prioritize pages through internal links. Pages that are difficult to reach from the homepage are treated as low-priority.

Crawl depth audit:

Level 1: Homepage
Level 2: Category pages, main navigation targets  (1 click from home)
Level 3: Individual product/post pages            (2 clicks from home)
Level 4+: Archived, low-priority content         (3+ clicks -- may be under-crawled)

Important pages should be reachable within 3 clicks from the homepage. If a critical page is only linked from deep in paginated archives, add it to a relevant category page or the sitemap.

Orphaned pages: Pages with no internal links are invisible to Googlebot (unless in the sitemap). Use Screaming Frog to find pages in your sitemap that aren’t linked internally.

7. Structured Data

Structured data enables rich results (FAQ accordions, review stars, recipe cards, event dates) in search. Verify implementation with the Schema Markup Validator and Google’s Rich Results Test.

<!-- Article schema for blog posts -->
<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Article",
  "headline": "Technical SEO Audit Checklist",
  "author": {
    "@type": "Person",
    "name": "Curtis Harrison",
    "url": "https://yourdomain.com/about"
  },
  "datePublished": "2026-04-10",
  "dateModified": "2026-04-10",
  "publisher": {
    "@type": "Organization",
    "name": "Veduis",
    "logo": {
      "@type": "ImageObject",
      "url": "https://yourdomain.com/logo.png"
    }
  }
}
</script>

8. Redirect Audit

# Find redirect chains (A → B → C is a chain; should be A → C)
# Use Screaming Frog: Mode → Spider, check "Follow Internal Redirects"

# Check for 302 instead of 301 (temporary vs permanent)
curl -I https://yourdomain.com/old-url
# Should see: HTTP/2 301 (not 302) for permanent content moves

Redirect chains lose PageRank at each hop. Flatten chains to single-hop redirects. Convert 302 redirects on permanently moved content to 301.

Flow diagram showing a clean single-hop 301 redirect and a self-referencing canonical tag

9. Page Speed for SEO

Beyond CWV, overall page speed affects crawl efficiency and user experience:

# Check Time To First Byte (TTFB)
curl -o /dev/null -s -w "TTFB: %{time_starttransfer}s\n" https://yourdomain.com

# TTFB should be under 800ms for good; under 200ms is excellent

TTFB fixes: enable HTTP/2, use a CDN, improve server response time, implement caching.

Running the Audit: Tool Stack

TaskTool
Full-site crawlScreaming Frog (£149/year)
Google’s view of your siteGoogle Search Console (free)
Performance + CWVPageSpeed Insights, Lighthouse
Log file analysisScreaming Frog Log File Analyser
Backlink auditAhrefs or Semrush
Structured data testingGoogle Rich Results Test

Run a full audit quarterly. Run the CWV and indexation checks monthly. Set up Google Search Console email alerts for coverage drops, mobile usability errors, and manual actions; these require immediate attention.