# SEO Setup Checklist
Complete technical SEO playbook based on the ActiveWizards Astro site (147 pages, 115 blog posts, 800+ redirects). Every pattern is production-tested. Covers on-page SEO, structured data, GEO (Generative Engine Optimization), and performance.
Table of Contents
1. Meta Tags
Required on every page
``html
`
Implementation in Astro SEO component
`astro
const { title, description, canonical } = Astro.props;
const siteUrl = 'https://mysite.com';
const canonicalUrl = canonical || new URL(Astro.url.pathname, siteUrl).href;
`
Title tag rules
| Rule | Limit | Example |
| Length | 50-60 characters | "Building RAG Pipelines with LangChain \ | ActiveWizards" |
| Format | Page Title \ | Brand | Keeps brand consistent, page-specific first |
| Primary keyword | Within first 60 chars | Front-load the target keyword |
| Unique per page | No duplicates | Every page has a distinct title |
Meta description rules
| Rule | Limit | Example |
| Length | 120-155 characters | Long enough for context, short enough to not truncate |
| Include keyword | Naturally | Not keyword-stuffed |
| Include CTA | When appropriate | "Learn how..." or "See our approach to..." |
| Unique per page | No duplicates | Every page has a distinct description |
Canonical URL rules
- Always absolute URL: https://mysite.com/blog/my-post/
- Include trailing slash consistently (match your Astro config)
- Self-referencing canonical on every page (even when not deduping)
- For paginated content: each page gets its own canonical
2. Open Graph Tags
`html
`
Article-specific OG tags
`html
`
OG image specifications
- Dimensions: 1200 x 630 pixels (1.91:1 ratio)
- Format: PNG for text-heavy, JPG for photo-heavy
- File size: Under 1MB (under 300KB preferred)
- Text safe area: Keep important text in the center 800x400 area
- Fallback: Set a site-wide default OG image for pages without a custom one
Implementation pattern
`astro
const ogImage = Astro.props.ogImage || '/images/brand/og-default.png';
const ogImageUrl = ogImage.startsWith('http')
? ogImage
: new URL(ogImage, 'https://mysite.com').href;
`
3. Twitter Cards
`html
`
Card types
| Type | Use case | Image size |
summary_large_image | Blog posts, landing pages | 1200x630 |
summary | Generic pages | 120x120 minimum |
Validation
Test at cards-dev.twitter.com/validator before launch.
4. JSON-LD Structured Data
Base pattern (every page)
`astro
`
Organization (site-wide, on homepage)
`json
{
"@context": "https://schema.org",
"@type": "Organization",
"name": "Your Company",
"url": "https://mysite.com",
"logo": {
"@type": "ImageObject",
"url": "https://mysite.com/images/logo.png"
},
"sameAs": [
"https://linkedin.com/company/your-company",
"https://github.com/your-company"
],
"contactPoint": {
"@type": "ContactPoint",
"email": "contact@mysite.com",
"contactType": "customer service"
}
}
`
WebSite (homepage)
`json
{
"@context": "https://schema.org",
"@type": "WebSite",
"name": "Your Site",
"url": "https://mysite.com",
"potentialAction": {
"@type": "SearchAction",
"target": "https://mysite.com/search?q={search_term_string}",
"query-input": "required name=search_term_string"
}
}
`
TechArticle (blog posts)
`json
{
"@context": "https://schema.org",
"@type": "TechArticle",
"headline": "Building Production RAG Pipelines",
"description": "Complete guide to RAG in production.",
"url": "https://mysite.com/blog/rag-pipelines/",
"datePublished": "2025-11-15T00:00:00Z",
"dateModified": "2026-01-20T00:00:00Z",
"author": {
"@type": "Person",
"name": "Igor Bobriakov",
"url": "https://mysite.com/about/"
},
"publisher": {
"@type": "Organization",
"name": "ActiveWizards",
"url": "https://mysite.com",
"logo": {
"@type": "ImageObject",
"url": "https://mysite.com/images/brand/og-default.svg"
}
},
"about": "How to build production RAG pipelines",
"dependencies": "LangChain",
"proficiencyLevel": "Expert",
"softwareVersion": "0.3.x",
"timeRequired": "PT12M",
"keywords": "RAG, LangChain, Vector DB"
}
`
BreadcrumbList
`json
{
"@context": "https://schema.org",
"@type": "BreadcrumbList",
"itemListElement": [
{
"@type": "ListItem",
"position": 1,
"name": "Home",
"item": "https://mysite.com/"
},
{
"@type": "ListItem",
"position": 2,
"name": "Blog",
"item": "https://mysite.com/blog/"
},
{
"@type": "ListItem",
"position": 3,
"name": "RAG Pipelines"
}
]
}
`
FAQPage (for pages with FAQ sections)
`json
{
"@context": "https://schema.org",
"@type": "FAQPage",
"mainEntity": [
{
"@type": "Question",
"name": "What is RAG?",
"acceptedAnswer": {
"@type": "Answer",
"text": "Retrieval-Augmented Generation combines..."
}
},
{
"@type": "Question",
"name": "When should I use RAG vs fine-tuning?",
"acceptedAnswer": {
"@type": "Answer",
"text": "Use RAG when your knowledge base changes frequently..."
}
}
]
}
`
Validation
Test all JSON-LD at search.google.com/test/rich-results.
5. GEO (Generative Engine Optimization)
GEO optimizes content for AI-powered search (Google AI Overviews, Perplexity, ChatGPT search). It extends traditional SEO with structured signals that LLMs parse.
Frontmatter GEO fields
`yaml
title: "Building Production RAG Pipelines with LangChain"
problem: "How to build a production RAG pipeline"
technology: "LangChain"
technologyVersion: "0.3.x"
persona: "ML Engineer"
`
TechArticle schema extensions for GEO
`json
{
"@type": "TechArticle",
"about": "How to build a production RAG pipeline",
"dependencies": "LangChain",
"proficiencyLevel": "Expert",
"softwareVersion": "0.3.x",
"timeRequired": "PT12M"
}
`
Content patterns that improve GEO
Content structure for GEO
`markdown
TL;DR
- Key point 1 (direct answer to the primary query)
- Key point 2
- Key point 3
The Problem
[1-2 paragraphs defining the problem clearly]
Architecture / Approach
[Technical explanation with diagram]
Implementation
[Step-by-step with code examples]
Benchmarks / Results
[Concrete metrics, comparison table]
FAQ
[3-5 question-answer pairs targeting long-tail queries]
`
6. Sitemap
Configuration
`js
// astro.config.mjs
import sitemap from '@astrojs/sitemap';
export default defineConfig({
site: 'https://mysite.com',
integrations: [sitemap()],
});
`
Submit to GSC
Sitemap rules
- Include all indexable pages
- Exclude 404 page, admin pages, draft pages
- Update lastmod
when content changes
- Maximum 50,000 URLs per sitemap file (Astro handles splitting automatically)
7. robots.txt
Standard configuration
`
User-agent: *
Allow: /
Disallow: /admin/
Disallow: /draft/
Disallow: /api/
Sitemap: https://mysite.com/sitemap-index.xml
`
With astro-robots-txt integration
`js
import robotsTxt from 'astro-robots-txt';
export default defineConfig({
site: 'https://mysite.com',
integrations: [robotsTxt()],
});
`
Auto-generates robots.txt with sitemap reference.
8. Heading Hierarchy
Rules
| Rule | Why |
| Single H1 per page | Tells search engines the primary topic |
| H1 matches the page's target keyword | Strongest on-page signal |
| Logical H2-H4 nesting | H3 under H2, H4 under H3 — never skip levels |
| H2s define major sections | Each H2 could be a standalone search result |
| No heading tags for styling | Use CSS classes, not
|
Example structure
`html
Building Production RAG Pipelines with LangChain
Architecture Overview
Document Ingestion Layer
Vector Store Selection
Implementation Guide
Step 1: Chunking Strategy
Step 2: Embedding Pipeline
Step 3: Retrieval Configuration
Performance Benchmarks
FAQ
`
Heading anchors (for deep linking)
Use rehype-slug + rehype-autolink-headings:
`js
// astro.config.mjs
markdown: {
rehypePlugins: [
rehypeSlug,
[rehypeAutolinkHeadings, {
behavior: 'wrap',
properties: { class: 'heading-anchor' },
}],
],
},
`
Style anchors to inherit heading color (not accent):
`css
.heading-anchor {
color: inherit;
text-decoration: none;
}
.heading-anchor:hover {
text-decoration: underline;
text-decoration-color: var(--color-accent);
text-underline-offset: 4px;
}
`
9. Image Optimization
Alt text rules
| Rule | Example |
| Descriptive, not decorative | alt="RAG pipeline architecture diagram showing ingestion, embedding, and retrieval stages" |
| Include keyword naturally | Not alt="RAG" but alt="RAG pipeline architecture" |
| Keep under 125 characters | Screen readers truncate longer text |
| Empty alt for decorative images | alt="" for background images, dividers |
| No "image of" or "picture of" | Screen readers already announce it as an image |
Performance
- Use loading="lazy"
on all images below the fold
- Use loading="eager"
on the first image (above fold)
- Set explicit width
andheightattributes (prevents CLS)
- Use WebP or AVIF format where possible
- Compress with tools like squoosh.app
orsharp
SVG diagrams
For technical diagrams (D2, Mermaid, GraphViz):
`css
/* Light background for dark-text diagrams */
.prose img[src$=".svg"] {
background: #f8f9fa;
padding: 2rem;
border: 1px solid var(--color-border);
display: block;
margin: 2rem auto;
max-width: 100%;
cursor: zoom-in;
}
/* Dark-themed diagrams need no background */
.prose img[src*="-dark.svg"] {
background: transparent;
padding: 0;
}
`
10. Internal Linking
Strategy
| Pattern | Purpose | Example |
| Related posts section | Keep readers on site, pass link equity | 3 related posts by tag overlap |
| Related service link | Connect blog posts to service pages | "Related Service: AI Agent Engineering" |
| Contextual body links | Natural mentions of other pages | "See our RAG case study" |
| Breadcrumb navigation | Establish hierarchy | Home > Blog > Post Title |
| Footer links | Ensure all top-level pages are reachable | Services, Case Studies, Blog, About |
Related posts algorithm (tag-based)
`ts
// Score each post by number of overlapping tags
const currentTags = new Set(post.data.tags.map(t => t.toLowerCase()));
const relatedPosts = allPosts
.filter(p => p.id !== post.id)
.map(p => ({
post: p,
score: p.data.tags.filter(t => currentTags.has(t.toLowerCase())).length,
}))
.filter(r => r.score > 0)
.sort((a, b) => b.score - a.score || b.post.data.publishedAt - a.post.data.publishedAt)
.slice(0, 3)
.map(r => r.post);
`
Tag-to-service mapping
Map blog post tags to service pages for cross-linking:
`ts
// src/data/service-mapping.ts
const TAG_SERVICE_MAP: Record
'rag': { slug: 'ai-agent-engineering', title: 'AI Agent Engineering', description: '...' },
'langchain': { slug: 'ai-agent-engineering', title: 'AI Agent Engineering', description: '...' },
'data pipeline': { slug: 'data-engineering', title: 'Data Engineering', description: '...' },
'spark': { slug: 'data-engineering', title: 'Data Engineering', description: '...' },
};
export function findRelatedService(tags: string[]): ServiceLink | null {
for (const tag of tags) {
const service = TAG_SERVICE_MAP[tag.toLowerCase()];
if (service) return service;
}
return null;
}
`
11. Page Speed
Targets
| Metric | Target | Tool |
| LCP | < 2.5s | PageSpeed Insights |
| INP | < 200ms | PageSpeed Insights |
| CLS | < 0.1 | PageSpeed Insights |
| Total page weight | < 500KB | Chrome DevTools Network |
| Time to First Byte | < 200ms | WebPageTest |
Astro advantages
- Zero JS by default — pages ship as pure HTML+CSS unless you add interactive components
- Static generation — all pages are pre-built at deploy time (no server rendering)
- Automatic code splitting — only page-specific CSS/JS loads per route
Optimization checklist
- [ ] Fonts preloaded with
- [ ] Font display: font-display: swap
on all@font-face
- [ ] Images have explicit width
andheight
- [ ] Below-fold images use loading="lazy"
- [ ] astro-compress
is the last integration (compresses HTML/CSS/JS)
- [ ] No render-blocking JS in
(useasyncordefer)
- [ ] CSS is inlined or preloaded (Astro handles this automatically)
- [ ] Cloudflare caching enabled (cache-control headers via CF dashboard)
External font loading pattern
`html
`
12. Redirect Strategy
Rules
| Rule | Why |
| Use 301 for permanent moves | Passes ~95% link equity to the new URL |
| Never use 302 for permanent moves | 302 doesn't transfer SEO value |
| No redirect chains | A -> B -> C is bad; A -> C directly |
| No redirect loops | A -> B -> A causes infinite loop |
| Redirect old URLs after migration | Preserve all SEO equity from the old site |
| Redirect with trailing slash consistency | Match your site's convention |
Cloudflare Pages _redirects file
`
# Old CMS URL → new Astro URL
/contact/ /contact-us/ 301
# Removed page → most relevant existing page
/old-service/ /services/ 301
# Changed slug
/blog/old-post-name/ /blog/new-post-name/ 301
# Section consolidation
/news/* /blog/ 301
`
Migration redirect generation
When migrating from a CMS, generate redirects programmatically:
`python
import json
# Load migration mapping
with open('data/migration/url-map.jsonl') as f:
mappings = [json.loads(line) for line in f]
# Generate _redirects file
with open('site/aw/public/_redirects', 'w') as f:
f.write('# Redirects for CMS migration\n\n')
for m in mappings:
f.write(f"{m['old_url']} {m['new_url']} 301\n")
print(f"Generated {len(mappings)} redirect rules")
`
Verify redirects
`bash
# Check a specific redirect
curl -I https://mysite.com/old-page/
# Should show: HTTP/2 301, location: https://mysite.com/new-page/
# Bulk check
while read line; do
url=$(echo "$line" | awk '{print $1}')
status=$(curl -o /dev/null -s -w "%{http_code}" "https://mysite.com${url}")
echo "$status $url"
done < public/_redirects
`
13. Schema.org by Page Type
Quick reference
| Page Type | Schema | Key Properties |
| Homepage | Organization + WebSite | name, url, logo, sameAs |
| Blog listing | CollectionPage | name, description |
| Blog post | TechArticle | headline, datePublished, author, keywords |
| Case study | Article + Review | headline, about, reviewBody |
| Service page | Service | name, description, provider |
| Contact page | ContactPage | name, description |
| About page | AboutPage + Person | name, jobTitle, sameAs |
Service page schema
`json
{
"@context": "https://schema.org",
"@type": "Service",
"name": "AI Agent Engineering",
"description": "Design and deploy production AI agent systems.",
"provider": {
"@type": "Organization",
"name": "Your Company",
"url": "https://mysite.com"
},
"areaServed": "Worldwide",
"serviceType": "AI Engineering Consulting"
}
`
14. Pre-Launch Checklist
Technical SEO
- [ ] Every page has a unique
(50-60 chars)
- [ ] Every page has a unique
(120-155 chars)
- [ ] Every page has a self-referencing
- [ ] robots.txt
exists and references sitemap
- [ ] Sitemap generated and accessible at /sitemap-index.xml
- [ ] Sitemap submitted to GSC
- [ ] JSON-LD structured data on all pages (validated with Rich Results Test)
- [ ] All old URLs redirected (301) to new equivalents
- [ ] No redirect chains or loops
- [ ] 404 page exists and returns HTTP 404 status code
On-page SEO
- [ ] Single H1 per page containing primary keyword
- [ ] Logical heading hierarchy (H1 > H2 > H3, no skips)
- [ ] All images have descriptive alt
text
- [ ] All images have explicit width
andheight
- [ ] External links open in new tab with rel="noopener noreferrer"
- [ ] Internal links use consistent trailing slash convention
Social / Sharing
- [ ] OG tags on every page (title, description, image, url, type)
- [ ] Twitter card tags on every page
- [ ] Default OG image exists (1200x630)
- [ ] Validated with Facebook/Twitter/LinkedIn debuggers
Performance
- [ ] PageSpeed Insights score > 90 (mobile)
- [ ] LCP < 2.5s, INP < 200ms, CLS < 0.1
- [ ] Fonts preloaded with font-display: swap
- [ ] astro-compress
enabled
Analytics
- [ ] GA4 installed and receiving data
- [ ] GSC property verified
- [ ] Key conversion events configured (e.g., generate_lead`)
- [ ] IndexNow key deployed (optional but recommended)
Content
- [ ] Every blog post has tags for related posts algorithm
- [ ] TL;DR present on long-form articles
- [ ] FAQ section on key pages (for Featured Snippets + GEO)
- [ ] Reading time calculated and displayed
- [ ] Article dates visible (published + updated)