How to Export SharePoint Pages to Markdown (2026 Guide)

·

SharePoint is where Microsoft 365 teams store years of institutional knowledge. Meeting notes, project wikis, engineering docs, HR policies — it all accretes there. But when you need that content somewhere else — a static site, an AI knowledge base, a Markdown-first documentation system — SharePoint doesn’t have an export button.

This guide covers the realistic methods to get clean Markdown out of SharePoint pages.

Why Export SharePoint to Markdown?

  • Migrating off SharePoint — to Notion, Confluence, a Git-based docs repo, or a modern static site
  • Building an internal AI knowledge base — feeding a RAG system on Claude, ChatGPT Enterprise, or Microsoft Copilot with clean chunks
  • Archiving retiring sites — before a Microsoft 365 tenant rotation or license downgrade
  • Publishing internal docs externally — turning internal knowledge into a public documentation site
  • Cross-team portability — Markdown is readable in any tool, forever

Method 1: Save Chrome Extension (one page at a time)

Save converts any SharePoint page to clean Markdown with a single click.

What Save captures from SharePoint:

  • Page title, creation date, author, last-modified timestamp
  • Page body with heading structure preserved
  • Inline images as Markdown references
  • Tables rendered as Markdown tables
  • Embedded documents linked as Markdown URLs
  • Web parts: text blocks, quick links, news, lists — extracted as flattened content

What Save strips:

  • Site navigation, left-pane, and ribbon
  • Comment and @ mention panels (the comments themselves are kept at the bottom)
  • Promoted-page callouts and admin-only badges

When it’s the right tool: exporting 1–20 pages, working with content you can only access via your browser session, quick one-off conversions.

Method 2: Microsoft Graph API

For bulk exports — entire sites or whole document libraries — the Microsoft Graph API is the canonical path.

Typical flow:

  1. Register an Azure AD app with Sites.Read.All and Files.Read.All permissions
  2. Get an access token via client-credentials or device-code flow
  3. Call /sites/{site-id}/pages to list pages
  4. For each page, call /sites/{site-id}/pages/{page-id} to get the content model
  5. Walk the canvasLayout structure and emit Markdown

Pros: scalable, scriptable, works unattended, preserves structured metadata.

Cons: Azure AD setup, throttling at large scale, requires a developer.

Open-source helpers: PnP PowerShell has Get-PnPPage that returns a page object you can post-process to Markdown. Office365 Python SDK offers similar for Python shops.

Method 3: PowerShell + PnP.PowerShell

Microsoft’s community-maintained PowerShell module is the go-to for SharePoint admins. A basic export loop:

Connect-PnPOnline -Url https://tenant.sharepoint.com/sites/YourSite -Interactive
$pages = Get-PnPClientSidePage
foreach ($page in $pages) {
  $html = (Get-PnPPage -Identity $page.Name).LayoutWebpartsContent
  # Pipe $html through a converter (pandoc, turndown, etc.)
  $html | Out-File "$($page.Name).html"
}

Then convert each HTML file to Markdown with Pandoc or Turndown.

When it shines: bulk exports on admin-accessible tenants, migration projects.

Method 4: SharePoint Online “Export to PDF” + OCR

The nuclear option: print every page to PDF, then OCR the PDFs into Markdown. Not recommended — you lose all structure, table layouts, and link fidelity. Mentioned only so you don’t waste time trying it.

What about OneNote?

OneNote is a sibling problem with a different answer:

  • Best path: open the OneNote web app (onenote.com) and use Save on each page — captures pages as clean Markdown
  • Alternative: Microsoft Graph API has /me/onenote/pages endpoints that return page content as HTML, which you convert to Markdown
  • Avoid: OneNote’s built-in “Print to PDF” — same structure loss as SharePoint’s

Handling SharePoint’s quirks

A few gotchas worth knowing:

  • Web parts can embed arbitrary content — news rotators, events, quick links. Save flattens these to plain Markdown; the Graph API gives you structured JSON that you can render your own way.
  • Modern vs. classic pages have very different HTML. Save auto-detects; scripts need to branch on the page type.
  • Permissions travel with the user. If you’re exporting for migration, make sure you have Site Owner access — some sections may be hidden to standard members.
  • Large images are inlined as base64 in some exports. Save gives you URL-linked references; strip the base64 from scripted exports if you hit file-size issues.

The practical choice for most teams

  • 1–20 pages: Save extension. Free tier handles 3/mo; Plus unlocks unlimited for $5.99.
  • 20–500 pages: PnP PowerShell + Pandoc conversion. Spin up a script, run it once.
  • 500+ pages: Graph API with a proper ingestion pipeline, scheduled.
  • You want the content in Claude or ChatGPT right now: Save — exports are already optimised for LLM context windows.

SharePoint content was never meant to stay locked there. Markdown is the portable format. Pick the method that matches your scale.

## Continue reading

Jean-Sébastien Wallez

Written by

Jean-Sébastien Wallez

I've been making internet products for 10+ years. Built Save on weekends because I wanted my own reading library in clean markdown for Claude and Obsidian. Write here about web clipping, AI workflows, and the small things that make a personal knowledge base actually useful.