How to Export SharePoint Pages to Markdown (2026 Guide)
SharePoint is where Microsoft 365 teams store years of institutional knowledge. Meeting notes, project wikis, engineering docs, HR policies — it all accretes there. But when you need that content somewhere else — a static site, an AI knowledge base, a Markdown-first documentation system — SharePoint doesn’t have an export button.
This guide covers the realistic methods to get clean Markdown out of SharePoint pages.
Why Export SharePoint to Markdown?
- Migrating off SharePoint — to Notion, Confluence, a Git-based docs repo, or a modern static site
- Building an internal AI knowledge base — feeding a RAG system on Claude, ChatGPT Enterprise, or Microsoft Copilot with clean chunks
- Archiving retiring sites — before a Microsoft 365 tenant rotation or license downgrade
- Publishing internal docs externally — turning internal knowledge into a public documentation site
- Cross-team portability — Markdown is readable in any tool, forever
Method 1: Save Chrome Extension (one page at a time)
Save converts any SharePoint page to clean Markdown with a single click.
What Save captures from SharePoint:
- Page title, creation date, author, last-modified timestamp
- Page body with heading structure preserved
- Inline images as Markdown references
- Tables rendered as Markdown tables
- Embedded documents linked as Markdown URLs
- Web parts: text blocks, quick links, news, lists — extracted as flattened content
What Save strips:
- Site navigation, left-pane, and ribbon
- Comment and @ mention panels (the comments themselves are kept at the bottom)
- Promoted-page callouts and admin-only badges
When it’s the right tool: exporting 1–20 pages, working with content you can only access via your browser session, quick one-off conversions.
Method 2: Microsoft Graph API
For bulk exports — entire sites or whole document libraries — the Microsoft Graph API is the canonical path.
Typical flow:
- Register an Azure AD app with
Sites.Read.AllandFiles.Read.Allpermissions - Get an access token via client-credentials or device-code flow
- Call
/sites/{site-id}/pagesto list pages - For each page, call
/sites/{site-id}/pages/{page-id}to get the content model - Walk the
canvasLayoutstructure and emit Markdown
Pros: scalable, scriptable, works unattended, preserves structured metadata.
Cons: Azure AD setup, throttling at large scale, requires a developer.
Open-source helpers: PnP PowerShell has Get-PnPPage that returns a page object you can post-process to Markdown. Office365 Python SDK offers similar for Python shops.
Method 3: PowerShell + PnP.PowerShell
Microsoft’s community-maintained PowerShell module is the go-to for SharePoint admins. A basic export loop:
Connect-PnPOnline -Url https://tenant.sharepoint.com/sites/YourSite -Interactive
$pages = Get-PnPClientSidePage
foreach ($page in $pages) {
$html = (Get-PnPPage -Identity $page.Name).LayoutWebpartsContent
# Pipe $html through a converter (pandoc, turndown, etc.)
$html | Out-File "$($page.Name).html"
}
Then convert each HTML file to Markdown with Pandoc or Turndown.
When it shines: bulk exports on admin-accessible tenants, migration projects.
Method 4: SharePoint Online “Export to PDF” + OCR
The nuclear option: print every page to PDF, then OCR the PDFs into Markdown. Not recommended — you lose all structure, table layouts, and link fidelity. Mentioned only so you don’t waste time trying it.
What about OneNote?
OneNote is a sibling problem with a different answer:
- Best path: open the OneNote web app (onenote.com) and use Save on each page — captures pages as clean Markdown
- Alternative: Microsoft Graph API has
/me/onenote/pagesendpoints that return page content as HTML, which you convert to Markdown - Avoid: OneNote’s built-in “Print to PDF” — same structure loss as SharePoint’s
Handling SharePoint’s quirks
A few gotchas worth knowing:
- Web parts can embed arbitrary content — news rotators, events, quick links. Save flattens these to plain Markdown; the Graph API gives you structured JSON that you can render your own way.
- Modern vs. classic pages have very different HTML. Save auto-detects; scripts need to branch on the page type.
- Permissions travel with the user. If you’re exporting for migration, make sure you have Site Owner access — some sections may be hidden to standard members.
- Large images are inlined as base64 in some exports. Save gives you URL-linked references; strip the base64 from scripted exports if you hit file-size issues.
The practical choice for most teams
- 1–20 pages: Save extension. Free tier handles 3/mo; Plus unlocks unlimited for $5.99.
- 20–500 pages: PnP PowerShell + Pandoc conversion. Spin up a script, run it once.
- 500+ pages: Graph API with a proper ingestion pipeline, scheduled.
- You want the content in Claude or ChatGPT right now: Save — exports are already optimised for LLM context windows.
SharePoint content was never meant to stay locked there. Markdown is the portable format. Pick the method that matches your scale.