Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/Amaculus/screaming-frog-api/llms.txt

Use this file to discover all available pages before exploring further.

.dbseospider files are zip archives of an internal Screaming Frog ProjectInstanceData crawl folder. They let you move, store, and reload DB-mode crawls without keeping a full Screaming Frog installation on the analysis machine. All packaging helpers are importable directly from the top-level screamingfrog package:
from screamingfrog import (
    pack_dbseospider,
    pack_dbseospider_from_db_id,
    unpack_dbseospider,
    export_dbseospider_from_seospider,
    load_seospider_db_project,
)

Environment variable

Set SCREAMINGFROG_PROJECT_DIR to the full path of the ProjectInstanceData directory when it is not in a standard location:
export SCREAMINGFROG_PROJECT_DIR="/data/sf/ProjectInstanceData"
If the variable is not set, the helpers check the following default locations in order:
  1. %APPDATA%\ScreamingFrogSEOSpider\ProjectInstanceData (Windows)
  2. ~/.ScreamingFrogSEOSpider/ProjectInstanceData (macOS / Linux)

pack_dbseospider

Zip a crawl folder from ProjectInstanceData into a .dbseospider file.
from screamingfrog import pack_dbseospider

dbseospider = pack_dbseospider(
    r"C:\Users\Antonio\.ScreamingFrogSEOSpider\ProjectInstanceData\<project_id>",
    r"C:\Users\Antonio\my-crawl.dbseospider",
)
print(dbseospider)  # WindowsPath('C:/Users/Antonio/my-crawl.dbseospider')

Parameters

ParameterTypeDescription
project_dirstr | PathPath to the crawl subdirectory inside ProjectInstanceData.
output_filestr | PathDestination .dbseospider file path. The .dbseospider extension is added automatically if omitted.
Returns the output file path as a Path. Raises FileNotFoundError if project_dir does not exist, and ValueError if it is not a directory.

pack_dbseospider_from_db_id

Package a DB-mode crawl by its UUID crawl ID instead of a full directory path.
from screamingfrog import pack_dbseospider_from_db_id

dbseospider = pack_dbseospider_from_db_id(
    "7c356a1b-ea14-40f3-b504-36c3046432a2",
    r"C:\Users\Antonio\my-crawl.dbseospider",
)

Parameters

ParameterTypeDefaultDescription
db_idstrrequiredUUID directory name from ProjectInstanceData.
output_filestr | PathrequiredDestination .dbseospider file path.
project_rootstr | Path | NoneNoneOverride the ProjectInstanceData root. Uses SCREAMINGFROG_PROJECT_DIR or the default path when None.
Returns the output file path as a Path.
Use list_crawls() to discover the available db_id values without opening Derby or starting Java.
from screamingfrog import list_crawls

for info in list_crawls():
    print(info.db_id, info.url, info.urls_crawled)

unpack_dbseospider

Extract a .dbseospider file into a directory.
from screamingfrog import unpack_dbseospider

unpack_dbseospider(
    r"C:\Users\Antonio\my-crawl.dbseospider",
    r"C:\Users\Antonio\unpacked_crawl",
)

Parameters

ParameterTypeDescription
dbseospider_filestr | PathPath to the .dbseospider zip archive to extract.
output_dirstr | PathDestination directory. Created if it does not exist.
Returns the output directory path as a Path. Raises FileNotFoundError if dbseospider_file does not exist.

export_dbseospider_from_seospider

Convert a .seospider crawl file into a .dbseospider archive in one step. Internally this:
  1. Forces storage.mode=DB in spider.config (unless ensure_db_mode=False).
  2. Runs the Screaming Frog CLI via --load-crawl to generate a DB crawl in ProjectInstanceData.
  3. Detects the newly created crawl directory.
  4. Packages it into a .dbseospider file.
  5. Cleans up the temporary export directory (unless cleanup_exports=False).
from screamingfrog import export_dbseospider_from_seospider

dbseospider = export_dbseospider_from_seospider(
    r"C:\Users\Antonio\schema-discovery\actionnetwork_crawl\crawl.seospider",
    r"C:\Users\Antonio\actionnetwork.dbseospider",
)

Parameters

ParameterTypeDefaultDescription
crawl_pathstr | PathrequiredPath to the .seospider source file.
output_filestr | PathrequiredDestination .dbseospider file path.
project_rootstr | Path | NoneNoneOverride the ProjectInstanceData root.
spider_config_pathstr | Path | NoneNoneOverride the spider.config path used by ensure_storage_mode.
cli_pathstr | NoneNoneOverride the CLI executable path.
export_dirstr | Path | NoneNoneDirectory for temporary CLI exports. A temp directory is created when None.
export_tabsIterable[str] | NoneNoneTabs to export during the CLI load. Defaults to ["Internal:All"].
bulk_exportsIterable[str] | NoneNoneBulk exports to include during the CLI load.
save_reportsIterable[str] | NoneNoneReports to save during the CLI load.
export_formatstr"csv"Export file format.
export_profilestr | NoneNoneNamed export profile (e.g. "kitchen_sink").
headlessboolTrueRun the CLI in headless mode.
overwriteboolTrueOverwrite existing output files.
ensure_db_modeboolTrueTemporarily force storage.mode=DB in spider.config before running the CLI.
cleanup_exportsboolTrueDelete the temporary export directory after packaging.
Returns the output .dbseospider file path as a Path.
If your ProjectInstanceData directory is in a non-default location, set SCREAMINGFROG_PROJECT_DIR or pass project_root=.... Without this, the helper cannot detect which directory was newly created by the CLI.

load_seospider_db_project

Like export_dbseospider_from_seospider, but returns the raw DB crawl directory path instead of packaging it. Useful when you want to inspect or manipulate the crawl folder before zipping.
from screamingfrog.db import load_seospider_db_project

project_dir = load_seospider_db_project(
    "./crawl.seospider",
    ensure_db_mode=True,
    cleanup_exports=True,
)
print(project_dir)  # Path to the new crawl dir inside ProjectInstanceData
Accepts the same parameters as export_dbseospider_from_seospider except output_file. Returns the detected ProjectInstanceData crawl directory as a Path.

Full round-trip example

1

Convert a .seospider crawl to .dbseospider

from screamingfrog import export_dbseospider_from_seospider

export_dbseospider_from_seospider(
    r"C:\Users\Antonio\schema-discovery\actionnetwork_crawl\crawl.seospider",
    r"C:\Users\Antonio\actionnetwork.dbseospider",
)
2

Load the archive for analysis

from screamingfrog import Crawl

crawl = Crawl.load("./actionnetwork.dbseospider")
pages_404 = crawl.pages().filter(status_code=404).collect()
3

Unpack if you need to inspect the raw Derby files

from screamingfrog import unpack_dbseospider

unpack_dbseospider(
    r"C:\Users\Antonio\actionnetwork.dbseospider",
    r"C:\Users\Antonio\unpacked_actionnetwork",
)