跳到内容

管道选项

管道选项允许在转换管道期间自定义模型的执行。这包括 OCR 引擎、表格模型以及可以通过设置 do_xyz = True 启用的增强选项。

这是 Docling 中所有可用管道选项的自动生成 API 参考。

pipeline_options

属性

granite_picture_description module-attribute

granite_picture_description = PictureDescriptionVlmOptions(
    repo_id="ibm-granite/granite-vision-3.1-2b-preview",
    prompt="What is shown in this image?",
)

granite_vision_vlm_conversion_options module-attribute

granite_vision_vlm_conversion_options = (
    HuggingFaceVlmOptions(
        repo_id="ibm-granite/granite-vision-3.1-2b-preview",
        prompt="OCR this image.",
        response_format=MARKDOWN,
        inference_framework=TRANSFORMERS,
    )
)

granite_vision_vlm_ollama_conversion_options module-attribute

granite_vision_vlm_ollama_conversion_options = (
    ApiVlmOptions(
        url=AnyUrl(
            "http://localhost:11434/v1/chat/completions"
        ),
        params={"model": "granite3.2-vision:2b"},
        prompt="OCR the full page to markdown.",
        scale=1.0,
        timeout=120,
        response_format=MARKDOWN,
    )
)

smoldocling_vlm_conversion_options module-attribute

smoldocling_vlm_conversion_options = HuggingFaceVlmOptions(
    repo_id="ds4sd/SmolDocling-256M-preview",
    prompt="Convert this page to docling.",
    response_format=DOCTAGS,
    inference_framework=TRANSFORMERS,
)

smoldocling_vlm_mlx_conversion_options module-attribute

smoldocling_vlm_mlx_conversion_options = (
    HuggingFaceVlmOptions(
        repo_id="ds4sd/SmolDocling-256M-preview-mlx-bf16",
        prompt="Convert this page to docling.",
        response_format=DOCTAGS,
        inference_framework=MLX,
    )
)

smolvlm_picture_description module-attribute

smolvlm_picture_description = PictureDescriptionVlmOptions(
    repo_id="HuggingFaceTB/SmolVLM-256M-Instruct"
)

AcceleratorDevice

Bases: str, Enum

运行模型推理的设备

属性

AUTO class-attribute instance-attribute

AUTO = 'auto'

CPU class-attribute instance-attribute

CPU = 'cpu'

CUDA class-attribute instance-attribute

CUDA = 'cuda'

MPS class-attribute instance-attribute

MPS = 'mps'

AcceleratorOptions

Bases: BaseSettings

方法

属性

cuda_use_flash_attention2 class-attribute instance-attribute

cuda_use_flash_attention2: bool = False

device class-attribute instance-attribute

device: Union[str, AcceleratorDevice] = 'auto'

model_config class-attribute instance-attribute

model_config = SettingsConfigDict(
    env_prefix="DOCLING_",
    env_nested_delimiter="_",
    populate_by_name=True,
)

num_threads class-attribute instance-attribute

num_threads: int = 4

check_alternative_envvars classmethod

check_alternative_envvars(data: Any) -> Any

从“替代”环境变量 OMP_NUM_THREADS 设置 num_threads。仅当替代环境变量有效且常规环境变量未设置时,才使用替代环境变量。

注意:带有参数“aliases”的标准 pydantic 设置机制不提供相同的功能。如果设置了别名环境变量,并且用户尝试在设置初始化时覆盖该参数,Pydantic 会将 __init__() 中提供的参数视为额外输入,而不是简单地覆盖该参数的环境变量值。

validate_device

validate_device(value)

ApiVlmOptions

Bases: BaseVlmOptions

属性

concurrency class-attribute instance-attribute

concurrency: int = 1

headers class-attribute instance-attribute

headers: Dict[str, str] = {}

kind class-attribute instance-attribute

kind: Literal['api_model_options'] = 'api_model_options'

params class-attribute instance-attribute

params: Dict[str, Any] = {}

prompt instance-attribute

prompt: str

response_format instance-attribute

response_format: ResponseFormat

scale class-attribute instance-attribute

scale: float = 2.0

timeout class-attribute instance-attribute

timeout: float = 60

url class-attribute instance-attribute

url: AnyUrl = AnyUrl(
    "http://localhost:11434/v1/chat/completions"
)

BaseOptions

Bases: BaseModel

选项的基类。

属性

kind class-attribute

kind: str

BaseVlmOptions

Bases: BaseModel

属性

kind instance-attribute

kind: str

prompt instance-attribute

prompt: str

EasyOcrOptions

Bases: OcrOptions

EasyOCR 引擎的选项。

属性

bitmap_area_threshold class-attribute instance-attribute

bitmap_area_threshold: float = 0.05

confidence_threshold class-attribute instance-attribute

confidence_threshold: float = 0.5

download_enabled class-attribute instance-attribute

download_enabled: bool = True

force_full_page_ocr class-attribute instance-attribute

force_full_page_ocr: bool = False

kind class-attribute

kind: Literal['easyocr'] = 'easyocr'

lang class-attribute instance-attribute

lang: List[str] = ['fr', 'de', 'es', 'en']

model_config class-attribute instance-attribute

model_config = ConfigDict(
    extra="forbid", protected_namespaces=()
)

model_storage_directory class-attribute instance-attribute

model_storage_directory: Optional[str] = None

recog_network class-attribute instance-attribute

recog_network: Optional[str] = 'standard'

use_gpu class-attribute instance-attribute

use_gpu: Optional[bool] = None

HuggingFaceVlmOptions

Bases: BaseVlmOptions

属性

inference_framework instance-attribute

inference_framework: InferenceFramework

kind class-attribute instance-attribute

kind: Literal['hf_model_options'] = 'hf_model_options'

llm_int8_threshold class-attribute instance-attribute

llm_int8_threshold: float = 6.0

load_in_8bit class-attribute instance-attribute

load_in_8bit: bool = True

prompt instance-attribute

prompt: str

quantized class-attribute instance-attribute

quantized: bool = False

repo_cache_folder property

repo_cache_folder: str

repo_id instance-attribute

repo_id: str

response_format instance-attribute

response_format: ResponseFormat

InferenceFramework

Bases: str, Enum

属性

MLX class-attribute instance-attribute

MLX = 'mlx'

OPENAI class-attribute instance-attribute

OPENAI = 'openai'

TRANSFORMERS class-attribute instance-attribute

TRANSFORMERS = 'transformers'

OcrEngine

Bases: str, Enum

有效 OCR 引擎的枚举。

属性

EASYOCR class-attribute instance-attribute

EASYOCR = 'easyocr'

OCRMAC class-attribute instance-attribute

OCRMAC = 'ocrmac'

RAPIDOCR class-attribute instance-attribute

RAPIDOCR = 'rapidocr'

TESSERACT class-attribute instance-attribute

TESSERACT = 'tesseract'

TESSERACT_CLI class-attribute instance-attribute

TESSERACT_CLI = 'tesseract_cli'

OcrMacOptions

Bases: OcrOptions

Mac OCR 引擎的选项。

属性

bitmap_area_threshold class-attribute instance-attribute

bitmap_area_threshold: float = 0.05

force_full_page_ocr class-attribute instance-attribute

force_full_page_ocr: bool = False

framework class-attribute instance-attribute

framework: str = 'vision'

kind class-attribute

kind: Literal['ocrmac'] = 'ocrmac'

lang class-attribute instance-attribute

lang: List[str] = ['fr-FR', 'de-DE', 'es-ES', 'en-US']

model_config class-attribute instance-attribute

model_config = ConfigDict(extra='forbid')

recognition class-attribute instance-attribute

recognition: str = 'accurate'

OcrOptions

Bases: BaseOptions

OCR 选项。

属性

bitmap_area_threshold class-attribute instance-attribute

bitmap_area_threshold: float = 0.05

force_full_page_ocr class-attribute instance-attribute

force_full_page_ocr: bool = False

kind class-attribute

kind: str

lang instance-attribute

lang: List[str]

PaginatedPipelineOptions

Bases: PipelineOptions

属性

accelerator_options class-attribute instance-attribute

accelerator_options: AcceleratorOptions = (
    AcceleratorOptions()
)

allow_external_plugins class-attribute instance-attribute

allow_external_plugins: bool = False

artifacts_path class-attribute instance-attribute

artifacts_path: Optional[Union[Path, str]] = None

create_legacy_output class-attribute instance-attribute

create_legacy_output: bool = True

document_timeout class-attribute instance-attribute

document_timeout: Optional[float] = None

enable_remote_services class-attribute instance-attribute

enable_remote_services: bool = False

generate_page_images class-attribute instance-attribute

generate_page_images: bool = False

generate_picture_images class-attribute instance-attribute

generate_picture_images: bool = False

images_scale class-attribute instance-attribute

images_scale: float = 1.0

PdfBackend

Bases: str, Enum

有效 PDF 后端的枚举。

属性

DLPARSE_V1 class-attribute instance-attribute

DLPARSE_V1 = 'dlparse_v1'

DLPARSE_V2 class-attribute instance-attribute

DLPARSE_V2 = 'dlparse_v2'

DLPARSE_V4 class-attribute instance-attribute

DLPARSE_V4 = 'dlparse_v4'

PYPDFIUM2 class-attribute instance-attribute

PYPDFIUM2 = 'pypdfium2'

PdfPipeline

Bases: str, Enum

属性

STANDARD class-attribute instance-attribute

STANDARD = 'standard'

VLM class-attribute instance-attribute

VLM = 'vlm'

PdfPipelineOptions

Bases: PaginatedPipelineOptions

PDF 管道的选项。

属性

accelerator_options class-attribute instance-attribute

accelerator_options: AcceleratorOptions = (
    AcceleratorOptions()
)

allow_external_plugins class-attribute instance-attribute

allow_external_plugins: bool = False

artifacts_path class-attribute instance-attribute

artifacts_path: Optional[Union[Path, str]] = None

create_legacy_output class-attribute instance-attribute

create_legacy_output: bool = True

do_code_enrichment class-attribute instance-attribute

do_code_enrichment: bool = False

do_formula_enrichment class-attribute instance-attribute

do_formula_enrichment: bool = False

do_ocr class-attribute instance-attribute

do_ocr: bool = True

do_picture_classification class-attribute instance-attribute

do_picture_classification: bool = False

do_picture_description class-attribute instance-attribute

do_picture_description: bool = False

do_table_structure class-attribute instance-attribute

do_table_structure: bool = True

document_timeout class-attribute instance-attribute

document_timeout: Optional[float] = None

enable_remote_services class-attribute instance-attribute

enable_remote_services: bool = False

force_backend_text class-attribute instance-attribute

force_backend_text: bool = False

generate_page_images class-attribute instance-attribute

generate_page_images: bool = False

generate_parsed_pages class-attribute instance-attribute

generate_parsed_pages: bool = False

generate_picture_images class-attribute instance-attribute

generate_picture_images: bool = False

generate_table_images class-attribute instance-attribute

generate_table_images: bool = Field(
    default=False,
    deprecated="Field `generate_table_images` is deprecated. To obtain table images, set `PdfPipelineOptions.generate_page_images = True` before conversion and then use the `TableItem.get_image` function.",
)

images_scale class-attribute instance-attribute

images_scale: float = 1.0

ocr_options class-attribute instance-attribute

ocr_options: OcrOptions = EasyOcrOptions()

picture_description_options class-attribute instance-attribute

table_structure_options class-attribute instance-attribute

table_structure_options: TableStructureOptions = (
    TableStructureOptions()
)

PictureDescriptionApiOptions

基类: PictureDescriptionBaseOptions

属性

batch_size class-attribute instance-attribute

batch_size: int = 8

concurrency class-attribute instance-attribute

concurrency: int = 1

headers class-attribute instance-attribute

headers: Dict[str, str] = {}

kind class-attribute

kind: Literal['api'] = 'api'

params class-attribute instance-attribute

params: Dict[str, Any] = {}

picture_area_threshold class-attribute instance-attribute

picture_area_threshold: float = 0.05

prompt class-attribute instance-attribute

prompt: str = 'Describe this image in a few sentences.'

provenance class-attribute instance-attribute

provenance: str = ''

scale class-attribute instance-attribute

scale: float = 2

timeout class-attribute instance-attribute

timeout: float = 20

url class-attribute instance-attribute

url: AnyUrl = AnyUrl(
    "http://localhost:8000/v1/chat/completions"
)

PictureDescriptionBaseOptions

Bases: BaseOptions

属性

batch_size class-attribute instance-attribute

batch_size: int = 8

kind class-attribute

kind: str

picture_area_threshold class-attribute instance-attribute

picture_area_threshold: float = 0.05

scale class-attribute instance-attribute

scale: float = 2

PictureDescriptionVlmOptions

基类: PictureDescriptionBaseOptions

属性

batch_size class-attribute instance-attribute

batch_size: int = 8

generation_config class-attribute instance-attribute

generation_config: Dict[str, Any] = dict(
    max_new_tokens=200, do_sample=False
)

kind class-attribute

kind: Literal['vlm'] = 'vlm'

picture_area_threshold class-attribute instance-attribute

picture_area_threshold: float = 0.05

prompt class-attribute instance-attribute

prompt: str = 'Describe this image in a few sentences.'

repo_cache_folder property

repo_cache_folder: str

repo_id instance-attribute

repo_id: str

scale class-attribute instance-attribute

scale: float = 2

PipelineOptions

Bases: BaseModel

基本管道选项。

属性

accelerator_options class-attribute instance-attribute

accelerator_options: AcceleratorOptions = (
    AcceleratorOptions()
)

allow_external_plugins class-attribute instance-attribute

allow_external_plugins: bool = False

create_legacy_output class-attribute instance-attribute

create_legacy_output: bool = True

document_timeout class-attribute instance-attribute

document_timeout: Optional[float] = None

enable_remote_services class-attribute instance-attribute

enable_remote_services: bool = False

RapidOcrOptions

Bases: OcrOptions

RapidOCR 引擎的选项。

属性

bitmap_area_threshold class-attribute instance-attribute

bitmap_area_threshold: float = 0.05

cls_model_path class-attribute instance-attribute

cls_model_path: Optional[str] = None

det_model_path class-attribute instance-attribute

det_model_path: Optional[str] = None

force_full_page_ocr class-attribute instance-attribute

force_full_page_ocr: bool = False

kind class-attribute

kind: Literal['rapidocr'] = 'rapidocr'

lang class-attribute instance-attribute

lang: List[str] = ['english', 'chinese']

model_config class-attribute instance-attribute

model_config = ConfigDict(extra='forbid')

print_verbose class-attribute instance-attribute

print_verbose: bool = False

rec_keys_path class-attribute instance-attribute

rec_keys_path: Optional[str] = None

rec_model_path class-attribute instance-attribute

rec_model_path: Optional[str] = None

text_score class-attribute instance-attribute

text_score: float = 0.5

use_cls class-attribute instance-attribute

use_cls: Optional[bool] = None

use_det class-attribute instance-attribute

use_det: Optional[bool] = None

use_rec class-attribute instance-attribute

use_rec: Optional[bool] = None

ResponseFormat

Bases: str, Enum

属性

DOCTAGS class-attribute instance-attribute

DOCTAGS = 'doctags'

MARKDOWN class-attribute instance-attribute

MARKDOWN = 'markdown'

TableFormerMode

Bases: str, Enum

TableFormer 模型的模式。

属性

ACCURATE class-attribute instance-attribute

ACCURATE = 'accurate'

FAST class-attribute instance-attribute

FAST = 'fast'

TableStructureOptions

Bases: BaseModel

表格结构的选项。

属性

do_cell_matching class-attribute instance-attribute

do_cell_matching: bool = True

mode class-attribute instance-attribute

TesseractCliOcrOptions

Bases: OcrOptions

TesseractCli 引擎的选项。

属性

bitmap_area_threshold class-attribute instance-attribute

bitmap_area_threshold: float = 0.05

force_full_page_ocr class-attribute instance-attribute

force_full_page_ocr: bool = False

kind class-attribute

kind: Literal['tesseract'] = 'tesseract'

lang class-attribute instance-attribute

lang: List[str] = ['fra', 'deu', 'spa', 'eng']

model_config class-attribute instance-attribute

model_config = ConfigDict(extra='forbid')

path class-attribute instance-attribute

path: Optional[str] = None

tesseract_cmd class-attribute instance-attribute

tesseract_cmd: str = 'tesseract'

TesseractOcrOptions

Bases: OcrOptions

Tesseract 引擎的选项。

属性

bitmap_area_threshold class-attribute instance-attribute

bitmap_area_threshold: float = 0.05

force_full_page_ocr class-attribute instance-attribute

force_full_page_ocr: bool = False

kind class-attribute

kind: Literal['tesserocr'] = 'tesserocr'

lang class-attribute instance-attribute

lang: List[str] = ['fra', 'deu', 'spa', 'eng']

model_config class-attribute instance-attribute

model_config = ConfigDict(extra='forbid')

path class-attribute instance-attribute

path: Optional[str] = None

VlmModelType

Bases: str, Enum

属性

GRANITE_VISION class-attribute instance-attribute

GRANITE_VISION = 'granite_vision'

GRANITE_VISION_OLLAMA class-attribute instance-attribute

GRANITE_VISION_OLLAMA = 'granite_vision_ollama'

SMOLDOCLING class-attribute instance-attribute

SMOLDOCLING = 'smoldocling'

VlmPipelineOptions

Bases: PaginatedPipelineOptions

属性

accelerator_options class-attribute instance-attribute

accelerator_options: AcceleratorOptions = (
    AcceleratorOptions()
)

allow_external_plugins class-attribute instance-attribute

allow_external_plugins: bool = False

artifacts_path class-attribute instance-attribute

artifacts_path: Optional[Union[Path, str]] = None

create_legacy_output class-attribute instance-attribute

create_legacy_output: bool = True

document_timeout class-attribute instance-attribute

document_timeout: Optional[float] = None

enable_remote_services class-attribute instance-attribute

enable_remote_services: bool = False

force_backend_text class-attribute instance-attribute

force_backend_text: bool = False

generate_page_images class-attribute instance-attribute

generate_page_images: bool = True

generate_picture_images class-attribute instance-attribute

generate_picture_images: bool = False

images_scale class-attribute instance-attribute

images_scale: float = 1.0

vlm_options class-attribute instance-attribute