管道选项
管道选项允许在转换管道期间自定义模型的执行。这包括 OCR 引擎、表格模型以及可以通过设置 do_xyz = True 启用的增强选项。
这是 Docling 中所有可用管道选项的自动生成 API 参考。
pipeline_options
类
-
AcceleratorDevice–运行模型推理的设备
-
AcceleratorOptions– -
ApiVlmOptions– -
BaseOptions–选项的基类。
-
BaseVlmOptions– -
EasyOcrOptions–EasyOCR 引擎的选项。
-
HuggingFaceVlmOptions– -
InferenceFramework– -
OcrEngine–有效 OCR 引擎的枚举。
-
OcrMacOptions–Mac OCR 引擎的选项。
-
OcrOptions–OCR 选项。
-
PaginatedPipelineOptions– -
PdfBackend–有效 PDF 后端的枚举。
-
PdfPipeline– -
PdfPipelineOptions–PDF 管道的选项。
-
PictureDescriptionApiOptions– -
PictureDescriptionBaseOptions– -
PictureDescriptionVlmOptions– -
PipelineOptions–基本管道选项。
-
RapidOcrOptions–RapidOCR 引擎的选项。
-
ResponseFormat– -
TableFormerMode–TableFormer 模型的模式。
-
TableStructureOptions–表格结构的选项。
-
TesseractCliOcrOptions–TesseractCli 引擎的选项。
-
TesseractOcrOptions–Tesseract 引擎的选项。
-
VlmModelType– -
VlmPipelineOptions–
属性
-
granite_picture_description– -
granite_vision_vlm_conversion_options– -
granite_vision_vlm_ollama_conversion_options– -
smoldocling_vlm_conversion_options– -
smoldocling_vlm_mlx_conversion_options– -
smolvlm_picture_description–
granite_picture_description module-attribute
granite_picture_description = PictureDescriptionVlmOptions(
repo_id="ibm-granite/granite-vision-3.1-2b-preview",
prompt="What is shown in this image?",
)
granite_vision_vlm_conversion_options module-attribute
granite_vision_vlm_conversion_options = (
HuggingFaceVlmOptions(
repo_id="ibm-granite/granite-vision-3.1-2b-preview",
prompt="OCR this image.",
response_format=MARKDOWN,
inference_framework=TRANSFORMERS,
)
)
granite_vision_vlm_ollama_conversion_options module-attribute
granite_vision_vlm_ollama_conversion_options = (
ApiVlmOptions(
url=AnyUrl(
"https://:11434/v1/chat/completions"
),
params={"model": "granite3.2-vision:2b"},
prompt="OCR the full page to markdown.",
scale=1.0,
timeout=120,
response_format=MARKDOWN,
)
)
smoldocling_vlm_conversion_options module-attribute
smoldocling_vlm_conversion_options = HuggingFaceVlmOptions(
repo_id="ds4sd/SmolDocling-256M-preview",
prompt="Convert this page to docling.",
response_format=DOCTAGS,
inference_framework=TRANSFORMERS,
)
smoldocling_vlm_mlx_conversion_options module-attribute
smoldocling_vlm_mlx_conversion_options = (
HuggingFaceVlmOptions(
repo_id="ds4sd/SmolDocling-256M-preview-mlx-bf16",
prompt="Convert this page to docling.",
response_format=DOCTAGS,
inference_framework=MLX,
)
)
smolvlm_picture_description module-attribute
smolvlm_picture_description = PictureDescriptionVlmOptions(
repo_id="HuggingFaceTB/SmolVLM-256M-Instruct"
)
AcceleratorDevice
AcceleratorOptions
Bases: BaseSettings
方法
-
check_alternative_envvars–从“替代”环境变量 OMP_NUM_THREADS 设置 num_threads。
-
validate_device–
属性
-
cuda_use_flash_attention2(bool) – -
device(Union[str, AcceleratorDevice]) – -
model_config– -
num_threads(int) –
cuda_use_flash_attention2 class-attribute instance-attribute
cuda_use_flash_attention2: bool = False
model_config class-attribute instance-attribute
model_config = SettingsConfigDict(
env_prefix="DOCLING_",
env_nested_delimiter="_",
populate_by_name=True,
)
num_threads class-attribute instance-attribute
num_threads: int = 4
check_alternative_envvars classmethod
check_alternative_envvars(data: Any) -> Any
从“替代”环境变量 OMP_NUM_THREADS 设置 num_threads。仅当替代环境变量有效且常规环境变量未设置时,才使用替代环境变量。
注意:带有参数“aliases”的标准 pydantic 设置机制不提供相同的功能。如果设置了别名环境变量,并且用户尝试在设置初始化时覆盖该参数,Pydantic 会将 __init__() 中提供的参数视为额外输入,而不是简单地覆盖该参数的环境变量值。
validate_device
validate_device(value)
ApiVlmOptions
Bases: BaseVlmOptions
属性
-
concurrency(int) – -
headers(Dict[str, str]) – -
kind(Literal['api_model_options']) – -
params(Dict[str, Any]) – -
prompt(str) – -
response_format(ResponseFormat) – -
scale(float) – -
timeout(float) – -
url(AnyUrl) –
concurrency class-attribute instance-attribute
concurrency: int = 1
headers class-attribute instance-attribute
headers: Dict[str, str] = {}
kind class-attribute instance-attribute
kind: Literal['api_model_options'] = 'api_model_options'
params class-attribute instance-attribute
params: Dict[str, Any] = {}
prompt instance-attribute
prompt: str
scale class-attribute instance-attribute
scale: float = 2.0
timeout class-attribute instance-attribute
timeout: float = 60
url class-attribute instance-attribute
url: AnyUrl = AnyUrl(
"https://:11434/v1/chat/completions"
)
BaseOptions
BaseVlmOptions
EasyOcrOptions
Bases: OcrOptions
EasyOCR 引擎的选项。
属性
-
bitmap_area_threshold(float) – -
confidence_threshold(float) – -
download_enabled(bool) – -
force_full_page_ocr(bool) – -
kind(Literal['easyocr']) – -
lang(List[str]) – -
model_config– -
model_storage_directory(Optional[str]) – -
recog_network(Optional[str]) – -
use_gpu(Optional[bool]) –
bitmap_area_threshold class-attribute instance-attribute
bitmap_area_threshold: float = 0.05
confidence_threshold class-attribute instance-attribute
confidence_threshold: float = 0.5
download_enabled class-attribute instance-attribute
download_enabled: bool = True
force_full_page_ocr class-attribute instance-attribute
force_full_page_ocr: bool = False
kind class-attribute
kind: Literal['easyocr'] = 'easyocr'
lang class-attribute instance-attribute
lang: List[str] = ['fr', 'de', 'es', 'en']
model_config class-attribute instance-attribute
model_config = ConfigDict(
extra="forbid", protected_namespaces=()
)
model_storage_directory class-attribute instance-attribute
model_storage_directory: Optional[str] = None
recog_network class-attribute instance-attribute
recog_network: Optional[str] = 'standard'
use_gpu class-attribute instance-attribute
use_gpu: Optional[bool] = None
HuggingFaceVlmOptions
Bases: BaseVlmOptions
属性
-
inference_framework(InferenceFramework) – -
kind(Literal['hf_model_options']) – -
llm_int8_threshold(float) – -
load_in_8bit(bool) – -
prompt(str) – -
quantized(bool) – -
repo_cache_folder(str) – -
repo_id(str) – -
response_format(ResponseFormat) –
kind class-attribute instance-attribute
kind: Literal['hf_model_options'] = 'hf_model_options'
llm_int8_threshold class-attribute instance-attribute
llm_int8_threshold: float = 6.0
load_in_8bit class-attribute instance-attribute
load_in_8bit: bool = True
prompt instance-attribute
prompt: str
quantized class-attribute instance-attribute
quantized: bool = False
repo_cache_folder property
repo_cache_folder: str
repo_id instance-attribute
repo_id: str
InferenceFramework
Bases: str, Enum
属性
-
MLX– -
OPENAI– -
TRANSFORMERS–
MLX class-attribute instance-attribute
MLX = 'mlx'
OPENAI class-attribute instance-attribute
OPENAI = 'openai'
TRANSFORMERS class-attribute instance-attribute
TRANSFORMERS = 'transformers'
OcrEngine
Bases: str, Enum
有效 OCR 引擎的枚举。
属性
-
EASYOCR– -
OCRMAC– -
RAPIDOCR– -
TESSERACT– -
TESSERACT_CLI–
EASYOCR class-attribute instance-attribute
EASYOCR = 'easyocr'
OCRMAC class-attribute instance-attribute
OCRMAC = 'ocrmac'
RAPIDOCR class-attribute instance-attribute
RAPIDOCR = 'rapidocr'
TESSERACT class-attribute instance-attribute
TESSERACT = 'tesseract'
TESSERACT_CLI class-attribute instance-attribute
TESSERACT_CLI = 'tesseract_cli'
OcrMacOptions
Bases: OcrOptions
Mac OCR 引擎的选项。
属性
-
bitmap_area_threshold(float) – -
force_full_page_ocr(bool) – -
framework(str) – -
kind(Literal['ocrmac']) – -
lang(List[str]) – -
model_config– -
recognition(str) –
bitmap_area_threshold class-attribute instance-attribute
bitmap_area_threshold: float = 0.05
force_full_page_ocr class-attribute instance-attribute
force_full_page_ocr: bool = False
framework class-attribute instance-attribute
framework: str = 'vision'
kind class-attribute
kind: Literal['ocrmac'] = 'ocrmac'
lang class-attribute instance-attribute
lang: List[str] = ['fr-FR', 'de-DE', 'es-ES', 'en-US']
model_config class-attribute instance-attribute
model_config = ConfigDict(extra='forbid')
recognition class-attribute instance-attribute
recognition: str = 'accurate'
OcrOptions
Bases: BaseOptions
OCR 选项。
属性
-
bitmap_area_threshold(float) – -
force_full_page_ocr(bool) – -
kind(str) – -
lang(List[str]) –
bitmap_area_threshold class-attribute instance-attribute
bitmap_area_threshold: float = 0.05
force_full_page_ocr class-attribute instance-attribute
force_full_page_ocr: bool = False
kind class-attribute
kind: str
lang instance-attribute
lang: List[str]
PaginatedPipelineOptions
Bases: PipelineOptions
属性
-
accelerator_options(AcceleratorOptions) – -
allow_external_plugins(bool) – -
artifacts_path(Optional[Union[Path, str]]) – -
create_legacy_output(bool) – -
document_timeout(Optional[float]) – -
enable_remote_services(bool) – -
generate_page_images(bool) – -
generate_picture_images(bool) – -
images_scale(float) –
accelerator_options class-attribute instance-attribute
accelerator_options: AcceleratorOptions = (
AcceleratorOptions()
)
allow_external_plugins class-attribute instance-attribute
allow_external_plugins: bool = False
artifacts_path class-attribute instance-attribute
artifacts_path: Optional[Union[Path, str]] = None
create_legacy_output class-attribute instance-attribute
create_legacy_output: bool = True
document_timeout class-attribute instance-attribute
document_timeout: Optional[float] = None
enable_remote_services class-attribute instance-attribute
enable_remote_services: bool = False
generate_page_images class-attribute instance-attribute
generate_page_images: bool = False
generate_picture_images class-attribute instance-attribute
generate_picture_images: bool = False
images_scale class-attribute instance-attribute
images_scale: float = 1.0
PdfBackend
Bases: str, Enum
有效 PDF 后端的枚举。
属性
-
DLPARSE_V1– -
DLPARSE_V2– -
DLPARSE_V4– -
PYPDFIUM2–
DLPARSE_V1 class-attribute instance-attribute
DLPARSE_V1 = 'dlparse_v1'
DLPARSE_V2 class-attribute instance-attribute
DLPARSE_V2 = 'dlparse_v2'
DLPARSE_V4 class-attribute instance-attribute
DLPARSE_V4 = 'dlparse_v4'
PYPDFIUM2 class-attribute instance-attribute
PYPDFIUM2 = 'pypdfium2'
PdfPipeline
PdfPipelineOptions
Bases: PaginatedPipelineOptions
PDF 管道的选项。
属性
-
accelerator_options(AcceleratorOptions) – -
allow_external_plugins(bool) – -
artifacts_path(Optional[Union[Path, str]]) – -
create_legacy_output(bool) – -
do_code_enrichment(bool) – -
do_formula_enrichment(bool) – -
do_ocr(bool) – -
do_picture_classification(bool) – -
do_picture_description(bool) – -
do_table_structure(bool) – -
document_timeout(Optional[float]) – -
enable_remote_services(bool) – -
force_backend_text(bool) – -
generate_page_images(bool) – -
generate_parsed_pages(bool) – -
generate_picture_images(bool) – -
generate_table_images(bool) – -
images_scale(float) – -
ocr_options(OcrOptions) – -
picture_description_options(PictureDescriptionBaseOptions) – -
table_structure_options(TableStructureOptions) –
accelerator_options class-attribute instance-attribute
accelerator_options: AcceleratorOptions = (
AcceleratorOptions()
)
allow_external_plugins class-attribute instance-attribute
allow_external_plugins: bool = False
artifacts_path class-attribute instance-attribute
artifacts_path: Optional[Union[Path, str]] = None
create_legacy_output class-attribute instance-attribute
create_legacy_output: bool = True
do_code_enrichment class-attribute instance-attribute
do_code_enrichment: bool = False
do_formula_enrichment class-attribute instance-attribute
do_formula_enrichment: bool = False
do_ocr class-attribute instance-attribute
do_ocr: bool = True
do_picture_classification class-attribute instance-attribute
do_picture_classification: bool = False
do_picture_description class-attribute instance-attribute
do_picture_description: bool = False
do_table_structure class-attribute instance-attribute
do_table_structure: bool = True
document_timeout class-attribute instance-attribute
document_timeout: Optional[float] = None
enable_remote_services class-attribute instance-attribute
enable_remote_services: bool = False
force_backend_text class-attribute instance-attribute
force_backend_text: bool = False
generate_page_images class-attribute instance-attribute
generate_page_images: bool = False
generate_parsed_pages class-attribute instance-attribute
generate_parsed_pages: bool = False
generate_picture_images class-attribute instance-attribute
generate_picture_images: bool = False
generate_table_images class-attribute instance-attribute
generate_table_images: bool = Field(
default=False,
deprecated="Field `generate_table_images` is deprecated. To obtain table images, set `PdfPipelineOptions.generate_page_images = True` before conversion and then use the `TableItem.get_image` function.",
)
images_scale class-attribute instance-attribute
images_scale: float = 1.0
picture_description_options class-attribute instance-attribute
picture_description_options: (
PictureDescriptionBaseOptions
) = smolvlm_picture_description
table_structure_options class-attribute instance-attribute
table_structure_options: TableStructureOptions = (
TableStructureOptions()
)
PictureDescriptionApiOptions
基类: PictureDescriptionBaseOptions
属性
-
batch_size(int) – -
concurrency(int) – -
headers(Dict[str, str]) – -
kind(Literal['api']) – -
params(Dict[str, Any]) – -
picture_area_threshold(float) – -
prompt(str) – -
provenance(str) – -
scale(float) – -
timeout(float) – -
url(AnyUrl) –
batch_size class-attribute instance-attribute
batch_size: int = 8
concurrency class-attribute instance-attribute
concurrency: int = 1
headers class-attribute instance-attribute
headers: Dict[str, str] = {}
kind class-attribute
kind: Literal['api'] = 'api'
params class-attribute instance-attribute
params: Dict[str, Any] = {}
picture_area_threshold class-attribute instance-attribute
picture_area_threshold: float = 0.05
prompt class-attribute instance-attribute
prompt: str = 'Describe this image in a few sentences.'
provenance class-attribute instance-attribute
provenance: str = ''
scale class-attribute instance-attribute
scale: float = 2
timeout class-attribute instance-attribute
timeout: float = 20
url class-attribute instance-attribute
url: AnyUrl = AnyUrl(
"https://:8000/v1/chat/completions"
)
PictureDescriptionBaseOptions
Bases: BaseOptions
属性
-
batch_size(int) – -
kind(str) – -
picture_area_threshold(float) – -
scale(float) –
batch_size class-attribute instance-attribute
batch_size: int = 8
kind class-attribute
kind: str
picture_area_threshold class-attribute instance-attribute
picture_area_threshold: float = 0.05
scale class-attribute instance-attribute
scale: float = 2
PictureDescriptionVlmOptions
基类: PictureDescriptionBaseOptions
属性
-
batch_size(int) – -
generation_config(Dict[str, Any]) – -
kind(Literal['vlm']) – -
picture_area_threshold(float) – -
prompt(str) – -
repo_cache_folder(str) – -
repo_id(str) – -
scale(float) –
batch_size class-attribute instance-attribute
batch_size: int = 8
generation_config class-attribute instance-attribute
generation_config: Dict[str, Any] = dict(
max_new_tokens=200, do_sample=False
)
kind class-attribute
kind: Literal['vlm'] = 'vlm'
picture_area_threshold class-attribute instance-attribute
picture_area_threshold: float = 0.05
prompt class-attribute instance-attribute
prompt: str = 'Describe this image in a few sentences.'
repo_cache_folder property
repo_cache_folder: str
repo_id instance-attribute
repo_id: str
scale class-attribute instance-attribute
scale: float = 2
PipelineOptions
Bases: BaseModel
基本管道选项。
属性
-
accelerator_options(AcceleratorOptions) – -
allow_external_plugins(bool) – -
create_legacy_output(bool) – -
document_timeout(Optional[float]) – -
enable_remote_services(bool) –
accelerator_options class-attribute instance-attribute
accelerator_options: AcceleratorOptions = (
AcceleratorOptions()
)
allow_external_plugins class-attribute instance-attribute
allow_external_plugins: bool = False
create_legacy_output class-attribute instance-attribute
create_legacy_output: bool = True
document_timeout class-attribute instance-attribute
document_timeout: Optional[float] = None
enable_remote_services class-attribute instance-attribute
enable_remote_services: bool = False
RapidOcrOptions
Bases: OcrOptions
RapidOCR 引擎的选项。
属性
-
bitmap_area_threshold(float) – -
cls_model_path(Optional[str]) – -
det_model_path(Optional[str]) – -
force_full_page_ocr(bool) – -
kind(Literal['rapidocr']) – -
lang(List[str]) – -
model_config– -
print_verbose(bool) – -
rec_keys_path(Optional[str]) – -
rec_model_path(Optional[str]) – -
text_score(float) – -
use_cls(Optional[bool]) – -
use_det(Optional[bool]) – -
use_rec(Optional[bool]) –
bitmap_area_threshold class-attribute instance-attribute
bitmap_area_threshold: float = 0.05
cls_model_path class-attribute instance-attribute
cls_model_path: Optional[str] = None
det_model_path class-attribute instance-attribute
det_model_path: Optional[str] = None
force_full_page_ocr class-attribute instance-attribute
force_full_page_ocr: bool = False
kind class-attribute
kind: Literal['rapidocr'] = 'rapidocr'
lang class-attribute instance-attribute
lang: List[str] = ['english', 'chinese']
model_config class-attribute instance-attribute
model_config = ConfigDict(extra='forbid')
print_verbose class-attribute instance-attribute
print_verbose: bool = False
rec_keys_path class-attribute instance-attribute
rec_keys_path: Optional[str] = None
rec_model_path class-attribute instance-attribute
rec_model_path: Optional[str] = None
text_score class-attribute instance-attribute
text_score: float = 0.5
use_cls class-attribute instance-attribute
use_cls: Optional[bool] = None
use_det class-attribute instance-attribute
use_det: Optional[bool] = None
use_rec class-attribute instance-attribute
use_rec: Optional[bool] = None
ResponseFormat
TableFormerMode
TableStructureOptions
Bases: BaseModel
表格结构的选项。
属性
-
do_cell_matching(bool) – -
mode(TableFormerMode) –
do_cell_matching class-attribute instance-attribute
do_cell_matching: bool = True
TesseractCliOcrOptions
Bases: OcrOptions
TesseractCli 引擎的选项。
属性
-
bitmap_area_threshold(float) – -
force_full_page_ocr(bool) – -
kind(Literal['tesseract']) – -
lang(List[str]) – -
model_config– -
path(Optional[str]) – -
tesseract_cmd(str) –
bitmap_area_threshold class-attribute instance-attribute
bitmap_area_threshold: float = 0.05
force_full_page_ocr class-attribute instance-attribute
force_full_page_ocr: bool = False
kind class-attribute
kind: Literal['tesseract'] = 'tesseract'
lang class-attribute instance-attribute
lang: List[str] = ['fra', 'deu', 'spa', 'eng']
model_config class-attribute instance-attribute
model_config = ConfigDict(extra='forbid')
path class-attribute instance-attribute
path: Optional[str] = None
tesseract_cmd class-attribute instance-attribute
tesseract_cmd: str = 'tesseract'
TesseractOcrOptions
Bases: OcrOptions
Tesseract 引擎的选项。
属性
-
bitmap_area_threshold(float) – -
force_full_page_ocr(bool) – -
kind(Literal['tesserocr']) – -
lang(List[str]) – -
model_config– -
path(Optional[str]) –
bitmap_area_threshold class-attribute instance-attribute
bitmap_area_threshold: float = 0.05
force_full_page_ocr class-attribute instance-attribute
force_full_page_ocr: bool = False
kind class-attribute
kind: Literal['tesserocr'] = 'tesserocr'
lang class-attribute instance-attribute
lang: List[str] = ['fra', 'deu', 'spa', 'eng']
model_config class-attribute instance-attribute
model_config = ConfigDict(extra='forbid')
path class-attribute instance-attribute
path: Optional[str] = None
VlmModelType
Bases: str, Enum
属性
GRANITE_VISION class-attribute instance-attribute
GRANITE_VISION = 'granite_vision'
GRANITE_VISION_OLLAMA class-attribute instance-attribute
GRANITE_VISION_OLLAMA = 'granite_vision_ollama'
SMOLDOCLING class-attribute instance-attribute
SMOLDOCLING = 'smoldocling'
VlmPipelineOptions
Bases: PaginatedPipelineOptions
属性
-
accelerator_options(AcceleratorOptions) – -
allow_external_plugins(bool) – -
artifacts_path(Optional[Union[Path, str]]) – -
create_legacy_output(bool) – -
document_timeout(Optional[float]) – -
enable_remote_services(bool) – -
force_backend_text(bool) – -
generate_page_images(bool) – -
generate_picture_images(bool) – -
images_scale(float) – -
vlm_options(Union[HuggingFaceVlmOptions, ApiVlmOptions]) –
accelerator_options class-attribute instance-attribute
accelerator_options: AcceleratorOptions = (
AcceleratorOptions()
)
allow_external_plugins class-attribute instance-attribute
allow_external_plugins: bool = False
artifacts_path class-attribute instance-attribute
artifacts_path: Optional[Union[Path, str]] = None
create_legacy_output class-attribute instance-attribute
create_legacy_output: bool = True
document_timeout class-attribute instance-attribute
document_timeout: Optional[float] = None
enable_remote_services class-attribute instance-attribute
enable_remote_services: bool = False
force_backend_text class-attribute instance-attribute
force_backend_text: bool = False
generate_page_images class-attribute instance-attribute
generate_page_images: bool = True
generate_picture_images class-attribute instance-attribute
generate_picture_images: bool = False
images_scale class-attribute instance-attribute
images_scale: float = 1.0
vlm_options class-attribute instance-attribute
vlm_options: Union[HuggingFaceVlmOptions, ApiVlmOptions] = (
smoldocling_vlm_conversion_options
)