Skip to main content
Ctrl+K
PDF-Extract-Kit 0.1.0 documentation - Home PDF-Extract-Kit 0.1.0 documentation - Home

Getting Started

  • Installation
  • Model Weights Download
  • Quick Start

Core Algorithm Modules

  • Layout Detection Algorithm
  • Formula Detection Algorithm
  • Formula Recognition Algorithm
  • OCR (Optical Character Recognition) Algorithm
  • Table Recognition Algorithm
  • Reading Order Algorithm

Task Extensions

  • Code Implementation
  • Documentation Supplement
  • Model Performance Evaluation

Supported Models

  • The Supported Models

Model Performance Evaluation

  • Layout Detection Evaluation
  • Formula Detection Evaluation
  • Formula Recognition Evaluation
  • OCR Evaluation
  • Table Recognition Evaluation
  • Reading Order Evaluation
  • PDF Content Extraction Evaluation [End-to-End]

PDF Projects

  • Document Content Extraction Project
  • Document Translation Project
  • Model Acceleration Project
  • .rst

Welcome to the PDF-Extract-Kit Documentation

Contents

  • Tutorial

Welcome to the PDF-Extract-Kit Documentation#

pdf-extract-kit

High-Quality Document Parsing Toolkit

Star Watch Fork

Tutorial#

Getting Started

  • Installation
    • Best Practices
  • Model Weights Download
    • [Recommended] Method 1: snapshot_download
    • Method 2: Git LFS
  • Quick Start
    • Layout Detection Example
    • Formula Detection Example

Core Algorithm Modules

  • Layout Detection Algorithm
    • Introduction
    • Model Usage
  • Formula Detection Algorithm
    • Introduction
    • Model Usage
  • Formula Recognition Algorithm
    • Introduction
    • Model Usage
  • OCR (Optical Character Recognition) Algorithm
    • Introduction
    • Model Usage
  • Table Recognition Algorithm
    • Introduction
    • Model Usage
  • Reading Order Algorithm

Task Extensions

  • Code Implementation
    • Task Definition and Registration
    • Model Definition and Registration
    • Example Script
    • Support Type Extension
    • Batch Processing Extension
  • Documentation Supplement
  • Model Performance Evaluation

Supported Models

  • The Supported Models

Model Performance Evaluation

  • Layout Detection Evaluation
  • Formula Detection Evaluation
  • Formula Recognition Evaluation
  • OCR Evaluation
  • Table Recognition Evaluation
  • Reading Order Evaluation
  • PDF Content Extraction Evaluation [End-to-End]

PDF Projects

  • Document Content Extraction Project
    • Introduction
    • Project Usage
  • Document Translation Project
  • Model Acceleration Project

next

Installation

Contents
  • Tutorial

By OpenDataLab

© Copyright 2024, PDF-Extract-Kit Contributors.