Skip to content

Commit

Permalink
Merge pull request #365 from brianxiadong/feat-brianxiadong-gitbook
Browse files Browse the repository at this point in the history
Feat : Add GitBook Document Reader / 添加 GitBook 文档阅读器ianxiadong gitbook Related issue : #278
  • Loading branch information
chickenlj authored Jan 13, 2025
2 parents 1209a8f + 5d7c916 commit b9cd1c0
Show file tree
Hide file tree
Showing 8 changed files with 858 additions and 0 deletions.
111 changes: 111 additions & 0 deletions community/document-readers/gitbook-document-reader/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
# GitBook Document Reader

GitBook Document Reader is a component of the Spring AI ecosystem designed to read and process documents from the GitBook platform. It converts GitBook documents into Spring AI Document objects, facilitating subsequent AI processing and analysis.

## Features

- Support for reading GitBook documents via API
- Automatic metadata extraction (title, description, path, etc.)
- Configurable metadata fields
- Custom API endpoint support
- Markdown content preservation

## Getting Started

### Maven Dependency

Add the following dependency to your `pom.xml` file:

```xml
<dependency>
<groupId>com.alibaba.cloud.ai</groupId>
<artifactId>gitbook-document-reader</artifactId>
<version>${version}</version>
</dependency>
```

### Basic Usage

```java
// Create a GitBook document reader instance with required parameters
GitbookDocumentReader reader = new GitbookDocumentReader(
"your-api-token",
"your-space-id"
);

// Get list of documents
List<Document> documents = reader.get();
```

### Advanced Configuration

```java
// Create reader with custom configuration
List<String> metadataFields = Arrays.asList("title", "description", "parent", "type");
GitbookDocumentReader reader = new GitbookDocumentReader(
"your-api-token",
"your-space-id",
"custom-api-url", // Optional custom API endpoint
metadataFields // Optional metadata fields to include
);
```

## Configuration Parameters

| Parameter | Description | Required | Default Value |
|-----------|-------------|----------|---------------|
| apiToken | GitBook API token for authentication | Yes | - |
| spaceId | ID of the GitBook space to read from | Yes | - |
| apiUrl | Custom API endpoint URL | No | Default GitBook API URL |
| metadataFields | List of metadata fields to include | No | null |

### Available Metadata Fields
- `title`: The page title
- `description`: The page description
- `parent`: The parent page information
- `type`: The page type
- `path`: The page path (always included)

## Important Notes

1. API Token is required for authentication
2. Space ID must be provided to identify the GitBook space
3. Each document's ID is set to the GitBook page ID
4. Empty content pages are automatically skipped
5. The `path` metadata field is always included regardless of metadata field configuration

## Example Code

```java
import com.alibaba.cloud.ai.reader.gitbook.GitbookDocumentReader;
import org.springframework.ai.document.Document;
import java.util.Arrays;
import java.util.List;

public class GitbookReaderExample {
public static void main(String[] args) {
// Create reader instance with metadata fields
List<String> metadataFields = Arrays.asList("title", "description");
GitbookDocumentReader reader = new GitbookDocumentReader(
"your-api-token",
"your-space-id",
null, // Use default API URL
metadataFields
);

// Read documents
List<Document> documents = reader.get();

// Process document content and metadata
for (Document doc : documents) {
System.out.println("Document ID: " + doc.getId());
System.out.println("Content: " + doc.getContent());
System.out.println("Metadata: " + doc.getMetadata());
}
}
}
```

## License

This project is licensed under the [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0).
111 changes: 111 additions & 0 deletions community/document-readers/gitbook-document-reader/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
~ Copyright 2024-2025 the original author or authors.
~
~ Licensed under the Apache License, Version 2.0 (the "License");
~ you may not use this file except in compliance with the License.
~ You may obtain a copy of the License at
~
~ https://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing, software
~ distributed under the License is distributed on an "AS IS" BASIS,
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
~ See the License for the specific language governing permissions and
~ limitations under the License.
-->

<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>com.alibaba.cloud.ai</groupId>
<artifactId>spring-ai-alibaba</artifactId>
<version>${revision}</version>
<relativePath>../../../pom.xml</relativePath>
</parent>

<artifactId>gitbook-document-reader</artifactId>
<name>gitbook-document-reader</name>
<description>gitbook reader for Spring AI Alibaba</description>
<packaging>jar</packaging>
<url>https://github.com/alibaba/spring-ai-alibaba</url>
<scm>
<url>https://github.com/alibaba/spring-ai-alibaba</url>
<connection>git://github.com/alibaba/spring-ai-alibaba.git</connection>
<developerConnection>[email protected]:alibaba/spring-ai-alibaba.git</developerConnection>
</scm>

<properties>
<maven.compiler.source>17</maven.compiler.source>
<maven.compiler.target>17</maven.compiler.target>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven-deploy-plugin.version>3.1.1</maven-deploy-plugin.version>
</properties>

<dependencies>

<dependency>
<groupId>com.alibaba.cloud.ai</groupId>
<artifactId>spring-ai-alibaba-core</artifactId>
<version>${project.parent.version}</version>
</dependency>

<!-- test dependencies -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-test</artifactId>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>

<dependency>
<groupId>io.projectreactor</groupId>
<artifactId>reactor-test</artifactId>
<scope>test</scope>
</dependency>

<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-observation-test</artifactId>
<scope>test</scope>
</dependency>

</dependencies>

<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
<version>${spring-boot.version}</version>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-deploy-plugin</artifactId>
<version>${maven-deploy-plugin.version}</version>
<configuration>
<skip>true</skip>
</configuration>
</plugin>
</plugins>
</build>

<repositories>
<repository>
<id>spring-milestones</id>
<name>Spring Milestones</name>
<url>https://repo.spring.io/milestone</url>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
</repositories>

</project>
Loading

0 comments on commit b9cd1c0

Please sign in to comment.