-
Notifications
You must be signed in to change notification settings - Fork 185
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #369 from brianxiadong/feat-brianxiadong-gitlab
Feat : GitLab Document Reader , close #279
- Loading branch information
Showing
12 changed files
with
1,506 additions
and
0 deletions.
There are no files selected for viewing
251 changes: 251 additions & 0 deletions
251
community/document-readers/gitlab-document-reader/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,251 @@ | ||
# GitLab Document Reader | ||
|
||
[English](#english) | [中文](#chinese) | ||
|
||
<a name="english"></a> | ||
## English | ||
|
||
GitLab Document Reader is a Spring AI document reader implementation that allows you to read issues and repository files from GitLab projects and convert them into documents. It supports both public repositories and provides flexible filtering options. | ||
|
||
### Features | ||
|
||
#### GitLab Issue Reader | ||
- Read issues from GitLab projects or groups | ||
- Filter issues by: | ||
- State (open, closed, all) | ||
- Labels | ||
- Milestone | ||
- Author | ||
- Assignee | ||
- Created/Updated date ranges | ||
- And more... | ||
- Support for issue metadata including: | ||
- State | ||
- URL | ||
- Labels | ||
- Creation date | ||
- Author | ||
- Assignee | ||
|
||
#### GitLab Repository Reader | ||
- Read files from GitLab repositories | ||
- Support for: | ||
- Single file reading | ||
- Directory traversal | ||
- Recursive file listing | ||
- File pattern filtering (glob patterns) | ||
- File metadata including: | ||
- File path | ||
- File name | ||
- Size | ||
- URL | ||
- Last commit ID | ||
- Content SHA256 | ||
|
||
### Usage | ||
|
||
#### Reading Issues | ||
|
||
Basic usage to read all open issues: | ||
```java | ||
GitLabIssueReader reader = new GitLabIssueReader( | ||
"https://gitlab.com", | ||
"namespace", | ||
"project-name" | ||
); | ||
List<Document> documents = reader.get(); | ||
``` | ||
|
||
Advanced filtering with configuration: | ||
```java | ||
GitLabIssueConfig config = GitLabIssueConfig.builder() | ||
.state(GitLabIssueState.CLOSED) | ||
.labels(Arrays.asList("bug", "critical")) | ||
.createdAfter(LocalDateTime.now().minusDays(30)) | ||
.build(); | ||
|
||
GitLabIssueReader reader = new GitLabIssueReader( | ||
"https://gitlab.com", | ||
"namespace", | ||
"project-name", | ||
null, | ||
config | ||
); | ||
List<Document> documents = reader.get(); | ||
``` | ||
|
||
#### Reading Repository Files | ||
|
||
Basic usage to read a single file: | ||
```java | ||
GitLabRepositoryReader reader = new GitLabRepositoryReader( | ||
"https://gitlab.com", | ||
"namespace", | ||
"project-name" | ||
); | ||
List<Document> documents = reader.setRef("main") | ||
.setFilePath("README.md") | ||
.get(); | ||
``` | ||
|
||
Reading all markdown files recursively: | ||
```java | ||
GitLabRepositoryReader reader = new GitLabRepositoryReader( | ||
"https://gitlab.com", | ||
"namespace", | ||
"project-name" | ||
); | ||
List<Document> documents = reader.setRef("main") | ||
.setPattern("**/*.md") | ||
.setRecursive(true) | ||
.get(); | ||
``` | ||
|
||
### Dependencies | ||
|
||
Add the following dependency to your project: | ||
|
||
```xml | ||
<dependency> | ||
<groupId>com.alibaba.cloud.ai</groupId> | ||
<artifactId>gitlab-document-reader</artifactId> | ||
<version>${spring-ai-alibaba.version}</version> | ||
</dependency> | ||
``` | ||
|
||
The GitLab Document Reader internally uses GitLab4J API for GitLab integration, which is automatically managed through transitive dependencies. | ||
|
||
### Limitations | ||
|
||
- Only supports public repositories | ||
- Rate limits apply based on GitLab's API restrictions | ||
- File size limits apply based on GitLab's API restrictions | ||
|
||
### License | ||
|
||
This project is licensed under the Apache License 2.0 - see the LICENSE file for details. | ||
|
||
--- | ||
|
||
<a name="chinese"></a> | ||
## 中文 | ||
|
||
GitLab Document Reader 是一个 Spring AI 文档读取器实现,可以从 GitLab 项目中读取 issues 和仓库文件并将它们转换为文档。它支持公开仓库访问,并提供灵活的过滤选项。 | ||
|
||
### 功能特性 | ||
|
||
#### GitLab Issue 读取器 | ||
- 从 GitLab 项目或群组中读取 issues | ||
- 支持多种过滤条件: | ||
- 状态(开放、关闭、全部) | ||
- 标签 | ||
- 里程碑 | ||
- 作者 | ||
- 指派人 | ||
- 创建/更新时间范围 | ||
- 更多... | ||
- 支持的 issue 元数据包括: | ||
- 状态 | ||
- URL | ||
- 标签 | ||
- 创建时间 | ||
- 作者 | ||
- 指派人 | ||
|
||
#### GitLab 仓库读取器 | ||
- 读取 GitLab 仓库中的文件 | ||
- 支持功能: | ||
- 单文件读取 | ||
- 目录遍历 | ||
- 递归文件列表 | ||
- 文件模式过滤(glob 模式) | ||
- 文件元数据包括: | ||
- 文件路径 | ||
- 文件名 | ||
- 大小 | ||
- URL | ||
- 最后提交 ID | ||
- 内容 SHA256 | ||
|
||
### 使用方法 | ||
|
||
#### 读取 Issues | ||
|
||
基本用法(读取所有开放的 issues): | ||
```java | ||
GitLabIssueReader reader = new GitLabIssueReader( | ||
"https://gitlab.com", | ||
"namespace", | ||
"project-name" | ||
); | ||
List<Document> documents = reader.get(); | ||
``` | ||
|
||
使用高级配置进行过滤: | ||
```java | ||
GitLabIssueConfig config = GitLabIssueConfig.builder() | ||
.state(GitLabIssueState.CLOSED) | ||
.labels(Arrays.asList("bug", "critical")) | ||
.createdAfter(LocalDateTime.now().minusDays(30)) | ||
.build(); | ||
|
||
GitLabIssueReader reader = new GitLabIssueReader( | ||
"https://gitlab.com", | ||
"namespace", | ||
"project-name", | ||
null, | ||
config | ||
); | ||
List<Document> documents = reader.get(); | ||
``` | ||
|
||
#### 读取仓库文件 | ||
|
||
基本用法(读取单个文件): | ||
```java | ||
GitLabRepositoryReader reader = new GitLabRepositoryReader( | ||
"https://gitlab.com", | ||
"namespace", | ||
"project-name" | ||
); | ||
List<Document> documents = reader.setRef("main") | ||
.setFilePath("README.md") | ||
.get(); | ||
``` | ||
|
||
递归读取所有 markdown 文件: | ||
```java | ||
GitLabRepositoryReader reader = new GitLabRepositoryReader( | ||
"https://gitlab.com", | ||
"namespace", | ||
"project-name" | ||
); | ||
List<Document> documents = reader.setRef("main") | ||
.setPattern("**/*.md") | ||
.setRecursive(true) | ||
.get(); | ||
``` | ||
|
||
### 依赖配置 | ||
|
||
在项目中添加以下依赖: | ||
|
||
```xml | ||
<dependency> | ||
<groupId>com.alibaba.cloud.ai</groupId> | ||
<artifactId>gitlab-document-reader</artifactId> | ||
<version>${spring-ai-alibaba.version}</version> | ||
</dependency> | ||
``` | ||
|
||
GitLab Document Reader 内部使用 GitLab4J API 进行 GitLab 集成,这些依赖会通过传递依赖自动管理。 | ||
|
||
### 使用限制 | ||
|
||
- 仅支持公开仓库 | ||
- 受 GitLab API 速率限制约束 | ||
- 受 GitLab API 文件大小限制约束 | ||
|
||
### 许可证 | ||
|
||
本项目采用 Apache License 2.0 许可证 - 详见 LICENSE 文件。 |
117 changes: 117 additions & 0 deletions
117
community/document-readers/gitlab-document-reader/pom.xml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,117 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<!-- | ||
~ Copyright 2024-2025 the original author or authors. | ||
~ | ||
~ Licensed under the Apache License, Version 2.0 (the "License"); | ||
~ you may not use this file except in compliance with the License. | ||
~ You may obtain a copy of the License at | ||
~ | ||
~ https://www.apache.org/licenses/LICENSE-2.0 | ||
~ | ||
~ Unless required by applicable law or agreed to in writing, software | ||
~ distributed under the License is distributed on an "AS IS" BASIS, | ||
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
~ See the License for the specific language governing permissions and | ||
~ limitations under the License. | ||
--> | ||
|
||
<project xmlns="http://maven.apache.org/POM/4.0.0" | ||
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" | ||
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> | ||
<modelVersion>4.0.0</modelVersion> | ||
<parent> | ||
<groupId>com.alibaba.cloud.ai</groupId> | ||
<artifactId>spring-ai-alibaba</artifactId> | ||
<version>${revision}</version> | ||
<relativePath>../../../pom.xml</relativePath> | ||
</parent> | ||
|
||
<artifactId>gitlab-document-reader</artifactId> | ||
<name>gitlab-document-reader</name> | ||
<description>gitlab reader for Spring AI Alibaba</description> | ||
<packaging>jar</packaging> | ||
<url>https://github.com/alibaba/spring-ai-alibaba</url> | ||
<scm> | ||
<url>https://github.com/alibaba/spring-ai-alibaba</url> | ||
<connection>git://github.com/alibaba/spring-ai-alibaba.git</connection> | ||
<developerConnection>[email protected]:alibaba/spring-ai-alibaba.git</developerConnection> | ||
</scm> | ||
|
||
<properties> | ||
<maven.compiler.source>17</maven.compiler.source> | ||
<maven.compiler.target>17</maven.compiler.target> | ||
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> | ||
<maven-deploy-plugin.version>3.1.1</maven-deploy-plugin.version> | ||
</properties> | ||
|
||
<dependencies> | ||
|
||
<dependency> | ||
<groupId>com.alibaba.cloud.ai</groupId> | ||
<artifactId>spring-ai-alibaba-core</artifactId> | ||
<version>${project.parent.version}</version> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>org.gitlab4j</groupId> | ||
<artifactId>gitlab4j-api</artifactId> | ||
<version>6.0.0-rc.8</version> | ||
</dependency> | ||
|
||
<!-- test dependencies --> | ||
<dependency> | ||
<groupId>org.springframework.ai</groupId> | ||
<artifactId>spring-ai-test</artifactId> | ||
<scope>test</scope> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>org.springframework.boot</groupId> | ||
<artifactId>spring-boot-starter-test</artifactId> | ||
<scope>test</scope> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>io.projectreactor</groupId> | ||
<artifactId>reactor-test</artifactId> | ||
<scope>test</scope> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>io.micrometer</groupId> | ||
<artifactId>micrometer-observation-test</artifactId> | ||
<scope>test</scope> | ||
</dependency> | ||
|
||
</dependencies> | ||
|
||
<build> | ||
<plugins> | ||
<plugin> | ||
<groupId>org.springframework.boot</groupId> | ||
<artifactId>spring-boot-maven-plugin</artifactId> | ||
<version>${spring-boot.version}</version> | ||
</plugin> | ||
<plugin> | ||
<groupId>org.apache.maven.plugins</groupId> | ||
<artifactId>maven-deploy-plugin</artifactId> | ||
<version>${maven-deploy-plugin.version}</version> | ||
<configuration> | ||
<skip>true</skip> | ||
</configuration> | ||
</plugin> | ||
</plugins> | ||
</build> | ||
|
||
<repositories> | ||
<repository> | ||
<id>spring-milestones</id> | ||
<name>Spring Milestones</name> | ||
<url>https://repo.spring.io/milestone</url> | ||
<snapshots> | ||
<enabled>false</enabled> | ||
</snapshots> | ||
</repository> | ||
</repositories> | ||
|
||
</project> |
Oops, something went wrong.