-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(document-readers): add mysql implements #379
Changes from 2 commits
243b039
a113617
6f7396c
0cb25ae
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,163 @@ | ||
# MySQL Document Reader | ||
|
||
MySQL Document Reader 是一个基于Spring AI的文档读取器实现,用于从MySQL数据库中读取数据并将其转换为文档格式。 | ||
|
||
MySQL Document Reader is a Spring AI-based document reader implementation that reads data from MySQL database and converts it into document format. | ||
|
||
## 特性 | Features | ||
|
||
- 使用纯JDBC实现,无需额外的连接池或ORM框架 | ||
- 支持自定义内容列和元数据列 | ||
- 完善的错误处理和资源自动关闭 | ||
- 支持自定义SQL查询 | ||
- 遵循Spring AI的Document接口规范 | ||
|
||
- Pure JDBC implementation, no additional connection pool or ORM framework required | ||
- Support for custom content and metadata columns | ||
- Comprehensive error handling and automatic resource cleanup | ||
- Support for custom SQL queries | ||
- Compliant with Spring AI Document interface specification | ||
|
||
## 依赖要求 | Dependencies | ||
|
||
```xml | ||
<dependencies> | ||
<!-- Spring AI Core --> | ||
<dependency> | ||
<groupId>org.springframework.ai</groupId> | ||
<artifactId>spring-ai-core</artifactId> | ||
</dependency> | ||
|
||
<!-- MySQL JDBC Driver --> | ||
<dependency> | ||
<groupId>mysql</groupId> | ||
<artifactId>mysql-connector-java</artifactId> | ||
<version>8.0.33</version> | ||
</dependency> | ||
</dependencies> | ||
``` | ||
|
||
## 使用方法 | Usage | ||
|
||
### 1. 创建MySQL资源配置 | Create MySQL Resource Configuration | ||
|
||
```java | ||
MySQLResource resource = new MySQLResource( | ||
"localhost", // MySQL主机地址 | MySQL host address | ||
3306, // MySQL端口号 | MySQL port number | ||
"your_db", // 数据库名称 | Database name | ||
"username", // 用户名 | Username | ||
"password", // 密码 | Password | ||
"SELECT * FROM your_table", // SQL查询语句 | SQL query | ||
Arrays.asList("title", "content"), // 文档内容字段 | Document content fields | ||
Arrays.asList("id", "created_at") // 元数据字段 | Metadata fields | ||
); | ||
``` | ||
|
||
### 2. 创建文档读取器 | Create Document Reader | ||
|
||
```java | ||
MySQLDocumentReader reader = new MySQLDocumentReader(resource); | ||
``` | ||
|
||
### 3. 获取文档 | Get Documents | ||
|
||
```java | ||
List<Document> documents = reader.get(); | ||
``` | ||
|
||
### 4. 处理文档 | Process Documents | ||
|
||
```java | ||
for (Document doc : documents) { | ||
// 获取文档内容 | Get document content | ||
String content = doc.getContent(); | ||
|
||
// 获取元数据 | Get metadata | ||
Map<String, Object> metadata = doc.getMetadata(); | ||
|
||
// 进行后续处理 | Process further | ||
} | ||
``` | ||
|
||
## 配置说明 | Configuration | ||
|
||
### MySQLResource 参数 | Parameters | ||
|
||
| 参数 Parameter | 说明 Description | 默认值 Default | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 来个默认值?就是 127.0.0.1:3306 root root? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. resolved |
||
|------|------|--------| | ||
| host | MySQL服务器地址 MySQL server address | 无 None | | ||
| port | MySQL服务器端口 MySQL server port | 无 None | | ||
| database | 数据库名称 Database name | 无 None | | ||
| username | 用户名 Username | 无 None | | ||
| password | 密码 Password | 无 None | | ||
| query | SQL查询语句 SQL query | 无 None | | ||
| contentColumns | 文档内容字段列表 Document content field list | null (使用所有字段 use all fields) | | ||
| metadataColumns | 元数据字段列表 Metadata field list | null (不使用元数据 no metadata) | | ||
|
||
## 注意事项 | Notes | ||
|
||
1. 确保MySQL JDBC驱动已正确配置 | ||
Ensure MySQL JDBC driver is properly configured | ||
|
||
2. 建议使用合适的WHERE条件和LIMIT子句限制查询结果集大小 | ||
Recommend using appropriate WHERE conditions and LIMIT clauses to restrict result set size | ||
|
||
3. 对于大型结果集,注意内存使用情况 | ||
For large result sets, be mindful of memory usage | ||
|
||
4. 敏感信息(如密码)建议使用配置文件或环境变量管理 | ||
Sensitive information (like passwords) should be managed using configuration files or environment variables | ||
|
||
5. 建议在生产环境中使用连接池管理数据库连接 | ||
Recommend using connection pools for database connections in production environment | ||
|
||
## 示例 | Examples | ||
|
||
### 基本查询示例 | Basic Query Example | ||
|
||
```java | ||
MySQLResource resource = new MySQLResource( | ||
"localhost", | ||
3306, | ||
"test_db", | ||
"test_user", | ||
"test_password", | ||
"SELECT id, title, content FROM articles WHERE status = 'published' LIMIT 100", | ||
Arrays.asList("title", "content"), // 内容字段 | Content fields | ||
Arrays.asList("id") // 元数据字段 | Metadata fields | ||
); | ||
|
||
MySQLDocumentReader reader = new MySQLDocumentReader(resource); | ||
List<Document> documents = reader.get(); | ||
``` | ||
|
||
### 自定义查询示例 | Custom Query Example | ||
|
||
```java | ||
MySQLResource resource = new MySQLResource( | ||
"localhost", | ||
3306, | ||
"blog_db", | ||
"blog_user", | ||
"blog_password", | ||
""" | ||
SELECT | ||
p.id, | ||
p.title, | ||
p.content, | ||
u.username as author, | ||
p.created_at | ||
FROM posts p | ||
JOIN users u ON p.author_id = u.id | ||
WHERE p.status = 'published' | ||
AND p.created_at >= DATE_SUB(NOW(), INTERVAL 7 DAY) | ||
""", | ||
Arrays.asList("title", "content", "author"), // 内容字段 | Content fields | ||
Arrays.asList("id", "created_at") // 元数据字段 | Metadata fields | ||
); | ||
``` | ||
|
||
## 许可证 | License | ||
|
||
[Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,65 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<!-- | ||
~ Copyright 2024-2025 the original author or authors. | ||
~ | ||
~ Licensed under the Apache License, Version 2.0 (the "License"); | ||
~ you may not use this file except in compliance with the License. | ||
~ You may obtain a copy of the License at | ||
~ | ||
~ https://www.apache.org/licenses/LICENSE-2.0 | ||
~ | ||
~ Unless required by applicable law or agreed to in writing, software | ||
~ distributed under the License is distributed on an "AS IS" BASIS, | ||
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
~ See the License for the specific language governing permissions and | ||
~ limitations under the License. | ||
--> | ||
|
||
<project xmlns="http://maven.apache.org/POM/4.0.0" | ||
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" | ||
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> | ||
<modelVersion>4.0.0</modelVersion> | ||
<parent> | ||
<groupId>com.alibaba.cloud.ai</groupId> | ||
<artifactId>spring-ai-alibaba</artifactId> | ||
<version>${revision}</version> | ||
<relativePath>../../../pom.xml</relativePath> | ||
</parent> | ||
|
||
<artifactId>mysql-document-reader</artifactId> | ||
<name>Spring AI Alibaba MySQL Document Reader</name> | ||
<description>Spring AI Alibaba MySQL Document Reader Implementation</description> | ||
|
||
<dependencies> | ||
<!-- Spring AI Core --> | ||
<dependency> | ||
<groupId>org.springframework.ai</groupId> | ||
<artifactId>spring-ai-core</artifactId> | ||
</dependency> | ||
|
||
<!-- MySQL JDBC Driver --> | ||
<dependency> | ||
<groupId>mysql</groupId> | ||
<artifactId>mysql-connector-java</artifactId> | ||
<version>8.0.33</version> | ||
</dependency> | ||
|
||
<!-- Test Dependencies --> | ||
<dependency> | ||
<groupId>org.junit.jupiter</groupId> | ||
<artifactId>junit-jupiter</artifactId> | ||
<scope>test</scope> | ||
</dependency> | ||
<dependency> | ||
<groupId>org.mockito</groupId> | ||
<artifactId>mockito-core</artifactId> | ||
<scope>test</scope> | ||
</dependency> | ||
<dependency> | ||
<groupId>org.assertj</groupId> | ||
<artifactId>assertj-core</artifactId> | ||
<scope>test</scope> | ||
</dependency> | ||
</dependencies> | ||
|
||
</project> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 以空行结尾 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. resolved |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,159 @@ | ||
/* | ||
* Copyright 2024-2025 the original author or authors. | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* https://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
package com.alibaba.cloud.ai.reader.mysql; | ||
|
||
import org.springframework.ai.document.Document; | ||
import org.springframework.ai.document.DocumentReader; | ||
|
||
import java.sql.*; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 不要用全量引入,用增量引入 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. resolved |
||
import java.util.*; | ||
|
||
/** | ||
* MySQL document reader implementation Uses JDBC to connect and fetch data from MySQL | ||
* | ||
* @author brianxiadong | ||
**/ | ||
public class MySQLDocumentReader implements DocumentReader { | ||
|
||
private final MySQLResource mysqlResource; | ||
|
||
public MySQLDocumentReader(MySQLResource mysqlResource) { | ||
this.mysqlResource = mysqlResource; | ||
} | ||
|
||
@Override | ||
public List<Document> get() { | ||
List<Document> documents = new ArrayList<>(); | ||
try { | ||
// Register MySQL JDBC driver | ||
Class.forName("com.mysql.cj.jdbc.Driver"); | ||
|
||
// Create database connection | ||
try (Connection connection = createConnection()) { | ||
documents = executeQueryAndProcessResults(connection); | ||
} | ||
} | ||
catch (ClassNotFoundException e) { | ||
throw new RuntimeException("MySQL JDBC driver not found", e); | ||
} | ||
catch (SQLException e) { | ||
throw new RuntimeException("Error executing MySQL query: " + e.getMessage(), e); | ||
} | ||
return documents; | ||
} | ||
|
||
/** | ||
* Create database connection | ||
*/ | ||
private Connection createConnection() throws SQLException { | ||
return DriverManager.getConnection(mysqlResource.getJdbcUrl(), mysqlResource.getUsername(), | ||
mysqlResource.getPassword()); | ||
} | ||
|
||
/** | ||
* Execute query and process results | ||
*/ | ||
private List<Document> executeQueryAndProcessResults(Connection connection) throws SQLException { | ||
List<Document> documents = new ArrayList<>(); | ||
try (Statement statement = connection.createStatement(); | ||
ResultSet resultSet = statement.executeQuery(mysqlResource.getQuery())) { | ||
|
||
List<String> columnNames = getColumnNames(resultSet.getMetaData()); | ||
while (resultSet.next()) { | ||
Map<String, Object> rowData = extractRowData(resultSet, columnNames); | ||
String content = buildContent(rowData); | ||
Map<String, Object> metadata = buildMetadata(rowData); | ||
documents.add(new Document(content, metadata)); | ||
} | ||
} | ||
return documents; | ||
} | ||
|
||
/** | ||
* Get list of column names | ||
*/ | ||
private List<String> getColumnNames(ResultSetMetaData metaData) throws SQLException { | ||
List<String> columnNames = new ArrayList<>(); | ||
int columnCount = metaData.getColumnCount(); | ||
for (int i = 1; i <= columnCount; i++) { | ||
columnNames.add(metaData.getColumnName(i)); | ||
} | ||
return columnNames; | ||
} | ||
|
||
/** | ||
* Extract row data | ||
*/ | ||
private Map<String, Object> extractRowData(ResultSet resultSet, List<String> columnNames) throws SQLException { | ||
Map<String, Object> rowData = new HashMap<>(); | ||
for (int i = 0; i < columnNames.size(); i++) { | ||
String columnName = columnNames.get(i); | ||
Object value = resultSet.getObject(i + 1); | ||
rowData.put(columnName, value); | ||
} | ||
return rowData; | ||
} | ||
|
||
/** | ||
* Build document content | ||
*/ | ||
private String buildContent(Map<String, Object> rowData) { | ||
StringBuilder contentBuilder = new StringBuilder(); | ||
List<String> contentColumns = mysqlResource.getContentColumns(); | ||
|
||
if (contentColumns == null || contentColumns.isEmpty()) { | ||
// If no content columns specified, use all columns | ||
for (Map.Entry<String, Object> entry : rowData.entrySet()) { | ||
appendColumnContent(contentBuilder, entry.getKey(), entry.getValue()); | ||
} | ||
} | ||
else { | ||
// Only use specified content columns | ||
for (String column : contentColumns) { | ||
if (rowData.containsKey(column)) { | ||
appendColumnContent(contentBuilder, column, rowData.get(column)); | ||
} | ||
} | ||
} | ||
return contentBuilder.toString().trim(); | ||
} | ||
|
||
/** | ||
* Append column content | ||
*/ | ||
private void appendColumnContent(StringBuilder builder, String column, Object value) { | ||
builder.append(column).append(": ").append(value).append("\n"); | ||
} | ||
|
||
/** | ||
* Build metadata | ||
*/ | ||
private Map<String, Object> buildMetadata(Map<String, Object> rowData) { | ||
Map<String, Object> metadata = new HashMap<>(); | ||
metadata.put(MySQLResource.SOURCE, mysqlResource.getJdbcUrl()); | ||
|
||
List<String> metadataColumns = mysqlResource.getMetadataColumns(); | ||
if (metadataColumns != null) { | ||
for (String column : metadataColumns) { | ||
if (rowData.containsKey(column)) { | ||
metadata.put(column, rowData.get(column)); | ||
} | ||
} | ||
} | ||
return metadata; | ||
} | ||
|
||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里需要引入版本吗?我记得 spring ai alibaba 引入 spring boot parent bom,里面是有 mysql jar 包的,可以复用?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
替换成了mysql-connector-j