修复多个问题

This commit is contained in:
ZhangYonghao
2026-03-21 20:32:19 +08:00
parent f2c371b87d
commit 10d463a55f
12 changed files with 1021 additions and 275 deletions

242
README.md Normal file
View File

@@ -0,0 +1,242 @@
# 教育智能体 HTML 发布服务
这个服务用于接收智能体生成的知识点讲解 HTML将内容保存为服务器上的唯一随机文件名并返回可直接访问的链接。
当前主实现为 FastAPI接口地址默认前缀为 `/api`。生成后的 HTML 文件默认存放在 `backend/data/generated_html/`,元数据记录在 `backend/data/html_generator.db`
## 适用场景
- 教育智能体生成知识点讲解页
- 将模型输出的 HTML 片段封装为完整页面
- 为智能体返回可分享、可打开的独立链接
- 保留一定天数后自动清理过期文件
## 推荐对接方式
腾讯云智能体侧建议直接使用本服务的 OpenAPI 文档:
- OpenAPI 地址:`https://你的域名/openapi.json`
- 接口文档地址:`https://你的域名/docs`
如果你们使用腾讯云智能体的 API 插件或 HTTP 工具节点,推荐直接导入上面的 OpenAPI 文档;如果不导入,也可以手工按本文档配置请求参数。
## 接口总览
### 1. 生成 HTML
- 方法:`POST`
- 路径:`/api/html/generate`
- Content-Type`application/json`
#### 请求头
| 名称 | 类型 | 必填 | 说明 |
| --- | --- | --- | --- |
| `Content-Type` | string | 是 | 固定为 `application/json` |
| `X-API-Key` | string | 否 | 当服务端配置了 `API_KEY` 时必填 |
#### 请求体参数
| 参数名 | 类型 | 必填 | 说明 |
| --- | --- | --- | --- |
| `html_content` | string | 是 | 智能体生成的 HTML 内容,可以是完整 HTML也可以是 HTML 片段 |
| `title` | string | 否 | 页面标题,最大 120 个字符 |
| `source` | string | 否 | 来源标识,建议填智能体名称、工作流名或插件名,最大 80 个字符 |
| `request_id` | string | 否 | 请求追踪 ID方便排查问题最大 120 个字符 |
| `ttl_days` | integer | 否 | 文件保留天数,默认 7 天,最大 30 天 |
#### 兼容别名
为了方便不同智能体或旧配置迁移,服务还兼容以下别名:
- `html` -> `html_content`
- `content` -> `html_content`
- `expire_days` -> `ttl_days`
#### 请求示例
```json
{
"title": "勾股定理讲解",
"source": "tencent-education-agent",
"request_id": "lesson-math-20260321-001",
"ttl_days": 7,
"html_content": "<section><h1>勾股定理</h1><p>在直角三角形中a² + b² = c²。</p></section>"
}
```
#### 成功响应示例
```json
{
"message": "HTML file generated successfully",
"unique_id": "vE7k7llv3zXc1A08",
"url": "https://你的域名/api/html/vE7k7llv3zXc1A08/content",
"query_url": "https://你的域名/api/html/vE7k7llv3zXc1A08",
"title": "勾股定理讲解",
"source": "tencent-education-agent",
"request_id": "lesson-math-20260321-001",
"size_bytes": 1286,
"created_at": "2026-03-21T12:00:00.000000",
"expires_at": "2026-03-28T12:00:00.000000"
}
```
#### 字段说明
| 返回字段 | 类型 | 说明 |
| --- | --- | --- |
| `message` | string | 接口处理结果说明 |
| `unique_id` | string | 系统生成的唯一 ID |
| `url` | string | 可直接打开的 HTML 页面地址 |
| `query_url` | string | 查询该记录元数据的地址 |
| `title` | string \| null | 页面标题 |
| `source` | string \| null | 来源标识 |
| `request_id` | string \| null | 请求追踪 ID |
| `size_bytes` | integer | 实际生成文件大小,单位字节 |
| `created_at` | string | 创建时间UTC |
| `expires_at` | string | 过期时间UTC |
### 2. 查询已生成 HTML 的元数据
- 方法:`GET`
- 路径:`/api/html/{unique_id}`
返回结构与生成接口一致,可用于后续查询链接是否仍有效。
### 3. 访问 HTML 内容
- 方法:`GET`
- 路径:`/api/html/{unique_id}/content`
这个地址会直接返回 `text/html` 内容。智能体侧通常只需要使用生成接口返回的 `url` 字段即可。
## 智能体调用约定
建议腾讯云智能体在工具描述中遵守以下规则:
1. `html_content` 必须是完整、可展示的知识讲解内容,优先输出结构化讲解,不要只返回一句话。
2. 推荐使用语义标签,例如 `section``article``h1``h2``p``ul``ol``table`
3. 如果只生成 HTML 片段,服务端会自动补齐基础的 `html/head/body` 包装。
4. 优先输出静态展示内容,不要把它当成前端应用来写。
5. 成功后请直接使用返回的 `url` 对外展示,不要自己拼接文件名。
## 内容限制和安全规则
默认开启安全拦截,以下内容会被拒绝:
- `<script>`
- `<iframe>`
- `<object>` / `<embed>` / `<base>`
- `<form>`
- `<link>`
- `meta refresh`
- 内联事件属性,例如 `onclick=`
- `javascript:` 协议
这是为了避免智能体把任意动态脚本直接挂到线上域名下。
### 推荐约束
- 推荐把生成 HTML 的访问域名单独隔离,例如 `html.example.com`
- 推荐全站 HTTPS
- 不要在该域名下挂载登录态 Cookie
- 生产环境不要把 `ALLOW_UNSAFE_HTML` 设为 `true`
- 单次 HTML 内容大小默认不超过 `200000` 字节
## 常见状态码
| 状态码 | 含义 |
| --- | --- |
| `201` | HTML 创建成功 |
| `400` | HTML 内容不安全或业务参数非法 |
| `401` | API Key 不正确 |
| `404` | 记录不存在或已过期 |
| `422` | 请求体格式不符合要求 |
| `500` | 服务端异常 |
## 本地启动
### 后端
```bash
cd backend
pip install -r requirements.txt
python server.py
```
也可以直接使用:
```bash
cd backend
uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
```
### 前端演示页
```bash
cd frontend
npm install
npm run dev
```
如需指定后端地址,可设置:
```bash
NEXT_PUBLIC_API_BASE_URL=http://localhost:8000
```
## 环境变量
`backend/.env.example` 已给出示例,核心配置如下:
| 变量名 | 类型 | 默认值 | 说明 |
| --- | --- | --- | --- |
| `APP_NAME` | string | `HTML Knowledge API` | 服务名称 |
| `API_PREFIX` | string | `/api` | 接口前缀 |
| `PUBLIC_BASE_URL` | string | `http://localhost:8000` | 对外可访问的服务域名 |
| `ALLOWED_ORIGINS` | JSON array / string | `["http://localhost:3000"]` | CORS 白名单 |
| `HTML_STORAGE_DIR` | string | `./data/generated_html` | HTML 文件存储目录 |
| `DEFAULT_RETENTION_DAYS` | integer | `7` | 默认保留天数 |
| `MAX_RETENTION_DAYS` | integer | `30` | 最大允许保留天数 |
| `MAX_HTML_LENGTH` | integer | `200000` | 单次 HTML 最大字节数 |
| `API_KEY` | string | 空 | 非空时启用 `X-API-Key` 鉴权 |
| `ALLOW_UNSAFE_HTML` | boolean | `false` | 是否关闭 HTML 安全拦截,不建议生产开启 |
## 给腾讯云智能体的最小参数说明
如果你们只想保留最简单的调用方式,智能体只传下面这一个字段也能工作:
```json
{
"html_content": "<section><h1>牛顿第一定律</h1><p>物体在不受外力时,将保持静止或匀速直线运动状态。</p></section>"
}
```
推荐智能体至少传:
```json
{
"title": "牛顿第一定律讲解",
"source": "tencent-education-agent",
"request_id": "physics-lesson-001",
"html_content": "<section>...</section>"
}
```
## 目录说明
- `backend/app/main.py`FastAPI 应用入口
- `backend/app/routers/html.py`HTML 生成、查询、访问接口
- `backend/app/schemas.py`:请求与响应模型
- `backend/app/models.py`:数据库模型
- `backend/server.py`:本地启动脚本
- `frontend/app/page.tsx`:手工联调用演示页
## 后续建议
如果你们后面还想继续增强,可以优先做这三项:
1.`generate` 接口增加业务层面的调用日志和审计记录。
2. 给智能体增加更严格的 HTML 输出提示词模板,减少不合规标签。
3.`PUBLIC_BASE_URL` 配成独立子域名,进一步隔离主站风险。

View File

@@ -1,12 +1,10 @@
# 应用配置 APP_NAME="HTML Knowledge API"
APP_NAME="HTML Generator API"
API_PREFIX="/api" API_PREFIX="/api"
PUBLIC_BASE_URL="http://localhost:8000"
# 前端配置 ALLOWED_ORIGINS=["http://localhost:3000"]
FRONTEND_BASE_URL="http://localhost:3000" HTML_STORAGE_DIR="./data/generated_html"
DEFAULT_RETENTION_DAYS=7
# 允许的跨域来源 MAX_RETENTION_DAYS=30
ALLOWED_ORIGINS=["*"] MAX_HTML_LENGTH=200000
API_KEY=""
# 静态文件目录 ALLOW_UNSAFE_HTML=false
STATIC_DIR="../frontend/public/static"

View File

@@ -1,12 +1,10 @@
# 应用配置 APP_NAME="HTML Knowledge API"
APP_NAME="HTML Generator API"
API_PREFIX="/api" API_PREFIX="/api"
PUBLIC_BASE_URL="http://localhost:8000"
# 前端配置 ALLOWED_ORIGINS=["http://localhost:3000"]
FRONTEND_BASE_URL="http://localhost:3000" HTML_STORAGE_DIR="./data/generated_html"
DEFAULT_RETENTION_DAYS=7
# 允许的跨域来源 MAX_RETENTION_DAYS=30
ALLOWED_ORIGINS=["*"] MAX_HTML_LENGTH=200000
API_KEY=""
# 静态文件目录 ALLOW_UNSAFE_HTML=false
STATIC_DIR="../frontend/public/static"

View File

@@ -1,3 +1,4 @@
import json
import os import os
from pathlib import Path from pathlib import Path
@@ -7,16 +8,74 @@ backend_dir = Path(__file__).resolve().parent.parent
load_dotenv(backend_dir / ".env") load_dotenv(backend_dir / ".env")
def _get_int_env(name: str, default: int) -> int:
value = os.getenv(name, "").strip()
if not value:
return default
try:
return int(value)
except ValueError:
return default
def _get_bool_env(name: str, default: bool = False) -> bool:
value = os.getenv(name)
if value is None:
return default
return value.strip().lower() in {"1", "true", "yes", "on"}
def _get_list_env(name: str, default: list[str]) -> list[str]:
value = os.getenv(name, "").strip()
if not value:
return default
try:
parsed = json.loads(value)
except json.JSONDecodeError:
parsed = None
if isinstance(parsed, list):
return [str(item).strip() for item in parsed if str(item).strip()]
return [item.strip() for item in value.split(",") if item.strip()]
def _get_path_env(name: str, default: Path) -> Path:
value = os.getenv(name, "").strip()
if not value:
return default.resolve()
path = Path(value)
if not path.is_absolute():
path = backend_dir / path
return path.resolve()
class Settings: class Settings:
app_name = "HTML Generator API" app_name = os.getenv("APP_NAME", "HTML Knowledge API")
api_prefix = "/api" api_prefix = os.getenv("API_PREFIX", "/api").rstrip("/") or "/api"
backend_dir = backend_dir backend_dir = backend_dir
data_dir = backend_dir / "data" data_dir = backend_dir / "data"
database_path = data_dir / "html_generator.db" database_path = data_dir / "html_generator.db"
database_url = f"sqlite:///{database_path.as_posix()}" database_url = f"sqlite:///{database_path.as_posix()}"
allowed_origins = ["*"] allowed_origins = _get_list_env("ALLOWED_ORIGINS", ["*"])
frontend_base_url = os.environ.get("FRONTEND_BASE_URL", "http://localhost:3000") public_base_url = os.getenv("PUBLIC_BASE_URL", "http://localhost:8000").rstrip("/")
static_dir = backend_dir / "../frontend/public/static" html_storage_dir = _get_path_env(
"HTML_STORAGE_DIR",
data_dir / "generated_html",
)
default_retention_days = max(1, _get_int_env("DEFAULT_RETENTION_DAYS", 7))
max_retention_days = max(
default_retention_days,
_get_int_env("MAX_RETENTION_DAYS", 30),
)
max_html_length = max(1024, _get_int_env("MAX_HTML_LENGTH", 200_000))
api_key = os.getenv("API_KEY", "").strip()
allow_unsafe_html = _get_bool_env("ALLOW_UNSAFE_HTML", False)
settings = Settings() settings = Settings()

View File

@@ -1,6 +1,6 @@
from collections.abc import Generator from collections.abc import Generator
from sqlalchemy import create_engine from sqlalchemy import create_engine, inspect, text
from sqlalchemy.orm import declarative_base, sessionmaker from sqlalchemy.orm import declarative_base, sessionmaker
from app.config import settings from app.config import settings
@@ -15,6 +15,36 @@ SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
Base = declarative_base() Base = declarative_base()
def ensure_database_schema() -> None:
inspector = inspect(engine)
if "html_files" not in inspector.get_table_names():
return
existing_columns = {
column["name"]
for column in inspector.get_columns("html_files")
}
column_migrations = {
"title": "ALTER TABLE html_files ADD COLUMN title VARCHAR(120)",
"source": "ALTER TABLE html_files ADD COLUMN source VARCHAR(80)",
"request_id": "ALTER TABLE html_files ADD COLUMN request_id VARCHAR(120)",
"size_bytes": "ALTER TABLE html_files ADD COLUMN size_bytes INTEGER",
"expires_at": "ALTER TABLE html_files ADD COLUMN expires_at DATETIME",
}
with engine.begin() as connection:
for column_name, statement in column_migrations.items():
if column_name not in existing_columns:
connection.execute(text(statement))
connection.execute(
text(
"CREATE INDEX IF NOT EXISTS ix_html_files_expires_at "
"ON html_files (expires_at)"
)
)
def get_db() -> Generator: def get_db() -> Generator:
db = SessionLocal() db = SessionLocal()
try: try:

View File

@@ -4,9 +4,9 @@ from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware from fastapi.middleware.cors import CORSMiddleware
from app.config import settings from app.config import settings
from app.database import Base, engine, SessionLocal from app.database import Base, SessionLocal, engine, ensure_database_schema
from app.models import HTMLFile
from app.routers import html from app.routers import html
from app.routers.html import cleanup_expired_files
logging.basicConfig( logging.basicConfig(
level=logging.INFO, level=logging.INFO,
@@ -14,18 +14,19 @@ logging.basicConfig(
) )
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
settings.html_storage_dir.mkdir(parents=True, exist_ok=True)
Base.metadata.create_all(bind=engine) Base.metadata.create_all(bind=engine)
ensure_database_schema()
# 删除过期记录 app = FastAPI(
db = SessionLocal() title=settings.app_name,
try: version="2.0.0",
deleted_count = HTMLFile.delete_expired_records(db) description=(
if deleted_count > 0: "Store agent-generated educational HTML pages and return a direct access URL. "
logger.info(f"Deleted {deleted_count} expired HTML file records") "The generated OpenAPI document can be imported directly into Tencent Cloud "
finally: "Agent plugins."
db.close() ),
)
app = FastAPI(title=settings.app_name)
app.add_middleware( app.add_middleware(
CORSMiddleware, CORSMiddleware,
@@ -38,6 +39,20 @@ app.add_middleware(
app.include_router(html.router, prefix=settings.api_prefix) app.include_router(html.router, prefix=settings.api_prefix)
@app.get("/") @app.on_event("startup")
def cleanup_on_startup() -> None:
db = SessionLocal()
try:
deleted_count = cleanup_expired_files(db)
if deleted_count > 0:
logger.info("Deleted %s expired HTML files during startup", deleted_count)
finally:
db.close()
@app.get("/", summary="Health check")
def health_check() -> dict[str, str]: def health_check() -> dict[str, str]:
return {"message": "HTML Generator API is running"} return {
"message": "HTML Knowledge API is running",
"openapi_url": "/openapi.json",
}

View File

@@ -1,7 +1,7 @@
from datetime import datetime, timedelta from datetime import datetime, timedelta
from sqlalchemy import DateTime, Integer, String from sqlalchemy import DateTime, Integer, String, and_, or_
from sqlalchemy.orm import Mapped, mapped_column, Session from sqlalchemy.orm import Mapped, Session, mapped_column
from app.database import Base from app.database import Base
@@ -11,20 +11,57 @@ class HTMLFile(Base):
id: Mapped[int] = mapped_column(Integer, primary_key=True, index=True) id: Mapped[int] = mapped_column(Integer, primary_key=True, index=True)
unique_id: Mapped[str] = mapped_column( unique_id: Mapped[str] = mapped_column(
String(32), unique=True, index=True, nullable=False String(32),
unique=True,
index=True,
nullable=False,
) )
filename: Mapped[str] = mapped_column(String(255), nullable=False) filename: Mapped[str] = mapped_column(String(255), nullable=False)
title: Mapped[str | None] = mapped_column(String(120), nullable=True)
source: Mapped[str | None] = mapped_column(String(80), nullable=True)
request_id: Mapped[str | None] = mapped_column(String(120), nullable=True)
size_bytes: Mapped[int | None] = mapped_column(Integer, nullable=True)
created_at: Mapped[datetime] = mapped_column( created_at: Mapped[datetime] = mapped_column(
DateTime, default=datetime.utcnow, nullable=False DateTime,
default=datetime.utcnow,
nullable=False,
)
expires_at: Mapped[datetime | None] = mapped_column(
DateTime,
nullable=True,
index=True,
) )
updated_at: Mapped[datetime] = mapped_column( updated_at: Mapped[datetime] = mapped_column(
DateTime, default=datetime.utcnow, onupdate=datetime.utcnow, nullable=False DateTime,
default=datetime.utcnow,
onupdate=datetime.utcnow,
nullable=False,
) )
@classmethod @classmethod
def delete_expired_records(cls, db: Session, days: int = 5) -> int: def list_expired_records(
"""删除超过指定天数的记录""" cls,
cutoff_date = datetime.utcnow() - timedelta(days=days) db: Session,
deleted = db.query(cls).filter(cls.created_at < cutoff_date).delete() default_retention_days: int,
db.commit() ) -> list["HTMLFile"]:
return deleted now = datetime.utcnow()
fallback_cutoff = now - timedelta(days=default_retention_days)
return (
db.query(cls)
.filter(
or_(
cls.expires_at < now,
and_(
cls.expires_at.is_(None),
cls.created_at < fallback_cutoff,
),
)
)
.all()
)
def resolved_expires_at(self, default_retention_days: int) -> datetime:
if self.expires_at is not None:
return self.expires_at
return self.created_at + timedelta(days=default_retention_days)

View File

@@ -1,8 +1,13 @@
import os
import secrets
import logging import logging
import re
import secrets
import tempfile
from datetime import datetime, timedelta
from html import escape
from pathlib import Path
from fastapi import APIRouter, HTTPException, status, Depends from fastapi import APIRouter, Depends, Header, HTTPException, status
from fastapi.responses import FileResponse
from sqlalchemy.orm import Session from sqlalchemy.orm import Session
from app.config import settings from app.config import settings
@@ -13,74 +18,308 @@ from app.schemas import HTMLGenerateRequest, HTMLGenerateResponse
router = APIRouter(prefix="/html", tags=["html"]) router = APIRouter(prefix="/html", tags=["html"])
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
DANGEROUS_HTML_PATTERNS = (
(re.compile(r"<\s*script\b", re.IGNORECASE), "script tags are not allowed"),
(re.compile(r"<\s*iframe\b", re.IGNORECASE), "iframe tags are not allowed"),
(re.compile(r"<\s*(?:object|embed|base)\b", re.IGNORECASE), "embedded active content is not allowed"),
(re.compile(r"<\s*form\b", re.IGNORECASE), "form tags are not allowed"),
(re.compile(r"<\s*link\b", re.IGNORECASE), "external stylesheet or import tags are not allowed"),
(
re.compile(r"<\s*meta\b[^>]*http-equiv\s*=\s*['\"]?\s*refresh", re.IGNORECASE),
"automatic refresh or redirect is not allowed",
),
(re.compile(r"\son[a-z]+\s*=", re.IGNORECASE), "inline event handlers are not allowed"),
(re.compile(r"javascript\s*:", re.IGNORECASE), "javascript URLs are not allowed"),
)
def generate_unique_id() -> str: CONTENT_SECURITY_POLICY = "; ".join(
return secrets.token_urlsafe(16) [
"default-src 'none'",
"img-src 'self' data: https:",
"style-src 'unsafe-inline'",
"font-src 'self' data: https:",
"media-src https:",
"script-src 'none'",
"connect-src 'none'",
"object-src 'none'",
"base-uri 'none'",
"form-action 'none'",
"frame-ancestors 'none'",
]
)
@router.post("/generate", response_model=HTMLGenerateResponse, status_code=status.HTTP_201_CREATED) def require_api_key(x_api_key: str | None = Header(default=None, alias="X-API-Key")) -> None:
def generate_html(request: HTMLGenerateRequest, db: Session = Depends(get_db)): if not settings.api_key:
return
if x_api_key != settings.api_key:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Invalid API key",
)
def build_content_url(unique_id: str) -> str:
return f"{settings.public_base_url}{settings.api_prefix}/html/{unique_id}/content"
def build_query_url(unique_id: str) -> str:
return f"{settings.public_base_url}{settings.api_prefix}/html/{unique_id}"
def generate_unique_id(db: Session) -> str:
for _ in range(10):
unique_id = secrets.token_urlsafe(12).replace("-", "").replace("_", "")
if not db.query(HTMLFile.id).filter(HTMLFile.unique_id == unique_id).first():
return unique_id
raise RuntimeError("Unable to generate a unique id")
def build_html_document(raw_html: str, title: str | None) -> str:
normalized_html = raw_html.strip()
if re.search(r"<!doctype\s+html|<html\b", normalized_html, re.IGNORECASE):
return normalized_html
escaped_title = escape(title or "知识点讲解")
return f"""<!DOCTYPE html>
<html lang="zh-CN">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>{escaped_title}</title>
<style>
:root {{
color-scheme: light;
}}
* {{
box-sizing: border-box;
}}
body {{
margin: 0;
background: #f5f7fb;
color: #18202a;
font-family: "PingFang SC", "Microsoft YaHei", sans-serif;
line-height: 1.75;
}}
main {{
max-width: 960px;
margin: 0 auto;
padding: 32px 20px 48px;
}}
</style>
</head>
<body>
<main>
{normalized_html}
</main>
</body>
</html>
"""
def validate_html_safety(html_content: str) -> None:
if settings.allow_unsafe_html:
return
for pattern, message in DANGEROUS_HTML_PATTERNS:
if pattern.search(html_content):
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail=f"Unsafe HTML rejected: {message}",
)
def write_html_file(target_path: Path, html_content: str) -> None:
target_path.parent.mkdir(parents=True, exist_ok=True)
temporary_path: Path | None = None
try: try:
# 先删除过期记录 with tempfile.NamedTemporaryFile(
deleted_count = HTMLFile.delete_expired_records(db) "w",
encoding="utf-8",
delete=False,
dir=target_path.parent,
suffix=".tmp",
) as temporary_file:
temporary_file.write(html_content)
temporary_path = Path(temporary_file.name)
temporary_path.replace(target_path)
finally:
if temporary_path and temporary_path.exists():
temporary_path.unlink(missing_ok=True)
def delete_stored_file(filename: str) -> None:
file_path = settings.html_storage_dir / filename
if file_path.exists():
file_path.unlink(missing_ok=True)
def cleanup_expired_files(db: Session) -> int:
expired_records = HTMLFile.list_expired_records(
db,
settings.default_retention_days,
)
if not expired_records:
return 0
for record in expired_records:
delete_stored_file(record.filename)
db.delete(record)
db.commit()
return len(expired_records)
def get_record_or_404(unique_id: str, db: Session) -> HTMLFile:
html_file = (
db.query(HTMLFile)
.filter(HTMLFile.unique_id == unique_id)
.first()
)
if html_file is None:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="HTML file not found",
)
expires_at = html_file.resolved_expires_at(settings.default_retention_days)
if expires_at <= datetime.utcnow():
delete_stored_file(html_file.filename)
db.delete(html_file)
db.commit()
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="HTML file has expired",
)
return html_file
def build_response(html_file: HTMLFile) -> HTMLGenerateResponse:
return HTMLGenerateResponse(
message="HTML file generated successfully",
unique_id=html_file.unique_id,
url=build_content_url(html_file.unique_id),
query_url=build_query_url(html_file.unique_id),
title=html_file.title,
source=html_file.source,
request_id=html_file.request_id,
size_bytes=html_file.size_bytes or 0,
created_at=html_file.created_at,
expires_at=html_file.resolved_expires_at(settings.default_retention_days),
)
@router.post(
"/generate",
response_model=HTMLGenerateResponse,
status_code=status.HTTP_201_CREATED,
summary="Generate and publish an HTML explanation page",
description=(
"Accepts agent-generated HTML, stores it with a unique random filename, "
"and returns a direct access URL."
),
)
def generate_html(
request: HTMLGenerateRequest,
_: None = Depends(require_api_key),
db: Session = Depends(get_db),
) -> HTMLGenerateResponse:
html_path: Path | None = None
try:
deleted_count = cleanup_expired_files(db)
if deleted_count > 0: if deleted_count > 0:
logger.info(f"Deleted {deleted_count} expired HTML file records") logger.info("Deleted %s expired HTML files", deleted_count)
# 生成唯一 ID validate_html_safety(request.html_content)
unique_id = generate_unique_id()
# 确保静态文件目录存在 unique_id = generate_unique_id(db)
static_dir = settings.static_dir.resolve()
static_dir.mkdir(parents=True, exist_ok=True)
# 生成 HTML 文件路径
html_filename = f"{unique_id}.html" html_filename = f"{unique_id}.html"
html_path = static_dir / html_filename html_path = settings.html_storage_dir / html_filename
html_document = build_html_document(request.html_content, request.title)
expires_at = datetime.utcnow() + timedelta(
days=request.ttl_days or settings.default_retention_days
)
size_bytes = len(html_document.encode("utf-8"))
# 写入 HTML 内容 write_html_file(html_path, html_document)
with open(html_path, "w", encoding="utf-8") as f:
f.write(request.html_content)
# 保存到数据库
html_file = HTMLFile( html_file = HTMLFile(
unique_id=unique_id, unique_id=unique_id,
filename=html_filename, filename=html_filename,
title=request.title,
source=request.source,
request_id=request.request_id,
size_bytes=size_bytes,
expires_at=expires_at,
) )
db.add(html_file) db.add(html_file)
db.commit() db.commit()
db.refresh(html_file) db.refresh(html_file)
# 生成完整链接 return build_response(html_file)
html_url = f"{settings.frontend_base_url}/static/{html_filename}" except HTTPException:
raise
return HTMLGenerateResponse( except Exception as exc:
message="HTML 文件生成成功", logger.exception("Failed to generate HTML file")
unique_id=unique_id, db.rollback()
url=html_url if html_path and html_path.exists():
) html_path.unlink(missing_ok=True)
except Exception as e:
logger.error(f"生成 HTML 文件失败: {str(e)}")
raise HTTPException( raise HTTPException(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
detail=f"生成 HTML 文件失败: {str(e)}" detail=f"Failed to generate HTML file: {exc}",
) ) from exc
@router.get("/{unique_id}") @router.get(
def get_html_file(unique_id: str, db: Session = Depends(get_db)): "/{unique_id}",
html_file = db.query(HTMLFile).filter(HTMLFile.unique_id == unique_id).first() response_model=HTMLGenerateResponse,
summary="Query metadata for a generated HTML file",
if not html_file: )
def get_html_file(unique_id: str, db: Session = Depends(get_db)) -> HTMLGenerateResponse:
html_file = get_record_or_404(unique_id, db)
file_path = settings.html_storage_dir / html_file.filename
if not file_path.exists():
db.delete(html_file)
db.commit()
raise HTTPException( raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND, status_code=status.HTTP_404_NOT_FOUND,
detail="HTML 文件不存在" detail="HTML file has been removed from storage",
) )
# 生成完整链接 return build_response(html_file)
html_url = f"{settings.frontend_base_url}/static/{html_file.filename}"
return {
"message": "HTML 文件查询成功", @router.get(
"unique_id": html_file.unique_id, "/{unique_id}/content",
"url": html_url summary="Serve the generated HTML content",
} response_description="The generated HTML page",
)
def get_html_content(unique_id: str, db: Session = Depends(get_db)) -> FileResponse:
html_file = get_record_or_404(unique_id, db)
file_path = settings.html_storage_dir / html_file.filename
if not file_path.exists():
db.delete(html_file)
db.commit()
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="HTML file has been removed from storage",
)
return FileResponse(
path=file_path,
media_type="text/html",
headers={
"Content-Security-Policy": CONTENT_SECURITY_POLICY,
"X-Content-Type-Options": "nosniff",
"Referrer-Policy": "no-referrer",
"Cache-Control": "public, max-age=300",
},
)

View File

@@ -1,23 +1,95 @@
from datetime import datetime from datetime import datetime
from pydantic import BaseModel
from pydantic import BaseModel, Field, root_validator, validator
from app.config import settings
class HTMLGenerateRequest(BaseModel): class HTMLGenerateRequest(BaseModel):
html_content: str html_content: str = Field(
...,
description="Required HTML content or HTML fragment.",
)
title: str | None = Field(
default=None,
max_length=120,
description="Optional page title shown in the generated HTML document.",
)
source: str | None = Field(
default=None,
max_length=80,
description="Optional source identifier such as a Tencent agent name.",
)
request_id: str | None = Field(
default=None,
max_length=120,
description="Optional trace id used for debugging and log correlation.",
)
ttl_days: int | None = Field(
default=None,
ge=1,
description="Optional retention days for the file.",
)
@root_validator(pre=True)
def normalize_aliases(cls, values: dict) -> dict:
alias_map = {
"html": "html_content",
"content": "html_content",
"expire_days": "ttl_days",
}
for alias, target in alias_map.items():
if target not in values and alias in values:
values[target] = values[alias]
return values
@validator("html_content")
def validate_html_content(cls, value: str) -> str:
content = value.strip()
if not content:
raise ValueError("html_content cannot be empty")
if len(content.encode("utf-8")) > settings.max_html_length:
raise ValueError(
f"html_content exceeds the limit of {settings.max_html_length} bytes"
)
return content
@validator("title", "source", "request_id")
def normalize_optional_text(cls, value: str | None) -> str | None:
if value is None:
return None
normalized = value.strip()
return normalized or None
@validator("ttl_days")
def validate_ttl_days(cls, value: int | None) -> int | None:
if value is None:
return None
if value > settings.max_retention_days:
raise ValueError(
f"ttl_days cannot be greater than {settings.max_retention_days}"
)
return value
class Config:
extra = "ignore"
class HTMLGenerateResponse(BaseModel): class HTMLGenerateResponse(BaseModel):
message: str message: str
unique_id: str unique_id: str
url: str url: str = Field(description="Direct URL that serves the generated HTML content.")
query_url: str = Field(description="Metadata URL for querying the generated record.")
title: str | None = None
class HTMLFileResponse(BaseModel): source: str | None = None
id: int request_id: str | None = None
unique_id: str size_bytes: int
filename: str
created_at: datetime created_at: datetime
updated_at: datetime expires_at: datetime
class Config:
from_attributes = True

View File

@@ -1,87 +1,12 @@
import http.server
import socketserver
import json
import os import os
import random
import string
from urllib.parse import urlparse, parse_qs
PORT = 8000 import uvicorn
# 生成唯一 ID
def generate_unique_id():
return ''.join(random.choices(string.ascii_letters + string.digits, k=16))
# 确保静态文件目录存在
static_dir = os.path.join(os.path.dirname(__file__), '../frontend/public/static')
if not os.path.exists(static_dir):
os.makedirs(static_dir, exist_ok=True)
class MyHTTPRequestHandler(http.server.SimpleHTTPRequestHandler):
def do_POST(self):
if self.path == '/api/html/generate':
# 读取请求体
content_length = int(self.headers['Content-Length'])
post_data = self.rfile.read(content_length)
try:
# 解析 JSON 数据
data = json.loads(post_data)
html_content = data.get('html_content', '')
if not html_content:
self.send_response(400)
self.send_header('Content-type', 'application/json')
self.end_headers()
self.wfile.write(json.dumps({'error': 'HTML 内容不能为空'}).encode('utf-8'))
return
# 生成唯一 ID
unique_id = generate_unique_id()
# 生成 HTML 文件路径
html_filename = f"{unique_id}.html"
html_path = os.path.join(static_dir, html_filename)
# 写入 HTML 内容
with open(html_path, 'w', encoding='utf-8') as f:
f.write(html_content)
# 生成完整链接
frontend_base_url = 'http://localhost:3000'
html_url = f"{frontend_base_url}/static/{html_filename}"
# 返回响应
self.send_response(201)
self.send_header('Content-type', 'application/json')
self.end_headers()
response = {
'message': 'HTML 文件生成成功',
'unique_id': unique_id,
'url': html_url
}
self.wfile.write(json.dumps(response).encode('utf-8'))
except Exception as e:
self.send_response(500)
self.send_header('Content-type', 'application/json')
self.end_headers()
self.wfile.write(json.dumps({'error': f'生成 HTML 文件失败: {str(e)}'}).encode('utf-8'))
else:
self.send_response(404)
self.end_headers()
def do_GET(self):
if self.path == '/':
# 健康检查
self.send_response(200)
self.send_header('Content-type', 'application/json')
self.end_headers()
self.wfile.write(json.dumps({'message': 'HTML Generator API is running'}).encode('utf-8'))
else:
# 静态文件服务
super().do_GET()
if __name__ == "__main__": if __name__ == "__main__":
with socketserver.TCPServer(("", PORT), MyHTTPRequestHandler) as httpd: uvicorn.run(
print(f"服务器运行在 http://localhost:{PORT}") "app.main:app",
httpd.serve_forever() host=os.getenv("HOST", "0.0.0.0"),
port=int(os.getenv("PORT", "8000")),
reload=os.getenv("RELOAD", "").strip().lower() in {"1", "true", "yes", "on"},
)

View File

@@ -1,32 +1,37 @@
import type { Metadata } from 'next'; import type { Metadata } from "next";
import './globals.css'; import type { ReactNode } from "react";
import "./globals.css";
export const metadata: Metadata = { export const metadata: Metadata = {
title: 'HTML Generator', title: "HTML Knowledge API",
description: '生成HTML文件并返回可访问链接', description: "保存智能体生成的 HTML 页面并返回可访问链接。",
}; };
export default function RootLayout({ export default function RootLayout({
children, children,
}: { }: Readonly<{
children: React.ReactNode children: ReactNode;
}) { }>) {
return ( return (
<html lang="zh-CN"> <html lang="zh-CN">
<body> <body>
<div className="min-h-screen flex flex-col"> <div className="flex min-h-screen flex-col">
<header className="bg-primary text-primary-foreground py-4 px-6 shadow-md"> <header className="border-b border-border bg-card/90 px-6 py-5 backdrop-blur">
<div className="container mx-auto"> <div className="mx-auto flex max-w-6xl items-center justify-between">
<h1 className="text-2xl font-bold">HTML Generator</h1> <div>
<p className="text-xs uppercase tracking-[0.35em] text-muted-foreground">
Education Agent
</p>
<h1 className="text-2xl font-bold text-foreground">
HTML Knowledge API
</h1>
</div>
</div> </div>
</header> </header>
<main className="flex-1 container mx-auto py-8 px-6"> <main className="flex-1 px-6 py-10">{children}</main>
{children} <footer className="border-t border-border bg-card px-6 py-5 text-sm text-muted-foreground">
</main> <div className="mx-auto max-w-6xl">
<footer className="bg-muted text-muted-foreground py-4 px-6 border-t">
<div className="container mx-auto text-center">
<p>© 2026 HTML Generator</p>
</div> </div>
</footer> </footer>
</div> </div>

View File

@@ -1,88 +1,214 @@
import { useState } from 'react'; "use client";
import { FormEvent, useState } from "react";
type GenerateResponse = {
message: string;
unique_id: string;
url: string;
query_url: string;
title?: string | null;
source?: string | null;
request_id?: string | null;
size_bytes: number;
created_at: string;
expires_at: string;
};
const apiBaseUrl =
process.env.NEXT_PUBLIC_API_BASE_URL?.replace(/\/$/, "") ||
"http://localhost:8000";
const initialHtml = `<section>
<h1>勾股定理</h1>
<p>在直角三角形中,两条直角边长度分别为 a、b斜边长度为 c则 a² + b² = c²。</p>
<ul>
<li>适用对象:直角三角形</li>
<li>核心关系:两直角边平方和等于斜边平方</li>
<li>常见用途:求边长、验证三角形是否为直角三角形</li>
</ul>
</section>`;
export default function Home() { export default function Home() {
const [htmlContent, setHtmlContent] = useState(''); const [title, setTitle] = useState("知识点讲解");
const [source, setSource] = useState("tencent-agent");
const [requestId, setRequestId] = useState("");
const [htmlContent, setHtmlContent] = useState(initialHtml);
const [loading, setLoading] = useState(false); const [loading, setLoading] = useState(false);
const [result, setResult] = useState<{ url: string } | null>(null); const [result, setResult] = useState<GenerateResponse | null>(null);
const [error, setError] = useState<string | null>(null); const [error, setError] = useState<string | null>(null);
const handleSubmit = async (e: React.FormEvent) => { const handleSubmit = async (event: FormEvent<HTMLFormElement>) => {
e.preventDefault(); event.preventDefault();
setLoading(true); setLoading(true);
setError(null); setError(null);
setResult(null); setResult(null);
try { try {
const response = await fetch('http://localhost:8000/api/html/generate', { const response = await fetch(`${apiBaseUrl}/api/html/generate`, {
method: 'POST', method: "POST",
headers: { headers: {
'Content-Type': 'application/json', "Content-Type": "application/json",
}, },
body: JSON.stringify({ html_content: htmlContent }), body: JSON.stringify({
title,
source,
request_id: requestId || undefined,
html_content: htmlContent,
}),
}); });
const data = await response.json();
if (!response.ok) { if (!response.ok) {
throw new Error('生成 HTML 文件失败'); throw new Error(data.detail || data.message || "生成 HTML 失败");
} }
const data = await response.json(); setResult(data);
setResult({ url: data.url }); } catch (submissionError) {
} catch (err) { setError(
setError('生成 HTML 文件失败,请稍后重试'); submissionError instanceof Error
console.error(err); ? submissionError.message
: "生成 HTML 失败,请稍后重试。"
);
} finally { } finally {
setLoading(false); setLoading(false);
} }
}; };
return ( return (
<div className="max-w-4xl mx-auto"> <div className="mx-auto max-w-5xl space-y-8">
<h2 className="text-2xl font-bold mb-6">HTML </h2> <section className="rounded-3xl border border-border bg-card p-8 shadow-sm">
<form onSubmit={handleSubmit} className="space-y-6"> <div className="max-w-3xl space-y-3">
<div> <p className="text-sm uppercase tracking-[0.3em] text-muted-foreground">
<label htmlFor="html-content" className="block text-sm font-medium mb-2"> HTML Explanation API
HTML </p>
</label> <h2 className="text-3xl font-bold text-foreground">
<textarea HTML
id="html-content" </h2>
value={htmlContent} <p className="text-muted-foreground">
onChange={(e) => setHtmlContent(e.target.value)} HTML
rows={10} </p>
className="w-full p-4 border border-border rounded-md bg-card focus:outline-none focus:ring-2 focus:ring-primary"
placeholder="请输入 HTML 内容..."
/>
</div> </div>
<div> </section>
<form
onSubmit={handleSubmit}
className="space-y-6 rounded-3xl border border-border bg-card p-8 shadow-sm"
>
<div className="grid gap-4 md:grid-cols-2">
<label className="space-y-2">
<span className="text-sm font-medium">title</span>
<input
value={title}
onChange={(event) => setTitle(event.target.value)}
className="w-full rounded-2xl border border-border bg-background px-4 py-3 outline-none ring-0 transition focus:border-primary"
placeholder="页面标题"
/>
</label>
<label className="space-y-2">
<span className="text-sm font-medium">source</span>
<input
value={source}
onChange={(event) => setSource(event.target.value)}
className="w-full rounded-2xl border border-border bg-background px-4 py-3 outline-none ring-0 transition focus:border-primary"
placeholder="来源,例如 tencent-agent"
/>
</label>
</div>
<label className="block space-y-2">
<span className="text-sm font-medium">request_id</span>
<input
value={requestId}
onChange={(event) => setRequestId(event.target.value)}
className="w-full rounded-2xl border border-border bg-background px-4 py-3 outline-none ring-0 transition focus:border-primary"
placeholder="可选的请求追踪 ID"
/>
</label>
<label className="block space-y-2">
<span className="text-sm font-medium">html_content</span>
<textarea
value={htmlContent}
onChange={(event) => setHtmlContent(event.target.value)}
rows={18}
className="w-full rounded-3xl border border-border bg-background px-4 py-4 font-mono text-sm outline-none ring-0 transition focus:border-primary"
placeholder="请输入完整 HTML 或 HTML 片段"
/>
</label>
<div className="flex flex-wrap items-center gap-4">
<button <button
type="submit" type="submit"
disabled={loading || !htmlContent.trim()} disabled={loading || !htmlContent.trim()}
className="bg-primary text-primary-foreground px-6 py-2 rounded-md hover:bg-primary/90 disabled:opacity-50 disabled:cursor-not-allowed transition-colors" className="rounded-full bg-primary px-6 py-3 text-sm font-semibold text-primary-foreground transition hover:opacity-90 disabled:cursor-not-allowed disabled:opacity-50"
> >
{loading ? '生成中...' : '生成 HTML 文件'} {loading ? "生成中..." : "生成 HTML 链接"}
</button> </button>
<p className="text-sm text-muted-foreground">
{apiBaseUrl}/api/html/generate
</p>
</div> </div>
</form> </form>
{error && ( {error ? (
<div className="mt-6 p-4 bg-destructive/10 text-destructive rounded-md"> <section className="rounded-3xl border border-destructive/20 bg-destructive/10 p-6 text-destructive">
{error} {error}
</div> </section>
)} ) : null}
{result && ( {result ? (
<div className="mt-6 p-4 bg-accent/10 text-accent rounded-md"> <section className="space-y-4 rounded-3xl border border-border bg-card p-8 shadow-sm">
<h3 className="font-medium mb-2"></h3> <div>
<p className="mb-2"> HTML 访</p> <h3 className="text-xl font-semibold"></h3>
<a <p className="text-sm text-muted-foreground">{result.message}</p>
href={result.url} </div>
target="_blank" <div className="grid gap-4 md:grid-cols-2">
rel="noopener noreferrer" <div className="rounded-2xl bg-background p-4">
className="text-primary hover:underline" <p className="text-xs uppercase tracking-[0.2em] text-muted-foreground">
> unique_id
{result.url} </p>
</a> <p className="mt-2 break-all font-mono text-sm">{result.unique_id}</p>
</div> </div>
)} <div className="rounded-2xl bg-background p-4">
<p className="text-xs uppercase tracking-[0.2em] text-muted-foreground">
size_bytes
</p>
<p className="mt-2 font-mono text-sm">{result.size_bytes}</p>
</div>
</div>
<div className="space-y-3 rounded-2xl bg-background p-4">
<p className="text-xs uppercase tracking-[0.2em] text-muted-foreground">
url
</p>
<a
href={result.url}
target="_blank"
rel="noreferrer"
className="break-all text-sm text-primary underline-offset-4 hover:underline"
>
{result.url}
</a>
</div>
<div className="space-y-3 rounded-2xl bg-background p-4">
<p className="text-xs uppercase tracking-[0.2em] text-muted-foreground">
query_url
</p>
<a
href={result.query_url}
target="_blank"
rel="noreferrer"
className="break-all text-sm text-primary underline-offset-4 hover:underline"
>
{result.query_url}
</a>
</div>
<p className="text-sm text-muted-foreground">
{new Date(result.created_at).toLocaleString("zh-CN")}
{new Date(result.expires_at).toLocaleString("zh-CN")}
</p>
</section>
) : null}
</div> </div>
); );
} }