dify对接textin实现 图片/pdf/excl的提取内容并转换为markwodn/excl输出

将http节点返回的内容如

{
  "status_code": 200,
  "body": "{\"x_request_id\":\"5d54bffb67988fff2f59e055a3b54ea1\",\"duration\":137,\"message\":\"Success\",\"result\":{\"markdown\":\"[\\/www\\/wwwroot\\/tz.ncncy.com】\\n\\n\",\"success_count\":1,\"pages\":[{\"angle\":0,\"page_id\":1,\"content\":[{\"pos\":[3,8,223,8,223,29,3,29],\"id\":0,\"score\":0.98100000619888,\"type\":\"line\",\"text\":\"[\\/www\\/wwwroot\\/tz.ncncy.com】\"}],\"status\":\"Success\",\"height\":35,\"structured\":[{\"pos\":[3,11,223,11,223,25,3,25],\"type\":\"textblock\",\"id\":0,\"content\":[0],\"text\":\"[\\/www\\/wwwroot\\/tz.ncncy.com】\",\"outline_level\":-1,\"sub_type\":\"text\"}],\"durations\":117.35350799561,\"image_id\":\"\",\"width\":225}],\"valid_page_number\":1,\"total_page_number\":1,\"total_count\":1,\"detail\":[{\"paragraph_id\":0,\"page_id\":1,\"tags\":[],\"outline_level\":-1,\"text\":\"[\\/www\\/wwwroot\\/tz.ncncy.com】\",\"type\":\"paragraph\",\"position\":[3,11,223,11,223,25,3,25],\"content\":0,\"sub_type\":\"text\"}]},\"metrics\":[{\"angle\":0,\"page_id\":1,\"status\":\"Success\",\"duration\":135.72903442383,\"page_image_width\":225,\"page_image_height\":35}],\"code\":200,\"version\":\"3.15.13\"}",
  "headers": {
    "date": "Mon, 28 Apr 2025 03:11:58 GMT",
    "content-type": "application/json;charset=utf-8",
    "content-length": "1002",
    "connection": "keep-alive",
    "access-control-max-age": "86400",
    "access-control-allow-origin": "*",
    "access-control-allow-headers": "Content-Type,token,No-Cache,Pragma,Cache-Control,X-Requested-With,x-ti-app-id,x-ti-secret-code",
    "access-control-expose-headers": "X-Request-Id",
    "server": "Intsig Web Server",
    "strict-transport-security": "max-age=3600; includeSubDomains; preload",
    "x-request-id": "5d54bffb67988fff2f59e055a3b54ea1"
  },
  "files": []
}

得到的是一个”body”中含\”markdown\”的格式,我们使用代码节点提取markdown内容

import json

def main(arg1: str) -> dict:
    # 将arg1(JSON字符串)解析为字典
    data = json.loads(arg1)
    
    # 提取result部分的markdown字段
    markdown_content = data.get("result", {}).get("markdown", "")
    
    # 返回提取的markdown内容
    return {
        "result": markdown_content,
    }

再使用markdown转文件节点得到你需要的内容格式

暂无评论

发送评论 编辑评论


				
|´・ω・)ノ
ヾ(≧∇≦*)ゝ
(☆ω☆)
(╯‵□′)╯︵┴─┴
 ̄﹃ ̄
(/ω\)
∠( ᐛ 」∠)_
(๑•̀ㅁ•́ฅ)
→_→
୧(๑•̀⌄•́๑)૭
٩(ˊᗜˋ*)و
(ノ°ο°)ノ
(´இ皿இ`)
⌇●﹏●⌇
(ฅ´ω`ฅ)
(╯°A°)╯︵○○○
φ( ̄∇ ̄o)
ヾ(´・ ・`。)ノ"
( ง ᵒ̌皿ᵒ̌)ง⁼³₌₃
(ó﹏ò。)
Σ(っ °Д °;)っ
( ,,´・ω・)ノ"(´っω・`。)
╮(╯▽╰)╭
o(*////▽////*)q
>﹏<
( ๑´•ω•) "(ㆆᴗㆆ)
😂
😀
😅
😊
🙂
🙃
😌
😍
😘
😜
😝
😏
😒
🙄
😳
😡
😔
😫
😱
😭
💩
👻
🙌
🖕
👍
👫
👬
👭
🌚
🌝
🙈
💊
😶
🙏
🍦
🍉
😣
Source: github.com/k4yt3x/flowerhd
颜文字
Emoji
小恐龙
花!
上一篇