一元网络论坛

 找回密码
 立即注册
搜索
热搜: 活动 交友 discuz
查看: 83|回复: 0

Looking for AI model training pointers! Master's degree needed.

[复制链接]

2万

主题

2万

帖子

6万

积分

管理员

Rank: 9Rank: 9Rank: 9

积分
62970
发表于 2024-8-28 14:43:54 | 显示全部楼层 |阅读模式
Please translate the following text into English, and only return one translation result without any other characters. Do not use the words 'premise explanation' or 'translate this part of the content': "I want to create an AI knowledge base for a company's internal training system. The database contains about 500 documents, tables, and images. I have followed online tutorials and figured out some things myself. Unfortunately, there were some issues with deploying Dify. I used a Singapore-based Aliyun chicken (2 cores, 4GB) to deploy it."
The maintenance of the knowledge base involves an Excel table that has around 3 million rows and contains thousands of related information. How can I upload each row and successfully index them? I've failed several times so far.
There are also a few books I'd like to add as indices. However, currently, I'm using the Embedding-V1 from Baidu, which seems to be frequently stuck and unable to index correctly. It can only handle small documents. Other knowledge bases manage to index most of the information but fail at high rates. Is it necessary to insert each piece of knowledge individually into the knowledge base?
Currently, my main knowledge base is mostly completed. The chatbot mode uses GPT-4O. The answers are somewhat rough. Many pieces of information cannot be retrieved from the knowledge base. I haven't started using it yet.
In addition, there are vector retrieval models such as rerank, TopK, and others, which I don't understand how to set up or train. Can you give me some guidance and answer questions in a simple and understandable way? If possible, please offer tuition for long-term study. I would like to learn from someone who can become my mentor.
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

Archiver|手机版|小黑屋|一元网络论坛

GMT+8, 2024-10-2 20:38 , Processed in 0.114121 second(s), 20 queries .

Powered by Discuz! X3.4

Copyright © 2001-2020, Tencent Cloud.

快速回复 返回顶部 返回列表