在此範例中,我們使用 Jina、PyTorch 和 Hugging Face 轉換器來建立一個可投入生產的基於 BERT 的金融問答系統。我們採用段落重新排序方法,首先檢索前50 個候選答案,然後使用FinBERT-QA 對候選答案重新排序,FinBERT-QA 是基於BERT 的模型,在FiQA 資料集上進行了微調,取得了最先進的結果。
?請參閱本教學以取得逐步指南和詳細說明。
受金融業對大規模非結構化和結構化資料自動分析的新興需求的推動,QA 系統可以透過促進財務顧問的決策制定,為公司提供利潤豐厚的競爭優勢。我們系統的目標是搜尋給定問題的相關答案段落清單。以下是 FiQA 資料集中的問題和真實答案的範例:
https://github.com/yuanbit/jina-financial-qa-search.git
我們將使用jina-financial-qa-search/
作為我們的工作目錄。
pip install -r requirements.txt
bash get_data.sh
我們想要索引 FiQA 資料集dataset/test_answers.csv
中答案段落的子集:
398960 From http://financial-dictionary.thefreedictionary.com/Business+Fundamentals The facts that affect a company's underlying value. Examples of business fundamentals include debt, cash flow, supply of and demand for the company's products, and so forth. For instance, if a company does not have a sufficient supply of products, it will fail. Likewise, demand for the product must remain at a certain level in order for it to be successful. Strong business fundamentals are considered essential for long-term success and stability. See also: Value Investing, Fundamental Analysis. For a stock the basic fundamentals are the second column of numbers you see on the google finance summary page, P/E ratio, div/yeild, EPS, shares, beta. For the company itself it's generally the stuff on the 'financials' link (e.g. things in the quarterly and annual report, debt, liabilities, assets, earnings, profit etc.
19183 If your sole proprietorship losses exceed all other sources of taxable income, then you have what's called a Net Operating Loss (NOL). You will have the option to "carry back" and amend a return you filed in the last 2 years where you owed tax, or you can "carry forward" the losses and decrease your taxes in a future year, up to 20 years in the future. For more information see the IRS links for NOL. Note: it's important to make sure you file the NOL correctly so I'd advise speaking with an accountant. (Especially if the loss is greater than the cost of the accountant...)
327002 To be deductible, a business expense must be both ordinary and necessary. An ordinary expense is one that is common and accepted in your trade or business. A necessary expense is one that is helpful and appropriate for your trade or business. An expense does not have to be indispensable to be considered necessary. (IRS, Deducting Business Expenses) It seems to me you'd have a hard time convincing an auditor that this is the case. Since business don't commonly own cars for the sole purpose of housing $25 computers, you'd have trouble with the "ordinary" test. And since there are lots of other ways to house a computer other than a car, "necessary" seems problematic also.
您可以變更answer_collection.tsv
的路徑以使用完整資料集建立索引。
python app.py index
最後你會看到以下內容:
✅ done in ⏱ 1 minute and 54 seconds ? 7.7/s
gateway@18904[S]:terminated
doc_indexer@18903[I]:recv ControlRequest from ctl▸doc_indexer▸⚐
doc_indexer@18903[I]:Terminating loop requested by terminate signal RequestLoopEnd()
doc_indexer@18903[I]:#sent: 56 #recv: 56 sent_size: 1.7 MB recv_size: 1.7 MB
doc_indexer@18903[I]:request loop ended, tearing down ...
doc_indexer@18903[I]:indexer size: 865 physical size: 3.1 MB
doc_indexer@18903[S]:artifacts of this executor (vecidx) is persisted to ./workspace/doc_compound_indexer-0/vecidx.bin
doc_indexer@18903[I]:indexer size: 865 physical size: 3.2 MB
doc_indexer@18903[S]:artifacts of this executor (docidx) is persisted to ./workspace/doc_compound_indexer-0/docidx.bin
我們需要建立一個自訂執行器來重新排名前 50 個候選答案。我們可以使用 Jina Hub API 來做到這一點。讓我們確保 Jina Hub 擴充功能已安裝:
pip install "jina[hub]"
我們可以透過執行以下命令來建立自訂Ranker FinBertQARanker
:
jina hub build FinBertQARanker/ --pull --test-uses --timeout-ready 60000
我們現在可以透過執行以下命令來使用我們的財務 QA 搜尋引擎:
python app.py search
由於Ranker 使用基於 BERT 的模型,因此可能需要一些時間來計算相關性分數。您可以嘗試 FiQA 資料集中的問題清單:
• What does it mean that stocks are “memoryless”?
• What would a stock be worth if dividends did not exist?
• What are the risks of Dividend-yielding stocks?
• Why do financial institutions charge so much to convert currency?
• Is there a candlestick pattern that guarantees any kind of future profit?
• 15 year mortgage vs 30 year paid off in 15
• Why is it rational to pay out a dividend?
• Why do companies have a fiscal year different from the calendar year?
• What should I look at before investing in a start-up?
• Where do large corporations store their massive amounts of cash?
#JinaSearch
與他們互動版權所有 (c) 2021 吉娜的朋友。版權所有。