reinforcement learning algorithm 中文意思是什麼

reinforcement learning algorithm 解釋

強化式學習演算法

reinforcement : n. 1. 增強，加固；補強物，強化物；補給品。2. 增援，支援；〈pl. 〉增援部隊，援軍；救援艦。
learning : n 學，學習；學問，學識；專門知識。 good at learning 善於學習。 a man of learning 學者。 New learn...
algorithm : n. 【數學】演算法；規則系統；演段。

例句

Elaborate process descriptions of evaluating offers, belief revision and proposing counteroffers are presented, in particular, we analyze the use of bayesian learning and reinforcement learning in negotiation process, restructuring the traditional q - learning into a dynamic q - leaming algorithm by introducing current beliefs and recency exploration bonus

在該談判模型的基礎上引入學習機制，並分別對評估提議、更新信念、生成提議等談判過具有學習機制的電子商務自動談判研究摘要程作了詳細闡述，重點分析了貝葉斯學習和強化學習技術在自動談判中的應用。
Fourthly, leaning agent adjusts the domain model and user model adaptively by reinforcement learning and genetic algorithm

第四，作者研究了學習agent使用強化學習、遺傳演算法自適應地調整領域模型和用戶模型。
L3ased on the organization rules of internet data, the distribution laws of hyperlinks and the name rules of url, a algorithm of tvm rebuilding is established, and satisfactory experiment results are obtained by applying this algorithm. furthermore, efforts are made by applying of tvm on browse navigation, web page classification and reinforcement learning algorithm

結合網際網路資源的構建規則、鏈接分佈規律和url命名規則，論文提出了樹藤共生數據模型的重建演算法，實驗結果驗證了樹藤共生模型的有效性與合理性，在此基礎上初步討論了樹藤共生模型在瀏覽導航、網頁分類和reinforcementlearning演算法中的應用。
As an example, the parallel machine scheduling problem is mapped on a non - constrained matrix construction graph, and a aco algorithm is proposed to solve the parallel machine scheduling problem. comparison with other best - performing algorithm, the algorithm we proposed is very effective. the finite deterministic markov decision process corresponding to the solution construction procedure of aco algorithm is illustrated in the terminology of reinforcement learning ( rl ) theory

本章最後提出了解決并行機調度問題的蟻群演算法，該演算法把并行機調度問題映射為無約束矩陣解構造圖，並在演算法的信息素更新過程中應用了無約束矩陣解構造圖的局部歸一化螞蟻種子信息素更新規則，與其他幾個高性能演算法的模擬對比試驗證明這種方法是非常有效的。
Based on the survey of current intelligent search engine, combined with reinforcement learning technique, the characteristic of similar web pages " distribution is used to present a heuristic search algorithm based on simulated annealing

網路蜘蛛是智能搜索引擎中首先需要解決的問題。本文利用web網頁分佈群聚性的特點，結合鞏固學習方法，提出了一種新的啟發式搜索演算法。
By means of the proposed reinforcement learning algorithm and modified genetic algorithm, neural network controller whose weights are optimized could generate time series small perturbation signals to convert chaotic oscillations of chaotic systems into desired regular ones. the computer simulations on controlling henon map and logistic chaotic system have demonstrated the capacity of the presented strategy by suppressing lower periodic orbits such as period - 1 and period - 2. meanwhile, the periodic control methodology is utilized, the higher periods such as period - 4 can also be successfully directed to expected periodic orbits

該控制方法無需了解系統的動態特性和精確的數學模型，也不需監督學習所要求的訓練數據，通過增強學習訓練方式，採用改進遺傳演算法優化神經網路權系數，使之成為混沌控制器，便可產生控制混沌系統的時間序列小擾動信號，模擬實驗結果表明它不僅能有效鎮定混沌周期1 、 2等低周期軌道，而且在周期控制技術基礎上，也可成功將高周期混沌軌道（如周期4軌道）變成期望周期行為。
( 4 ) a new cooperation model called macm is presentd and based on this model, an improved distributed reinforcement learning algorithm is also proposed

（ 4 ）提出一種新的多agent協作模型macm及一種改進的分散式強化學習演算法。
On the base of creating the index of evaluation, this paper designs the algorithm of genetic clustering, attains the aim of the credit classification of suppliers by reinforcement learning from cases and clustering analysis, and provides a scientific method for the evaluation of credit

在建立的評價指標基礎上，設計的遺傳聚類演算法通過對案例的強化學習和聚類分析，達到了對供應商信用分類的目的，為信用評價提供了科學的方法。
A reinforcement learning algorithm based on process reward and prioritized sweeping is presented as interference solving strategy

本文提出了基於過程獎賞和優先掃除的強化學習演算法作為多機器人系統的沖突消解策略。
In this paper, introducing joint - action to the traditional reinforcement learning, a new multi - agent reinforcement learning algorithm based on behavior prediction is presented and several methods for predicting other agents " behaviors are discussed

在傳統強化學習方式中引入組合動作的基礎上，本文提出了一種基於行為預測的多智能體強化學習方法，研究了對其他智能體行為進行預測的幾種可行方法。
The reinforcement learning algorithm was also introduced, since it has some relations with the colony algorithm and can be need in the problem of scheduling. 4. some new concepts and scheduling algorithms for batch chemical process were proposed in our studies

由於蟻群演算法與人工智慧中的強化學習演算法之間有著某種聯系，同時強化學習近年來也應用於求解調度問題，因此本文也涉及到了一些強化學習的主要演算法。

reinforced polyester

reinforcement gymnastics

reinforcement lay up

reheat steam turbine plant

reinholdova

reinholds

frister

mongour

sinmin

baluti

dim method

disarmament programme

preview shade

double-acting limit stop

thung wa

poor engieer and private pilot licence

avian paramyxovirus

日歷紙

鼻針

最高畫質

江渭清

河段來水

指向選擇器

振幅波動

振幅編輯

繩索式分級機

盧之超

婚禮鐘聲