漏网之鱼 LV
发表于 2025-4-9 13:41:23
推荐下NLP这个领域内最重要的8篇论文吧(依据学术范标准评价体系得出的8篇名单):
一、Deep contextualized word representations
作者:Matthew E. Peters / Mark Neumann / Mohit Iyyer / Matt Gardner / Christopher M. Clark / ... / Luke Zettlemoyer
摘要:We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy). Our word vectors are learned functions of the internal states of a deep bidirectional language model (biLM), which is pre-trained on a large text corpus. We show that these representations can be easily added to existing models and significantly improve the state of the art across six challenging NLP problems, including question answering, textual entailment and sentiment analysis. We also present an analysis showing that exposing the deep internals of the pre-trained network is crucial, allowing downstream models to mix different types of semi-supervision signals.
全文链接:文献全文 - 学术范 (xueshufan.com)
二、Glove: Global Vectors for Word Representation
作者:Piotr Bojanowski / Edouard Grave / Armand Joulin / Tomas Mikolov
摘要:Recent methods for learning vector space representations of words have succeeded in capturing fine-grained semantic and syntactic regularities using vector arithmetic, but the origin of these regularities has remained opaque. We analyze and make explicit the model properties needed for such regularities to emerge in word vectors. The result is a new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods. Our model efficiently leverages statistical information by training only on the nonzero elements in a word-word cooccurrence matrix, rather than on the entire sparse matrix or on individual context windows in a large corpus. The model produces a vector space with meaningful substructure, as evidenced by its performance of 75% on a recent word analogy task. It also outperforms related models on similarity tasks and named entity recognition.
全文链接:文献全文 - 学术范 (xueshufan.com)
三、SQuAD: 100,000+ Questions for Machine Comprehension of Text
作者:Pranav Rajpurkar / Jian Zhang / Konstantin Lopyrev / Percy Liang
摘要:We present the Stanford Question Answering Dataset (SQuAD), a new reading comprehension dataset consisting of 100,000+ questions posed by crowdworkers on a set of Wikipedia articles, where the answer to each question is a segment of text from the corresponding reading passage. We analyze the dataset to understand the types of reasoning required to answer the questions, leaning heavily on dependency and constituency trees. We build a strong logistic regression model, which achieves an F1 score of 51.0%, a significant improvement over a simple baseline (20%). However, human performance (86.8%) is much higher, indicating that the dataset presents a good challenge problem for future research. The dataset is freely available at this https URL
全文链接:文献全文 - 学术范 (xueshufan.com)
四、GloVe: Global Vectors for Word Representation
作者:Jeffrey Pennington / Richard Socher / Christopher D. Manning
摘要:Recent methods for learning vector spacerepresentations of words have succeededin capturing fine-grained semantic andsyntactic regularities using vector arith-metic, but the origin of these regularitieshas remained opaque. We analyze andmake explicit the model properties neededfor such regularities to emerge in wordvectors. The result is a new global log-bilinear regression model that combinesthe advantages of the two major modelfamilies in the literature: global matrixfactorization and local context windowmethods. Our model efficiently leveragesstatistical information by training only onthe nonzero elements in a word-word co-occurrence matrix, rather than on the en-tire sparse matrix or on individual contextwindows in a large corpus. The model pro-duces a vector space with meaningful sub-structure, as evidenced by its performanceof 75% on a recent word analogy task. Italso outperforms related models on simi-larity tasks and named entity recognition.
全文链接:学术范
五、Sequence to Sequence Learningwith Neural Networks
作者:Ilya Sutskever / Oriol Vinyals / Quoc V. Le
摘要:Deep Neural Networks (DNNs) are powerful models that have achieved excel-lent performance on difficult learning tasks. Although DNNswork well wheneverlarge labeled training sets are available, they cannot be used to map sequences tosequences. In this paper, we present a general end-to-end approach to sequencelearning that makes minimal assumptions on the sequence structure. Our methoduses a multilayered Long Short-Term Memory (LSTM) to map theinput sequenceto a vector of a fixed dimensionality, and then another deep LSTM to decode thetarget sequence from the vector. Our main result is that on anEnglish to Frenchtranslation task from the WMT’14 dataset, the translationsproduced by the LSTMachieve a BLEU score of 34.8 on the entire test set, where the LSTM’s BLEUscore was penalized on out-of-vocabulary words. Additionally, the LSTM did nothave difficulty on long sentences. For comparison, a phrase-based SMT systemachieves a BLEU score of 33.3 on the same dataset. When we usedthe LSTMto rerank the 1000 hypotheses produced by the aforementioned SMT system, itsBLEU score increases to 36.5, which is close to the previous best result on thistask. The LSTM also learned sensible phrase and sentence representations thatare sensitive to word order and are relatively invariant to the active and the pas-sive voice. Finally, we found that reversing the order of thewords in all sourcesentences (but not target sentences) improved the LSTM’s performance markedly,because doing so introduced many short term dependencies between the sourceand the target sentence which made the optimization problemeasier.
全文链接:学术范
六、The Stanford CoreNLP Natural Language Processing Toolkit
作者:Christopher D. Manning / Mihai Surdeanu / John Bauer / Jenny Finkel / Steven J. Bethard / David McClosky
摘要:We describe the design and use of theStanford CoreNLP toolkit, an extensiblepipeline that provides core natural lan-guage analysis. This toolkit is quite widelyused, both in the research NLP communityand also among commercial and govern-ment users of open source NLP technol-ogy. We suggest that this follows froma simple, approachable design, straight-forward interfaces, the inclusion of ro-bust and good quality analysis compo-nents, and not requiring use of a largeamount of associated baggage.
全文链接:学术范
七、Distributed Representations of Words and Phrases and their Compositionality
作者:Tomas Mikolov / Ilya Sutskever / Kai Chen / Greg Corrado / Jeffrey Dean
摘要:The recently introduced continuous Skip-gram model is an efficient method forlearning high-quality distributed vector representations that capture a large num-ber of precise syntactic and semantic word relationships. In this paper we presentseveral extensions that improve both the quality of the vectors and the trainingspeed. By subsampling of the frequent words we obtain significant speedup andalso learn more regular word representations. We also describe a simple alterna-tive to the hierarchical softmax called negative sampling.An inherent limitation of word representations is their indifference to word orderand their inability to represent idiomatic phrases. For example, the meanings of“Canada” and “Air” cannot be easily combined to obtain “Air Canada”. Motivatedby this example, we present a simple method for finding phrases in text, and showthat learning good vector representations for millions of phrases is possible.
全文链接:学术范
八、Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
作者:Richard Socher / Alex Perelygin / Jean Y. Wu / Jason Chuang / Christopher D. Manning / Andrew Y. Ng / Christopher Potts
摘要:Semantic word spaces have been very use-ful but cannot express the meaning of longerphrases in a principled way. Further progresstowards understanding compositionality intasks such as sentiment detection requiresricher supervised training and evaluation re-sources and more powerful models of com-position.To remedy this, we introduce aSentiment Treebank. It includes fine grainedsentiment labels for 215,154 phrases in theparse trees of 11,855 sentences and presentsnew challenges for sentiment composition-ality.To address them, we introduce theRecursive Neural Tensor Network.Whentrained on the new treebank, this model out-performs all previous methods on several met-rics. It pushes the state of the art in singlesentence positive/negative classification from80% up to 85.4%. The accuracy of predictingfine-grained sentiment labels for all phrasesreaches 80.7%, an improvement of 9.7% overbag of features baselines. Lastly, it is the onlymodel that can accurately capture the effectsof negation and its scope at various tree levelsfor both positive and negative phrases.
全文链接:学术范
希望对你有帮助! |
|