TY - GEN
T1 - Sentence embedding evaluation using pyramid annotation
AU - Baumel, Tal
AU - Cohen, Raphael
AU - Elhadad, Michael
N1 - Funding Information:
This work was supported by the Lynn and William Frankel Center for Computer Sciences, Ben-Gurion University. We thank the reviewers for extremely helpful advice. We would also like to thanks the reviewers for their insight.
Publisher Copyright:
© 2016 Proceedings of the Annual Meeting of the Association for Computational Linguistics. All Rights Reserved.
PY - 2016/1/1
Y1 - 2016/1/1
N2 - Word embedding vectors are used as input for a variety of tasks. Choosing the right model and features for producing such vectors is not a trivial task and different embedding methods can greatly affect results. In this paper we repurpose the "Pyramid Method" annotations used for evaluating automatic summarization to create a benchmark for comparing embedding models when identifying paraphrases of text snippets containing a single clause. We present a method of converting pyramid annotation files into two distinct sentence embedding tests. We show that our method can produce a good amount of testing data, analyze the quality of the testing data, perform test on several leading embedding methods, and finally explain the downstream usages of our task and its significance.
AB - Word embedding vectors are used as input for a variety of tasks. Choosing the right model and features for producing such vectors is not a trivial task and different embedding methods can greatly affect results. In this paper we repurpose the "Pyramid Method" annotations used for evaluating automatic summarization to create a benchmark for comparing embedding models when identifying paraphrases of text snippets containing a single clause. We present a method of converting pyramid annotation files into two distinct sentence embedding tests. We show that our method can produce a good amount of testing data, analyze the quality of the testing data, perform test on several leading embedding methods, and finally explain the downstream usages of our task and its significance.
UR - http://www.scopus.com/inward/record.url?scp=85093184489&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85093184489
T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics
SP - 145
EP - 149
BT - Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP, RepEval 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
PB - Association for Computational Linguistics (ACL)
T2 - 1st Workshop on Evaluating Vector-Space Representations for NLP, RepEval 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
Y2 - 7 August 2016
ER -