TY - UNPB
T1 - How Does That Sound?
T2 - Multi-Language SpokenName2Vec Algorithm Using Speech Generation and Deep Learning
AU - Elyashar, Aviad
AU - Puzis, Rami
AU - Fire, Michael
PY - 2020/7/21
Y1 - 2020/7/21
N2 - Searching for information about a specific person is an online activity
frequently performed by many users. In most cases, users are aided by
queries containing a name and sending back to the web search engines for
finding their will. Typically, Web search engines provide just a few
accurate results associated with a name-containing query. Currently,
most solutions for suggesting synonyms in online search are based on
pattern matching and phonetic encoding, however very often, the
performance of such solutions is less than optimal. In this paper, we
propose SpokenName2Vec, a novel and generic approach which addresses the
similar name suggestion problem by utilizing automated speech
generation, and deep learning to produce spoken name embeddings. This
sophisticated and innovative embeddings captures the way people
pronounce names in any language and accent. Utilizing the name
pronunciation can be helpful for both differentiating and detecting
names that sound alike, but are written differently. The proposed
approach was demonstrated on a large-scale dataset consisting of 250,000
forenames and evaluated using a machine learning classifier and 7,399
names with their verified synonyms. The performance of the proposed
approach was found to be superior to 10 other algorithms evaluated in
this study, including well used phonetic and string similarity
algorithms, and two recently proposed algorithms. The results obtained
suggest that the proposed approach could serve as a useful and valuable
tool for solving the similar name suggestion problem.
AB - Searching for information about a specific person is an online activity
frequently performed by many users. In most cases, users are aided by
queries containing a name and sending back to the web search engines for
finding their will. Typically, Web search engines provide just a few
accurate results associated with a name-containing query. Currently,
most solutions for suggesting synonyms in online search are based on
pattern matching and phonetic encoding, however very often, the
performance of such solutions is less than optimal. In this paper, we
propose SpokenName2Vec, a novel and generic approach which addresses the
similar name suggestion problem by utilizing automated speech
generation, and deep learning to produce spoken name embeddings. This
sophisticated and innovative embeddings captures the way people
pronounce names in any language and accent. Utilizing the name
pronunciation can be helpful for both differentiating and detecting
names that sound alike, but are written differently. The proposed
approach was demonstrated on a large-scale dataset consisting of 250,000
forenames and evaluated using a machine learning classifier and 7,399
names with their verified synonyms. The performance of the proposed
approach was found to be superior to 10 other algorithms evaluated in
this study, including well used phonetic and string similarity
algorithms, and two recently proposed algorithms. The results obtained
suggest that the proposed approach could serve as a useful and valuable
tool for solving the similar name suggestion problem.
KW - Computer Science - Computation and Language
U2 - 10.48550/arXiv.2005.11838
DO - 10.48550/arXiv.2005.11838
M3 - Preprint
BT - How Does That Sound?
ER -