CompanyName2Vec: Company Entity Matching based on Job Ads

Ran Ziv, Ilan Gronau, Michael Fire

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Entity Matching is an essential part of all real-world systems that take in structured and unstructured data coming from different sources. Typically no common key is available for connecting records. Massive data cleaning and integration processes require completion before any data analytics, or further processing can be performed. Although record linkage is frequently regarded as a somewhat tedious but necessary step, it reveals valuable insights, supports data visualization, and guides further analytic approaches to the data. Here, we focus on organization entity matching. We introduce CompanyName2Vec, a novel algorithm to solve company entity matching (CEM) using a neural network model to learn company name semantics from a job ad corpus, without relying on any information on the matched company besides its name. Based on a real-world data, we show that CompanyName2Vec outperforms other evaluated methods and solves the CEM challenge with an average success rate of 89.3%.

Original languageEnglish
Title of host publicationProceedings - 2022 IEEE 9th International Conference on Data Science and Advanced Analytics, DSAA 2022
EditorsJoshua Zhexue Huang, Yi Pan, Barbara Hammer, Muhammad Khurram Khan, Xing Xie, Laizhong Cui, Yulin He
PublisherInstitute of Electrical and Electronics Engineers
ISBN (Electronic)9781665473309
DOIs
StatePublished - 1 Jan 2022
Event9th IEEE International Conference on Data Science and Advanced Analytics, DSAA 2022 - Shenzhen, China
Duration: 13 Oct 202216 Oct 2022

Publication series

NameProceedings - 2022 IEEE 9th International Conference on Data Science and Advanced Analytics, DSAA 2022

Conference

Conference9th IEEE International Conference on Data Science and Advanced Analytics, DSAA 2022
Country/TerritoryChina
CityShenzhen
Period13/10/2216/10/22

Keywords

  • CompanyName2Vec
  • Entity Matching
  • LSTM
  • Organization Name Matching

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Vision and Pattern Recognition
  • Hardware and Architecture
  • Information Systems
  • Information Systems and Management

Fingerprint

Dive into the research topics of 'CompanyName2Vec: Company Entity Matching based on Job Ads'. Together they form a unique fingerprint.

Cite this