New computerised method can disambiguate namesakes
New York: It is very likely that you have a namesake who is very distinct from your personality. To disambiguate you two, a new method has been developed that can tell you from your namesake.
This ambiguity often occurs in bibliographic, law enforcement and other areas.
Computer scientists from the Indiana University-Purdue University Indianapolis (IUPUI) have developed a novel machine-learning method to provide better solutions to this perplexing problem.
“We can teach the computer to recognise names and disambiguate information accumulated from a variety of sources — Facebook, Twitter and blog posts, public records and other documents — by collecting features such as Facebook friends and keywords from people’s posts using the identical algorithm,” explained Mohammad al Hasan, Associate Professor, IUPUI.
The new method, unlike the existing methods, can perform non-exhaustive classification so that it can tell whom a new record, which appears in streaming data, belongs to.
“Our method grows and changes when new persons appear, enabling us to recognise the ever-growing number of individuals whose records were not previously encountered. While working in non-exhaustive setting, our model automatically detects such names and adjusts the model parameters accordingly,” added Hasan.
The researchers trained computers by using records of different individuals with that name to build a model that distinguishes between individuals with that name, even individuals about whom information had not been included in the training data previously provided to the computer.
The researchers focused on three types of “features” — bits of information with some degree of predictive power to define a specific individual.
“Relational or association features to reveal persons with whom an individual is associated; text features, such as keywords in documents; and venue features to determine memberships or events with which an individual is currently or was formerly associated,” the study noted.
The study was published in proceedings of the 25th International Conference on Information and Knowledge Management.