EntityFingerprint.Fingerprint (entityfingerprint v0.3.0)
Summary
Functions
Creates a fingerprint for the given entity name. It supports special character, emojis (because we all know that emoji's in company names are coming), and entity types in other non-latin scripts.
Functions
Link to this function
create(entity)
Creates a fingerprint for the given entity name. It supports special character, emojis (because we all know that emoji's in company names are coming), and entity types in other non-latin scripts.
Examples
iex(1)> EntityFingerprint.create("ФИЛИАЛ КОМПАНИИ С ОГРАНИЧЕННОЙ")
{:ok,
[
fingerprint: "filial kompanii ogranichennoy s",
original: "ФИЛИАЛ КОМПАНИИ С ОГРАНИЧЕННОЙ",
script: "cyrillic"
]}
iex(2)> EntityFingerprint.create("ООО КУРЬЕР-РЕГИОН СТОЛИЦА")
{:ok,
[
fingerprint: "kurerregion ooo stolitsa",
original: "ООО КУРЬЕР-РЕГИОН СТОЛИЦА",
script: "cyrillic"
]}
iex(3)> EntityFingerprint.create("Google Limited Liability Company")
{:ok,
[
fingerprint: "google llc",
original: "Google Limited Liability Company",
script: "latin"
]}
iex(4)> EntityFingerprint.create("현대해상화재보험")
{:ok,
[
fingerprint: "hyeondaehaesanghwajaeboheom",
original: "현대해상화재보험",
script: "hangul"
]}
iex(5)> EntityFingerprint.create(" 💩 Limited Liability Company")
{:ok,
[
fingerprint: "llc poop",
original: " 💩 Limited Liability Company",
script: "common"
]}
iex(6)> EntityFingerprint.create("佐贤鸣智(上海)企业管理咨询有限公司")
{:ok,
[
fingerprint: "guanlizixun shanghai zuoxianmingzhi",
original: "佐贤鸣智(上海)企业管理咨询有限公司",
script: "han"
]}
iex(7)> EntityFingerprint.create("Siemens Aktiengesellschaft")
{:ok,
%{
script: "latin",
original: "Google Limited Liability Company",
fingerprint: "382621CA5922751BB77F398DD0B3CB1B4EACE596",
fingerprint_str: "google llc"
}}
iex(4)> EntityFingerprint.Fingerprint.create("현대해상화재보험")
{:ok,
%{
script: "hangul",
original: "현대해상화재보험",
fingerprint: "00044613027E2BF63B36225EAFF0EB48352A68E4",
fingerprint_str: "hyeondaehaesanghwajaeboheom"
}}
iex(5)> EntityFingerprint.Fingerprint.create(" 💩 Limited Liability Company")
{:ok,
%{
script: "common",
original: " 💩 Limited Liability Company",
fingerprint: "2881722FEEB9C5AB87B7519C8FB711455690C330",
fingerprint_str: "llc poop"
}}
iex(6)> EntityFingerprint.Fingerprint.create("佐贤鸣智(上海)企业管理咨询有限公司")
{:ok,
%{
script: "han",
original: "佐贤鸣智(上海)企业管理咨询有限公司",
fingerprint: "51DDB4F4AA0F7484E7D9AD5CA2A81C4CAFAB5A4C",
fingerprint_str: "guanlizixun shanghai zuoxianmingzhi"
}}
iex(7)> EntityFingerprint.Fingerprint.create("Siemens Aktiengesellschaft")
{:ok,
%{
script: "latin",
original: "Siemens Aktiengesellschaft",
fingerprint: "069BCC150A2D09F1968E220F48B2362A655A7685",
fingerprint_str: "ag siemens"
}}
iex(8)> EntityFingerprint.create("New York, New York")
** (UndefinedFunctionError) function EntityFingerprint.create/1 is undefined (module EntityFingerprint is not available)
EntityFingerprint.create("New York, New York")
iex:8: (file)
iex(8)> EntityFingerprint.Fingerprint.create("New York, New York")
{:ok,
%{
script: "latin",
original: "New York, New York",
fingerprint: "DDDD9606DD438582C9642AA4DEB3C013CBC89148",
fingerprint_str: "new york"
}}
iex(9)>
Thanks
This library was heavily inspired by the python tool alephdata/fingerprints
A Google Spreadsheet created by OCCRP.
The ISO 20275: Entity Legal Forms Code List
Wikipedia also maintains an index of types of business entity.
See also
- Clustering in Depth, part of the OpenRefine documentation discussing how to create collisions in data clustering.