Looking for a Similar Name in Text Mining with Matlab

Some names has first name, middle name, and last name. The last name is a family name, except some country like Chine. Many literatures for simplicity only show the last name with abbreviation to the first and middle ones. For example Rahmadya Trias Handayanto will be written as R. T. Handayanto or Handayanto, R. T. (see e.g in IEEE: http://ieeexplore.ieee.org/search/searchresult.jsp?newsearch=true&queryText=handayanto) so we want to predict that R. T. Handayanto is similar to Rahmadya Trias Handayanto. For experiment try to type in Matlab:

As usual, I use ‘{‘ instead of ‘(‘ because I made a cell type that fit the record on database. We want to collect other name that has similar to it like r. t. handayanto. For collecting the last name you can see my other post using regular expression: https://rahmadya.com/2014/04/18/how-to-search-last-name-using-matlab/. Therefore, we just add the first and middle, or only the last name if only a single name.

The logic is we find first the index of space that indicate the first or middle name with last name. After we count the total number of name we represent only first character with point for first name and middle name. I use if-else to detect whether the name only has one name (last name).

Try using a command window to convert the name become short name. You see that now rahmadya trias handayanto became r. t. handayanto that will compare to other that similar to it using format character (see the bottom line).

For example, we have 10 data:

You can see that Ahmed Abdul-hamid and A. Abdul-hamid that has similar name, but A. Cain may be different from A. J. Cain.

The result state that there is two names with similarity of last name and abbreviation of firs name that id=3 and id=10. With function lower we change all author name into lower case in order to easily compare. Of course, to get this code we have to understand how to manipulate matrices in Matlab.