Gender Guesser
Current version: 0.10.0, Last updated: Oct. 8, 2014
Wudi <wudi@wudilabs.org>, Wudi Labs

NAME

Gender Guesser

SYNOPSIS

include_once 'genderguesser.php';

$name = '王杰';

$GenderGuesser = new GenderGuesser();
$GenderGuesser->loadLexicon('charlex_gender_full.lex');

$male_prob = $GenderGuesser->getMaleProbability($name);

echo "$name: ";
if ($male_prob === false) {
    echo 'Error';
} elseif ($male_prob >= 0.5) {
    echo 'Male, probability: ' . sprintf('%2.2f', $male_prob * 100) . '%';
} else {
    echo 'Female, probability: ' . sprintf('%2.2f', (1.0 - $male_prob) * 100) . '%';
}

DESCRIPTION

This class can guess the gender by name.

Generally speaking, name is related to gender. For example, the name contained "杰", "志", or "宏" is male name generally, the name contained "琬", "佩", or "琳" is female name generally. But it's difficult to guess the gender by the neutral character such as "文", "安", or "清".

NOTICE

The charset of data is UTF-8.

CLASS METHODS

__construct ( [ mixed options [, string lexicon_path ] ] )

Constructor. If the optional parameters options and lexicon_path are given, they will be used to initialize options and load lexicon.

bool setOptions ( mixed options )

Sets options. The argument can be an array or a string (space separated).

To switch off a option, add the prefix "-". Options:

s - The argument name contains family name.

bool loadLexicon ( string path )

Loads a lexicon. Returns true on success, or false on failure.

string getLexiconComment ( )

Returns the comment of the lexicon loaded.

float getMaleProbability ( string name )

Returns the probability that the specified name is a male name.

The value greater than 0.5 means it is a male name; the value less than 0.5 means it is a female name; and an exact 0.5 means it is a neutral name or the character cannot be found in the lexicon.

HISTORY

v0.10.0 (2014-10-08)

v0.05.0 (2012-03-18)

v0.02.0 (2005-11-12)

AUTHOR

2005-2014, Wudi <wudi@wudilabs.org>, Wudi Labs