Google
WWW CareCure Forums

Go Back   CareCure Forums > SCI Community Forums > Computers

Computers Hardware, software, internet, and related subjects

Reply
 
Thread Tools Display Modes
Old 07-20-2003, 10:23 AM   #1
Max
Senior Member
 
Max's Avatar
 
Join Date: Jul 2001
Location: Montreal,Province of Quebec, CANADA
Posts: 15,036
Send a message via MSN to Max Send a message via Yahoo to Max Send a message via Skype™ to Max
Computer program detects author gender

Computer program detects author gender
Simple algorithm suggests words and syntax bear sex and genre stamp.
18 July 2003
PHILIP BALL


A.S Byatt confuses the computer; will it see through George Elliot?




A new computer program can tell whether a book was written by a man or a woman. The simple scan of key words and syntax is around 80% accurate on both fiction and non-fiction1,2.

The program's success seems to confirm the stereotypical perception of differences in male and female language use. Crudely put, men talk more about objects, and women more about relationships.

Female writers use more pronouns (I, you, she, their, myself), say the program's developers, Moshe Koppel of Bar-Ilan University in Ramat Gan, Israel, and colleagues. Males prefer words that identify or determine nouns (a, the, that) and words that quantify them (one, two, more).

So this article would already, through sentences such as this, have probably betrayed its author as male: there is a prevalence of plural pronouns (they, them), indicating the male tendency to categorize rather than personalize.

If I were female, the researchers imply, I'd be more likely to write sentences like this, which assume that you and I share common knowledge or engage us in a direct relationship. These differing styles have previously been called 'informational' and 'involved', respectively.

Koppel and colleagues trained their algorithm on a few test cases to identify the most prevalent fingerprints of gender and of fiction and non-fiction. They then set it searching for these fingerprints in 566 English-language works in a variety of genres, ranging from A Guide to Prague to A. S. Byatt's novel Possession - which, intriguingly, the programme misclassified by gender, along with Kazuo Ishiguro's The Remains of the Day.

Strikingly, the distinctions between male and female writers are much the same as those that, even more clearly, differentiate non-fiction and fiction. The programme can tell these two genres apart with 98% accuracy. This is perhaps unsurprising, given that non-fiction is more informational and fiction more involved.

Most of the works studied were published after 1975. The Israeli team now intends to probe whether the differences extend further back in time - and so whether George Eliot was wasting her time disguising herself with a male nom de plume - and also whether they occur in other languages.


References
Koppel, M., Argamon, S. & Shimoni, A. R. Automatically categorizing written texts by author gender. Literary and Linguistic Computing, in the press, (2003). |Homepage|
, Koppel, M., Fine, J. & Shimoni, A. R. Gender, genre, and writing style in formal written texts. Text, in the press, (2003).



http://www.nature.com/nsu/030714/030714-13.html
Max is offline   Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -4. The time now is 03:43 AM.



"CC Wiki" powered by VaultWiki v2.5.0.
Copyright © 2008 - 2009, Cracked Egg Studios.