Hard Light Productions Forums

Off-Topic Discussion => General Discussion => Topic started by: Dark Hunter on July 20, 2011, 11:55:24 pm

Title: Gender Guesser
Post by: Dark Hunter on July 20, 2011, 11:55:24 pm
This is a silly little thing I recently came across. (http://www.hackerfactor.com/GenderGuesser.php) Basically, it's a word analysis program that takes large amounts of text someone writes and tries to determine their gender based on their use of language.

I hardly think it's very accurate, but still fun. It is supposed to work better the longer the text you enter.

My results, based on a short paper I recently wrote for college:

Code: [Select]
Total words: 1420

Genre: Informal
  Female = 1698
  Male   = 3633
  Difference = 1935; 68.14%
  Verdict: MALE

Genre: Formal
  Female = 1549
  Male   = 2640
  Difference = 1091; 63.02%
  Verdict: MALE

(For the record, I am male.)
Title: Re: Gender Guesser
Post by: watsisname on July 21, 2011, 06:01:22 am
Amusing, it says I'm 70% MALE in informal writing, but 43% WEAK FEMALE in formal writing...
Apparently this also could be an indication that I'm secretly European. 
:shaking:
Title: Re: Gender Guesser
Post by: Klaustrophobia on July 21, 2011, 01:03:55 pm
I could probably make this thing explode with one of my lab reports or senior design paper.
Title: Re: Gender Guesser
Post by: Colonol Dekker on July 21, 2011, 01:06:47 pm
Make Batutta use this.
Title: Re: Gender Guesser
Post by: Snail on July 21, 2011, 01:14:43 pm
Make Batutta use this.
hahahahaha
Title: Re: Gender Guesser
Post by: NGTM-1R on July 21, 2011, 01:17:31 pm
Inputting some of my fiction gave the complete range of results depending on what I gave it.

The only thing I can tell that actually changed, though, was the the Female bit failed the Bechdel Test. Which doesn't say good things about whoever wrote this.
Title: Re: Gender Guesser
Post by: LordPomposity on July 21, 2011, 01:22:43 pm
Make Batutta use this.

Inputting his Cyclops bug report:

Genre: Informal
  Female = 1043
  Male   = 1620
  Difference = 577; 60.83%
  Verdict: MALE

Genre: Formal
  Female = 658
  Male   = 857
  Difference = 199; 56.56%
  Verdict: Weak MALE

Weak emphasis could indicate European.
Title: Re: Gender Guesser
Post by: Luis Dias on July 21, 2011, 03:16:59 pm
Weak emphasis could indicate European.

Because Europeans are effeminate.
Title: Re: Gender Guesser
Post by: LordPomposity on July 21, 2011, 03:18:24 pm
For the record that's part of the program's output, not my commentary.
Title: Re: Gender Guesser
Post by: Rodo on July 21, 2011, 03:40:12 pm
well at least I'm not a female :P, but weak male??

I need to start going to the gym again.
Title: Re: Gender Guesser
Post by: StarSlayer on July 21, 2011, 03:47:41 pm
I was assuming there was going to be a bunch of Thai hookers and you had to guess which one was naturally a woman.
Title: Re: Gender Guesser
Post by: Commander Zane on July 21, 2011, 03:59:09 pm
And if you lost you had to drink liposuction fluid?
Title: Re: Gender Guesser
Post by: Scourge of Ages on July 21, 2011, 04:47:02 pm
I input a short story I wrote from a first-person perspective, as a female character. The results are:

Code: [Select]
Genre: Informal
  Female = 2817
  Male   = 7696
  Difference = 4879; 73.2%
  Verdict: MALE

Genre: Formal
  Female = 3823
  Male   = 4104
  Difference = 281; 51.77%
  Verdict: Weak MALE
Weak emphasis could indicate European.

I'm not sure what specifically that indicates
Title: Re: Gender Guesser
Post by: Thaeris on July 21, 2011, 04:53:25 pm
well at least I'm not a female :P, but weak male??

I need to start going to the gym again.

For the record, at some level everyone can be AT LEAST be classified as "weak female." :p
Title: Re: Gender Guesser
Post by: Mr. Vega on July 21, 2011, 05:55:21 pm
I put in an article by Ursula K. Le Guin (maybe one of the most feminine writers ever) and it gave me 70% male. Some guesser.
Title: Re: Gender Guesser
Post by: Luis Dias on July 21, 2011, 06:10:37 pm
well at least I'm not a female :P, but weak male??

I need to start going to the gym again.

For the record, at some level everyone can be AT LEAST be classified as "weak female." :p

or "european".... I'm slightly pissed off at the innuendo there if no one else noticed...
Title: Re: Gender Guesser
Post by: Pred the Penguin on July 21, 2011, 07:11:28 pm
These are the results I got. I don't have anything longer on hand that might give a more accurate result... oh well. :blah:
Code: [Select]
Total words: 1055

Genre: Informal
  Female = 1881
  Male   = 2207
  Difference = 326; 53.98%
  Verdict: Weak MALE

Weak emphasis could indicate European.

Genre: Formal
  Female = 1030
  Male   = 1610
  Difference = 580; 60.98%
  Verdict: MALE
Title: Re: Gender Guesser
Post by: Polpolion on July 21, 2011, 09:50:26 pm
this is one of Battuta's short stories he posted on here a while ago, think it was about space tube babies and spaceships and cyborgs or something. Just for kicks. :p

Code: [Select]
Total words: 9028

Genre: Informal
  Female = 9173
  Male   = 15914
  Difference = 6741; 63.43%
  Verdict: MALE

Genre: Formal
  Female = 10705
  Male   = 10883
  Difference = 178; 50.41%
  Verdict: Weak MALE

Weak emphasis could indicate European.


and myself, from a term paper I wrote on Armor back in high school

Code: [Select]
Total words: 6799

Genre: Informal
  Female = 7449
  Male   = 16237
  Difference = 8788; 68.55%
  Verdict: MALE

Genre: Formal
  Female = 6695
  Male   = 10802
  Difference = 4107; 61.73%
  Verdict: MALE

Though all of my short stories that I've ran through it seem to come up weak male in formal. Not too surprising, as my narrative writing is understandably different than expository.
Title: Re: Gender Guesser
Post by: Mongoose on July 22, 2011, 12:51:21 am
I wonder penis what would happen penis if you wrote "penis" every five words penis.
Title: Re: Gender Guesser
Post by: Scourge of Ages on July 22, 2011, 01:04:31 am
I wonder penis what would happen penis if you wrote "penis" every five words penis.
Code: [Select]
Verdict: You're apparently a 13 year old weak male
Title: Re: Gender Guesser
Post by: Pred the Penguin on July 22, 2011, 01:54:17 am
Code: [Select]
Total words: 15
Too few words.  Try 300 words or more.

Genre: Informal
  Female = 0
  Male   = 25
  Difference = 25; 100%
  Verdict: MALE

Genre: Formal
  Female = 47
  Male   = 35
  Difference = -12; 42.68%
  Verdict: Weak FEMALE

Weak emphasis could indicate European.
:lol:
Title: Re: Gender Guesser
Post by: watsisname on July 22, 2011, 04:18:52 am
OH SNAP, ANOTHER WEAK FEMALE!  GET OVER HERE, WE GOTTA COMPARE OUR EUROPEAN SHOULDERBAGS!

It's like its got this hint of sexism but you can't quite pinpoint exactly where!
Title: Re: Gender Guesser
Post by: Flipside on July 22, 2011, 04:27:58 am
Oh nice, a program that manages to combine racial and sexual stereotypes in one easy bundle....
Title: Re: Gender Guesser
Post by: Enigmatic Entity on July 22, 2011, 07:53:05 am
I think I get a formal average of 75% male and informal percentage of 50% :p based on lab reports and silly little stories and lists that I have written...
Title: Re: Gender Guesser
Post by: Nuke on July 22, 2011, 09:13:52 am
so i took a sample out of one of my old colledge reports and tested it. mind you this is circa 2002. and here are the results:

Total words: 1093

Genre: Informal
  Female = 1121
  Male   = 2375
  Difference = 1254; 67.93%
  Verdict: MALE

Genre: Formal
  Female = 1158
  Male   = 1748
  Difference = 590; 60.15%
  Verdict: MALE

not satisfied i looked at my most recent forum posts, found the biggest one on the first page and got this:

Total words: 472

Genre: Informal
  Female = 788
  Male   = 1078
  Difference = 290; 57.77%
  Verdict: Weak MALE

Weak emphasis could indicate European.

Genre: Formal
  Female = 496
  Male   = 470
  Difference = -26; 48.65%
  Verdict: Weak FEMALE

Weak emphasis could indicate European.

the first sample was obviously formal, since it got me an a in my psychology 101 class. the second sample just being a long forum post on gd from one of the recent space threads (so slightly technical), sans spelling errors. so id call it informal. using the right genre preference, id say their algorithms are adequate though far from perfect.

Title: Re: Gender Guesser
Post by: Lester on July 22, 2011, 10:59:32 am
Weak emphasis could indicate European.

Code: [Select]
Total words: 481

Genre: Informal
  Female = 558
  Male   = 877
  Difference = 319; 61.11%
  Verdict: MALE

Genre: Formal
  Female = 183
  Male   = 591
  Difference = 408; 76.35%
  Verdict: MALE

Not bad for a European, eh?

Conclusive proof that EU >>> US. Both in masculinity and otherwise.
Title: Re: Gender Guesser
Post by: Nuke on July 22, 2011, 11:12:12 am
for the record i am not european.
Title: Re: Gender Guesser
Post by: Colonol Dekker on July 22, 2011, 11:18:38 am
Proud English.
Title: Re: Gender Guesser
Post by: -Sara- on July 22, 2011, 01:14:21 pm
When I wrote something simple and normal about daily things, I get this:

Code: [Select]
Genre: Informal
  Female = 725
  Male   = 285
  Difference = -440; 28.21%
  Verdict: FEMALE

Genre: Formal
  Female = 676
  Male   = 350
  Difference = -326; 34.11%
  Verdict: FEMALE

But if I use more serious text about games or aliens like the shivans I said in the BP forum, it says something else again. So I'm not sure about these silly things, I guess it gives words a score or whatever which then says how likely it is a guy or girl wrote it. If I write formal, I write the way my old teachers at school told me to write. And ofcourse because Dutch standards were not so high back then for learning English, I used what I read on forums and have seen on television and what I saw in games. Having to write scriptions for college and artschool also makes you look at native english texts to get your sentences just right. A good example is my old teacher I had the last few years who was a britton himself. He saw I had a keen interesting in improving my english and asked if I wanted to go a step further by learning to use british english. I say colour and not color and Americans and Europeans raise eyebrows when I pronounce the word 'either', while the British crack a smile and give a thumb up. It sure shows that when you are not english, you instead go with what you learned. Monkey see, monkey do?

Taking a text from my post history that I wrote, trying to be a lot more formal, gives:

Code: [Select]
Genre: Informal
  Female = 2628
  Male   = 7291
  Difference = 4663; 73.5%
  Verdict: MALE

Genre: Formal
  Female = 3926
  Male   = 4735
  Difference = 809; 54.67%
  Verdict: Weak MALE

Weak emphasis could indicate European.

So I say, shenanigans! :P I might be an alternate rock baitch, but I'm sure not butch, lol.

So here comes the interesting part.. I tried playing around with that tool a bit! And the results? Amazing(ly wrong), but really funny too! Some really funny facts I collected by playing around with the tool  :lol::


So I guess that debunks it. :P It embraces sexism to the point that everything written by someone with not such a high IQ is considered female, while formal and elaborate texts are rated as male! And that by an Israeli university? We must release Herr Battuta on this topic at once! :D
Title: Re: Gender Guesser
Post by: Snail on July 22, 2011, 04:56:34 pm
Code: [Select]
Total words: 933

Genre: Informal
  Female = 180
  Male   = 2607
  Difference = 2427; 93.54%
  Verdict: MALE

Genre: Formal
  Female = 888
  Male   = 1406
  Difference = 518; 61.29%
  Verdict: MALE
Title: Re: Gender Guesser
Post by: Luis Dias on July 22, 2011, 05:41:37 pm
.....and to call europeans "WEAK FEMALES". I mean just gimme a gun and an address, will you?
Title: Re: Gender Guesser
Post by: MP-Ryan on July 22, 2011, 11:06:58 pm
Am I the only one who actually opened the cited paper that spawned this thing?  If one looks at the abstract...

Quote
This paper explores differences between male and female writing in a large subset of the
British National Corpus covering a range of genres. Several classes of  simple lexical and
syntactic features  that differ substantially according to  author gender are identified, both in
fiction and in non-fiction documents. In particular, we find significant differences between
male- and female-authored documents in the use of pronouns and certain types of noun
modifiers: although the total number of nominals used by male and female authors is virtually
identical, females use many more pronouns and males use many more noun specifiers. More
generally, it is found that even in formal writing, female writing exhibits greater usage of
features identified by previous researchers as "involved" while male writing exhibits greater
usage of features which have been identified as "informational".  Finally, a strong correlation
between the characteristics of male (female) writing and those of nonfiction (fiction) is
demonstrated. 

...one will find the experimentally-derived rationale for the formula.  Don't criticize the tool for sexism; it's merely reporting on writing tendency correlations.  It has nothing whatsoever to do with IQ, stereotypes, or sexism.

(As an aside, it is unsurprising that much formal/technical writing returns a male result as it tends to be used in fields that, until very recently, were/are dominated by men.  Successful women tend to emulate male writing styles in those scenarios, and in doing so conform to the expectations of the reader.  Nowhere is this more evident than in journalism - a point the paper touches on).  As for weak emphasis indicating European, non-native English speakers should show fewer markers in general because they tend to think in their own native language - most of which actually incorporate gender directly into several aspects of language (while English generally does not).  French is an excellent example.

But don't let me get in the way of silly outrage, please do continue. *eyeroll*
Title: Re: Gender Guesser
Post by: Nuke on July 22, 2011, 11:11:42 pm
this implementation may be bogus but it makes you think that this is a thing computers can do given the proper algorithms. no doubt the basis of these algorithms is stereotypical data. you would have to to a lot more research into how males and females use language, identify other cues than simply whether words used were weighted male or female. take it a step further and you could probably identify country of origin as well. id bet the cia has something that can do this.
Title: Re: Gender Guesser
Post by: MP-Ryan on July 22, 2011, 11:15:12 pm
no doubt the basis of these algorithms is stereotypical data.

Looks observationally-derived from a scientific methodology to me.  See my post above.
Title: Re: Gender Guesser
Post by: Nuke on July 22, 2011, 11:17:31 pm
no doubt the basis of these algorithms is stereotypical data.

Looks observationally-derived from a scientific methodology to me.  See my post above.

sorry we posted at exactly the same time there. and i wasnt gonna edit :D
Title: Re: Gender Guesser
Post by: MP-Ryan on July 22, 2011, 11:21:16 pm
sorry we posted at exactly the same time there. and i wasnt gonna edit :D

Fair enough. =)
Title: Re: Gender Guesser
Post by: Luis Dias on July 25, 2011, 06:45:06 am
It mostly seems statistical rubbish to me, conveyed to demonstrate the stereotypes that the researcher already had in his mysoginistic mind.
Title: Re: Gender Guesser
Post by: MP-Ryan on July 25, 2011, 09:49:21 am
It mostly seems statistical rubbish to me, conveyed to demonstrate the stereotypes that the researcher already had in his mysoginistic mind.

Maybe you should have read the paper before leaping to conclusions.  If you had, you'd see that the principle is based on 30 years of research and that the methodology used here is based on rigorous AUTOMATED (i.e. no human input) statistical analysis of published documents (including journal articles), half from writers of each gender, from a reputable source collection (British National Corpus).  Page 5 of the PDF, if you'd like to stop making silly assertions without a shred of evidence.

Or carry on, whichever.
Title: Re: Gender Guesser
Post by: Luis Dias on July 25, 2011, 10:54:53 am
Every statistical analysis is "automated" in the sense of being a computation, so I really don't get where you are going there with that word. The paper is fine and dandy, in that sense. It would only get interesting if it really predicted anything outside of its own database.
Title: Re: Gender Guesser
Post by: Enigmatic Entity on July 25, 2011, 12:24:09 pm
Your little deviations produces this result, you guys:

Genre: Informal
  Female = 107
  Male   = 317
  Difference = 210; 74.76%
  Verdict: MALE

Genre: Formal
  Female = 230
  Male   = 160
  Difference = -70; 41.02%
  Verdict: Weak FEMALE
Title: Re: Gender Guesser
Post by: MP-Ryan on July 25, 2011, 02:24:03 pm
Every statistical analysis is "automated" in the sense of being a computation, so I really don't get where you are going there with that word. The paper is fine and dandy, in that sense. It would only get interesting if it really predicted anything outside of its own database.

The statistical analysis was automated in the sense that a human being did not pick what things identified "male" or "female" writing.  The tool is BASED on the methodology and algorithm used in the paper - so it IS predicting outside of the database used to create it.


Read.  The.  Paper.
Title: Re: Gender Guesser
Post by: Polpolion on July 25, 2011, 03:35:10 pm
MP-Ryan is completely correct. If you even skim over page 5 - 6 of the paper you'll see that the data was gathered quite objectively. An examination of the source code of the tool reveals its operation to be really quite simple: It splits the input into a list of words, and then searches that list for occurences of particular words which have been determined to be useful (ie statistically viable) in distinguishing texts written by one gender from tests written by the other gender. Based on the word, it adds a certain weight to a score for either informal or formal words. Words suggesting a male author have a positive weight and words suggesting a female author have a negative weight, and the absolute value obviously denotes how much it suggests its respective gender. The algorithm the tool uses really has little to do with the algorithm used to determine the weights of each word, merely being an application of the findings of the study that used the algorithm. In fact, the tool on the website could have been written by a first year programming student.
Title: Re: Gender Guesser
Post by: Dragon on July 25, 2011, 06:55:49 pm
TBH, I never expected this thing to work. If anybody could guess the writer's gender based on one example of his/her writing, it'd be a human psychologist who worked on that for these 30 years, and even then, he/she won't be 100% sucessfull. I tend to be skeptical about both statistics and psychology, and this is "statistical psychology", conducted by a machine to boot. In order for computer to handle this, it'd need to have a real artificial intelligence (and even then, it'd still make mistakes from time to time).
Title: Re: Gender Guesser
Post by: Polpolion on July 25, 2011, 08:12:13 pm
Well if you're still skeptical of the study, you've got a tool here that can confirm its results. Even disregarding any support shown here, I'd bet money that it's right a fair amount more often than it's wrong.
Title: Re: Gender Guesser
Post by: NGTM-1R on July 25, 2011, 08:27:15 pm
I have to admit, I'm likewise fairly convinced that if you present it truthful data it will likely return a truthful answer.

Except about being European. That it's got a problem with.
Title: Re: Gender Guesser
Post by: Scourge of Ages on July 26, 2011, 12:18:26 am
I could never stomach reading papers like this, so if somebody (Ryan) who did could sum up: what is with the European thing?
Did they find that European writers tend to use more of the same words regardless of whether they were male or female?
Title: Re: Gender Guesser
Post by: Dark Hunter on July 26, 2011, 12:30:43 am
What I suspect is that the software was written to analyze American English. Europeans use the language a bit differently, so they're acknowledging that that might mess with its results.
Title: Re: Gender Guesser
Post by: Droid803 on July 26, 2011, 01:00:44 am
Code: [Select]
Total words: 1439

Genre: Informal
  Female = 346
  Male   = 4845
  Difference = 4499; 93.33%
  Verdict: MALE

Genre: Formal
  Female = 1048
  Male   = 2498
  Difference = 1450; 70.44%
  Verdict: MALE


Have it the first few paragraphs of a research essay I wrote for a uni class on eroge/anime.
Yeah, I did a lot of what it said males generally do. :/

Guess paper on the wonders of 2D women is manly.
Title: Re: Gender Guesser
Post by: S-99 on July 26, 2011, 05:04:06 am
(http://img36.imageshack.us/img36/5169/guesser.png)
Very interesting this gender guesser. I'm obviously not that inspired or caring to type more into it. Just wanted to see what it would do with such blatancy. I've had my fun.
Title: Re: Gender Guesser
Post by: Mikes on July 26, 2011, 10:33:22 am
So i typed the word "sex" in 300 times (hey it said 300 words minimum!!) and it couldn't decide the gender... mmmh ;)
Title: Re: Gender Guesser
Post by: Pred the Penguin on July 26, 2011, 11:00:57 am
It'll basically read that as 1 word...
Title: Re: Gender Guesser
Post by: Mongoose on July 26, 2011, 12:13:12 pm
You could try the lyrics to "I Just Had Sex" instead.
Title: Re: Gender Guesser
Post by: LordPomposity on July 26, 2011, 12:27:19 pm
What you gon' do with all that junk?
All that junk inside your trunk?
I'ma get, get, get, get, you drunk,
Get you love drunk off my hump.
My hump, my hump, my hump, my hump, my hump,
My hump, my hump, my hump, my lovely little lumps (Check it out)

I drive these brothers crazy,
I do it on the daily,
They treat me really nicely,
They buy me all these ices.
Dolce & Gabbana,
Fendi and that Donna
Karan, they be sharin'
All their money got me wearin' fly
Brother I ain't askin,
They say they love my ass ‘n,
Seven Jeans, True Religion's,
I say no, but they keep givin'
So I keep on takin'
And no I ain't taken
We can keep on datin'
I keep on demonstrating.

My love (love), my love, my love, my love (love)
You love my lady lumps (love),
My hump, my hump, my hump (love),
My humps they got you,

She's got me spending.
(Oh) Spendin' all your money on me and spending time on me.
She's got me spendin'.
(Oh) Spendin' all your money on me, up on me, on me

What you gon' do with all that junk?
All that junk inside that trunk?
I'ma get, get, get, get, you drunk,
Get you love drunk off my hump.
What you gon' do with all that ass?
All that ass inside them jeans?
I'm a make, make, make, make you scream
Make you scream, make you scream.
Cos of my hump (ha), my hump, my hump, my hump (what).
My hump, my hump, my hump (ha), my lovely lady lumps (Check it out)

I met a girl down at the disco.
She said hey, hey, hey yea let's go.
I could be your baby, you can be my honey
Let's spend time not money.
I mix your milk wit my cocoa puff,
Milky, milky cocoa,
Mix your milk with my cocoa puff, milky, milky riiiiiiight.

They say I'm really sexy,
The boys they wanna sex me.
They always standing next to me,
Always dancing next to me,
Tryin' a feel my hump, hump.
Lookin' at my lump, lump.
You can look but you can't touch it,
If you touch it I'ma start some drama,
You don't want no drama,
No, no drama, no, no, no, no drama
So don't pull on my hand boy,
You ain't my man, boy,
I'm just tryn'a dance boy,
And move my hump.

My hump, my hump, my hump, my hump,
My hump, my hump, my hump, my hump, my hump, my hump.
My lovely lady lumps (lumps)
My lovely lady lumps (lumps)
My lovely lady lumps (lumps)
In the back and in the front (lumps)
My lovin' got you,

She's got me spendin'.
(Oh) Spendin' all your money on me and spending time on me.
She's got me spendin'.
(Oh) Spendin' all your money on me, up on me, on me.

What you gon' do with all that junk?
All that junk inside that trunk?
I'ma get, get, get, get you drunk,
Get you love drunk off my hump.
What you gon' do with all that ass?
All that ass inside them jeans?
I'ma make, make, make, make you scream
Make you scream, make you scream.
What you gon' do with all that junk?
All that junk inside that trunk?
I'ma get, get, get, get you drunk,
Get you love drunk off this hump.
What you gon' do wit all that breast?
All that breast inside that shirt?
I'ma make, make, make, make you work
Make you work, work, make you work.

(A-ha, a-ha, a-ha, a-ha) [x4]

She's got me spendin'.
(Oh) Spendin' all your money on me and spendin' time on me
She's got me spendin'.
(Oh) Spendin' all your money on me, up on me, on me.


Code: [Select]
Genre: Informal
  Female = 292
  Male   = 232
  Difference = -60; 44.27%
  Verdict: Weak FEMALE

Weak emphasis could indicate European.

Genre: Formal
  Female = 845
  Male   = 430
  Difference = -415; 33.72%
  Verdict: FEMALE

Maybe they're on to something.
Title: Re: Gender Guesser
Post by: watsisname on July 27, 2011, 04:34:16 am
baaahahahahahaha