Tuesday, October 19, 2010

Text Analysis and Data Visualization Exercise Oct19

   W.W Denslow's Three Bears (1903)

Three Bears is one of the world's oldest and most treasured ferry tales which has been passed on for centuries since it's original narration in 1837.  I chose it because it is a simple and familiar example which is very helpful when learning new software. By using text analysis and data visualization tool, Voyeur Tools, it becomes simple to determine that the most commonly used subject term is "She" (19 times) followed by "Bear" (18 times) - now we can ask ourselves is this story really about three bears or is it about a girl?  We scroll down the list of frequently used words and determine that "Bears" plural is used twelve times, and "Girl" as an alternative to "She" is only used twice.  So far, we've collected the story is going to be about bears.  In order to cut down on the time and inefficiency of multi-checking a word frequency list, the best sollution is to make a quick wordle.  Entering the text in to a wordle will give you the same word count results but it also provides a handy visual which can help the reader determine what groups of words will be used frequently as a general idea of how many times.  I would say Voyeur Tools is an excellent program when it comes to text analysis but I would also be wary of revlying solely on it because of its lack of features.  


Next, I used ManyEyes and it's many tool sets to further analyze the text.  I started out with the word tree application to analyze my data set.  By entering a phrase in the search box, a user can find out not only how many times that phrase is used but also every sentence which is started by the search term.  Displayed through the form of a tree (seen below), the user also has the option to narrow the search by clicking the next word in a possible sentence.  For example if you search "three" you can then click "bears" to find out all sentences that start with "three bears".  Or you can click "beds" to find out what's so important about the three beds, or who they belong to.  Next, I chose to search "she" because the Wordle determined it was the most frequent term.  One might do this in order to learn everything there is to know about the main character.  By clicking "She->Went" I could find out in seconds everywhere "she" (Goldie Locks) goes.  This type of technology would be ideal for the Police.  It could be used to screen a violent paroled convict's text messages and emails for unsafe terms like "drug", "meet", "hurt" or "beat" which would lead the police to only incriminating messages.  This would be a great alternative to manual screening through dozens of pages of text.  


Lastly, I used the phrase net tool:

The phrase net is unique, as it crosses the visuals of a word tree and a wordle.  It shows the frequency of the words in their size while it also shows the concordance of the words through arrows.  The user can also choose to narrow results by selecting in particular what word the arrow should represent.  For example if you entered, "The", it would yield every sentence where "the" is used between two words (i.e, liked --- porridge, ran --- water, over --- woods).  By doing this, you can find correlation hidden deep in texts by having clear visual word groupings which are all related (i.e if you enter "girl" and you get many intelligent ---- told, or similar results you can presume the girl is going to be smart).  Though an effective combination of Wordle and word trees, analysis is more centralized when all of the tools are used individually.

No comments:

Post a Comment