
scientio has got something - using 'concept structures'
Computer Science | Linguistics | Computational Linguistics | Natural Language Processing | Concept Mapping | Document Similarity | WordNetEdmonds. 2007. Using concept structures for efficient document comparison and location. Conference Proceeding
I was so pleased to finally (and serendipitously, I might add) find a computer science article that describes what I was trying to do with my masters work, from outside the discipline.
This is a quick read. Spells out terms very clearly for non-adepts, so I found it to be quite accessible.

what are 'n-grams'
Information Technology | Computational Linguistics | Natural Language Processing | N-gramsFrom wikipedia: "An n-gram is a sub-sequence of n items from a given sequence." So the scope and granularity of their application matters greatly!
At the 'word-level', n-grams are constituted by successive groups of n words.
N-grams can be used (as in NLP) for 'efficient approximate matching.'...

visualizing textual clusters
Information Design | Methodology | Natural Language Processing | Statistics | Categories | Classification | Clustering | Evaluation | VisualizationI started my search on google scholar, to identify recent articles on this topic.
"visualizing text clusters" produced 0 hits.
"visualizing clusters" produced 120 hits. (32 post 2006)
Found one great one:
Chen, K. & Liu, L. (2006). iVIBRATE: Interactive visualization-based framework for clustering large datasets. ACM Transactions on Information Systems (TOIS), 24, 245-294.
Georgia Tech seems to put out a lot of good stuff.
Then I decided to change tracks; to see if 'text content' or document clustering comes up:

reading for gist
Cognitive Science | Computer Science | Information Science | Computational Linguistics | Natural Language Processing | Gist | Reading | VisualizationGot to get going on my IAT 814 and 802 term projects.
Started with the idea of reading for gist, a model principally from
O'Halloran, K. (2003). Critical discourse analysis and language cognition. Edinburgh: Edinburgh University Press.
Wanted to get a deeper view on the concept, so did a google scholar search using combined SFUBC proxies and VPNs.
Found about 32 articles with the following strategy:
"reading for gist" -esl -teacher
Downloaded (and meta-scraped) 11...but only about 7 from this search.

Review of VisualText
Computer Science | Linguistics | Natural Language Processing | reviews | software
tex.tuals process
Information Science | Information Technology | Linguistics | Philosophy | Computational Linguistics | Natural Language Processing | Ontology | Argument Recognition | Context | Definition RecognitionAs I read through a text (I've remarked on the paper-to-digital conversion process elsewhere) I want to be able to highlight and capture whole swaths of text.
It is critically important to repurposing my captured snippets (bits, fits, blobs, fragments, portions) that I am able to recontextualize them easily.
How can this recontextualization happen?
First; the bibliographic information must be embedded in each bit.
Also, its relative location in the linear flow of the text must be recorded, so as to be able to quickly pull up various degrees of context around the bit.
Recent blog posts
- Note to self: Neatreceipts method
- 1239482 seconds since last panic
- Tag Folders (or maybe just smart folders) broken after 10.6.1 upgrade
- Hard to clean spots, No. 8: My Work folder
- Updating to DockSpaces 2.45 breaks FlowSpaces PLIST hacks
- Hard to clean spots, No. 7: My Databases folder
- Hard to clean spots, No. 6: My Documents folder
- Whats with these Stacks 'Drawers' anyhow?
- Hard to clean spots, No. 5: My Desktop
- Hard to clean spots, No. 4: My Applications Folder
bookmark
tuals 0.1 on del.icio.us