{"id":808,"date":"2010-06-08T16:34:40","date_gmt":"2010-06-08T15:34:40","guid":{"rendered":"http:\/\/www.meanboyfriend.com\/overdue_ideas\/?p=808"},"modified":"2010-06-08T16:34:40","modified_gmt":"2010-06-08T15:34:40","slug":"sort-advanced-text-mining-tools-and-resources-for-knowledge-discovery","status":"publish","type":"post","link":"http:\/\/www.meanboyfriend.com\/overdue_ideas\/2010\/06\/sort-advanced-text-mining-tools-and-resources-for-knowledge-discovery\/","title":{"rendered":"SORT &#8211; Advanced text mining tools and resources for knowledge discovery"},"content":{"rendered":"<p>Penultimate session of the day &#8211; Sophia Ananiadou from <a href=\"http:\/\/www.nactem.ac.uk\/\">NaCTeM (National Centre for Text Mining)<\/a><\/p>\n<p>What is text mining? &#8211; takes us from text to knowledge.<\/p>\n<ul>\n<li>Yields precise knowledge nuggest from sea of infomration -&gt; Knowledge Extraction<\/li>\n<li>Extraction of &#8216;named entities&#8217; &#8211; e.g. names of people, institution names, diseases, genes, etc. etc.<\/li>\n<li>Diovery of concepts allows semantic annotation and enrichment of documents &#8211; improves information access (goes beyond index terms) and allows clustering and classification of documents<\/li>\n<li>Extracts relationships, events and even opinions, attitudes etc. &#8211; for further semantic enrichment<\/li>\n<\/ul>\n<p>Need a toolkit:<\/p>\n<ul>\n<li>Resources &#8211; lexica, grammars, ontologies, databases<\/li>\n<li>Tools &#8211; parsers, taggers, named entity recognisers<\/li>\n<li>Annotated corpora<\/li>\n<li>Domain adaptation<\/li>\n<\/ul>\n<p>Sophia talking in a bit more detail about how you go about doing text mining:<\/p>\n<ul>\n<li>Start with syntactic analysis<\/li>\n<li>Use Named Entity Recognition to extract terms\/semantic entities<\/li>\n<li>Use parsers to extract other aspects &#8211; events, sentiments etc.<\/li>\n<\/ul>\n<p>All this allows the creation of annotations &#8211; semantic metatdata.<\/p>\n<p>Some examples of text mining applications:<\/p>\n<ul>\n<li><a href=\"http:\/\/www.nactem.ac.uk\/software\/kleio\/\">Kleio<\/a> (<a href=\"http:\/\/www.nactem.ac.uk\/software\/kleio\/\">http:\/\/www.nactem.ac.uk\/software\/kleio\/<\/a>)<\/li>\n<li><a href=\"http:\/\/www-tsujii.is.s.u-tokyo.ac.jp\/medie\/\">Medie<\/a> (<a href=\"http:\/\/www-tsujii.is.s.u-tokyo.ac.jp\/medie\/\">http:\/\/www-tsujii.is.s.u-tokyo.ac.jp\/medie\/<\/a>)<\/li>\n<li><a href=\"http:\/\/text0.mib.man.ac.uk\/software\/facta\/main.html\">Facta<\/a> (<a href=\"http:\/\/text0.mib.man.ac.uk\/software\/facta\/main.html\">http:\/\/text0.mib.man.ac.uk\/software\/facta\/main.html<\/a>)<\/li>\n<\/ul>\n<p>Sophia suggests we should be integrating &#8216;Language Technology&#8217; into open and common e-research infrastructure to enable the use of text mining tools on the content. See <a href=\"http:\/\/www.nactem.ac.uk\/u-compare.php\">U-Compare<\/a> tool from NaCTeM &#8211; <a href=\"http:\/\/www.nactem.ac.uk\/u-compare.php\">http:\/\/www.nactem.ac.uk\/u-compare.php<\/a><\/p>\n<p><strong>Q &amp; A<\/strong><\/p>\n<p>Q: (David Flanders) If I was a repository manager which tool would you recommend I play with first?<\/p>\n<p>A: All of them! Need to work out what you want to do and pick appropriate tool<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Penultimate session of the day &#8211; Sophia Ananiadou from NaCTeM (National Centre for Text Mining) What is text mining? &#8211; takes us from text to knowledge. Yields precise knowledge nuggest from sea of infomration -&gt; Knowledge Extraction Extraction of &#8216;named entities&#8217; &#8211; e.g. names of people, institution names, diseases, genes, etc. etc. Diovery of concepts [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[46],"class_list":["post-808","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-sort2010"],"_links":{"self":[{"href":"http:\/\/www.meanboyfriend.com\/overdue_ideas\/wp-json\/wp\/v2\/posts\/808","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.meanboyfriend.com\/overdue_ideas\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.meanboyfriend.com\/overdue_ideas\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.meanboyfriend.com\/overdue_ideas\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"http:\/\/www.meanboyfriend.com\/overdue_ideas\/wp-json\/wp\/v2\/comments?post=808"}],"version-history":[{"count":2,"href":"http:\/\/www.meanboyfriend.com\/overdue_ideas\/wp-json\/wp\/v2\/posts\/808\/revisions"}],"predecessor-version":[{"id":810,"href":"http:\/\/www.meanboyfriend.com\/overdue_ideas\/wp-json\/wp\/v2\/posts\/808\/revisions\/810"}],"wp:attachment":[{"href":"http:\/\/www.meanboyfriend.com\/overdue_ideas\/wp-json\/wp\/v2\/media?parent=808"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.meanboyfriend.com\/overdue_ideas\/wp-json\/wp\/v2\/categories?post=808"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.meanboyfriend.com\/overdue_ideas\/wp-json\/wp\/v2\/tags?post=808"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}