{"id":479,"date":"2018-09-20T16:12:00","date_gmt":"2018-09-20T16:12:00","guid":{"rendered":"https:\/\/sites.library.ualberta.ca\/collections-information\/?p=479"},"modified":"2020-02-14T16:13:04","modified_gmt":"2020-02-14T16:13:04","slug":"text-and-data-mining-now-available-at-hathitrust-research-center","status":"publish","type":"post","link":"https:\/\/sites.library.ualberta.ca\/collections-information\/2018\/09\/20\/text-and-data-mining-now-available-at-hathitrust-research-center\/","title":{"rendered":"Text and Data Mining Now Available at HathiTrust Research Center"},"content":{"rendered":"\n<p> We recently received news about\u00a0<a href=\"https:\/\/www.google.com\/url?q=https%3A%2F%2Fwiki.htrc.illinois.edu%2Fdisplay%2FCOM%2FGetting%2Bstarted&amp;sa=D&amp;sntz=1&amp;usg=AFQjCNEj6xqhiXyPp53s-OF26UO4-2PtPw\">HathiTrust Research Center\u2019s<\/a>\u00a0services, which\u00a0has been developing services and tools allowing researchers to employ text and data mining methodologies using the HathiTrust collection. To date, this service has been available only on the portion of the collection that is out of copyright. With the development of a landmark HathiTrust policy and an updated release of\u00a0<a rel=\"noreferrer noopener\" href=\"https:\/\/www.google.com\/url?q=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttps-3A__hathitrust.us14.list-2Dmanage.com_track_click-3Fu-3Dc9252a443583eb73bf936f046-26id-3D9554dfbead-26e-3D1405f303f7%26d%3DDwMFaQ%26c%3Dl45AxH-kUV29SRQusp9vYR0n1GycN4_2jInuKy6zbqQ%26r%3DnZrbeKxKvMFmlVPhvU-N-SyQiICHw5mybD0jOovXhp0%26m%3Dk-ItkGqEB9LX9TwKKwef2UooOoQD58AEERPWqjkQHpQ%26s%3DCo5Cuw0ltkgbNLOUqjuCmSony2iOWxpKO9AxM292HD4%26e%3D&amp;sa=D&amp;sntz=1&amp;usg=AFQjCNH3tQKB3xv6pUBh4j2h9tcGSfSROg\" target=\"_blank\">HTRC Analytics<\/a>,\u00a0<em>HTRC now provides access to the text of the complete 16.7-million-item HathiTrust corpus for non-consumptive research, such as data mining and computational analysis, including items protected by copyright.<\/em><\/p>\n\n\n\n<p>\n\nThis extraordinary opportunity to use copyrighted materials for non-consumptive research purposes expands research access to the entire HathiTrust digital collection, which is sustained by HathiTrust\u2019s 140+ member libraries. Your community may access HTRC\u2019s easy-to-use computational tools ideal for beginners, as well as more complex tools to meet advanced data analysis needs.<br><a href=\"https:\/\/www.google.com\/url?q=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttps-3A__hathitrust.us14.list-2Dmanage.com_track_click-3Fu-3Dc9252a443583eb73bf936f046-26id-3Da623d83983-26e-3D1405f303f7%26d%3DDwMFaQ%26c%3Dl45AxH-kUV29SRQusp9vYR0n1GycN4_2jInuKy6zbqQ%26r%3DnZrbeKxKvMFmlVPhvU-N-SyQiICHw5mybD0jOovXhp0%26m%3Dk-ItkGqEB9LX9TwKKwef2UooOoQD58AEERPWqjkQHpQ%26s%3DhRSCRUYK0IU6A6XoLLnC5eQ3pfwyAvPthMUoZS4eOHc%26e%3D&amp;sa=D&amp;sntz=1&amp;usg=AFQjCNH1ItVw55znoV9qNvmV-kLZeGlR3Q\" target=\"_blank\" rel=\"noreferrer noopener\">HTRC Algorithms<\/a>: a set of tools for assembling collections of digitized text from the HathiTrust corpus and performing text analysis on them.&nbsp;<em>Including copyrighted items for ALL USERS.<\/em><\/p>\n\n\n\n<p><a href=\"https:\/\/www.google.com\/url?q=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttps-3A__hathitrust.us14.list-2Dmanage.com_track_click-3Fu-3Dc9252a443583eb73bf936f046-26id-3Dab2be35653-26e-3D1405f303f7%26d%3DDwMFaQ%26c%3Dl45AxH-kUV29SRQusp9vYR0n1GycN4_2jInuKy6zbqQ%26r%3DnZrbeKxKvMFmlVPhvU-N-SyQiICHw5mybD0jOovXhp0%26m%3Dk-ItkGqEB9LX9TwKKwef2UooOoQD58AEERPWqjkQHpQ%26s%3D5APvOZuQPzxtxjhPKbGKyx22HZVKEC1bGn54CDecNmo%26e%3D&amp;sa=D&amp;sntz=1&amp;usg=AFQjCNHxXhx_0yO1SmRw01TCz97DN9p-RA\" target=\"_blank\" rel=\"noreferrer noopener\">Extracted Features Dataset<\/a>: dataset allowing non-consumptive analysis on specific features extracted from the full text of the HathiTrust corpus.&nbsp;<em>Including copyrighted items for ALL USERS.<\/em><\/p>\n\n\n\n<p><a href=\"https:\/\/www.google.com\/url?q=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttps-3A__hathitrust.us14.list-2Dmanage.com_track_click-3Fu-3Dc9252a443583eb73bf936f046-26id-3Dd7074eba5a-26e-3D1405f303f7%26d%3DDwMFaQ%26c%3Dl45AxH-kUV29SRQusp9vYR0n1GycN4_2jInuKy6zbqQ%26r%3DnZrbeKxKvMFmlVPhvU-N-SyQiICHw5mybD0jOovXhp0%26m%3Dk-ItkGqEB9LX9TwKKwef2UooOoQD58AEERPWqjkQHpQ%26s%3DZ-V4dE3OfPpOoZ4-hawyTedfPgzRGY0dFd_DlcJThgA%26e%3D&amp;sa=D&amp;sntz=1&amp;usg=AFQjCNGtog3-Zd2BL89c5j0zpKJ-_nnsqQ\" target=\"_blank\" rel=\"noreferrer noopener\">HathiTrust+Bookworm<\/a>: a tool for visualizing and analyzing word usage trends in the HathiTrust corpus.&nbsp;<em>Including copyrighted items for ALL USERS.<\/em><\/p>\n\n\n\n<p><a href=\"https:\/\/www.google.com\/url?q=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttps-3A__hathitrust.us14.list-2Dmanage.com_track_click-3Fu-3Dc9252a443583eb73bf936f046-26id-3D057a325cc1-26e-3D1405f303f7%26d%3DDwMFaQ%26c%3Dl45AxH-kUV29SRQusp9vYR0n1GycN4_2jInuKy6zbqQ%26r%3DnZrbeKxKvMFmlVPhvU-N-SyQiICHw5mybD0jOovXhp0%26m%3Dk-ItkGqEB9LX9TwKKwef2UooOoQD58AEERPWqjkQHpQ%26s%3Ddpi1jvpnpZC8I1lq-Ekn1ixsBvDbOKfZRIjUdnwShHQ%26e%3D&amp;sa=D&amp;sntz=1&amp;usg=AFQjCNHBHUFQU4_XV1KrLcKqK6_tEu3czg\" target=\"_blank\" rel=\"noreferrer noopener\">HTRC Data Capsule<\/a>: a secure computing environment for researcher-driven text analysis on the HathiTrust corpus. All users may access public domain items.&nbsp;<em>Access to copyrighted items is available ONLY to member-affiliated researchers.<\/em>For more information, visit the&nbsp;<a href=\"https:\/\/www.google.com\/url?q=https%3A%2F%2Fwiki.htrc.illinois.edu%2Fdisplay%2FCOM%2FGetting%2Bstarted&amp;sa=D&amp;sntz=1&amp;usg=AFQjCNEj6xqhiXyPp53s-OF26UO4-2PtPw\">HathiTrust Research Center wiki.<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>We recently received news about\u00a0HathiTrust Research Center\u2019s\u00a0services, which\u00a0has been developing services and tools allowing researchers to employ text and data mining methodologies using the HathiTrust collection. To date, this service has been available only on the portion of the collection that is out of copyright. With the development of a [&hellip;]<\/p>\n","protected":false},"author":7,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-479","post","type-post","status-publish","format-standard","hentry","category-uncategorized","clearfix"],"_links":{"self":[{"href":"https:\/\/sites.library.ualberta.ca\/collections-information\/wp-json\/wp\/v2\/posts\/479","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sites.library.ualberta.ca\/collections-information\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sites.library.ualberta.ca\/collections-information\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sites.library.ualberta.ca\/collections-information\/wp-json\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/sites.library.ualberta.ca\/collections-information\/wp-json\/wp\/v2\/comments?post=479"}],"version-history":[{"count":1,"href":"https:\/\/sites.library.ualberta.ca\/collections-information\/wp-json\/wp\/v2\/posts\/479\/revisions"}],"predecessor-version":[{"id":480,"href":"https:\/\/sites.library.ualberta.ca\/collections-information\/wp-json\/wp\/v2\/posts\/479\/revisions\/480"}],"wp:attachment":[{"href":"https:\/\/sites.library.ualberta.ca\/collections-information\/wp-json\/wp\/v2\/media?parent=479"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sites.library.ualberta.ca\/collections-information\/wp-json\/wp\/v2\/categories?post=479"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sites.library.ualberta.ca\/collections-information\/wp-json\/wp\/v2\/tags?post=479"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}