Topics on this page include content and text mining as well as topic modeling.
Content Mining is the overall concept of pulling together into one place a large corpus of text, data, or images from various sources and running scripts on them to answer a research question.
Costs are incurred for working with some of this data. Please contact a UofSC librarian if you are interested.
|Adam Matthew||Adam Matthew makes available for USC LIbraries
|Cambridge University Press||Cost negotiated per request||Contact USC Libraries to initiate the process.|
|Gale Primary Resources||Some free: downloading large datasets incurs costs||Contact USC Libraries to initiate the process. Gale Artemis: Primary Sources, which searches across 23 of our Gale primary source databases covering 1500-2012, has a Term Frequency search option and Term Clusters viewer. To download large datasets USC Libraries will have to request data on your behalf from our Gale sales representative. It can take up to 3 weeks to process requests.|
|IEEE||cost negotiated per request||Contact USC Libraries to initiate process. Through a negotiation of the vendor license, the library facilitates on a case by case basis.|
|Newsbank||Costs Incurred||Contact USC Libraries to initiate the process. Restrictions in place; cost for TDM research between $6-8,000 and can take up to 6-8 weeks to process.|
|Oxford University Press||Costs incurred||Contact USC Libraries to initiate the process. Researchers may use resources for non-commercial text mining. However, OUP offers consultation services with technical project managers to assist in planning projects, including "avoidance of any technical safeguard triggers OUP has in place to protect stability and security of website."|
|ProQuest||Cost negotiated per request and available TDM Studio platform||Contact USC Libraries to initiate the process. Proquest does allow free text mining for the newspapers to which USC Libraries have purchased perpetual access licenses. USC Libraries will have to request this data on your behalf. 2019 platform: TDM Studio offers (for pay) select researchers to use ProQuest resources including newspapers for research.|
|SpringerLink||Free (with subscription)||Users can download subscribed and open access content for TDM purposes directly from the SpringerLink platform. Content can be downloaded via a web browser or with an HTTP GET request using a scripting tool such as curl, wget and Python's urllib, among others. No API key or other authentication is required. TDM researchers are requested to be considerate and limit their downloading speed to a reasonable rate.|