New methods for searching tobacco industry documents
Initial Award Abstract
Investigating new methods for searching and displaying data is crucial to dealing with the overwhelming amount of information in many tobacco industry documents databases. People with different levels of expertise are expected to sort through search results and determine relevance. This research study proposes to study how people use a tobacco industry document database and how to make searching easier for people.
We will conduct a survey of the users of the Legacy Tobacco Documents Library and the Tobacco Control Archives to determine who uses the documents and why. In addition, we will ask users what features they would find useful when searching. Then, we will develop and test two novel methods for searching the tobacco industry documents. The first method that we will test will be based on the Flamenco search interface. Flamenco uses data about the documents to enable keyword searching and browsing. The second method will be to develop a text data-mining engine, which will utilize statistical techniques, specifically for tobacco documents. This tool will help research by putting the connections of the documents in context. For example, a user might be interested in when the tobacco companies starting using a particular phrase or tactic.
The tobacco industry documents have already been valuable for advancing public health objectives of reducing tobacco use and exposure. However, tobacco control researchers have done most document analyses. Developing new ways of searching could make the documents more available and open up analysis to other researchers and advocates. Our objective is to develop new methods of searching the tobacco industry documents by using informatics to discover new information about the tobacco industry. Using these methods will enable people to examine the industry in a different way and facilitate connections that were not easily made using a standard search engine. Despite the considerable effort and resources that have gone into establishing and maintaining these millions of documents, there have been no studies of methods to search the documents in a more effective manner. |
|Who searches the internal tobacco industry documents and why?
|Periodical: National Conference on Tobacco or Health
|Authors: Michel MC, Bero LA